Lightnews — Scholar-powered news

Valentin Liévin

@valentinlievin.bsky.social

13 followers 11 following 9 posts

ML research at Google DeepMind
Better LLMs for healthcare and science.

Posts Replies Media Videos

Valentin Liévin

@valentinlievin.bsky.social

Thank you @mike-shake.bsky.social, Anil Palepu, Tao Tu, Alan Kharti, Adam Rodman, Vivek Natarajan, Wei-Hung, and many others!

March 6, 2025 at 7:10 PM

Valentin Liévin

@valentinlievin.bsky.social

[7/n] This is a major update for AMIE. While further research is needed, our study hints that AI could become a powerful tool for improving clinical decisions and healthcare access! Congratulations to the team for this incredible piece of work!

March 6, 2025 at 7:10 PM

Valentin Liévin

@valentinlievin.bsky.social

[6/n] We also challenged AMIE with RxQA, a new medication reasoning benchmark requiring precise pharmacological knowledge. Again, AMIE outperformed PCPs.

March 6, 2025 at 7:10 PM

Valentin Liévin

@valentinlievin.bsky.social

[5/n] We assessed AMIE’s clinical skills via a multi-visit Objective Structured Clinical Evaluation (OSCE), a common tool to evaluate medical professionals. AMIE scored (slightly) higher than human doctors (PCPs) in both management quality and alignment with guidelines!

March 6, 2025 at 7:10 PM

Valentin Liévin

@valentinlievin.bsky.social

[4/n] In management reasoning, there is generally not a single ground truth, but rather a space of acceptable solutions. Capturing a reliable signal to gauge plan quality was a major challenge. Gemini-as-a-judge provided a robust signal, which we hill-climbed to improve our agent

March 6, 2025 at 7:10 PM

Valentin Liévin

@valentinlievin.bsky.social

[3/n] Management reasoning is hard. AMIE can in-context retrieve and reason about specific recommendations from the guidelines, and compile these findings into a personalized plan with citations. Structured generation was a key ingredient to elicit long and controllable reasoning

March 6, 2025 at 7:10 PM

Valentin Liévin

@valentinlievin.bsky.social

[2/n] We upgraded AMIE’s original dialogue agent with a reasoning partner. The new reasoning agent taps into Gemini’s long-context reasoning and retrieval capabilities to process 100+ pages of evidence-based clinical guidelines from @NICEComms and @BMJBestPractice, in real time!

March 6, 2025 at 7:10 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news