Luisa Zintgraf
@luisazintgraf.bsky.social
RL & Meta-Learning @ DeepMind.
Huge shout-out to my co-first authors @dancalian.bsky.social, @gregfar.bsky.social, & Iurii Kemaev.
And to our amazing collaborators: Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, @schaul.bsky.social, @jeffdean.bsky.social, Hado van Hasselt, & Dave Silver.
And to our amazing collaborators: Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, @schaul.bsky.social, @jeffdean.bsky.social, Hado van Hasselt, & Dave Silver.
November 6, 2025 at 11:29 AM
Huge shout-out to my co-first authors @dancalian.bsky.social, @gregfar.bsky.social, & Iurii Kemaev.
And to our amazing collaborators: Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, @schaul.bsky.social, @jeffdean.bsky.social, Hado van Hasselt, & Dave Silver.
And to our amazing collaborators: Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, @schaul.bsky.social, @jeffdean.bsky.social, Hado van Hasselt, & Dave Silver.
We believe that the DataRater is a promising step towards more automated and principled dataset curation. This could be especially important for filtering and making the best use of massive synthetic datasets in the future.
For a deeper dive, check out arxiv.org/pdf/2505.17895
For a deeper dive, check out arxiv.org/pdf/2505.17895
November 6, 2025 at 11:29 AM
We believe that the DataRater is a promising step towards more automated and principled dataset curation. This could be especially important for filtering and making the best use of massive synthetic datasets in the future.
For a deeper dive, check out arxiv.org/pdf/2505.17895
For a deeper dive, check out arxiv.org/pdf/2505.17895
So what does the DataRater learn? It automatically identifies and down-weights data that aligns with human intuitions of low quality, such as incorrect text encodings, OCR errors, and irrelevant content.
November 6, 2025 at 11:29 AM
So what does the DataRater learn? It automatically identifies and down-weights data that aligns with human intuitions of low quality, such as incorrect text encodings, OCR errors, and irrelevant content.
The result? The DataRater is highly effective at filtering data, leading to significant compute efficiency improvements. In our experiments, we observed up to a 46.6% net compute gain while often improving final model performance.
November 6, 2025 at 11:29 AM
The result? The DataRater is highly effective at filtering data, leading to significant compute efficiency improvements. In our experiments, we observed up to a 46.6% net compute gain while often improving final model performance.
We introduce the DataRater, a meta-learning method that learns to rate the value of each data point for training. Instead of manually specifying filtering rules, we train the DataRater to optimize for a simple goal: improving the training efficiency on a held-out dataset.
November 6, 2025 at 11:29 AM
We introduce the DataRater, a meta-learning method that learns to rate the value of each data point for training. Instead of manually specifying filtering rules, we train the DataRater to optimize for a simple goal: improving the training efficiency on a held-out dataset.
Foundation models are trained on large datasets, but not all data is created equal. Dataset curation often relies on manual, coarse-grained filtering and hand-crafted rules. This is becoming a major challenge, especially with the rise of synthetic data.
November 6, 2025 at 11:29 AM
Foundation models are trained on large datasets, but not all data is created equal. Dataset curation often relies on manual, coarse-grained filtering and hand-crafted rules. This is becoming a major challenge, especially with the rise of synthetic data.
Tagging first author @jakeabeck.bsky.social who just joined bsky! Welcome 🎉
April 9, 2025 at 2:22 PM
Tagging first author @jakeabeck.bsky.social who just joined bsky! Welcome 🎉
📘 Journal: nowpublishers.com/article/Deta...
📝 ArXiv: arxiv.org/abs/2301.08028
🎙️ Podcast: www.talkrl.com/episodes/jac...
🎥 Talk: youtu.be/XUQ9jLOZqGc
📝 ArXiv: arxiv.org/abs/2301.08028
🎙️ Podcast: www.talkrl.com/episodes/jac...
🎥 Talk: youtu.be/XUQ9jLOZqGc
[AUTOML23] A Tutorial on MetaReinforcement Learning
YouTube video by AutoMLConf
youtu.be
April 9, 2025 at 9:54 AM
📘 Journal: nowpublishers.com/article/Deta...
📝 ArXiv: arxiv.org/abs/2301.08028
🎙️ Podcast: www.talkrl.com/episodes/jac...
🎥 Talk: youtu.be/XUQ9jLOZqGc
📝 ArXiv: arxiv.org/abs/2301.08028
🎙️ Podcast: www.talkrl.com/episodes/jac...
🎥 Talk: youtu.be/XUQ9jLOZqGc