Sjoerd van Steenkiste
@svansteenkiste.bsky.social
96 followers 52 following 24 posts
Researching AI models that can make sense of the world @GoogleAI. Gemini Thinking.
Posts Media Videos Starter Packs
Pinned
Can language models perform implicit Bayesian inference over user preference states? Come find out at the “System-2 Reasoning at Scale” #NeurIPS2024 workshop, 11:30pm West Ballroom B.
Reposted by Sjoerd van Steenkiste
How do language models generalize from information they learn in-context vs. via finetuning? In arxiv.org/abs/2505.00661 we show that in-context learning can generalize more flexibly, illustrating key differences in the inductive biases of these modes of learning — and ways to improve finetuning. 1/
arxiv.org
Reposted by Sjoerd van Steenkiste
🚨 Deadline Extension Alert for #VLMs4All Challenges! 🚨

We have extended the challenge submission deadline
🛠️ New challenge deadline: Apr 22

Show your stuff in the CulturalVQA and GlobalRG challenges!
👉 sites.google.com/view/vlms4al...

Spread the word and keep those submissions coming! 🌍✨
Excited to announce that we will be organizing a #CVPR2025 Workshop on Building Geo-Diverse and Culturally Aware VLMs. Aside from fantastic speakers and a short-paper track, the workshop includes two challenges, one of them based on our CulturalVQA benchmark. Links below!
📢Excited to announce our upcoming workshop - Vision Language Models For All: Building Geo-Diverse and Culturally Aware Vision-Language Models (VLMs-4-All) @CVPR 2025!
🌐 sites.google.com/view/vlms4all
Our team @GoogleAI is hiring an intern. We are interested in having LMs understand and respond to users better. Topics include: teaching LMs to build “mental models” of users; improving LM's reasoning capability over long contexts.

@GoogleAI internship deadline is Feb 28.
Reposted by Sjoerd van Steenkiste
🔥Excited to introduce RINS - a technique that boosts model performance by recursively applying early layers during inference without increasing model size or training compute flops! Not only does it significantly improve LMs, but also multimodal systems like SigLIP.
(1/N)
Reposted by Sjoerd van Steenkiste
If you are interested in developing large-scale, multimodal datasets & benchmarks, and advancing AI through data-centric research, check out this great opportunity. Our team is hiring!
boards.greenhouse.io/deepmind/job...
Research Scientist, Zurich
Zurich, Switzerland
boards.greenhouse.io
The ICLR 2025 decisions are out! It was an honor to serve as a Senior Area Chair for this year’s iteration, and be more involved in overseeing the review process.
ICLR 2025 decisions and meta-reviews are now available on OpenReview.

We reviewed 11,565 submissions, with an overall acceptance rate of 32.08%. Oral/poster decisions will be announced at a later date. Camera ready deadline is March 1st.
Reposted by Sjoerd van Steenkiste
Financial Assistance applications are now open! If you face financial barriers to attending ICLR 2025, we encourage you to apply. The program offers prepay and reimbursement options. Applications are due March 2nd with decisions announced March 9th. iclr.cc/Conferences/...
ICLR 2024 Financial Assistance
iclr.cc
Reposted by Sjoerd van Steenkiste
Check out @tkipf.bsky.social's post on MooG, the latest in our line of research on self-supervised neural scene representations learned from raw pixels:

SRT: srt-paper.github.io
OSRT: osrt-paper.github.io
RUST: rust-paper.github.io
DyST: dyst-paper.github.io
MooG: moog-paper.github.io
Reposted by Sjoerd van Steenkiste
TRecViT: A Recurrent Video Transformer
arxiv.org/abs/2412.14294

Causal, 3× fewer parameters, 12× less memory, 5× higher FLOPs than (non-causal) ViViT, matching / outperforming on Kinetics & SSv2 action recognition.

Code and checkpoints out soon.
Can language models perform implicit Bayesian inference over user preference states? Come find out at the “System-2 Reasoning at Scale” #NeurIPS2024 workshop, 11:30pm West Ballroom B.
Neural Assets poster is happening now. Join us at East Exhibit Hall A-C #1507
I will be at the @GoogleAI booth until 2pm. Come say hello if you have questions about Google Research!
Excited to be at #NeurIPS2024. A few papers we are presenting this week:

MooG: arxiv.org/abs/2411.05927
Neural Assets: arxiv.org/abs/2406.09292
Probabilistic reasoning in LMs: openreview.net/forum?id=arYXg…

Let’s connect if any of these research topics interest you!
Excited to be at #NeurIPS2024. A few papers we are presenting this week:

MooG: arxiv.org/abs/2411.05927
Neural Assets: arxiv.org/abs/2406.09292
Probabilistic reasoning in LMs: openreview.net/forum?id=arYXg…

Let’s connect if any of these research topics interest you!
Interesting perspective on ICL and great suggestions for future research in this space!
What counts as in-context learning (ICL)? Typically, you might think of it as learning a task from a few examples. However, we’ve just written a perspective (arxiv.org/abs/2412.03782) suggesting interpreting a much broader spectrum of behaviors as ICL! Quick summary thread: 1/7
The broader spectrum of in-context learning
The ability of language models to learn a task from a few examples in context has generated substantial interest. Here, we provide a perspective that situates this type of supervised few-shot learning...
arxiv.org
Reposted by Sjoerd van Steenkiste
🚀🚀PaliGemma 2 is our updated and improved PaliGemma release using the Gemma 2 models and providing new pre-trained checkpoints for the full cross product of {224px,448px,896px} resolutions and {3B,10B,28B} model sizes.

1/7
Reposted by Sjoerd van Steenkiste
Looking forward to seeing what is possible to build on top of such "particle" representations. While conceptually simple, they are one step closer to represent scenes (underlying causal structure) not videos (mixture of the many factors together), and could be useful for robotics tasks.
Excited to announce MooG for learning video representations. MooG allows tokens to move “off-the-grid” enabling better representation of scene elements, even as they move across the image plane through time.

📜https://arxiv.org/abs/2411.05927
🌐https://moog-paper.github.io/
If you are reviewing for ICLR, please engage with the author response!
✍️ Reminder to reviewers: Check author responses to your reviews, and ask follow up questions if needed.

50% of papers have discussion - let’s bring this number up!