Stephanie Chan
@scychan.bsky.social
Staff Research Scientist at Google DeepMind. Artificial and biological brains 🤖 🧠
Reposted by Stephanie Chan
In neuroscience, we often try to understand systems by analyzing their representations — using tools like regression or RSA. But are these analyses biased towards discovering a subset of what a system represents? If you're interested in this question, check out our new commentary! Thread:
August 5, 2025 at 2:36 PM
In neuroscience, we often try to understand systems by analyzing their representations — using tools like regression or RSA. But are these analyses biased towards discovering a subset of what a system represents? If you're interested in this question, check out our new commentary! Thread:
Great new paper by @jessegeerts.bsky.social, looking at a certain type of generalization in transformers -- transitive inference -- and what conditions induce this type of generalization
🧠 How do transformers learn relational reasoning? We trained small transformers on transitive inference (if A>B and B>C, then A>C) and discovered striking differences between learning paradigms. Our latest work reveals when and why AI systems generalize beyond training data 🤖
June 6, 2025 at 3:50 PM
Great new paper by @jessegeerts.bsky.social, looking at a certain type of generalization in transformers -- transitive inference -- and what conditions induce this type of generalization
New paper: Generalization from context often outperforms generalization from finetuning.
And you might get the best of both worlds by spending extra compute and train time to augment finetuning.
And you might get the best of both worlds by spending extra compute and train time to augment finetuning.
How do language models generalize from information they learn in-context vs. via finetuning? In arxiv.org/abs/2505.00661 we show that in-context learning can generalize more flexibly, illustrating key differences in the inductive biases of these modes of learning — and ways to improve finetuning. 1/
arxiv.org
May 2, 2025 at 5:48 PM
New paper: Generalization from context often outperforms generalization from finetuning.
And you might get the best of both worlds by spending extra compute and train time to augment finetuning.
And you might get the best of both worlds by spending extra compute and train time to augment finetuning.
New work led by
@aaditya6284.bsky.social
"Strategy coopetition explains the emergence and transience of in-context learning in transformers."
We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") 😯 🧵👇
@aaditya6284.bsky.social
"Strategy coopetition explains the emergence and transience of in-context learning in transformers."
We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") 😯 🧵👇
March 11, 2025 at 6:18 PM
New work led by
@aaditya6284.bsky.social
"Strategy coopetition explains the emergence and transience of in-context learning in transformers."
We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") 😯 🧵👇
@aaditya6284.bsky.social
"Strategy coopetition explains the emergence and transience of in-context learning in transformers."
We find some surprising things!! E.g. that circuits can simultaneously compete AND cooperate ("coopetition") 😯 🧵👇
Sadly, we have lost a brilliant researcher and colleague, Felix Hill. Please see this note, where I have tried to compile some of his writings: docs.google.com/document/d/1...
For Felix
Devastatingly, we have lost a bright light in our field. Felix Hill was not only a deeply insightful thinker -- he was also a generous, thoughtful mentor to many researchers. He majorly changed my lif...
docs.google.com
January 4, 2025 at 12:28 PM
Sadly, we have lost a brilliant researcher and colleague, Felix Hill. Please see this note, where I have tried to compile some of his writings: docs.google.com/document/d/1...
Reposted by Stephanie Chan
What counts as in-context learning (ICL)? Typically, you might think of it as learning a task from a few examples. However, we’ve just written a perspective (arxiv.org/abs/2412.03782) suggesting interpreting a much broader spectrum of behaviors as ICL! Quick summary thread: 1/7
The broader spectrum of in-context learning
The ability of language models to learn a task from a few examples in context has generated substantial interest. Here, we provide a perspective that situates this type of supervised few-shot learning...
arxiv.org
December 10, 2024 at 6:17 PM
What counts as in-context learning (ICL)? Typically, you might think of it as learning a task from a few examples. However, we’ve just written a perspective (arxiv.org/abs/2412.03782) suggesting interpreting a much broader spectrum of behaviors as ICL! Quick summary thread: 1/7
Reposted by Stephanie Chan
Introducing the :milkfoamo: emoji
December 9, 2024 at 8:12 PM
Introducing the :milkfoamo: emoji
I'll be not at Neurips this week. Let's grab coffee if you want to fomo-commiserate with me
December 9, 2024 at 2:16 AM
I'll be not at Neurips this week. Let's grab coffee if you want to fomo-commiserate with me
Hello hello. Testing testing 123
December 9, 2024 at 12:48 AM
Hello hello. Testing testing 123