Studying in-context learning and reasoning in humans and machines
Prev. @UofT CS & Psych
🔵Early: A simplicity bias (prior) favors a less complex strategy (G)
🔴Late: reducing loss (likelihood) favors a better-fitting, but more complex strategy (M)
8/
🔵Early: A simplicity bias (prior) favors a less complex strategy (G)
🔴Late: reducing loss (likelihood) favors a better-fitting, but more complex strategy (M)
8/
We now have a predictive model of task diversity effects and transience!
7/
We now have a predictive model of task diversity effects and transience!
7/
6/
6/
🔴 Memorizing (M): discrete prior on seen tasks.
🔵 Generalizing (G): continuous prior matching the true task distribution.
These match known strategies from prior work!
2/
🔴 Memorizing (M): discrete prior on seen tasks.
🔵 Generalizing (G): continuous prior matching the true task distribution.
These match known strategies from prior work!
2/
Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵
1/
Our work explains this & *predicts Transformer behavior throughout training* without its weights! 🧵
1/
Once a cornerstone for studying human reasoning, the think-aloud method declined in popularity as manual coding limited its scale. We introduce a method to automate analysis of verbal reports and scale think-aloud studies. (1/8)🧵
Once a cornerstone for studying human reasoning, the think-aloud method declined in popularity as manual coding limited its scale. We introduce a method to automate analysis of verbal reports and scale think-aloud studies. (1/8)🧵