“this disconnect between eval performance and actual real-world performance,”
Next time someone goes - LLMs beat ‘So & So’ Olympiad - just quote Ilya.
“this disconnect between eval performance and actual real-world performance,”
Next time someone goes - LLMs beat ‘So & So’ Olympiad - just quote Ilya.
www.dwarkesh.com/p/ilya-sutsk...
www.dwarkesh.com/p/ilya-sutsk...
(screenshots not chronological)
www.dwarkesh.com/p/ilya-sutsk...
(screenshots not chronological)
www.dwarkesh.com/p/ilya-sutsk...
> This supports our assertion that the ceiling on LLM creativity (0.25) corresponds to the boundary between little-c and Pro-c human creative performance (Figure 6).
www.academia.edu/144621465/_T...
> This supports our assertion that the ceiling on LLM creativity (0.25) corresponds to the boundary between little-c and Pro-c human creative performance (Figure 6).
www.academia.edu/144621465/_T...
Not the model, not the prompt - still the human.
The amount of shilling these guys do, no wonder they can’t get anything serious built.
cdn.openai.com/pdf/4a25f921...
Not the model, not the prompt - still the human.
The amount of shilling these guys do, no wonder they can’t get anything serious built.
cdn.openai.com/pdf/4a25f921...
storage.googleapis.com/deepmind-med...
storage.googleapis.com/deepmind-med...
Actually their realization dawned a few weeks back, but these things take a little while to surface externally.
Image of tweet from bird site because I won’t link to it.
Actually their realization dawned a few weeks back, but these things take a little while to surface externally.
Image of tweet from bird site because I won’t link to it.
Summary: Conjectural with nice diagrams but no quantitative measures and ignores prior literature.
arxiv.org/pdf/2510.26745
Summary: Conjectural with nice diagrams but no quantitative measures and ignores prior literature.
arxiv.org/pdf/2510.26745
- Kolb: Experimental learning theory
- Feynman: Explain in your own words
- Dweck: Growth mindset scale
The medium is the message.
- Kolb: Experimental learning theory
- Feynman: Explain in your own words
- Dweck: Growth mindset scale
The medium is the message.
arxiv.org/pdf/2506.10772
arxiv.org/pdf/2506.10772
Folks still refusing to acknowledge the obvious are invested in it, directly or indirectly.
For the rest of us it’s just another tool.
Folks still refusing to acknowledge the obvious are invested in it, directly or indirectly.
For the rest of us it’s just another tool.
- Verifier assisted code snippet optimization
- Before solutions are sub-optimal as are the after solutions
- Some of the paper’s commentary is contradictory
- If framework included chaos monkey to simulate real world, would these hold?
arxiv.org/abs/2510.061...
- Verifier assisted code snippet optimization
- Before solutions are sub-optimal as are the after solutions
- Some of the paper’s commentary is contradictory
- If framework included chaos monkey to simulate real world, would these hold?
arxiv.org/abs/2510.061...
Buddy - the rest of us knew.
(Not going link, but it’s on the bird site)
Buddy - the rest of us knew.
(Not going link, but it’s on the bird site)
bsky.app/profile/tech...
bsky.app/profile/tech...
Not a great idea to listen to ‘"center-left, corporate and GOP donor-funded nonprofit", which advocates for neoliberal policies and is staunchly opposed to Medicare for All.’
en.m.wikipedia.org/wiki/Third_W...
Not a great idea to listen to ‘"center-left, corporate and GOP donor-funded nonprofit", which advocates for neoliberal policies and is staunchly opposed to Medicare for All.’
en.m.wikipedia.org/wiki/Third_W...