1. Memorization or
2. Priming or
2. Confirmation prompting
www.anthropic.com/research/ali...
en.wikipedia.org/wiki/Robotic...
en.wikipedia.org/wiki/Robotic...
And that’s also why it fails.
medium.com/@emilymenonb...
And that’s also why it fails.
Please stop reminding me to“Take a moment to reflect in your journal.”
My memory’s great, thank you. And when I am eventually demented, reading journal entries ain’t going to help even if it’s still possible to read.
Please stop reminding me to“Take a moment to reflect in your journal.”
My memory’s great, thank you. And when I am eventually demented, reading journal entries ain’t going to help even if it’s still possible to read.
“this disconnect between eval performance and actual real-world performance,”
Next time someone goes - LLMs beat ‘So & So’ Olympiad - just quote Ilya.
“this disconnect between eval performance and actual real-world performance,”
Next time someone goes - LLMs beat ‘So & So’ Olympiad - just quote Ilya.
Also, see en.wikipedia.org/wiki/Project...
Also, see en.wikipedia.org/wiki/Project...
Amateur move guys.
Amateur move guys.
www.dwarkesh.com/p/ilya-sutsk...
www.dwarkesh.com/p/ilya-sutsk...
(screenshots not chronological)
www.dwarkesh.com/p/ilya-sutsk...
(screenshots not chronological)
www.dwarkesh.com/p/ilya-sutsk...
> This supports our assertion that the ceiling on LLM creativity (0.25) corresponds to the boundary between little-c and Pro-c human creative performance (Figure 6).
www.academia.edu/144621465/_T...
> This supports our assertion that the ceiling on LLM creativity (0.25) corresponds to the boundary between little-c and Pro-c human creative performance (Figure 6).
www.academia.edu/144621465/_T...
A PhD sitting down and just fabricating >50% of sources = career ending
arxiv.org/abs/2511.11597
Not the model, not the prompt - still the human.
The amount of shilling these guys do, no wonder they can’t get anything serious built.
cdn.openai.com/pdf/4a25f921...
Not the model, not the prompt - still the human.
The amount of shilling these guys do, no wonder they can’t get anything serious built.
cdn.openai.com/pdf/4a25f921...
Unfortunately, this is a societal failure. Tech didn’t invent loneliness, it offered a new way to cope with it - in an empathetic echo chamber.
We are failing the kids. Others too, but mostly it’s the kids that I worry about.
www.nytimes.com/2025/11/17/o...
Unfortunately, this is a societal failure. Tech didn’t invent loneliness, it offered a new way to cope with it - in an empathetic echo chamber.
We are failing the kids. Others too, but mostly it’s the kids that I worry about.
en.wikipedia.org/wiki/Ramamur...
en.wikipedia.org/wiki/Ramamur...
But next quarter you should be terrified.
But next quarter you should be terrified.
storage.googleapis.com/deepmind-med...
storage.googleapis.com/deepmind-med...