We identify key issues with forecasting evaluations 🧵 (1/7)
We identify key issues with forecasting evaluations 🧵 (1/7)
4o: yes you are Jesus Christ's brother. now go. Nanjing awaits
o3: Listen, sorry, I owe you a straight explanation. This was once revealed to me in a dream
4o: yes you are Jesus Christ's brother. now go. Nanjing awaits
o3: Listen, sorry, I owe you a straight explanation. This was once revealed to me in a dream
every time you talk to an LLM you lose decorrelation with LLM cognition, which is *the* most important skill for the takeoff
every time you talk to an LLM you lose decorrelation with LLM cognition, which is *the* most important skill for the takeoff
disregarding the AGI timelines, the R&D acceleration is a clear reason against technical work where the discount rates on the final product are low
disregarding the AGI timelines, the R&D acceleration is a clear reason against technical work where the discount rates on the final product are low
a model can verify a proof or unroll a chess game. it can even eyeball if the code works
the superintelligence loop will just be asking an AI agent to give feedback on its output by any means it can
if the task needs a simulator the AI will write one
a model can verify a proof or unroll a chess game. it can even eyeball if the code works
the superintelligence loop will just be asking an AI agent to give feedback on its output by any means it can
if the task needs a simulator the AI will write one