nathanlabenz.bsky.social
@nathanlabenz.bsky.social
Host of the all-AI Cognitive Revolution podcast – cognitiverevolution.ai
Western AI execs often claim that "China will never slow their AI development down – and so of course we can't either!"

But is it true? Brian Tse of Concordia AI says China is more focused on practical applications & less AGI-pilled than the US

Full episode out tomorrow!
October 17, 2025 at 8:38 PM
(2) @OpenAI Operator is good now!

Tell it that you don't want it to confirm every step. It still will, in some cases for the better.

It can do ~100-action sequences; it tries new strategies rather than getting stuck.

Here it uses @waymark – sadly no HTML5 canvas support! 😦
May 6, 2025 at 8:37 PM
"It's important to know that Elon's right.

OpenAI is attempting the second-biggest theft in human history.

The amicus brief is completely correct. The terms they've suggested are completely unacceptable."

@TheZvi on the @elonmusk vs @OpenAI lawsuit
April 28, 2025 at 7:17 PM
"o3 is misaligned. It lies to the user & defends it onto death. It's significantly worse than o1

and GPT-4.1 seems less aligned than GPT-4o

The more RL you apply, the more misaligned they get, & we don't seem to be trying that hard"

@TheZvi on why his p(doom) is up to 70%
April 25, 2025 at 5:18 PM
On the other hand, China "hold[s] our infrastructure at risk. One Taiwan invasion scenario, if things are not going as well as they hope ... just turn off all of our grids. Turn out the lights and let the chaos reign. Why wouldn't they do that?"

Doesn't sound good either!
April 24, 2025 at 4:55 PM
On the AI safety & control side: "The biological weapon analogy has more in common with AI than the nuclear analogy. There is a control problem. It also has the same potential for devastation – you could design a weapon that exterminates the entire population"

Scary stuff!
April 24, 2025 at 4:54 PM
"the AI safety camp says: It's a coordination problem, we can't solve alignment, so we need to slow down.

the NatSec folks say: I've seen how China operates. A deal is impossible. We have to win, we have to race"

@jeremiecharris & @harris_edouard on the USA's AI dilemma

🧵
April 24, 2025 at 4:54 PM
and finally, "the best way to do whistleblower protections is to pair it with rules around what information needs to be shared ... as opposed to this vague standard of like, 'If you're worried, call this hotline'"

Overall, an excellent conversation. Enjoy!
April 21, 2025 at 7:03 PM
On AI developers' responsibility for the impact of their work, Helen says its OK to decouple technical & societal questions, BUT ...

"if technical progress outpaces society's ability to adapt, then the people doing technical work might end up having huge societal consequences"
April 21, 2025 at 7:03 PM
re: AI company CEOs' calls for an AI race with China, Helen agrees that "the rhetoric change has been striking", but chalks is up to "the path of least resistance" – with "so little agreement on what to do about [AI], the one thing people agree on is: We gotta beat China"
April 21, 2025 at 7:03 PM
Despite what you might've heard, Helen is not a "doomer" or even particularly hawkish on most AI safety issues.

Here, she argues that "Nonproliferation is the wrong approach to AI misuse" and instead promotes the concept of "adaptation buffers"

x.com/hlntnr/stat...
April 21, 2025 at 7:03 PM
And finally... where's the Gemini 2.5 System Card?

Jack says that "experimental" models, which are launched with lower rate limits & sometimes limited access, don't always get a full write-up, but reassures us that industry-leading safety testing has been done.
April 8, 2025 at 2:11 PM
Can interpretability keep up with model progress?

Maybe, but it probably depends on getting models to do more and more of the interpretability work for us.
April 8, 2025 at 2:11 PM
How did Google Deepmind decide to share the Gemini 2.5 chain of thought?
April 8, 2025 at 2:11 PM
We've recently seen deep integration of text & image – will we see the same for more exotic modalities?

Jack says that "anything you can train jointly will have deeper understanding"

But for each modality, it depends on "how much positive transfer there is to this new task"
April 8, 2025 at 2:11 PM
Just in case you've been away... Gemini 2.5 Pro is one of those rare models that feels different – primarily because it has such incredible command of long inputs – and invites you to re-imagine workflows to take full advantage of its capabilities.

x.com/gfodor/stat...
April 8, 2025 at 2:11 PM
Why have leading AI companies converged on reasoning models in recent months?

Was it simply that the same next steps were obvious to all, or are people swapping secrets at SF parties?

@jack_w_rae, who led "Thinking" for Gemini 2.5, shares his perspective

Reasoning Model🧵↓
April 8, 2025 at 2:11 PM
When is it worth training your own models?

Guy says:
- "bias towards simplicity – the simpler, the better"
- "the faster you run experiments, the more likely you'll find something good"
- "let the experiments tell you which way to go"

Tons of practical wisdom here. Enjoy!
April 4, 2025 at 5:38 PM
How does one build the "#1 open-source agent on the SWE-bench Verified leaderboard"?

For one thing, "turn your hyper-parameters up!"

x.com/augmentcode...
April 4, 2025 at 5:38 PM
Augment's Agent mode is now generally available.

Like running an AI lab in general, it's ... "pretty capital intensive"

x.com/augmentcode...
April 4, 2025 at 5:38 PM
The Revolution in software runs much deeper than vibe coding.

@augmentcode's @guygr & I discussed AI assistance for professional engineers & large-scale codebases.

Ever heard of "Reinforcement Learning from Developer Behaviors"?

Listen to "Code Context is King" now!

👂↓🧵
April 4, 2025 at 5:37 PM
Applications of AI to frontier biology endlessly fascinate me.

Here's @SiyuHe7 talking about Squidiff, a model that runs experiments in silico by predicting how single cell transciptomes will respond to perturbations

This can save researchers months!

x.com/SiyuHe7/sta...
April 3, 2025 at 3:16 PM