Ólafur Páll Geirsson
geirsson.com
Ólafur Páll Geirsson
@geirsson.com
Building Agents at Sourcegraph. Posts about coding, AI, and family (3 kids). @olafurpg elsewhere. Based in Oslo, Norway.

https://geirsson.com
Meanwhile we’ll get ads in ChatGPT.
May 31, 2025 at 9:15 PM
On a second iteration, it seems like it's the web tool that's causing troubles. Disabling the web tool makes Claude 4 reach the right syntactic solution although not with the optimal token edits. Goes to show that you need to be careful with what tools you're exposing. Less is more.
May 29, 2025 at 8:02 PM
Sonnet 3.7 is the only model I've seen that delivers the perfect solution, it replaces the tokens for `.` and `apply` and nothing else. All other models I've tested use the worse tree replacement APIs.
May 29, 2025 at 7:51 PM
Last note, even with code that can be unit tested, I still think most of the tests that AI generates is crap. And the AI generated commit messages also miss the point. I'm seeing lots of PRs now where people add AI generated tests that aren't even testing anything meaningful.
May 27, 2025 at 7:53 AM
The Dwarkesh episode still gave a fresh perspective on how these models work, and I have probably underestimated how powerful they will become. If you're still judging AI capabilities by today's products and today's models then you are probably also underestimating how weird things are going to get.
May 27, 2025 at 7:51 AM
I am knee deep in the AI hype, and I don't think software engineering will ever be the same again. I love working on ampcode.com and I see daily anecdotes how AI coding is turning software development upside-down for our users.
Amp
Everything will change.
ampcode.com
May 27, 2025 at 7:51 AM
Even components that can be unit tested or e2e tested via behavioral assertions have lots of implicit constraints wrt. latency or how features interact with each other in long-running user sessions that are impractical to tests in an automated fashion.
May 27, 2025 at 7:51 AM
The fallacy is thinking that all software engineers do is deliver code that can be tested in isolation, and AI is very good at doing that now. The problem is that tests only cover maybe 0-50% of real-world constraints.
May 27, 2025 at 7:51 AM
I keep shaking my head hearing AI folks claiming software engineering will be automated this year. After listening to this conversation, I better understand what they at least mean by this. These AI researchers are super smart, but they're also sort of clueless over what "software engineering" is.
May 27, 2025 at 7:51 AM
Memory reminds me of the Facebook feed circa 2016. It was clearly beneficial for the company, it sure boosted engagement, but deleted my Facebook account and was better off for it.
May 25, 2025 at 3:15 PM
Contrary to popular belief, the models are surprisingly bad at writing CSS.
May 22, 2025 at 1:56 PM