Lightnews — Scholar-powered news

Ólafur Páll Geirsson

@geirsson.com

610 followers 720 following 280 posts

Building Agents at Sourcegraph. Posts about coding, AI, and family (3 kids). @olafurpg elsewhere. Based in Oslo, Norway.

https://geirsson.com

Posts Replies Media Videos

Ólafur Páll Geirsson

@geirsson.com

Meanwhile we’ll get ads in ChatGPT.

May 31, 2025 at 9:15 PM

Ólafur Páll Geirsson

@geirsson.com

On a second iteration, it seems like it's the web tool that's causing troubles. Disabling the web tool makes Claude 4 reach the right syntactic solution although not with the optimal token edits. Goes to show that you need to be careful with what tools you're exposing. Less is more.

May 29, 2025 at 8:02 PM

Ólafur Páll Geirsson

@geirsson.com

Sonnet 3.7 is the only model I've seen that delivers the perfect solution, it replaces the tokens for `.` and `apply` and nothing else. All other models I've tested use the worse tree replacement APIs.

May 29, 2025 at 7:51 PM

Ólafur Páll Geirsson

@geirsson.com

Last note, even with code that can be unit tested, I still think most of the tests that AI generates is crap. And the AI generated commit messages also miss the point. I'm seeing lots of PRs now where people add AI generated tests that aren't even testing anything meaningful.

May 27, 2025 at 7:53 AM

Ólafur Páll Geirsson

@geirsson.com

The Dwarkesh episode still gave a fresh perspective on how these models work, and I have probably underestimated how powerful they will become. If you're still judging AI capabilities by today's products and today's models then you are probably also underestimating how weird things are going to get.

May 27, 2025 at 7:51 AM

Ólafur Páll Geirsson

@geirsson.com

I am knee deep in the AI hype, and I don't think software engineering will ever be the same again. I love working on ampcode.com and I see daily anecdotes how AI coding is turning software development upside-down for our users.

Amp

Everything will change.

ampcode.com

May 27, 2025 at 7:51 AM

Ólafur Páll Geirsson

@geirsson.com

Even components that can be unit tested or e2e tested via behavioral assertions have lots of implicit constraints wrt. latency or how features interact with each other in long-running user sessions that are impractical to tests in an automated fashion.

May 27, 2025 at 7:51 AM

Ólafur Páll Geirsson

@geirsson.com

The fallacy is thinking that all software engineers do is deliver code that can be tested in isolation, and AI is very good at doing that now. The problem is that tests only cover maybe 0-50% of real-world constraints.

May 27, 2025 at 7:51 AM

Ólafur Páll Geirsson

@geirsson.com

I keep shaking my head hearing AI folks claiming software engineering will be automated this year. After listening to this conversation, I better understand what they at least mean by this. These AI researchers are super smart, but they're also sort of clueless over what "software engineering" is.

May 27, 2025 at 7:51 AM

Ólafur Páll Geirsson

@geirsson.com

Memory reminds me of the Facebook feed circa 2016. It was clearly beneficial for the company, it sure boosted engagement, but deleted my Facebook account and was better off for it.

May 25, 2025 at 3:15 PM

Ólafur Páll Geirsson

@geirsson.com

Contrary to popular belief, the models are surprisingly bad at writing CSS.

May 22, 2025 at 1:56 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news