Daniel Mewes
banner
dmewes.com
Daniel Mewes
@dmewes.com
Computer scientist. Interested in technology, artificial and natural intelligence, emergent complexity, among other things. Blogging at amongai.com.

Currently at Imbue. Previously Ambient.ai, Stripe, RethinkDB, Max Planck Institute.
100% youtu.be/FyStPztM7eo . Things are really off in our current society. (video by @hankgreen.bsky.social )
Truth
YouTube video by Hank Green
youtu.be
January 14, 2026 at 2:40 AM
As someone who still finds typing on a touchscreen incredibly slow and frustrating, I'm very excited about the Clicks Communicator. It's basically a BlackBerry! clicksphone.com/communicator
Even comes with a notification LED!
Clicks Communicator: the ultimate communciation companion
Clicks Communicator is phone purpose-built for taking action and communicating in a noisy world with deeper context, versatile input and greater control in a compact design.
clicksphone.com
January 3, 2026 at 6:55 AM
Has anyone tried the Nyxt browser? nyxt.atlas.engineer

It's cool that it's written in Lisp and comes with a Repl. I'm running into some slow UI updates though.
Nyxt browser: The hacker's browser
nyxt.atlas.engineer
December 20, 2025 at 6:01 AM
GPT 5.2 (thinking on high) seems to be buggy. I'm seeing reproducible issues where it starts glitching out in the middle of its response with garbage tokens and then ends the message.
Highly reproducible for me every 10th-20th prompt or so.
December 17, 2025 at 5:47 PM
Gemini 3 Flash beats Gemini 3 Pro in both ARC-AGI 2 (33.6% vs 31.1%) and in SWE-bench Verified (78% vs 76.2%). 1/4th the cost and faster.
This is very very impressive. Is there still a reason to use 3 Pro?
blog.google/products/gem...
Gemini 3 Flash: frontier intelligence built for speed
Gemini 3 Flash offers frontier intelligence built for speed at a fraction of the cost.
blog.google
December 17, 2025 at 5:41 PM
LLMs can generate arbitrary apps on the fly. Without ever writing source code.

I wrote about this here:
wp.me/pdyMeH-bd

This was really fun to play around with. You can try it out yourself using github.com/danielmewes/...
Hallucinate any App, One Screen at a Time
A major trend of this year has been vibe coding – using LLMs to create software from the ground up, without a human ever interacting with the source code directly. The current consensus is th…
wp.me
December 10, 2025 at 5:33 PM
I do think it's most likely that this was indeed due to teleoperation. But I also think it would be funny if this behavior got accidentally trained into the model due to operator-gathered training data that didn't get trimmed correctly. :D
can't stop watching this clip of a tesla Optimus teleoperator taking his headset off before properly logging out the robot
December 9, 2025 at 7:09 PM
Poetiq's methodology on top of Gemini 3 & GPT 5.1 exceeds average human performance on ARC-AGI-2!
This is huge.

Only caveat is that they evaluated on the public set - it might have been used in post training of Gemini 3? Looking forward to see private eval results! poetiq.ai/posts/arcagi...
Traversing the Frontier of Superintelligence
Poetiq is proud to announce a major milestone in AI reasoning. We have established a new state-of-the-art (SOTA) on the ARC-AGI-1 & 2 benchmarks, significantly advancing both the performance and the e...
poetiq.ai
November 21, 2025 at 5:56 PM
There have been a few works that showed signs of LLM introspection since I wrote my article. Though the consensus still seems to be that it's quite unreliable. amongai.com/2024/12/24/l...
October 29, 2025 at 7:01 PM
Reposted by Daniel Mewes
This was a tough but necessary decision - I posted my own notes on this here, from the perspective of a current PSF board member simonwillison.net/2025/Oct/27/...
October 27, 2025 at 8:34 PM
I have to admit the mechanism behind cable impedance and impedance matching never really clicked for me, despite being a licensed radio amateur for 20 years.
...until I watched this video by AlphaPhoenix - it's such an amazing visualization of what's going on in a cable! youtu.be/RkAF3X6cJa4?...
What does "impedance matching" actually look like? (electricity waves)
YouTube video by BetaPhoenix
youtu.be
October 24, 2025 at 3:22 PM
Claude Haiku 4.5 outperforms Sonnet 4 in some coding benchmarks (such as SWE-bench Verified). This is exciting, since it's 1/3 the price.

However, in my actual use, I've found it to be a bit underwhelming compared to Sonnet 4 (not to mention Sonnet 4.5).

What has been your experience with Haiku?
October 23, 2025 at 5:35 PM
This is so true.
I think almost nobody is talking about symbolic GOFAI these days, so I'm not concerned about that.
But all of ML being re-branded to AI lately, while AI has simultaneously been made synonymous with generative AI in many places, has led to so much confusion.
The fallout from the fact that data science/classical machine learning & generative AI are both called "AI" has been remarkably broad & persistent

Policy addresses the wrong harms, companies have been confused about who should lead efforts, hiring is misguided, academic discussion is often muddled.
October 22, 2025 at 5:51 PM
The age of single-serving, disposable (simple) software is here. I now frequently have LLMs write software for me that I use exactly once.
Often to convert some data or create visualizations, but also one-off new features to add to some open-source application that I only want to use once.
October 8, 2025 at 12:07 AM
Reposted by Daniel Mewes
new blog post! why do LLMs freak out over the seahorse emoji? i put llama-3.3-70b through its paces with the logit lens to find out, and explain what the logit lens (everyone's favorite underrated interpretability tool) is in the process.

link in reply!
October 5, 2025 at 2:36 PM
Today we're releasing Sculpture, an AI coding tool that combines async agents with the ability to collaborate with agents locally.

It also comes with built-in verifiers to automatically check the quality of AI written code. More to come! imbue.com/sculptor-ann...
September 30, 2025 at 5:04 PM
I used AI vibe coding tools to port Anthropic's API client to Common Lisp: github.com/danielmewes/...
It was a very quick and fun mini project that taught me a thing or two about Lisp. Use it at your own risk.
GitHub - danielmewes/anthropic-sdk-cl-port: An AI-written port of the Anthropic client SDK to Common Lisp.
An AI-written port of the Anthropic client SDK to Common Lisp. - danielmewes/anthropic-sdk-cl-port
github.com
September 22, 2025 at 4:42 PM
I feel like each of Anthropic's three post mortems is missing some key explanation step in their root cause descriptions. www.anthropic.com/engineering/... 🧵
A postmortem of three recent issues
This is a technical report on three bugs that intermittently degraded responses from Claude. Below we explain what happened, why it took time to fix, and what we're changing.
www.anthropic.com
September 18, 2025 at 5:47 PM
My Hulu / Disney+ subscription will be preempted indefinitely.
September 18, 2025 at 5:14 AM
Reposted by Daniel Mewes
As AI systems keep getting better at very hard problems while getting more opaque, the way that we work with AI is shifting shifting from being collaborators who shape the process to being supplicants who receive the output.

I discussed what that means. www.oneusefulthing.org/p/on-working...
On Working with Wizards
Verifying magic on the jagged frontier
www.oneusefulthing.org
September 11, 2025 at 8:55 PM
Reposted by Daniel Mewes
Meta trained a special “aggregator” model that learns how to combine and reconcile different answers into a more accurate final one, instead of relying on simple majority voting or reward model ranking on multiple model answers.
September 9, 2025 at 2:03 PM
Reposted by Daniel Mewes
The funny thing about the prediction that AI would be writing 90% of all code by now is that the prediction's failure distracts from the fact that AI adoption in code writing is actually extremely high, it was already over 30% in December, 2024 according to one measure, with large economic impact.
September 3, 2025 at 4:19 PM
I remember seeing this graph and thinking it was about the influence of training data on LLM responses as well. Quite misleading.
This chart is everywhere and is being horribly misinterpreted.

This is not where the training data for AI comes from, it is a study done by a SEO firm that claims to show how often sites come up at least once in THE WEB SEARCH FUNCTION of certain AI agents when they do a web search for more info.
September 2, 2025 at 3:52 AM
I think we're starting to see diminishing returns from LLM pre- and post-training. The limitations of today's LLMs are unlikely to just disappear with the next bigger model.
This is not all bad: we can start focusing on how to work around those limitations and how to put current LLMs to work.
August 29, 2025 at 4:25 PM
Just experienced phantom breaking for the first time in my Kia with HDA2 (Highway Drive Assist) and it made me sentimental about the exciting times of early Tesla Autopilot 😢
August 29, 2025 at 4:00 PM