Simon Willison
banner
simonwillison.net
Simon Willison
@simonwillison.net
Independent AI researcher, creator of datasette.io and llm.datasette.io, building open source tools for data journalism, writing about a lot of stuff at https://simonwillison.net/
I couldn't resist recreating this using the new GPT-5 Codex Mini, which produced the worst ray-traced pelican on a bicycle yet! simonwillison.net/2025/Nov/9/p...
Pelican on a Bike—Raytracer Edition
beetle_b ran this prompt against a bunch of recent LLMs: Write a POV-Ray file that shows a pelican riding on a bicycle. This turns out to be a harder challenge …
simonwillison.net
November 9, 2025 at 5:23 PM
Doing this easily was very much not the point!
November 9, 2025 at 6:29 AM
For comparison, here are the pelicans riding bicycles drawn by GPT-5-Codex-Mini (the new model), GPT-5-Codex and full GPT-5 - all produced via the same hacked version of the Codex CLI tool
November 9, 2025 at 3:47 AM
I also recorded a 7 minute YouTube video showing how I got Codex to reverse-engineer and then extend itself in order to draw me that pelican www.youtube.com/watch?v=9o1_...
Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican
YouTube video by Simon Willison
www.youtube.com
November 9, 2025 at 3:39 AM
Since there's no API access yet I got OpenAI's Codex coding agent to rewrite itself (in Rust) to add a new "codex prompt ..." command which I could use to run prompts against the private models that are only available within that tool - full details here simonwillison.net/2025/Nov/9/g...
Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican
OpenAI partially released a new model yesterday called GPT-5-Codex-Mini, which they describe as "a more compact and cost-efficient version of GPT-5-Codex". It’s currently only available via their Code...
simonwillison.net
November 9, 2025 at 3:38 AM
Provided the publishing industry can resist the temptation to bankrupt the Internet Archive we might be OK
November 9, 2025 at 1:59 AM
I love this project infinitemac.org
Infinite Mac
A classic Mac loaded with everything you'd want.
infinitemac.org
November 8, 2025 at 10:11 PM
I think old language execution is more or less a solved problem now thanks to emulators - we can run code from 50 years ago on an emulator, or even an emulator inside an emulator

A lot of those emulators run in the browser now thanks to WebAssembly!
November 8, 2025 at 10:10 PM
Good error messages are so important these days! The past couple of Python versions included some big improvements to their error messages for which I'm very grateful
November 8, 2025 at 5:16 AM
Yes I'd love to se an example of this too, challenge is finding an interesting novel language that definitely hasn't made it into any LLM training data yet!
November 7, 2025 at 10:24 PM
Yes I'm here for good I think
November 7, 2025 at 4:30 PM
I have some undisclosed ones that I use - I'm hoping some day I'll catch a model drawing an amazing pelican on a bicycle and failing horribly at some other creature on some other mode of transport!
November 7, 2025 at 4:03 AM
Which means it's not OSI compliant, which is my bar for calling something "open source" over just "open weights"
November 7, 2025 at 4:02 AM
Reposted by Simon Willison
Then I wanted to make it easier to play with, so another hour with Claude Code and I had a plugin for @simonwillison.net's llm: github.com/btucker/llm-...

What's cool this is you don't have to install anything other than some python packages & you have full access to a reasonably capable LLM.
GitHub - btucker/llm-apple: LLM plugin for local apple-foundation-models available on macOS 26
LLM plugin for local apple-foundation-models available on macOS 26 - btucker/llm-apple
github.com
November 6, 2025 at 9:57 PM
Ooh I haven't tried that yet, thanks for the tip!
November 6, 2025 at 8:18 PM
uv makes testing different projects against upgraded dependencies so much easier - no need to think about virtual environments, uv handles them almost invisibly

I wrote more about my uv testing tricks in this TIL til.simonwillison.net/python/uv-te...
Testing different Python versions with uv with-editable and uv-test
A quick uv recipe I figured out today, for running the tests for a project against multiple Python versions.
til.simonwillison.net
November 6, 2025 at 6:43 PM
No, because it's not creating the data - it's writing code that then produces that data.

It might make mistakes in the code it writes but those are a lot easier to spot.
November 6, 2025 at 5:44 PM
I doubt it, personally. I spend a lot of time experimenting with local LLMs and I've not yet experienced any that I'd trust with a task remotely as complex as "search for PyPI markdown packages, download them and design and implement a comparative benchmark that tries them out"
November 6, 2025 at 5:43 PM
I find Claude Sonnet 4.5 fits my own personal coding style slightly better, but that might just be that I have more practice in prompting it - they're very close to each other!

I like Claude Code for web more than Codex Cloud because you can send Claude new instructions while it's still running
November 6, 2025 at 4:42 PM