Lightnews — Scholar-powered news

Simon Willison

@simonwillison.net

38K followers 1.4K following 3.4K posts

Independent AI researcher, creator of datasette.io and llm.datasette.io, building open source tools for data journalism, writing about a lot of stuff at https://simonwillison.net/

Posts Replies Media Videos

Simon Willison

@simonwillison.net

I couldn't resist recreating this using the new GPT-5 Codex Mini, which produced the worst ray-traced pelican on a bicycle yet! simonwillison.net/2025/Nov/9/p...

Pelican on a Bike—Raytracer Edition

beetle_b ran this prompt against a bunch of recent LLMs: Write a POV-Ray file that shows a pelican riding on a bicycle. This turns out to be a harder challenge …

simonwillison.net

November 9, 2025 at 5:23 PM

Simon Willison

@simonwillison.net

Doing this easily was very much not the point!

November 9, 2025 at 6:29 AM

Simon Willison

@simonwillison.net

For comparison, here are the pelicans riding bicycles drawn by GPT-5-Codex-Mini (the new model), GPT-5-Codex and full GPT-5 - all produced via the same hacked version of the Codex CLI tool

GPT-5-Codex-Mini. This is terrible. The pelican is an abstract collection of shapes, the bicycle is likewise very messed up

GPT-5 Codex. It's a dumpy little pelican with a weird face, not particularly great but better than Mini.

GPT-5: Much better bicycle, pelican is a bit line-drawing-ish but does have the necessary parts in the right places

November 9, 2025 at 3:47 AM

Simon Willison

@simonwillison.net

I also recorded a 7 minute YouTube video showing how I got Codex to reverse-engineer and then extend itself in order to draw me that pelican www.youtube.com/watch?v=9o1_...

Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican

YouTube video by Simon Willison

www.youtube.com

November 9, 2025 at 3:39 AM

Simon Willison

@simonwillison.net

Since there's no API access yet I got OpenAI's Codex coding agent to rewrite itself (in Rust) to add a new "codex prompt ..." command which I could use to run prompts against the private models that are only available within that tool - full details here simonwillison.net/2025/Nov/9/g...

Reverse engineering Codex CLI to get GPT-5-Codex-Mini to draw me a pelican

OpenAI partially released a new model yesterday called GPT-5-Codex-Mini, which they describe as "a more compact and cost-efficient version of GPT-5-Codex". It’s currently only available via their Code...

simonwillison.net

November 9, 2025 at 3:38 AM

Simon Willison

@simonwillison.net

Provided the publishing industry can resist the temptation to bankrupt the Internet Archive we might be OK

November 9, 2025 at 1:59 AM

Simon Willison

@simonwillison.net

I love this project infinitemac.org

Infinite Mac

A classic Mac loaded with everything you'd want.

infinitemac.org

November 8, 2025 at 10:11 PM

Simon Willison

@simonwillison.net

I think old language execution is more or less a solved problem now thanks to emulators - we can run code from 50 years ago on an emulator, or even an emulator inside an emulator

A lot of those emulators run in the browser now thanks to WebAssembly!

November 8, 2025 at 10:10 PM

Simon Willison

@simonwillison.net

Good error messages are so important these days! The past couple of Python versions included some big improvements to their error messages for which I'm very grateful

November 8, 2025 at 5:16 AM

Simon Willison

@simonwillison.net

Yes I'd love to se an example of this too, challenge is finding an interesting novel language that definitely hasn't made it into any LLM training data yet!

November 7, 2025 at 10:24 PM

Simon Willison

@simonwillison.net

Yes I'm here for good I think

November 7, 2025 at 4:30 PM

Simon Willison

@simonwillison.net

I have some undisclosed ones that I use - I'm hoping some day I'll catch a model drawing an amazing pelican on a bicycle and failing horribly at some other creature on some other mode of transport!

November 7, 2025 at 4:03 AM

Simon Willison

@simonwillison.net

Which means it's not OSI compliant, which is my bar for calling something "open source" over just "open weights"

November 7, 2025 at 4:02 AM

Reposted by Simon Willison

Ben Tucker

@btucker.net

Then I wanted to make it easier to play with, so another hour with Claude Code and I had a plugin for @simonwillison.net's llm: github.com/btucker/llm-...

What's cool this is you don't have to install anything other than some python packages & you have full access to a reasonably capable LLM.

GitHub - btucker/llm-apple: LLM plugin for local apple-foundation-models available on macOS 26

LLM plugin for local apple-foundation-models available on macOS 26 - btucker/llm-apple

github.com

November 6, 2025 at 9:57 PM

Simon Willison

@simonwillison.net

Ooh I haven't tried that yet, thanks for the tip!

November 6, 2025 at 8:18 PM

Simon Willison

@simonwillison.net

uv makes testing different projects against upgraded dependencies so much easier - no need to think about virtual environments, uv handles them almost invisibly

I wrote more about my uv testing tricks in this TIL til.simonwillison.net/python/uv-te...

Testing different Python versions with uv with-editable and uv-test

A quick uv recipe I figured out today, for running the tests for a project against multiple Python versions.

til.simonwillison.net

November 6, 2025 at 6:43 PM

Simon Willison

@simonwillison.net

No, because it's not creating the data - it's writing code that then produces that data.

It might make mistakes in the code it writes but those are a lot easier to spot.

November 6, 2025 at 5:44 PM

Simon Willison

@simonwillison.net

I doubt it, personally. I spend a lot of time experimenting with local LLMs and I've not yet experienced any that I'd trust with a task remotely as complex as "search for PyPI markdown packages, download them and design and implement a comparative benchmark that tries them out"

November 6, 2025 at 5:43 PM

Simon Willison

@simonwillison.net

I find Claude Sonnet 4.5 fits my own personal coding style slightly better, but that might just be that I have more practice in prompting it - they're very close to each other!

I like Claude Code for web more than Codex Cloud because you can send Claude new instructions while it's still running

November 6, 2025 at 4:42 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news