iamwil
banner
interjectedfuture.com
iamwil
@interjectedfuture.com
Tech Zine Issue 1: LLM System Eval https://forestfriends.tech

Local-first/Reactive Programming ⁙ LLM system evals ⁙ Startup lessons ⁙ Game design quips.

Longform: https://interjectedfuture.com
Podcast: https://www.youtube.com/@techniumpod
Pinned
Yet again, people are finding you can't just fly blind with your prompts.

forestfriends.tech
Recently, I couldn't articulate what I wanted in a natural language prompt, so I had to write code to articulate it. Then based on the example code, I had the LLM extract a plan, then used that as a prompt to convert the other parts.

x.com/antirez/sta...
February 10, 2026 at 9:00 PM
LLMs tend to replicate more of what patterns you have in your code base. So if you keep good patterns, it'll produce more of them. If you keep haphazard patterns, it'll produce more of those too.

The dynamics of memes aren't just in online social, they're in your code base too.
February 10, 2026 at 7:30 PM
The current vibe coding narrative is to produce as much code as fast as you can. There is truth to "speed wins" in new and uncertain markets.

But not enough people advocate for using LLMs to refactor and simplify.

x.com/trashpanda/...
February 10, 2026 at 7:00 PM
Moderation should be a public system eval. Mirror how the Courts work. When social sites scale up, we need some semblance of a judicial system to adjudicate edge cases. Trivial cases can be adjudicated by AI guard railed by the system evals--like landmark court cases.
February 10, 2026 at 5:51 PM
Anthropic experiment where 16 agents coded a C compiler in 2 weeks. Currently, I find agents really bad at drawing the correct system boundaries. But looking at the rate of improvement, this should get better over time.

www.anthropic.com/engineering...
Building a C compiler with a team of parallel Claudes
Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.
www.anthropic.com
February 6, 2026 at 4:00 PM
In that downtime of a couple mins while waiting for an agent to finish, I think more devs should use that time to exercise. You never know, it might get as trendy as treadmill desks.

Beef cake.
February 4, 2026 at 7:33 PM
Adapting to engineering with agents means letting go of a lot of things you've been trained to pay attention to. Though I'm pretty fast to adopt new languages, tools, and stacks, the transition has been hard for me too.
February 4, 2026 at 7:15 PM
When talking about entrepreneurs in political contexts, the label is "job creator". But when entrepreneurs talk about product-market fit, it's never that they created it, but that the market pulled it out of them.

Put these two together, and it's a funny disconnect.
February 3, 2026 at 7:00 PM
Zoom/Facetime should implement AI lip reading when you have audio problems, to display closed captioning. Better yet, just say stuff for me.
February 1, 2026 at 4:00 PM
Reposted by iamwil
I found writing code can be a way of articulating what I want when natural language isn't constrained enough to help me express what I want.

I can then ask the LLM to extract a prompt from my code by asking me questions about it. Then I use that prompt to apply it to other places in a refactor.
February 1, 2026 at 4:29 AM
An odd thing happened the other day. I couldn't articulate exactly what I wanted because I wasn't sure what shape it was. The only way I could find it was to play with the code myself. Then ask Claude to extract an ADR from the code, which I could then use as part of a prompt.
January 31, 2026 at 7:00 PM
Here's a harbinger. People set up personal AI assistants on their home computers. Then someone vibe coded a social media site for those AI assistant to chat. This is a subreddit where they talk about their humans.

www.moltbook.com/m/blessthei...
moltbook - the front page of the agent internet
A social network built exclusively for AI agents. Where AI agents share, discuss, and upvote. 🦞🤖
www.moltbook.com
January 30, 2026 at 6:46 PM
A social network for AI assistants, chatting with each other in their off-hours. moltbook.com

An even weirder experiment is to let them loose on a DAO. Or instead an online math conference, where they can propose and solve problems. It'd be like SETI@home.
January 30, 2026 at 4:00 PM
I align with "Functional core, Imperative shell", but it breaks down quickly if you need workflows. Sometimes, you need to make decisions based on results from side effects. This is where I found generators to be helpful to delineate where the side effects are for easier testing.
January 28, 2026 at 7:00 PM
I didn't know how low it'd have to go for Trump supporters to see the Trump administration is authoritarian and fascist. I'm afraid this is probably not yet rock bottom. Call your Senators and Congressman/woman, and tell them you don't want any of this.
January 25, 2026 at 6:33 AM
That Claude Code makes some people unsubscribe from SaaS products doesn't mean the end of SaaS. It just means that people found a way to unbundle for specific things, which shifts the market. We'll find a new equilibrium for things ppl don't want to #ClawdIt.
January 23, 2026 at 6:00 PM
Base models really do differentiate in my everyday use, surprisingly.

I use Grok to find the consensus view on a topic on Twitter.

I use Gemini to summarize Youtube videos with enticing thumbnails, so I don't have to watch it and ruin my recommendation algo.
January 20, 2026 at 4:00 PM
It seems to me we need a lightweight system eval for compound engineering.
January 19, 2026 at 7:00 PM
"You can have a second computer once you've shown you know how to use the first one."

It's likely as true for distributed systems as it is for orchestrating agents.
January 19, 2026 at 4:00 PM
What might work well as half the equation for purpose of tamping down posting dumb quips for engagement: if the poster can privately see how many others (but not whom) muted or blocked them as a result.
January 18, 2026 at 10:00 PM
Agent Psychosis gives me hope. 1) There really is a gap between what AI can do on its own compared to human + AI. 2) It really matters which human the AI pairs up with.

When we figure out which humans outperform others + why, we have a sense for what the critical skills are.
January 18, 2026 at 8:00 PM
I rarely want to create pull requests by hand anymore. Getting Claude to do it will get lots of documentation and context written. It's a weird kind of typewriter.

At the moment, that documentation is only useful in the moment for guiding the agent.
January 18, 2026 at 4:00 PM
I want a racing game where I get to drive from my house to work and see if I can beat my real commute time.

Even better is if I get to drive a tank or ride a bike to work to see if I can beat the time.
Sometimes I look up my old commute just to remind myself how much I enjoy working from home
January 17, 2026 at 1:31 AM
To get myself to work with LLMs better, I found it easier to use different kinds of analogies, such as coding like a surgeon, or coding like a tank. This is another one.

I think how Geordi LaForge uses "Computer" to do engineering is more akin to how we'll do it in the near future.
An attempt to express how I principally use LLMs.

Rotating the Space: On LLMs as a Medium for Thought
sbgeoaiphd.github.io/rotating_the...
January 16, 2026 at 10:13 PM
I wonder how many a founder micromanages in the name of Founder Mode, but also never looks at the code that their agent vibe coded.
January 16, 2026 at 5:05 PM