Nicolay Gerold
banner
nicolaygerold.com
Nicolay Gerold
@nicolaygerold.com
Daytime: building ai systems & platforms @ Aisbach
Nighttime: hacking on generative ai

Host of How AI Is Built
OpenCode is really stepping it up. I love the sidebar stuff. Really plays well with nvim.
May 16, 2025 at 8:16 AM
Thanks buddy
April 25, 2025 at 5:46 AM
EU mobilizes 200 billion Euros for AI.

Unless we have a massive political change that money will just go to waste.

For innovation to happen, we don't need money first, but deregulation.

We cannot work on breakthrough technologies, when we are at constant fear of being sued.
February 13, 2025 at 9:15 AM
RAG is dead. Long live RAG.

LLMs suck at long context.

This paper shows what I have seen in most deployments.

With longer contexts, performance degrades.
February 13, 2025 at 6:11 AM
Dropping some new episodes on @howaiisbuilt.fm . Links below.
January 31, 2025 at 12:44 PM
New podcast with Alex Garcia on search in sqlite
January 25, 2025 at 5:29 AM
That's surprisingly on brand.
January 18, 2025 at 7:33 AM
Developers treat search as a blackbox.

Throw everything in a vector database and hope something good comes out.

Throw all ranking signals into one big ML model and hope it makes something good out of it.

You don’t want to create this witch’s cauldron.

New episode on @howaiisbuilt.fm
January 9, 2025 at 1:58 PM
The biggest lie in RAG is that semantic search is simple.

The reality is that it's easy to build, it's easy to get up and running, but it's really hard to get right.

And if you don't have a good setup, it's near impossible to debug.

One of the reasons it's really hard is chunking.
January 3, 2025 at 11:28 AM
@merve.bsky.social

Can you show the Amazon people how to use a VLM to do the handwriting recognition.

That's atrocious.
December 23, 2024 at 6:59 PM
Getting ads for the remarkable pro while watching a review on the new Kindle Scribe. Someone got their targeting down.
December 23, 2024 at 6:57 PM
You usually have supervisors and workers. And it is super easy to spin them up. And the workers run in "lightweight processes", which when they crash they can be spun up again super fast and since they are isolated, they don't bring down the entire system.
December 21, 2024 at 11:40 AM
They use a large model (e.g. gpt-4o) to generate training data for a smaller one (gpt-4o-mini).

This lets you build fast, cheap models that do one thing well or that are more capable because they have (nearly) identical capabilities distilled into a smaller number of parameters.
December 19, 2024 at 12:43 PM
Didn't find the on demand pricing for it. Probably around 500/h if it becomes available for it :D
December 17, 2024 at 8:38 AM
@chris.blue

Getting reminded to buy myself a Christmas gift.
December 17, 2024 at 5:33 AM
If you are lost in all the fuzz around the Byte Latent Transformer by Meta, read on.

Meta has created BLT, a new AI model that works with raw bytes instead of tokens.

Current AI models split text into tokens (fixed chunks of letters) before processing it.

>>
December 16, 2024 at 3:18 PM
Claude is surprisingly good at workout programming.
December 16, 2024 at 5:57 AM
December 15, 2024 at 1:19 PM
"Instead of being a one-way pipeline, agentic RAG allows you to check, 'Am I actually answering the user's question?'"

Different questions need different approaches.

➡️ 𝗤𝘂𝗲𝗿𝘆-𝗕𝗮𝘀𝗲𝗱 𝗙𝗹𝗲𝘅𝗶𝗯𝗶𝗹𝗶𝘁𝘆:
- Structured data? Use SQL
- Context-rich query? Use vector search
- Date-specific? Apply filters first
December 14, 2024 at 2:07 PM
Damn....
December 14, 2024 at 12:16 PM
Inequality joins in polars is massive.
December 14, 2024 at 6:56 AM
Has been a while since, I have done gradient accumulation.

I always tend to forget the last step (the check for hitting the length of the dataset) on the first implementation.
December 11, 2024 at 5:26 AM
Coding a project > reading articles.

Coding a project forces you to apply concepts directly. It’s a richer learning experience than just reading technical articles.

You discover gaps, solve real problems, and solidify your understanding.
December 10, 2024 at 7:24 PM
15Mio Ai Builders on Huggingface, we are still early.
December 10, 2024 at 2:24 PM
Data Drift for Dummies.

Data drift happens when the real world changes but your model doesn't.

- Input drift: The data coming in changes (like cameras getting better resolution)
- Label drift: What you're predicting changes (like what counts as "spam" evolving)

>>
December 10, 2024 at 2:13 PM