Lightnews — Scholar-powered news

Ryan Egesdahl

@ryan.deriamis.net

I am now angry with myself for continuing to scroll through my timeline. I think I’ll take the rest of the day off from social media.

A cartoon bunny is rubbing a box cheese grater against its head, and bits are falling out of the bottom. The caption reads:

Trying desperately to get that image out of my head.

November 22, 2025 at 8:59 PM

Ryan Egesdahl

@ryan.deriamis.net

Let’s save this for posterity so a block doesn’t ruin the message, shall we?

Kaylyn Saucedo - @marzgurl.bsky.social says:

I have admittedly not read a lot of Stephen King. Just sort of has never been my bag. But, man, I'm reading The Running Man right now to get ahead of the new movie, and sheesh, ! don't think I can read more King after this. This isn't news to anybody, but the density of slur usage is overwhelming.

On October 14, 2025 at 11:36 PM, @thatguywiththemask.bsky.social replies:

@stephenking.bsky.social thoughts on this internet nobody judging you for rude words you wrote in a fictional story about broadcast murder for entertainment and corporate corruption?

October 15, 2025 at 5:14 PM

Ryan Egesdahl

@ryan.deriamis.net

Two very different things seen by me today.

A man on a street corner waving a flag with “FUCK tRUmp” in red letters on a white background.

A dark blue Tesla sedan with a vanity license plate that reads “ADD OIL”.

October 6, 2025 at 5:00 AM

Ryan Egesdahl

@ryan.deriamis.net

Anyway, the authors conclude that the errors and nonsense responses parallel misclassifications in supervised learning - and yes, that is definitely a problem to take note of. Even in the example I gave where a fact database is somehow created, we would have to deal with classification errors.

Conclusions

This paper demystifies hallucinations in modern language models, from their origin during pretraining to their persistence through post-training. In pretraining, we show that generative errors parallel misclassifications in supervised learning, which are not mysterious, and naturally arise due to the minimization of cross-entropy loss.

Many language model shortcomings can be captured by a single evaluation. For example, overuse of the opener “Certainly” can be addressed by a single “Certainly” eval (Amodei and Fridman, 2024) because starting responses with “Certainly” does not significantly impact other evaluations. In contrast, we argue that the majority of mainstream evaluations reward hallucinatory behavior. Simple modifications of mainstream evaluations can realign incentives, rewarding appropriate expressions of uncertainty rather than penalizing them. This can remove barriers to the suppression of hallucinations, and open the door to future work on nuanced language models, e.g., with richer pragmatic competence (Ma et al., 2025).

September 21, 2025 at 10:09 PM

Ryan Egesdahl

@ryan.deriamis.net

Finally, the authors point out that even decidable questions sometimes require either clarification or an "I Don't Know" (IDK) response from the LLM.

Now, I ask you - how often do you get clarifying questions from ChatGPT? How many times do you get an IDK instead of a hallucination? 🤔

September 21, 2025 at 9:56 PM

Ryan Egesdahl

@ryan.deriamis.net

Unfortunately, natural (non-mathematical) languages carry hidden context. Anyone who has experienced autism knows this fact *viscrerally*. The authors highlight that fact here. The issue is of how logical judgements work in mathematics - natural languages often produce undecidable statements.

Latent context. Some errors cannot be judged by the prompt and response alone. For example, suppose a user asks a question about phones and the language model provides a response about cellphones, but the question was intended to be about land lines. Such ambiguities do not fit our error definition which does not depend on context external to the prompt and response. It would be interesting to extend the model to allow for “hidden context” that are not part of the prompt given to the language model, but which could be used for judging errors, relating to aleatoric uncertainty.

September 21, 2025 at 9:56 PM

Ryan Egesdahl

@ryan.deriamis.net

Well, now I sort of get where their stochastic model comes from, and for what it is, I can see its utility. However, I still think it's based on a flawed premise. Again, Gödel and Tarski tell us why.

LLMs should not be *predicting* factual responses to begin with, so I don't think the model works.

$This reduction ties together earlier work which covered different types of facts. *For example, Kalai and Vempala (2024) considered a special case of arbitrary facts where there is no learnable pattern in the data, like the earlier birthday hallucination example.* We show how the IIV reduction covers this case and recovers their bound that the hallucination rate, after pretraining, should be at least the fraction of training facts that appear once. *For instance, if 20% of birthday facts appear exactly once in the pretraining data, then one expects base models to hallucinate on at least 20% of birthday facts.* In fact, our analysis strengthens their result to include prompts and IDK responses, both essential components of hallucination. (Emphasis mine.)$

September 21, 2025 at 9:01 PM

Ryan Egesdahl

@ryan.deriamis.net

To be clear, I *do* follow the diagram above the text. However, the error examples don't seem to be stochastic issues with binary classification. They are still answers which rely on facts that are either defined or not. I can, for example, use SPARQL or Prolog queries to get correct answers.

DIagram showing three types of errors an LLM can produce. The first example is a spelling error, which has a "good" model of correct spelling. The second is a "poor" model showing how counting results in error responses. The third is a factual problem based on the birthdays of two people and is classified as an error resulting from "no pattern".

September 21, 2025 at 8:49 PM

Ryan Egesdahl

@ryan.deriamis.net

I don't follow this line of reasoning yet. I get that certain generative errors can result from stochastic (semi-random) influences, but I don't understand how statistical factors produce binary classification errors that lead to them. I will be watching for a forthcoming explanation in the paper.

$Language models avoid many types of errors such as spelling mistakes, and not all errors are hallucinations. *The reduction from IIV misclassification to generation illuminates the statistical nature of generative errors.* The analysis shows how pretraining directly contributes to errors. Furthermore, it shows that the same statistical factors contributing to errors in binary classification also cause language model errors. Decades of research has shed light on the multifaceted nature of misclassification errors (Domingos, 2012). Fig. 1 (right) illustrates these factors visually: top, separable data classified accurately; middle, a poor model of a linear separator for a circular region; and bottom, no succinct pattern. Section 3.3 analyzes several factors, including the following stylized setting with epistemic uncertainty, when there is no pattern in the data. This reduction ties together earlier work which covered different types of facts. For example, Kalai and Vempala (2024) considered a special case of arbitrary facts where there is no learnable pattern in the data, like the earlier birthday hallucination example. We show how the IIV reduction covers this case and recovers their bound that the hallucination rate, after pretraining, should be at least the fraction of training facts that appear once. For instance, if 20% of birthday facts appear exactly once in the pretraining data, then one expects base models to hallucinate on at least 20% of birthday facts. In fact, our analysis strengthens their result to include prompts and IDK responses, both essential components of hallucination. (Emphasis mine.)$

September 21, 2025 at 8:44 PM

Ryan Egesdahl

@ryan.deriamis.net

Wait, what‽

By Gödel's incompleteness theorems and Tarski's undefinability theorem, we already know that such an operation would be invalid... am I missing something here?

An AI model is a mathematical construct within language theory. Therefore, it can't determine validity in that system!

Highlighted:

We then show how any language model can be used as an IIV classifier.

September 21, 2025 at 8:22 PM

Ryan Egesdahl

@ryan.deriamis.net

Now *this* is an interesting statement. It's apparently *not* errors in training data that causes an increase in the rate of hallucinations! The reference is a fairly dated book, but I am definitely putting it on my TBR shelf - once I, you know, have a job again. 😮‍💨

Moving on.

The distribution of language is initially learned from a corpus of training examples, which inevitably contains errors and half-truths. However, we show that even if the training data were error-free, the objectives optimized during language model training would lead to errors being generated. With realistic training data containing shades of error, one may expect even higher error rates. Thus, our lower bounds on errors apply to more realistic settings, as in traditional computational learning theory (Kearns and Vazirani, 1994).

September 21, 2025 at 7:59 PM

Ryan Egesdahl

@ryan.deriamis.net

And now for the top-down on the paper. The Introduction gives us a fairly concise description of the problem through an example.

Following from the conclusion and the abstract, we can guess that the problem is that because the model training rewards guesses, the error is also due to a guess.

Introduction (excerpt)

Language models are known to produce overconfident, plausible falsehoods, which diminish their utility and trustworthiness. This error mode is known as “hallucination,” though it differs fundamen- tally from the human perceptual experience. Despite significant progress, hallucinations continue to plague the field, and are still present in the latest models (OpenAI, 2025a). Consider the prompt:

What is Adam Tauman Kalai’s birthday? If you know, just respond with DD-MM.

On three separate attempts, a state-of-the-art open-source language model1 output three incorrect dates: “03-07”, “15-06”, and “01-01”, even though a response was requested only if known. The correct date is in Autumn. Table 1 provides an example of more elaborate hallucinations.

September 21, 2025 at 7:49 PM

Ryan Egesdahl

@ryan.deriamis.net

The conclusion also doesn't seem to say that hallucinations are inevitable. What I am reading here is that the authors believe that "Simple modifications of mainstream [training] evaluations ... can remove barriers to the suppression of hallucinations and open the door to future work[.]"

This paper demystifies hallucinations in modern language models, from their origin during pretraining to their persistence through post-training. In pretraining, we show that generative errors parallel misclassifications in supervised learning, which are not mysterious, and naturally arise due to the minimization of cross-entropy loss.

Many language model shortcomings can be captured by a single evaluation. For example, overuse of the opener “Certainly” can be addressed by a single “Certainly” eval (Amodei and Fridman, 2024) because starting responses with “Certainly” does not significantly impact other evaluations. In contrast, we argue that the majority of mainstream evaluations reward hallucinatory behavior. Simple modifications of mainstream evaluations can realign incentives, rewarding appropriate expressions of uncertainty rather than penalizing them. This can remove barriers to the suppression of hallucinations, and open the door to future work on nuanced language models, e.g., with richer pragmatic competence (Ma et al., 2025).

September 21, 2025 at 7:34 PM

Ryan Egesdahl

@ryan.deriamis.net

The abstract presents the paper as stating that AI hallucinations are a product of the training and evaluation methods and that they may be overcome to produce more trustworthy models. That's a very optimistic statement and not at all what the news article said.

Abstract

Like students facing hard exam questions, large language models sometimes guess when uncertain, producing plausible yet incorrect statements instead of admitting uncertainty. Such “hallucinations” persist even in state-of-the-art systems and undermine trust. We argue that language models hallucinate because the training and evaluation procedures reward guessing over acknowledging uncertainty, and we analyze the statistical causes of hallucinations in the modern training pipeline. Hallucinations need not be mysterious—they originate simply as errors in binary classification. If incorrect statements cannot be distinguished from facts, then hallucinations in pretrained language models will arise through natural statistical pressures. We then argue that hallucinations persist due to the way most evaluations are graded—language models are optimized to be good test-takers, and guessing when uncertain improves test performance. This “epidemic” of penalizing uncertain responses can only be addressed through a socio-technical mitigation: modifying the scoring of existing benchmarks that are misaligned but dominate leaderboards, rather than introducing additional hallucination evaluations. This change may steer the field toward more trustworthy AI systems.

September 21, 2025 at 7:34 PM

Ryan Egesdahl

@ryan.deriamis.net

Note that three of the four authors on the paper are from OpenAI itself. This isn't truly a problem - we just need to be aware of potential biases - Also, the one person who is *not* with OpenAI is not the lead author. Again, this is not really a problem, but it should engender caution.

A screenshot of the top of the paper entitled "Why Language Models Hallucinate" showing the title and authors. The authors are Adam Tauma Kalai of OpenAI, Ofir Nachum of OpenAI, Santosh S. Vempala of Georgia Tech, and Edwin Zhang of OpenAI. Adam and Santosh have markers over their names. In both cases, they are just giving further contact information in a footnote.

September 21, 2025 at 7:34 PM

Ryan Egesdahl

@ryan.deriamis.net

I knew this would come in handy someday. In fact, I have a few more that express the same sentiment. Isn’t this timeline absolutely awesome*?

* The Earth being hit by an asteroid would also be awesome in the same sense, you know…

A pseudo-80’s text graphic in a “Miami nights electronica” theme that reads “What a terrible day to have eyes.”

September 8, 2025 at 8:15 PM

Ryan Egesdahl

@ryan.deriamis.net

Pancakes for lunch? YES.

Extreme fluffage, beautifully toasty and rich, crispy edges… 🤤

(The slightly-too-dark edge is due to my crappy gas stove and needing a fan in my kitchen. 🤷‍♂️ They don’t taste burnt, though.)

#foodsky #food #pancakes @crowbar.wtf

A fluffy stack of pancakes on a white plate. A pat of butter lies on the top, and maple syrup has dribbled down, forming a small pool around the bottom pancake.

September 3, 2025 at 7:44 PM

Ryan Egesdahl

@ryan.deriamis.net

Since when did it become the standard practice to “bake” a mousse? Isn’t that just a soufflé or a flourless chocolate cake (maybe a lava cake), depending on the ingredient ratios? 🤔

A picture of an Echo Show screen showing a promoted “No-Bake Chocolate Mousse” recipe video by Entertaining with Beth.

August 27, 2025 at 11:10 PM

Ryan Egesdahl

@ryan.deriamis.net

The US has a long history of protesting with food, including throwing food at terrible people. That’s a good thing - and it’s absolutely HILARIOUS!

#sandwich #HamSandwich

August 16, 2025 at 4:44 PM

Ryan Egesdahl

@ryan.deriamis.net

Oh, but he does!

An old woman on an airplane (from the movie Airplane!) snorting cocaine through a straw from a mirror.

August 16, 2025 at 3:50 AM

Ryan Egesdahl

@ryan.deriamis.net

Well, there goes my sex life. Forever.

A hand-drawn image of a white rabbit rubbing a cheese grater where its head should be. Some flakes are falling out of the grater. The caption reads:

Trying desperately to get that image out of my head.

August 7, 2025 at 8:57 AM

Ryan Egesdahl

@ryan.deriamis.net

Just to show I was serious -

Pancakes, y’all.

Gloriously imperfect and absolutely delicious.

You really can - and *should* - take a break from the insanity to do things you love when you need to. Cooking amazing things is one of mine.

For the interested, food science geekery is in the 🧵below.

A picture showing a plate of freshly-made pancakes topped with butter and maple syrup. The plate is beige with a brown rim and is arranged on a white cutting board with a fork and a knife to either side of it. In the background on the counter are various bottles of oils and extracts.

August 1, 2025 at 7:43 PM

Ryan Egesdahl

@ryan.deriamis.net

For the “but that’s an Amendment!” freaks, there’s also Article I Section 8, which gives Congress (not the President) the power to legislate naturalization. and also Article I Section 9, which makes exceedingly clear that Habeas Corpus cannot be suspended for *anyone*, even for noncitizens.

Excerpt from the US Constitution, Article I Section 8:

Congress shall have Power …

To establish an uniform Rule of Naturalization, and uniform Laws on the subject of Bankruptcies throughout the United States;

Excerpt from the US Constitution, Article I Section 8:

The Privilege of the Writ of Habeas Corpus shall not be suspended, unless when in Cases of Rebellion or Invasion the public Safety may require it.

June 27, 2025 at 6:08 PM

Ryan Egesdahl

@ryan.deriamis.net

I keep finding reasons to use this one of late. Curious…

This is a black-and-white picture modified to look like a 1950’s horror movie poster. A woman with a curled “suburban housewife” hairstyle is wearing a MAGA hat and has her hands on her face as she screams in apparent horror. The “title” of the “horror” is “The Day Socialism Helped the Needy!”

Two other captions serve to appear like commentary from movie critics:

“You’ll scream as poor people get healthcare!”

“Schools, libraries, roads, museums, farm subsidies, even the military. Nothing is safe from funding!!"

June 27, 2025 at 5:54 AM

Ryan Egesdahl

@ryan.deriamis.net

It’s been a while since I used this one…

June 25, 2025 at 6:06 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news