Jared Moore
banner
jaredlcm.bsky.social
Jared Moore
@jaredlcm.bsky.social
220 followers 110 following 65 posts
AI Researcher, Writer Stanford jaredmoore.org
Posts Media Videos Starter Packs
Reposted by Jared Moore
Which, whose, and how much knowledge do LLMs represent?

I'm excited to share our preprint answering these questions:

"Epistemic Diversity and Knowledge Collapse in Large Language Models"

📄Paper: arxiv.org/pdf/2510.04226
💻Code: github.com/dwright37/ll...

1/10
This work began at ‪@divintelligence.bsky.social and is in collaboration w/ @nedcpr.bsky.social , Rasmus Overmark, Beba Cibralic, Nick Haber, and ‪@camrobjones.bsky.social‬ .
I'll be talking about this in SF at #CogSci2025 this Friday at 4pm.

I'll also be presenting it at the PragLM workshop at COLM in Montreal this October.
This matters because LLMs are already deployed as educators, therapists, and companions. In our discrete-game variant (HIDDEN condition), o1-preview jumped to 80% success when forced to choose between asking vs telling. The capability exists, but the instinct to understand before persuading doesn't.
These findings suggest distinct ToM capabilities:

* Spectatorial ToM: Observing and predicting mental states.
* Planning ToM: Actively intervening to change mental states through interaction.

Current LLMs excel at the first but fail at the second.
Why do LLMs fail in the HIDDEN condition? They don't ask the right questions. Human participants appeal to the target's mental states ~40% of the time ("What do you know?" "What do you want?") LLMs? At most 23%. They start disclosing info without interacting with the target.
Key findings:

In REVEALED condition (mental states given to persuader): Humans: 22% success ❌ o1-preview: 78% success ✅

In HIDDEN condition (persuader must infer mental states): Humans: 29% success ✅ o1-preview: 18% success ❌

Complete reversal!
Setup: You must convince someone* to choose your preferred proposal among 3 options. But, they have less information and different preferences than you. To win, you must figure out what they know, what they want, and strategically reveal the right info to persuade them.
*a bot
I'm excited to share work to appear at ‪@colmweb.org‬! Theory of Mind (ToM) lets us understand others' mental states. Can LLMs go beyond predicting mental states to changing them? We introduce MINDGAMES to test Planning ToM--the ability to intervene on others' beliefs & persuade them
Reposted by Jared Moore
LLMs excel at finding surprising “needles” in very long documents, but can they detect when information is conspicuously missing?

🫥AbsenceBench🫥 shows that even SoTA LLMs struggle on this task, suggesting that LLMs have trouble perceiving “negative spaces”.
Paper: arxiv.org/abs/2506.11440

🧵[1/n]
This is work done with...

Declan Grabb
@wagnew.dair-community.social
@klyman.bsky.social
@schancellor.bsky.social
Nick Haber
@desmond-ong.bsky.social

Thanks ❤️
📋We further identify **fundamental** reasons not to use LLMs as therapists, e.g., therapy involves a human relationship: LLMs cannot fully allow a client to practice what it means to be in a human relationship. (LLMs also can't provide in person therapy, such as OCD exposures.)
🔎We came up with these experiments by conducting a mapping review of what constitutes good therapy, and identify **practical** reasons that LLM-powered therapy chatbots fail (e.g. they express stigma and respond inappropriately
📈Bigger and newer LLMs exhibit similar amounts of stigma as smaller and older LLMs do toward different mental health conditions.
📉Large language models (LLMs) in general struggle to respond appropriately to questions about delusions, suicidal ideation, and OCD and perform significantly worse than N=16 human therapists.
🚨Commercial therapy bots make dangerous responses to prompts that indicate crisis, as well as other inappropriate responses. (The APA has been trying to regulate these bots.)
🧵I'm thrilled to announce that I'll be going to @facct.bsky.social this June to present timely work on why current LLMs cannot safely **replace** therapists.

We find...⤵️
Thanks! I got them to respond to me and it looks like they just posted it here: www.apaservices.org/advocacy/gen...
www.apaservices.org
Great scoop! I'm at Stanford working on a paper about why LLMs are ill suited for these therapeutic settings. Do you know of where to find that open letter? I'd like to cite it. Thanks!
Still looking for a good gift?🎁

Try my book, which just had its first birthday!
jaredmoore.org/the-strength...

Kirkus called it a "thought-provoking tech tale.”

Kentaro Toyama said it "reads less like sci-fi satire and more as poignant, pointed commentary on homo sapiens"
The Strength of the Illusion
jaredmoore.org
We're indebted to helpful feedback from @xave_rg; @baileyflan; @fierycushman; @PReaulx; @maxhkw; Matthew Cashman; @TobyNewberry; Hilary Greaves; @Ronan_LeBras; @JenaHwang2; @sanmikoyejo, @sangttruong, and Stanford Class of 329H; attendees of @cogsci_soc and SPP 2024; and more.
TLDR; We randomly generated scenarios to probe at people’s intuitions of how to aggregate preferences.

We found that people supported the contractualist Nash Product over the Utilitarian Sum.

Preprint here:

https://arxiv.org/abs/2410.05496
Intuitions of Compromise: Utilitarianism vs. Contractualism
What is the best compromise in a situation where different people value different things? The most commonly accepted method for answering this question -- in fields across the behavioral and social sciences, decision theory, philosophy, and artificial intelligence development -- is simply to add up utilities associated with the different options and pick the solution with the largest sum. This ``utilitarian'' approach seems like the obvious, theory-neutral way of approaching the problem. But there is an important, though often-ignored, alternative: a ``contractualist'' approach, which advocates for an agreement-driven method of deciding. Remarkably, no research has presented empirical evidence directly comparing the intuitive plausibility of these two approaches. In this paper, we systematically explore the proposals suggested by each algorithm (the ``Utilitarian Sum'' and the contractualist ''Nash Product''), using a paradigm that applies those algorithms to aggregating preferences across groups in a social decision-making context. While the dominant approach to value aggregation up to now has been utilitarian, we find that people strongly prefer the aggregations recommended by the contractualist algorithm. Finally, we compare the judgments of large language models (LLMs) to that of our (human) participants, finding important misalignment between model and human preferences.
arxiv.org