Lightnews — Scholar-powered news

Myra Cheng

@myra.bsky.social

Despite sycophantic AI’s reduction of prosocial intentions, people also preferred it and trusted it more. This reveals a tension: AI is rewarded for telling us what we want to hear (immediate user satisfaction), even when it may harm our relationships.

Rightness judgment is higher and repair likelihood is lower for sycophantic AI

Response quality, return likelihood, and trust are higher for sycophantic AI

October 3, 2025 at 10:57 PM

Myra Cheng

@myra.bsky.social

Next, we tested the effects of sycophancy. We find that even a single interaction with sycophantic AI increased users’ conviction that they were right and reduced their willingness to apologize. This held both in controlled, hypothetical vignettes and live conversations about real conflicts.

Description of Study 2 (hypothetical vignettes) and Study 3 (live interaction) where self-attributed wrongness and desire to initiate repair decrease, while response quality and trust increases.

October 3, 2025 at 10:55 PM

Myra Cheng

@myra.bsky.social

We focus on the prevalence and harms of one dimension of sycophancy: AI models endorsing users’ behaviors. Across 11 AI models, AI affirms users’ actions about 50% more than humans do, including when users describe harmful behaviors like deception or manipulation.

Description of Study 1, where we characterize the prevalence of social sycophancy and find it to be highly prevalent across leading AI models

October 3, 2025 at 10:53 PM

Myra Cheng

@myra.bsky.social

AI always calling your ideas “fantastic” can feel inauthentic, but what are sycophancy’s deeper harms? We find that in the common use case of seeking AI advice on interpersonal situations—specifically conflicts—sycophancy makes people feel more right & less willing to apologize.

Screenshot of paper title: Sycophantic AI Decreases Prosocial Intentions and Promotes Dependence

October 3, 2025 at 10:53 PM

Myra Cheng

@myra.bsky.social

Thoughtful NPR piece about ChatGPT relationship advice! Thanks for mentioning our research :)

August 5, 2025 at 2:38 PM

Myra Cheng

@myra.bsky.social

So we built DumT, a method using DPO + HumT to steer models to be less human-like without hurting performance. Annotators preferred DumT outputs for being: 1) more informative and less wordy (no extra “Happy to help!”) 2) less deceptive and more authentic to LLMs’ capabilities.

Plots showing that DumT reduces MeanHumT and has higher performance on RewardBench than the baseline models.

June 12, 2025 at 12:09 AM

Myra Cheng

@myra.bsky.social

We also develop metrics for implicit social perceptions in language, and find that human-like LLM outputs correlate with perceptions linked to harms: warmth and closeness (→ overreliance), and low status and femininity (→ harmful stereotypes).

human-like LLM outputs are strongly positively correlated with social closeness, femininity, and warmth (r = 0.87, 0.47, 0.45), and strongly negatively correlated with status (r = 0.80).

June 12, 2025 at 12:08 AM

Myra Cheng

@myra.bsky.social

First, we introduce HumT (Human-like Tone), a metric for how human-like a text is, based on relative LM probabilities. Measuring HumT across 5 preference datasets, we find that preferred outputs are consistently less human-like.

bar plot showing that human-likeness is lower in preferred responses

June 12, 2025 at 12:08 AM

Myra Cheng

@myra.bsky.social

Do people actually like human-like LLMs? In our #ACL2025 paper HumT DumT, we find a kind of uncanny valley effect: users dislike LLM outputs that are *too human-like*. We thus develop methods to reduce human-likeness without sacrificing performance.

Screenshot of first page of the paper HumT DumT: Measuring and controlling human-like language in LLMs

June 12, 2025 at 12:07 AM

Myra Cheng

@myra.bsky.social

We apply ELEPHANT to 8 LLMs across two personal advice datasets (Open-ended Questions & r/AITA). LLMs preserve face 47% more than humans, and on r/AITA, LLMs endorse the user’s actions in 42% of cases where humans do not.

May 21, 2025 at 4:52 PM

Myra Cheng

@myra.bsky.social

By defining social sycophancy as excessive preservation of the user’s face (i.e., their desired self-image), we capture sycophancy in these complex, real-world cases. ELEPHANT, our evaluation framework, detects 5 face-preserving behaviors.

May 21, 2025 at 4:51 PM

Myra Cheng

@myra.bsky.social

Dear ChatGPT, Am I the Asshole?
While Reddit users might say yes, your favorite LLM probably won’t.
We present Social Sycophancy: a new way to understand and measure sycophancy as how LLMs overly preserve users' self-image.

May 21, 2025 at 4:51 PM

Myra Cheng

@myra.bsky.social

We find that:
➕ Anthropomorphism is rising over time: people are seeing AI as more human-like and agentic.
➕ Warmth is rising over time.

May 2, 2025 at 1:20 AM

Myra Cheng

@myra.bsky.social

To quantify these perceptions, we combine AnthroScore (anthroscore.stanford.edu) and a computational SCM model (aclanthology.org/2021.acl-lon...) to measure anthropomorphism, warmth, and competence from the metaphors.

May 2, 2025 at 1:20 AM

Myra Cheng

@myra.bsky.social

We identify 20 dominant metaphors—ranging from "friend" to "god" to "thief"—and how their prevalence is shifting over time: warm, human-like metaphors (friend, assistant) are rising while mechanical ones that signal competence (computer, search engine) are declining.

May 2, 2025 at 1:20 AM

Myra Cheng

@myra.bsky.social

How does the public conceptualize AI? Rather than self-reported measures, we use metaphors to understand the nuance and complexity of people’s mental models. In our #FAccT2025 paper, we analyzed 12,000 metaphors collected over 12 months to track shifts in public perceptions.

May 2, 2025 at 1:19 AM

Myra Cheng

@myra.bsky.social

In our blogpost, we outline key directions to provide scaffolding for this work: conceptual clarity (what counts as anthropomorphic behavior?), better terminology, mitigation strategies, and unpacking what practices lead to anthropomorphic AI.

April 27, 2025 at 9:55 PM

Myra Cheng

@myra.bsky.social

Anthropomorphic AI system behaviors are already everywhere, from outputs explicitly claiming human-like life experiences (“I have a child”) to ones suggesting emotional capacity (“I’m excited for you!”).

April 27, 2025 at 9:55 PM

Myra Cheng

@myra.bsky.social

New ICLR blogpost! 🎉 We argue that understanding the impact of anthropomorphic AI is critical to understanding the impact of AI.

April 27, 2025 at 9:55 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news