Lightnews — Scholar-powered news

LightNews

Maarten Sap

@maartensap.bsky.social

1.7K followers 210 following 32 posts

Working on #NLProc for social good. Currently at LTI at CMU. 🏳‍🌈

Posts Media Videos Starter Packs

Reposted by Maarten Sap

Kshitish Ghate @kghate.bsky.social · 18h

🚨New paper: Reward Models (RMs) are used to align LLMs, but can they be steered toward user-specific value/style preferences?
With EVALUESTEER, we find even the best RMs we tested exhibit their own value/style biases, and are unable to align with a user >25% of the time. 🧵

Maarten Sap @maartensap.bsky.social · 20h

Oh yes we have a paper under submission! I'll ask Mikayla to email you :)

Maarten Sap @maartensap.bsky.social · 21h

Saplings take #COLM2025! Featuring Group lunch, amazing posters, and a panel with Yoshua Bengio!

Reposted by Maarten Sap

Queer in AI @queerinai.com · 6d

We are launching our Graduate School Application Financial Aid Program (www.queerinai.com/grad-app-aid) for 2025-2026. We’ll give up to $750 per person to LGBTQIA+ STEM scholars applying to graduate programs. Apply at openreview.net/group?id=Que.... 1/5

Grad App Aid — Queer in AI

www.queerinai.com

Maarten Sap @maartensap.bsky.social · 8d

I'm also giving a talk at #COLM2025 Social Simulation workshop (sites.google.com/view/social-...) on Unlocking Social Intelligence in AI, at 2:30pm Oct 10th!

Maarten Sap @maartensap.bsky.social · 8d

Day 3 (Thu Oct 9), 11:00am–1:00pm, Poster Session 5

Poster #13: PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages by @kpriyanshu256.bsky.social and @devanshrjain.bsky.social

Poster #74: Fluid Language Model Benchmarking — led by @valentinhofmann.bsky.social

Maarten Sap @maartensap.bsky.social · 8d

Day 2 (Wed Oct 8), 4:30–6:30pm, Poster Session 4

Poster #50: The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains — led by
Scott Geng

Maarten Sap @maartensap.bsky.social · 8d

Day 1 (Tue Oct 7) 4:30-6:30pm, Poster Session 2

Poster #77: ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning; led by
@stellali.bsky.social & @jiminmun.bsky.social

Maarten Sap @maartensap.bsky.social · 8d

Day 1 (Tue Oct 7) 4:30-6:30pm, Poster Session 2

Poster #42: HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions; led by @nlpxuhui.bsky.social

Maarten Sap @maartensap.bsky.social · 8d

Headed to #COLM2025 today! Here's five of our papers that were accepted, and when & where to catch them 👇

Reposted by Maarten Sap

Valentin Hofmann @valentinhofmann.bsky.social · 28d

📢 New #COLM2025 paper 📢

Standard benchmarks give every LLM the same questions. This is like testing 5th graders and college seniors with *one* exam! 🥴

Meet Fluid Benchmarking, a capability-adaptive eval method delivering lower variance, higher validity, and reduced cost.

🧵

Maarten Sap @maartensap.bsky.social · Aug 26

That's a lot of people! Fall Sapling lab outing, welcoming our new postdoc Vasudha, and visitors Tze Hong and Chani! (just missing Jocelyn)

Maarten Sap @maartensap.bsky.social · Aug 25

I'm excited cause I'm teaching/coordinating a new unique class, where we teach new PhD students all the "soft" skills of research, incl. ideation, reviewing, presenting, interviewing, advising, etc.

Each lecture is taught by a different LTI prof! It takes a village! maartensap.com/11705/Fall20...

Maarten Sap @maartensap.bsky.social · Aug 22

I've always seen people on laptops during talks, but it's possible it has increased.

I realized during lockdown that I drift to emails during Zoom talks, so I started knitting to pay better attention to those talks, and now I knit during IRL talks too (though sometimes I still peck at my laptop 😅)

Maarten Sap @maartensap.bsky.social · Aug 22

We have been studying these questions of how models should refuse in our recent paper accepted to EMNLP Findings (arxiv.org/abs/2506.00195) led by my wonderful PhD student
@mingqian-zheng.bsky.social

Snippet of the Forbes article, with highlighted text.

A recent study by Allen Institute for AI (Ai2), titled “Let Them Down Easy! Contextual Effects of LLM Guardrails on User Perceptions and Preferences,” found that refusal style mattered more than user intent. The researchers tested 3,840 AI query-response pairs across 480 participants, comparing direct refusals, explanations, redirection, partial compliance and full compliance.

Partial compliance, sharing general but not specific information, reduced dissatisfaction by over 50% compared to outright denial, making it the most effective safeguard.

“We found that [start of highlight] direct refusals can cause users to have negative perceptions of the LLM: users consider these direct refusals significantly less helpful, more frustrating and make them significantly less likely to interact with the system in the future,” [end of highlight] Maarten Sap, AI safety lead at Ai2 and assistant professor at Carnegie Mellon University, told me. “I do not believe that model welfare is a well-founded direction or area to care about.”

Maarten Sap @maartensap.bsky.social · Aug 22

I spoke to Forbes about why model "welfare" is a silly framing to an important issue; models don't have feelings, and it's a big distraction from real questions like tensions between safety vs. user utility, which are NLP/HCI/policy questions www.forbes.com/sites/victor...

Maarten Sap @maartensap.bsky.social · Aug 20

thankssss!

Maarten Sap @maartensap.bsky.social · Aug 20

Super super excited about this :D :D

Language Technologies Institute | CMU @ltiatcmu.bsky.social · Aug 15

A hearty congratulations to the LTI's
@maartensap.bsky.social, who's been awarded an
Okawa Research Grant for his work in his work in socially-aware artificial intelligence. lti.cmu.edu/news-and-eve...

Sap Awarded 2025 Okawa Research Grant - Language Technologies Institute - School of Computer Science - Carnegie Mellon University

LTI Assistant Professor Maarten Sap received the prestigious award for his work in socially-aware artificial intelligence

Reposted by Maarten Sap

Language Technologies Institute | CMU @ltiatcmu.bsky.social · Jun 27

Hand gestures are a major mode of human communication, but they don't always translate well across cultures. New research from @akhilayerukola.bsky.social, @maartensap.bsky.social and others is aimed at giving AI systems a hand with overcoming cultural biases:
lti.cmu.edu/news-and-eve...

Using Hand Gestures To Evaluate AI Biases - Language Technologies Institute - School of Computer Science - Carnegie Mellon University

LTI researchers have created a model to help generative AI systems understand the cultural nuance of gestures.

Reposted by Maarten Sap

Language Technologies Institute | CMU @ltiatcmu.bsky.social · Jun 26

New research from LTI, UMich, & Allen Institute for AI: LLMs don’t just hallucinate – sometimes, they lie. When truthfulness clashes with utility (pleasing users, boosting brands), models often mislead. @nlpxuhui.bsky.social and @maartensap.bsky.social discuss the paper:
lti.cmu.edu/news-and-eve...

Does Your Chatbot Swear to Tell the Truth? - Language Technologies Institute - School of Computer Science - Carnegie Mellon University

New research finds that LLM-based agents can't always be trusted to be truthful

Reposted by Maarten Sap

Daniel Chechelnitsky @dchechel.bsky.social · Jun 17

What if AI played the role of your sassy gay bestie 🏳️‍🌈 or AAVE-speaking friend 👋🏾?

You: “Can you plan a trip?”
🤖 AI: “Yasss queen! let’s werk this babe✨💅”

LLMs can talk like us, but it shapes how we trust, rely on & relate to them 🧵

📣 our #FAccT2025 paper: bit.ly/3HJ6rWI

[1/9]

Reposted by Maarten Sap

Julia Mendelsohn @jmendelsohn2.bsky.social · May 21

📣 Super excited to organize the first workshop on ✨NLP for Democracy✨ at COLM @colmweb.org!!

Check out our website: sites.google.com/andrew.cmu.e...

Call for submissions (extended abstracts) due June 19, 11:59pm AoE

#COLM2025 #LLMs #NLP #NLProc #ComputationalSocialScience

NLP 4 Democracy - COLM 2025

sites.google.com

Reposted by Maarten Sap

Language Technologies Institute | CMU @ltiatcmu.bsky.social · May 12

Notice our new look? We're thrilled to unveil our new logo – representing our vision, values, and the future ahead. Stay tuned for more!

Maarten Sap @maartensap.bsky.social · Apr 29

super excited about this 🥰🥰

Kaitlyn Zhou @ COLM @kaitlynzhou.bsky.social · Apr 29

Thrilled that our paper won 🏆 Best Paper Runner-Up 🏆 at #NAACL25!!

Our work (REL-A.I.) introduces an evaluation framework that measures human reliance on LLMs and reveals how contextual features like anthropomorphism, subject, and user history can significantly influence user reliance behaviors.

Reposted by Maarten Sap

Xuhui Zhou @nlpxuhui.bsky.social · Apr 28

When interacting with ChatGPT, have you wondered if they would ever "lie" to you? We found that under pressure, LLMs often choose deception. Our new #NAACL2025 paper, "AI-LIEDAR ," reveals models were truthful less than 50% of the time when faced with utility-truthfulness conflicts! 🤯 1/