Niklas Stoehr
banner
niklasstoehr.bsky.social
Niklas Stoehr
@niklasstoehr.bsky.social
Gemini Post-Training ⚫️ Research Scientist at Google DeepMind ⚫️ PhD from ETH Zurich
⚖️ Measuring Scalar Constructs in Social Science with LLMs

with rising (and established) stars in Computational Social Science

@haukelicht.bsky.social
@rupak-s.bsky.social
@patrickwu.bsky.social
@pranavgoel.bsky.social
@elliottash.bsky.social
@alexanderhoyle.bsky.social

arxiv.org/abs/2509.03116
November 17, 2025 at 9:29 AM
Reposted by Niklas Stoehr
[corrected link]

LLMs are often used for text annotation in social science. In some cases, this involves placing text items on a scale: eg, 1 for liberal and 9 for conservative

There are a few ways to handle this task. Which work best? Our new EMNLP paper has some answers🧵
arxiv.org/abs/2509.03116
October 28, 2025 at 6:23 AM
Reposted by Niklas Stoehr
Evaluating topic models (and document clustering methods) is hard. In fact, since our paper critiquing standard evaluation practices four years ago, there hasn't been a good replacement metric

That ends today (we hope)! Our new ACL paper introduces an LLM-based evaluation protocol 🧵
July 8, 2025 at 12:40 PM
🎓 I recently defended my PhD and moved from one dream team at ETH Zurich to another at DeepMind—a huge thank you to the many people who have supported me along the way!
June 11, 2025 at 9:39 AM
Reposted by Niklas Stoehr
Our paper "A Practical Method for Generating String Counterfactuals" has been accepted to the findings of NAACL 2025! a joint work with @matan-avitan.bsky.social , @yoavgo.bsky.social and Ryan Cotterell. We propose "Intervention Lens", a technique to explain intervention in natural language. (1/6)
February 12, 2025 at 3:19 PM
Reposted by Niklas Stoehr
Are LLMs biased when they write about political issues?

We just released IssueBench – the largest, most realistic benchmark of its kind – to answer this question more robustly than ever before.

Long 🧵with spicy results 👇
February 13, 2025 at 2:08 PM
Reposted by Niklas Stoehr
Can we understand and control how language models balance context and prior knowledge? Our latest paper shows it’s all about a 1D knob! 🎛️
arxiv.org/abs/2411.07404

Co-led with
@kevdududu.bsky.social - @niklasstoehr.bsky.social , Giovanni Monea, @wendlerc.bsky.social, Robert West & Ryan Cotterell.
November 22, 2024 at 3:49 PM
Reposted by Niklas Stoehr
November 19, 2024 at 7:23 PM
Reposted by Niklas Stoehr
If you’re interested in mechanistic interpretability, I just found this starter pack and wanted to boost it (thanks for creating it @butanium.bsky.social !). Excited to have a mech interp community on bluesky 🎉

go.bsky.app/LisK3CP
November 19, 2024 at 12:28 AM
Reposted by Niklas Stoehr
Just launched a Political Comm/NLP/Text-as-Data Starter Pack. 🦋🤗

Join us and/or drop a message to be added!

go.bsky.app/39MWTjg #starterpack #polsci
November 18, 2024 at 3:01 PM
Reposted by Niklas Stoehr
Trying to bring ML/NLP/etal people from ETH Zürich together. Ping me to add you. 🙂
bsky.app/starter-pack...
November 18, 2024 at 10:51 AM