Maria Antoniak
@mariaa.bsky.social
10K followers 1.3K following 2K posts
☀️ Assistant Professor of Computer Science at CU Boulder 👩‍💻 NLP, cultural analytics, narratives, online communities 🌐 https://maria-antoniak.github.io 💬 books, bikes, games, art
Posts Media Videos Starter Packs
Pinned
mariaa.bsky.social
some little bluesky tips 🦋

your blocks, likes, lists, and just about everything except chats are PUBLIC

you can pin custom feeds; i like quiet posters, best of follows, mutuals, mentions

if your chronological feed is overwhelming, you can make and pin make a personal list of "unmissable" people
mariaa.bsky.social
Umm… well… history shows… 😭
Reposted by Maria Antoniak
bsky.app
Quick Tip: Don’t want people getting pinged when you post or reply?

You can turn that off by heading to: Settings → Privacy and Security → Allow others to be notified of your posts and toggling it off.
A screenshot of “Allow others to be notified of your posts” with “No one” selected
Reposted by Maria Antoniak
abosselut.bsky.social
If you're interested in doing a postdoc at @icepfl.bsky.social , there's still time to apply for the @epfl-ai-center.bsky.social postdoctoral fellowships.

Apart from this, I'm also recruiting postdocs in developing novel training algorithms for reasoning models and agentic AI.
mariaa.bsky.social
"Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection"
Kabir Ahuja et al. arxiv.org/abs/2504.11900
Title, authors, and abstract of paper Figure 1: Example of FLAWEDFICTIONSMAKER (without the filtering step) in action that can be used to introduce plot holes in a plot hole-free story.
mariaa.bsky.social
"Supposedly Equivalent Facts That Aren't? Entity Frequency in Pre-training Induces Asymmetry in LLMs" by Yuan He et al. arxiv.org/abs/2503.22362
Title, authors, and abstract of the paper Figure 1: LLMs can exhibit asymmetry when recognising equivalent facts, often identifying facts from high-frequency to low-frequency entities but struggling with the inverse. Shown here is a working example from our tests with the OLMo2-13B model.
mariaa.bsky.social
Inspired to share some papers that I found at #COLM2025!

"Register Always Matters: Analysis of LLM Pretraining Data Through the Lens of Language Variation" by Amanda Myntti et al. arxiv.org/abs/2504.01542
Title, authors, and abstract of the paper. Figure 3: Change of accuracy from first to final checkpoint on individual benchmarks shown as a range, with grey indicating the first checkpoint and colours indicating the last checkpoint. The random-guess threshold is shown as a grey vertical line in cases where at least one model falls below it. Bars and legend shown in order of average accuracy.
mariaa.bsky.social
This takes me back down an old rabbit hole...

"Translator Jay Rubin cut about 61 of 1,379 pages, including three chapters... These chapters contain plot elements not found elsewhere in the book.... some chapters were moved ahead of others, taking them out of the context of the original order... "
The Wind-Up Bird Chronicle - Wikipedia
en.wikipedia.org
Reposted by Maria Antoniak
dmshanmugam.bsky.social
I am on the job market this year! My research advances methods for reliable machine learning from real-world data, with a focus on healthcare. Happy to chat if this is of interest to you or your department/team.
Reposted by Maria Antoniak
arxiv-cs-cl.bsky.social
Federica Bologna, Tiffany Pan, Matthew Wilkens, Yue Guo, Lucy Lu Wang
LONGQAEVAL: Designing Reliable Evaluations of Long-Form Clinical QA under Resource Constraints
https://arxiv.org/abs/2510.10415
Reposted by Maria Antoniak
mryskina.bsky.social
⭐ A thread for some cool recent work I learned about at #COLM2025, either from the paper presentations or from the keynotes!
Reposted by Maria Antoniak
mmvty.bsky.social
📣 New preprint! We know humans are biased against AI-creativity. But what about LLMs, now often judging creativity in various contexts? Do they replicate, transform, or amplify this bias? We tested it. Turns out: AI is 2.5X more biased against its own work than humans. arxiv.org/pdf/2510.08831 🧵
arxiv.org
Reposted by Maria Antoniak
chenhaotan.bsky.social
What is the best way to find faculty job advertisements for CS/info students on the job market? It seems that students in my group are spending an unreasonable amount on this. I thought CRA is good, but seems not active any more.
mariaa.bsky.social
I also had a great time at #COLM2025! I especially liked the long poster sessions (no need to rush through, plenty of time to see everything and chat with everyone) and single track talks.
juand-r.bsky.social
#COLM2025 was one of my favorite conferences -- a really high fraction of interesting papers and people, but small enough to see everything!
Thank you to the organizers for putting it together!
Reposted by Maria Antoniak
dustinbwright.com
Which, whose, and how much knowledge do LLMs represent?

I'm excited to share our preprint answering these questions:

"Epistemic Diversity and Knowledge Collapse in Large Language Models"

📄Paper: arxiv.org/pdf/2510.04226
💻Code: github.com/dwright37/ll...

1/10
mariaa.bsky.social
Yeah that’s the same route I was looking at and in total, 2+ hours even in the best case (middle of the day when there are somewhat frequent buses).
mariaa.bsky.social
I believe that would take 2+ hours unless I’m missing something!
mariaa.bsky.social
That’s good to know!
mariaa.bsky.social
Yeah my current frustration is definitely also fueled by arriving from Montreal 😭
mariaa.bsky.social
I’ve never owned a car, only occasionally renting or using a car share, for environmental and public health convictions. But Boulder might finally be the city to break me. Bad infrastructure mostly built for weekend warriors. Currently in a $90 Lyft from the airport because the hourly bus was full.
mariaa.bsky.social
I tried the same query with @kagi.com and the results seem better to me! First hit is a set of opinionated reddit threads from r/houseplants.
Reposted by Maria Antoniak
bayesianboy.bsky.social
What problem is explainability/interpretability research trying to solve in ML, and do you have a favorite paper articulating what that problem is?
mariaa.bsky.social
"come on pleeeease we only want to persuade you of good things! please please just let us persuade you a little bit! we promise only for GOOD things! come on!"
mariaa.bsky.social
"nlp for social good"
unenthusiast.com
In honour of spooky month, share a 4 word horror story that only someone in your profession would understand.

rm -rf ~/
hammancheez.bsky.social
"The chancellor approved it"
Reposted by Maria Antoniak
What does an LLM do when it translates from Italian "amore" to Spanish "amor" or French "amour"?

That's easy! (you might think) Because surely it knows: amore, amor, amour are all based on the same Latin word. It can just drop the "e", or add a "u".