Jeremiah Milbauer
banner
jerelev.bsky.social
Jeremiah Milbauer
@jerelev.bsky.social
NLP PhD student at Carnegie Mellon, resident at Mozilla AI, making things for thinking with // previously Google Research, Google X, cs+phil at UChicago // jeremiah.milbauer.info
'tis the season of getting cold emails from the boldest phd applicants.

In the interest of fairness for those who did not know they could ask, please DM if you'd like an inside perspective on AI/NLP at CMU. (Or share with those you know who might!)
October 22, 2025 at 6:34 PM
really enjoying COLM 2025, everyone keeps telling me it feels like ICLR or neurips 10 years ago... so good vibes at an AI conferences only happens once a decade?
October 10, 2025 at 8:29 PM
Emily's work looks at one of the most important issues emerging in the LLM era of social science
💡Can we trust synthetic data for statistical inference?

We show that synthetic data (e.g., LLM simulations) can significantly improve the performance of inference tasks. The key intuition lies in the interactions between the moment residuals of synthetic data and those of real data
October 10, 2025 at 8:27 PM
“He has endeavoured to prevent the Population of these States; for that Purpose obstructing the Laws for Naturalization of Foreigners; refusing to pass others to encourage their Migrations hither”
July 4, 2025 at 10:04 PM
how do we even handle tech that's this fragmented, where every layer of the stack wants to be the one to deliver it? are there historical examples where this gets resolved nicely?
Trying to read a PDF and I am being prompted in three separate places to use an LLM to summarize the fucking article. Is this what they mean when they say that gen AI is inevitable
June 11, 2025 at 2:52 PM
Reposted by Jeremiah Milbauer
Last year we started a project to download and preserve public data. lil.law.harvard.edu/blog/2025/01... Since saving public data is in the news today — but is always needed — let’s talk about what you can do to help.
Preserving Public U.S. Federal Data | Library Innovation Lab
lil.law.harvard.edu
January 31, 2025 at 8:59 PM
I love reading work that draws on ideas outside NLP to examine and amplify actual practitioner perspectives – so crucial as we work toward purposeful & humanistic AI.

We saw similar trends re: de-contextualization and loss of document narrative in upcoming research led by @siree.sh
December 18, 2024 at 9:08 PM
Reposted by Jeremiah Milbauer
New project with @profjamesevans and my @dsi-uchicago.bsky.social friends on the largest (?) examination (~11 million papers) of how computational social science emerges from -- and shapes -- Econ, Sociology, Psych, and PoliSci. Paper: arxiv.org/abs/2412.08087 A thread: (1/7)
December 12, 2024 at 5:31 PM
Reposted by Jeremiah Milbauer
Mosaic v0.12 is out: database-powered scalable, interactive visualization! 📈 One new addition is support for dynamic changes in the backing data. Move between smaller and larger samples to balance speed and comprehensive coverage.
November 21, 2024 at 5:43 PM
Reposted by Jeremiah Milbauer
1/ Introducing ᴏᴘᴇɴꜱᴄʜᴏʟᴀʀ: a retrieval-augmented LM to help scientists synthesize knowledge 📚
@uwnlp.bsky.social & Ai2
With open models & 45M-paper datastores, it outperforms proprietary systems & match human experts.
Try out our demo!
openscholar.allen.ai
November 19, 2024 at 4:30 PM
very useful feed from @siree.sh, and for now I'm enjoying seeing research from outside my usual bubble!
I've been missing paper recommendations as a regular part of my feed, so I threw together a feed that pulls every skeet (minus link repost bots) with a link to arXiv, the ACL Anthology, NeurIPS proceedings, or OpenReview: bsky.app/profile/did:...

It's noisy, but maybe that's a feature?
November 13, 2024 at 10:13 PM
Just spun up a starter pack for people thinking about tools for thought & scholarship, future of work & science, human-AI collaboration, etc -- go.bsky.app/jSgrE5

Please reply if I've missed you, if you'd like me to remove you, or if you have suggestions on who to add!
November 9, 2024 at 4:33 PM
The guy in this photo taught my AP CS class in high school. He’s a fantastic teacher. I am not joking.
Man On Internet Almost Falls Into World Of DIY Mustard Enthusiasts
theonion.com/man-on-inter...
November 8, 2024 at 4:54 PM