Emma Pierson
emmapierson.bsky.social
Emma Pierson
@emmapierson.bsky.social
Assistant professor of CS at UC Berkeley, core faculty in Computational Precision Health. Developing ML methods to study health and inequality. "On the whole, though, I take the side of amazement."

https://people.eecs.berkeley.edu/~emmapierson/
Apply here - aprecruit.berkeley.edu/JPF05028 by 11/15, but review of applications is ongoing so sooner is better! (Application deadline currently says 9/15 but will be extended).
Postdoctoral Employee - Artificial Intelligence - Electrical Engineering and Computer Sciences Department
University of California, Berkeley is hiring. Apply now!
aprecruit.berkeley.edu
August 22, 2025 at 2:11 PM
Broad project areas include:

1) language modelling methods for scientific discovery (building on our recent work - arxiv.org/abs/2502.04382)

2) using language models to support equity (ai.nejm.org/doi/full/10....)

both in collaboration with health+social scientists.

2/3
Sparse Autoencoders for Hypothesis Generation
We describe HypotheSAEs, a general method to hypothesize interpretable relationships between text data (e.g., headlines) and a target variable (e.g., clicks). HypotheSAEs has three steps: (1) train a ...
arxiv.org
August 22, 2025 at 2:11 PM
Thanks, Megan!! This is kind :) hope you’re doing well.
April 26, 2025 at 10:11 PM
This work is led by @gsagostini.bsky.social, who gets more excited about geospatial data than anyone I've ever met, and with Rachel Young, Maria Fitzpatrick, and @nkgarg.bsky.social.

Paper: arxiv.org/abs/2503.20989
Website (and data): migrate.tech.cornell.edu
Thread: bsky.app/profile/gsag...
Migration data lets us study responses to environmental disasters, social change patterns, policy impacts, etc. But public data is too coarse, obscuring these important phenomena!

We build MIGRATE: a dataset of yearly flows between 47 billion pairs of US Census Block Groups. 1/5
March 28, 2025 at 4:04 PM
This work is led by the wonderful @rajmovva.bsky.social and @kennypeng.bsky.social with coauthors @nkgarg.bsky.social and Jon Kleinberg. See Raj’s full thread for details, Python package, and project website!

bsky.app/profile/rajm...
💡New preprint & Python package: We use sparse autoencoders to generate hypotheses from large text datasets.

Our method, HypotheSAEs, produces interpretable text features that predict a target variable, e.g. features in news headlines that predict engagement. 🧵1/
March 18, 2025 at 6:26 PM
HypotheSAEs outperforms strong LLM baselines, generates new discoveries even on well-studied datasets, and comes with easy-to-use code.

We hope this will be helpful not just to CS folks, but to many in social/health sciences - please reshare to help reach them.
March 18, 2025 at 6:26 PM