Lightnews — Scholar-powered news

shira mitchell

@shiraamitchell.bsky.social

blog post: quantity vs quality

compare 2 surveys:

1. 100% coverage, but response probability P[R = 1 | Y] differs a lot by Y

2. Only 5% coverage, but P[R = 1 | Y] is roughly constant across Y

which would you use ? both ?

November 25, 2025 at 9:58 PM

shira mitchell

@shiraamitchell.bsky.social

new blog post: sampling the sample

we’ve focused on estimating means E[Y].

but say Y are openends ("describe how you feel about the candidate") and you want to read thru a few draws from the population, not only survey responders.

what should you do ?

November 19, 2025 at 12:59 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: weights and MRP for voters

so far we've talked about weights and MRP for E[Y], vote choice in the population overall.

but what if you want E[Y | V = 1], vote choice in the population of voters.

what are the weights and how do you modify MRP ?

November 12, 2025 at 10:13 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: continued struggles with equivalent weights

November 4, 2025 at 9:06 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: Blue Rose Research is hiring !

We are looking for a teammate with expertise in both LLM tools and statistical modeling.

Someone who clearly communicates assumptions, results, and uncertainty. With care and kindness.

October 28, 2025 at 8:32 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: individualism doesn't work

typical machine learning loss looks at one individual at a time

but for MRP, we care about aggregates

October 22, 2025 at 1:30 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: MRPW

you've got a survey collected by someone else, and they gave you weights.

how can you use those weights in the MRP (Multilevel Regression and Poststratification) ?

October 15, 2025 at 1:46 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: struggles with equivalent weights

you've done MRP.

someone asks you for survey weights.

how to get them ?

October 7, 2025 at 11:56 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: beyond balancing

in midterms, voters tend to support the out party for balance

do polls still help predict midterms ? yes

October 1, 2025 at 10:40 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: Fat Bear Week

Basu's Bears is a lesson in:

1) using auxiliary information (pre-salmon-feasting weights)

2) how bad an unbiased estimator can be

statmodeling.stat.columbia.edu/2025/09/23/s...

September 23, 2025 at 8:19 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: random sampling is not leaving

we turned to response instrument Z because random sampling is "dead"

but does this method still rely on starting with random sampling ?

September 16, 2025 at 9:01 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: random sampling is not leaving

we turned to response instrument Z because random sampling is "dead"

but does this method still rely on starting with random sampling ?

September 16, 2025 at 9:01 PM

shira mitchell

@shiraamitchell.bsky.social

blog post on imputation (again):

we want E[Y|X] but X can be missing

@lucystats.bsky.social @sarahlotspeich.bsky.social @glenmartin.bsky.social @maartenvsmeden.bsky.social et al. say:

random imputation should use Y
deterministic imputation shouldn't

statmodeling.stat.columbia.edu/2025/09/09/s...

September 9, 2025 at 8:22 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: connections between survey statistics and experimental design.

split-plot designs are analogous to cluster sampling.

blocking is analogous to stratification.

featuring an experiment by Arjun Potter and colleagues at NM-AIST !

September 3, 2025 at 3:25 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: Thomas Lumley writes about Interviewing your Laptop

what are the problems with using LLMs as survey respondents ?

how are these similar to problems with poststratification ?

CC @tslumley.bsky.social

August 27, 2025 at 7:37 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: answers from the BLS

2 weeks ago we learned about the CES employer survey that produces the jobs count.

we asked: why use employment size in stratification but not nonresponse adjustment ?

BLS responded !

statmodeling.stat.columbia.edu/2025/08/19/s...

August 19, 2025 at 8:53 PM

shira mitchell

@shiraamitchell.bsky.social

the BLS is so helpful in their communication !

August 13, 2025 at 8:53 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: 2nd helpings of the 2nd flavor of calibration 🍨🍨

in political surveys, we "logit shift" predictions to match known aggregates (e.g. total Democratic votes).

but what happens for multinomial outcomes ?

a fun excuse to review IPF/raking 🍂

statmodeling.stat.columbia.edu/2025/08/12/s...

August 12, 2025 at 8:26 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: BLS Jobs Report

let's learn about the CES employer survey that produces the jobs count.

late reporting (a form of nonresponse) results in revisions.

my first (naive !) question: why use employment size in stratification but not nonresponse adjustment ?

August 5, 2025 at 8:09 PM

shira mitchell

@shiraamitchell.bsky.social

blog post: adjusting for interest in politics

whether you respond to a survey (R) may depend on outcome (Y), even after controlling for covariates (X)

what if we can expand this set of X to include interest in politics ?

July 29, 2025 at 8:04 PM

shira mitchell

@shiraamitchell.bsky.social

very excited to teach again soon at NM-AIST

July 29, 2025 at 12:36 AM

shira mitchell

@shiraamitchell.bsky.social

blog post: a new paradigm for polling

so far we assumed response R is independent of outcome Y **within X**

but if R can depend on Y, what to do ?

one idea: use a response instrument Z

statmodeling.stat.columbia.edu/2025/07/22/s...

July 23, 2025 at 12:20 AM

shira mitchell

@shiraamitchell.bsky.social

blog post about longitudinal/panel data:

panel data includes repeated surveys of the same people over time.

this structure can be incorporated into models using person-level effects.

but misspecifying the person-level effects distribution can cause bias.

July 15, 2025 at 8:56 PM

shira mitchell

@shiraamitchell.bsky.social

blog post about imputation:

With nonresponse worsening, we want to adjust for a lot of covariates.

This often means handling many missing covariates.

In theory, fit one big model for everything. But how can practitioners handle this ?

July 9, 2025 at 4:29 AM

shira mitchell

@shiraamitchell.bsky.social

*really* excited for this !

love the name: Structural Zero.

Alan Agresti's Categorical Data Analysis book offers a good explanation (which I'm sure the amazing authors at @hrdag.org will get into):

July 7, 2025 at 2:59 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news