Lightnews — Scholar-powered news

Pausal Zivference @pausalz.bsky.social · 2h

I forgot to include the visualization for forward-mode autodiff...

Essentially, we break everything into functions and evaluate those functions using those object pairs from before. That gives us the derivative of the function at a particular point

1

Pausal Zivference @pausalz.bsky.social · 2h

To summarize, M-estimators provide an entry point into many computational aspects of epidemiology. Here, we looked at a few ways of computing the derivative

Pausal Zivference @pausalz.bsky.social · 2h

Through this setup, we can recursively apply the rules of differentiation to any function. The caveat is that any functions within our function must have their rules explicitly programmed in (eg we must program what happens to sin(x) manually)

Numerical approximation doesn't require this of us

2

Pausal Zivference @pausalz.bsky.social · 2h

The first object of the pair is the standard operator output (eg x + x would return 2x). The second object evaluates the derivatives (eg x + x would return 2)

1

Pausal Zivference @pausalz.bsky.social · 2h

This works programmatically by what is called 'operator overloading'. Essentially, we create a new object type and we redefine what the operators (addition, subtraction, multiplication) do for that object

Specifically, our new object when used with operators returns a pair

1

Pausal Zivference @pausalz.bsky.social · 2h

autodiff exists somewhere between numerical approximation and symbolic manipulation. autodiff uses symbolic manipulation but only does so for the particular point we want the derivative at (so it doesn't need to store the full expression)

It does all this by programming all the derivative rules

1

Pausal Zivference @pausalz.bsky.social · 2h

Numerical approximation does what it sounds like it does

The computer approximates what the slope would be at a particular point. We do this by evaluating the function at two points (depending on what approximation method we use). Then we do 'rise over run' to compute the slope (and approx dx)

Visualization of numerically approximating the derivative. There is a gray line we want to compute the derivative of. Then there are two red points we evaluate the function at. We then compute the slope of the red dashed line that connects those two points

1

Pausal Zivference @pausalz.bsky.social · 2h

For the example function I gave previously, we can see the derivative is a bit of a beast. It has lots of terms (thanks to the chain rule)

A very long expression with lots of terms (because the chain rule requires lots of steps for the derivative of this function)

1

Pausal Zivference @pausalz.bsky.social · 2h

Symbolic manipulation is how you learn derivatives in school. It is also the process websites like Wolfram Alpha provide.

It is cool and useful, but it is not always computationally efficient. For something like the sandwich, we don't need the full expression, just the evaluation at a point

1

Pausal Zivference @pausalz.bsky.social · 2h

With a computer, there are essentially 3 options we have to evaluate derivatives: symbolic, numerical approximation, or autodiff

1

Pausal Zivference @pausalz.bsky.social · 2h

As you might recall from a calculus class, the derivative can be thought of the slope of a line at a particular point. You also might remember that are a multitude of rules for computing a derivative

Below is a weird function we will do the derivative of

f of x is equal to x to the power of x plus x-squared all to the power of the sine of x

1

Pausal Zivference @pausalz.bsky.social · 2h

autodiff is actually a very neat idea. I learned the most about it while coding it by-hand for `delicatessen`. So this is an overview of what I learned during that process

github.com/pzivich/Deli...

Delicatessen/delicatessen/derivative.py at main · pzivich/Delicatessen

Delicatessen: the Python one-stop sandwich (variance) shop 🥪 - pzivich/Delicatessen

github.com

1

Pausal Zivference @pausalz.bsky.social · 2h

I have been bad about keeping up with #MEstimatorMonday so this will be a week of M-estimator (so 37 to 41/52)

To start, let's talk computational aspects. As you might recall from the weeks on the sandwich variance, we need to compute derivatives. One option is automatic differentiation (autodiff)

1

Pausal Zivference @pausalz.bsky.social · 3h

That trick? Running different regression models on the same data until you get a small P-value for that super nutrient

NPR @npr.org · 3h

When it comes to rice and pasta, dietitians recommend eating brown or whole grain because they're more nutritious. But you can create a super nutrient in white rice and white pasta. Here's the trick.

There's a secret superfood in white rice and pasta: Here's how to unlock it

When it comes to rice and pasta, dietitians recommend eating brown or whole grain because they're more nutritious. But you can create a super nutrient in white rice and white pasta. Here's the trick.

n.pr

1 2

Reposted by Pausal Zivference

Pausal Zivference @pausalz.bsky.social · 4d

For years I had trouble following some of the discussion about confidence bands, but at ACIC this year @noahgreifer.bsky.social pointed me to a helpful paper

So you don't have to be as perplexed as I once was, we have a new pre-print introducing the key ideas
arxiv.org/abs/2510.07076

Confidence Regions for Multiple Outcomes, Effect Modifiers, and Other Multiple Comparisons

In epidemiology, some have argued that multiple comparison corrections are not necessary as there is rarely interest in the universal null hypothesis. From a parameter estimation perspective, epidemio...

arxiv.org

1 7 18

Pausal Zivference @pausalz.bsky.social · 3d

I wanted to see what he was up to, so I looked up what he has listed on Google Scholar for 2025

There are forty-nine (49) papers he is an author on for the past 348 days...

1

Pausal Zivference @pausalz.bsky.social · 3d

I haven't really paid attention to the lalonde data much, so what is the positivity violation that occurs?

1

Pausal Zivference @pausalz.bsky.social · 4d

Feel free, I can also send any additional details you need

1

Pausal Zivference @pausalz.bsky.social · 4d

Give it a read, there are some fun visualization in there, like this one

1

Pausal Zivference @pausalz.bsky.social · 4d

For years I had trouble following some of the discussion about confidence bands, but at ACIC this year @noahgreifer.bsky.social pointed me to a helpful paper

So you don't have to be as perplexed as I once was, we have a new pre-print introducing the key ideas
arxiv.org/abs/2510.07076

Confidence Regions for Multiple Outcomes, Effect Modifiers, and Other Multiple Comparisons

In epidemiology, some have argued that multiple comparison corrections are not necessary as there is rarely interest in the universal null hypothesis. From a parameter estimation perspective, epidemio...

arxiv.org

1 7 18

Pausal Zivference @pausalz.bsky.social · 4d

Glad I get to continue my anti-R (and associated R products) persona

1 2

Pausal Zivference @pausalz.bsky.social · 4d

The correct version would have instead re-sampled the age distribution form the data

This error is subtle (because it doesn't crash the program, I also missed it on my first glance) and anyone using LLMs like the quoted person is making these errors

2 1

Pausal Zivference @pausalz.bsky.social · 4d

Now the trick is that the code it output is actually wrong in a subtle way. I actually hadn't noticed the error it introduce because it won't output an error. The red boxes are the error.

I won't get into the finer details, but this causes the whole procedure to under-estimate the variance

1 2

Pausal Zivference @pausalz.bsky.social · 4d

and like the summary was fine. It was very basic and cursory, but no errors

It didn't highlight that I had code provided as part of the paper. So, I followed up by asking it for code. It generated a new example (using the details from the paper example)

1 1

Pausal Zivference @pausalz.bsky.social · 4d

I have an interesting case study on this actually. So at the beginning of the semester, I was preparing a bit on the (mis)use of LLMs for the course I co-teach

One of things I did was have it summarize one of my own papers, since people say "it's so good at it"
arxiv.org/abs/2503.02789

Accounting for Missing Data in Public Health Research Using a Synthesis of Statistical and Mathematical Models

Introduction: Missing data is a challenge to medical research. Accounting for missing data by imputing or weighting conditional on covariates relies on the variable with missingness being observed at ...

arxiv.org

2 2 5