Pausal Zivference
@pausalz.bsky.social
2.2K followers 570 following 840 posts
Paul Zivich, Assistant (to the Regional) Professor Computational epidemiologist, causal inference researcher, amateur mycologist, and open-source enthusiast. https://github.com/pzivich #epidemiology #statistics #python #episky #causalsky
Posts Media Videos Starter Packs
pausalz.bsky.social
I forgot to include the visualization for forward-mode autodiff...

Essentially, we break everything into functions and evaluate those functions using those object pairs from before. That gives us the derivative of the function at a particular point
pausalz.bsky.social
To summarize, M-estimators provide an entry point into many computational aspects of epidemiology. Here, we looked at a few ways of computing the derivative
pausalz.bsky.social
Through this setup, we can recursively apply the rules of differentiation to any function. The caveat is that any functions within our function must have their rules explicitly programmed in (eg we must program what happens to sin(x) manually)

Numerical approximation doesn't require this of us
pausalz.bsky.social
The first object of the pair is the standard operator output (eg x + x would return 2x). The second object evaluates the derivatives (eg x + x would return 2)
pausalz.bsky.social
This works programmatically by what is called 'operator overloading'. Essentially, we create a new object type and we redefine what the operators (addition, subtraction, multiplication) do for that object

Specifically, our new object when used with operators returns a pair
pausalz.bsky.social
autodiff exists somewhere between numerical approximation and symbolic manipulation. autodiff uses symbolic manipulation but only does so for the particular point we want the derivative at (so it doesn't need to store the full expression)

It does all this by programming all the derivative rules
pausalz.bsky.social
Numerical approximation does what it sounds like it does

The computer approximates what the slope would be at a particular point. We do this by evaluating the function at two points (depending on what approximation method we use). Then we do 'rise over run' to compute the slope (and approx dx)
Visualization of numerically approximating the derivative. There is a gray line we want to compute the derivative of. Then there are two red points we evaluate the function at. We then compute the slope of the red dashed line that connects those two points
pausalz.bsky.social
For the example function I gave previously, we can see the derivative is a bit of a beast. It has lots of terms (thanks to the chain rule)
A very long expression with lots of terms (because the chain rule requires lots of steps for the derivative of this function)
pausalz.bsky.social
Symbolic manipulation is how you learn derivatives in school. It is also the process websites like Wolfram Alpha provide.

It is cool and useful, but it is not always computationally efficient. For something like the sandwich, we don't need the full expression, just the evaluation at a point
pausalz.bsky.social
With a computer, there are essentially 3 options we have to evaluate derivatives: symbolic, numerical approximation, or autodiff
pausalz.bsky.social
As you might recall from a calculus class, the derivative can be thought of the slope of a line at a particular point. You also might remember that are a multitude of rules for computing a derivative

Below is a weird function we will do the derivative of
f of x is equal to x to the power of x plus x-squared all to the power of the sine of x
pausalz.bsky.social
autodiff is actually a very neat idea. I learned the most about it while coding it by-hand for `delicatessen`. So this is an overview of what I learned during that process

github.com/pzivich/Deli...
Delicatessen/delicatessen/derivative.py at main · pzivich/Delicatessen
Delicatessen: the Python one-stop sandwich (variance) shop 🥪 - pzivich/Delicatessen
github.com
pausalz.bsky.social
I have been bad about keeping up with #MEstimatorMonday so this will be a week of M-estimator (so 37 to 41/52)

To start, let's talk computational aspects. As you might recall from the weeks on the sandwich variance, we need to compute derivatives. One option is automatic differentiation (autodiff)
pausalz.bsky.social
That trick? Running different regression models on the same data until you get a small P-value for that super nutrient
npr.org
NPR @npr.org · 3h
When it comes to rice and pasta, dietitians recommend eating brown or whole grain because they're more nutritious. But you can create a super nutrient in white rice and white pasta. Here's the trick.
There's a secret superfood in white rice and pasta: Here's how to unlock it
When it comes to rice and pasta, dietitians recommend eating brown or whole grain because they're more nutritious. But you can create a super nutrient in white rice and white pasta. Here's the trick.
n.pr
pausalz.bsky.social
I wanted to see what he was up to, so I looked up what he has listed on Google Scholar for 2025

There are forty-nine (49) papers he is an author on for the past 348 days...
pausalz.bsky.social
I haven't really paid attention to the lalonde data much, so what is the positivity violation that occurs?
pausalz.bsky.social
Feel free, I can also send any additional details you need
pausalz.bsky.social
Give it a read, there are some fun visualization in there, like this one
pausalz.bsky.social
Glad I get to continue my anti-R (and associated R products) persona
pausalz.bsky.social
The correct version would have instead re-sampled the age distribution form the data

This error is subtle (because it doesn't crash the program, I also missed it on my first glance) and anyone using LLMs like the quoted person is making these errors
pausalz.bsky.social
Now the trick is that the code it output is actually wrong in a subtle way. I actually hadn't noticed the error it introduce because it won't output an error. The red boxes are the error.

I won't get into the finer details, but this causes the whole procedure to under-estimate the variance
pausalz.bsky.social
and like the summary was fine. It was very basic and cursory, but no errors

It didn't highlight that I had code provided as part of the paper. So, I followed up by asking it for code. It generated a new example (using the details from the paper example)