David Bindel
@dbindel.bsky.social
1K followers 640 following 650 posts
Professing computing and applied math at Cornell. Numerical methods for data science, plasma physics, other stuff depending on the day. Director, Cornell Center for Applied Mathematics; Director, Simons Collaboration on Hidden Symmetries and Fusion Energy.
Posts Media Videos Starter Packs
dbindel.bsky.social
(A postscript: In Fall 2008, as a Courant instructor at NYU math, I taught an upper division undergrad probability course in which I gave myself free license to say things like "covariance is an inner product over a space of mean zero rvs," as a treat. At least a few students appreciated this.)
dbindel.bsky.social
Googling for this mostly brings up my own notes, which suggests to me that I'm falling prey to idiosyncracies of my own academic upbringing. Pointers to alternate terminology that might lead to more successful searches would also be welcome!
dbindel.bsky.social
Now, here's the question: the concept of bias-variance decomposition is well known in statistics and machine learning. The concept of quasi-optimality is well known in approximation theory. Putting them together is not deep. Surely there must be a reference?
dbindel.bsky.social
One can do something similar with the noise term, and this is something I teach in my matrix computations class:

www.cs.cornell.edu/courses/cs62...
CS 6210: Matrix Computations
www.cs.cornell.edu
dbindel.bsky.social
For a regularized least squares fit, we can get that 𝔼[c]-c_* is the Moore-Penrose pseudoinverse (with regularization) of a piece of r. And from there, we can show *quasi-optimality* of the squared bias: the version learned from data gives a residual within some factor of the best possible.
dbindel.bsky.social
Consider a generalized linear model s(X) = w(X) c. There is a c_* that minimizes the expected squared bias term 𝔼[r(X)²] where r(X) = f(X)-w(X) c_*; and we can show (again by orthogonality) that 𝔼[(m(X)-f(X))²] = 𝔼[(w(X) (𝔼[c]-c_*))²] + 𝔼[r(X)²].
dbindel.bsky.social
Now let X be random, and consider

𝔼[(f(X)-s(X))²] = Var[s(X)] + 𝔼[(m(X)-f(X))²]

where m is the mean of s with respect to the sampling variance (the expectation is just over X).
dbindel.bsky.social
Now suppose f(x) is a (deterministic) function and s(x) is a predictor (based on a noisy sample). Then

𝔼[(f(x)-s(x))²] = Var[s(X)]+(𝔼[s(X)]-f(x))²

This is the bias-variance decomposition (sometimes people add a term for test-time noise, but I'll drop it here). Same idea.
dbindel.bsky.social
OK, #MathSky. This reminds me of a reference-chasing exercise that came up again recently, and maybe someone here will know.

First, a perspective: covariance is almost an inner product (it is an inner product on mean 0 rvs). So what @ccanonne.github.io cites is the Pythagorean theorem.
ccanonne.github.io
Here's a classic (but fun to show) fact: if X is any random variable (with a finite variance) and λ is a real, then

𝔼[(X-λ)²] = Var[X]+(𝔼[X]-λ)²

(In particular, this shows that 𝔼[X] is the quantity minimizing 𝔼[(X-λ)²] over all λ, and that Var[X] is the resulting value.)
A short proof: here is the LaTeX code.

**Proof.** We have, for any $\color{blue}{\lambda} \in\mathbb{R}$,
\begin{align*}
\mathbb{E}[(X-\color{blue}{\lambda})^2]
&= \mathbb{E}[(X-\color{red}{\mathbb{E}[X]} + \color{red}{\mathbb{E}[X]} - \color{blue} {\lambda})^2] \\
&=\mathbb{E}[(X-\color{red}{\mathbb{E}[X]})^2 + 2(X-\color{red}{\mathbb{E}[X]})(\color{red}{\mathbb{E}[X]} - \color{blue} {\lambda}) + (\color{red}{\mathbb{E}[X]} - \color{blue} {\lambda})^2]\\
&=\underbrace{\mathbb{E}[(X-\color{red}{\mathbb{E}[X]})^2]}_{=\textrm{Var}[X]} + 2\underbrace{\mathbb{E}[X-\color{red}{\mathbb{E}[X]}]}_{=0}(\color{red}{\mathbb{E}[X]} - \color{blue} {\lambda})] + (\color{red}{\mathbb{E}[X]} - \color{blue} {\lambda})^2
\end{align*}
and that's all. (The first step is a trick known as *"hiding zero:"* writing $0=a-a$. 🤷)
dbindel.bsky.social
Things are all pear-shapes, and yet, I got some bunny paper clips along with my order of whiteboard marker refills, and that sparks joy.
Reposted by David Bindel
paysmaths.bsky.social
"There are lots of problems in mathematics that are interesting but have not been solved, and every time you solve one you think up a new one. Mathematics, therefore, is something that expands rather than contracts." – Mary Ellen Rudin (1924– 2013)
#quote #mathematics #problems #maths #math
Quote from Mary Ellen Rudin : "There are lots of problems in mathematics that are interesting but have not been solved, and every time you solve one you think up a new one. Mathematics, therefore, is something that expands rather than contracts."
dbindel.bsky.social
Also got to have lunch with a former student of mine yesterday. Time flies.
dbindel.bsky.social
On the AC Transit to Berkeley for day 3 of Developments in Modern Methods for Linear Algebra... aka DMML, aka happy 70th to my PhD advisor, Jim Demmel.

Realized that I am now the same age he was about halfway through my PhD years...
Reposted by David Bindel
bghgmg.bsky.social
gender norms admit the existence of gender metrics which can be abstracted to gender topologies which, here, we categorify into gender locales, also known as pointless genders
dbindel.bsky.social
- Asvine just released a new pen! The V800 is very pretty. I don't need more fountain pens, but can still admire.
- I get to check out a CT scanner setup tomorrow! Research reasons, nothing medical.
- The CAM students remain an excellent community. They did tie-dye shirts today.
dbindel.bsky.social
- I am learning about the mechanics of hair! Which is also awesome.
- Apparently, thinking these things are awesome is a thing the administration now thinks can be cured with Vitamin B. My vitamin B levels are fine, and I am happy they are wrong (even if I wish they were less in my face about it).
dbindel.bsky.social
- I am also writing a recommendation for a postdoc, who is awesome and should land that faculty job he wants. And if the market sucks, well, I am still happy he's awesome and working some of the time with me.
dbindel.bsky.social
- I am astounded that I don't yet have a better alternative than TikZ and dvisvgm for programmatically producing the titles of math diagrams I want for notes and slides, and it's fun to Google variations of "really?" in the hopes of finding a different answer.
dbindel.bsky.social
- My background music is K2 singing in the shower.
- I spent time redoing lecture notes today, and am enjoying the process.
- Next weekend is the DMML meeting, aka JimFest -- in honor of my advisor (a) retiring and (b) turning 70. Looking forward to a couple days in the Berkeley.
dbindel.bsky.social
In a day of wtf nat'l news moments (again!), I also know:
- Today was AppleFest in Ithaca! And yesterday, and tomorrow.
- K1 is out enjoying a concert, and has been noodling around on his guitar all week.
- K2 just got back from playing basketball and is heading to a sleepover soon.
Reposted by David Bindel
verybadllama.bsky.social
I have taken
the Tylenol
that was in
the medicine cabinet

and which
they think probably
is the reason
you like trains

forgive me
but that’s bullshit
you got autism
from your dad
dbindel.bsky.social
I approve of books and bunny (?) both!
dbindel.bsky.social
This generalizes beyond n=2, but that's just a little longer than one skeet!
dbindel.bsky.social
Consider m=2. If switching arguments switches sign, then f(u,u)=-f(u,u), so f(u,u)=0. Conversely, suppose f(u,u)=0 for all u. Then f(u,v)+f(v,u)=f(u,v)-2f(u,u)+f(v,u)=f(u-u,v)+f(v,u-u)=0. So zero if two entries the same is equivalent to changing signs of two arguments are swapped.