Lightnews — Scholar-powered news

Jason Hartline

@jasonhartline.bsky.social

1.4K followers 150 following 190 posts

Professor at Northwestern CS. Economics, by courtesy. Study mechanism design, economics of algorithms, regulation of algorithms, AI and society. https://sites.northwestern.edu/hartline/

Posts Replies Media Videos

Jason Hartline

@jasonhartline.bsky.social

This is a really lovely result. Again, it's in this paper:

"Loss Minimization Yields Multicalibration for Large Neural Networks"
arxiv.org/abs/2304.09424

It seems the paper is not very well known and, hence, worthwhile to post about.

Loss Minimization Yields Multicalibration for Large Neural Networks

Multicalibration is a notion of fairness for predictors that requires them to provide calibrated predictions across a large set of protected groups. Multicalibration is known to be a distinct goal tha...

arxiv.org

November 12, 2025 at 11:39 PM

Jason Hartline

@jasonhartline.bsky.social

3. so most sizes n do not improve accuracy much by increasing to n+k+2 so they must have low bias (by 1).

4. minimizing loss with regularizer that penalizes for large sizes n will avoid the ones where there is significant improvement by making the network a little bigger (i.e., to n+k+2).

November 12, 2025 at 11:37 PM

Jason Hartline

@jasonhartline.bsky.social

2. consider the sequence of optimal losses as a function of network size. there can't be too many sizes n where the networks of size n+k+2 are significantly better. this is because the cumulative difference in losses of any sequence of networks is at most the maximum loss of 1.

November 12, 2025 at 11:23 PM

Jason Hartline

@jasonhartline.bsky.social

Proof outline:
1. neural networks are rich enough so that if there is expected multi-calibration bias for a size n network for points identified by a size k network the two networks can be combined with size n+k+2 to reduce squared loss by the squared bias.

November 12, 2025 at 11:20 PM

Jason Hartline

@jasonhartline.bsky.social

Back to "Loss Minimization Yields Multicalibration for Large Neural Networks". Main Theorem:

The minimizer of "squared loss minus a regularizer term that penalizes for neural network models by their size" is automatically multi-calibrated with respect to small neural networks.

November 12, 2025 at 11:18 PM

Jason Hartline

@jasonhartline.bsky.social

The calibration module was heavily based on the paper reading course co-taught with @yifanwu.bsky.social last year. Details of both courses are here:

- paper reading on calibration: sites.northwestern.edu/hartline/cs-...
- lecture based on data economics: sites.northwestern.edu/hartline/cs-...

November 12, 2025 at 11:13 PM

Jason Hartline

@jasonhartline.bsky.social

On the other hand, some esteemed scholars are finding the AI slop to be pretty good. bsky.app/profile/lanc...

Lance Fortnow @lance.fortnow.com · Oct 22

ChatGPT pulse yesterday on its own accord saw that I was teaching basic circuit in my class, and it gave me formatted notes on a high-level intuitive approach to the switching lemma. Even had the class date, time and location.

Bowing to my AI overloads, I used the example in my lecture today.

November 8, 2025 at 7:33 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news