Lightnews — Scholar-powered news

Aaron Roth

@aaroth.bsky.social

4.2K followers 390 following 360 posts

Professor at Penn, Amazon Scholar at AWS. Interested in machine learning, uncertainty quantification, game theory, privacy, fairness, and most of the intersections therein

Posts Replies Media Videos

Aaron Roth

@aaroth.bsky.social

My own STOC submission. Really did use elegant properties of linear regression that I didn't know about until embarrassingly recently!

November 5, 2025 at 1:32 AM

Aaron Roth

@aaroth.bsky.social

I'm working on it!

November 4, 2025 at 11:54 PM

Aaron Roth

@aaroth.bsky.social

i.e. in this framework, you get the decision theoretic benefits of full calibration at an extremely low (and computationally tractable) level of this hierarchy. The paper is here: arxiv.org/abs/2510.23471 and is joint with Shayan Kiyani, Hamed Hassani, and George Pappas.

Robust Decision Making with Partially Calibrated Forecasts

Calibration has emerged as a foundational goal in ``trustworthy machine learning'', in part because of its strong decision theoretic semantics. Independent of the underlying distribution, and independ...

arxiv.org

October 30, 2025 at 7:02 PM

Aaron Roth

@aaroth.bsky.social

What lies in between? Maybe infinite hierarchy of ever less conservative decision rules as we add to H. But one surprise that we find is that as soon as H contains the decision calibration tests (just one for each action), the optimal decision rule collapses to best response.

October 30, 2025 at 7:02 PM

Aaron Roth

@aaroth.bsky.social

We can interpolate between full calibration and no information: optimize for the worst distribution that is consistent with the H-calibration guarantees of f. When H is empty, we recover the minimax safety strategy. When H is all functions, we recover the best-response rule.

October 30, 2025 at 7:02 PM

Aaron Roth

@aaroth.bsky.social

Full calibration is hard and so rarely satisfied. But predictions aren't useless either. Maybe the forecaster is partially calibrated in that for some class of tests H={h1,...,hk}, we know that |E[(f(x)-y)*h(f(x))]| <= eps. Most relaxations of calibration have this format.

October 30, 2025 at 7:02 PM

Aaron Roth

@aaroth.bsky.social

If the forecasts have no bearing on the outcome at all, then you should ignore them, and you might conservatively play your minimax strategy: argmax_a min_o u(a,o). The forecasts don't tell you how to do anything better. But generally we aren't in either of these two cases.

October 30, 2025 at 7:02 PM

Aaron Roth

@aaroth.bsky.social

(but that is not the same thing I think as having "no understanding" and only be extruding text)

October 21, 2025 at 7:58 PM

Aaron Roth

@aaroth.bsky.social

I totally agree they are worse at out of distribution kinds of examples, and cannot learn/improve on these tasks the same way people can. They seem to have modest understanding of a huge collection of things rather than deep understanding in anything, and no ability to learn.

October 21, 2025 at 7:58 PM

Aaron Roth

@aaroth.bsky.social

I do not --- but I have been using them quite a bit to draft mathematics using coding tools like Cursor and Windsurf, and I have found them useful. It is very much human in the loop --- but also very useful in my experience.

October 21, 2025 at 7:54 PM

Aaron Roth

@aaroth.bsky.social

But now that they write working code, and can be useful assistants in mathematical research (including my own) I don't see how it is defensible to say that all value/understanding comes from interpretation on the part of the human user. I'd be interested in hearing the best version of the argument.

October 21, 2025 at 3:33 PM

Aaron Roth

@aaroth.bsky.social

I have to say, because of my upbringing in computer science (and in particular TCS), I am partial to the functionalist argument. When LLMs were just chatting and writing poems, I could believe that we were reading more into them than was there because of our anthopomorphic biases.

October 21, 2025 at 3:33 PM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news