Come chat if you’re interested in learning more!
This is work done with wonderful collaborators: Yang Liu, @fcalmon.bsky.social, and @berkustun.bsky.social.
Come chat if you’re interested in learning more!
This is work done with wonderful collaborators: Yang Liu, @fcalmon.bsky.social, and @berkustun.bsky.social.
For example, we demonstrate that, by abstaining from prediction using our algorithm, we can reduce mistakes compared to standard approaches:
For example, we demonstrate that, by abstaining from prediction using our algorithm, we can reduce mistakes compared to standard approaches:
Regret is inevitable with label noise, but it can tell us where models silently fail, and how we can guide safer predictions
Regret is inevitable with label noise, but it can tell us where models silently fail, and how we can guide safer predictions
If we can’t tell which predictions are wrong, we can’t improve models, we can’t debug, and we can’t trust them in high-stakes tasks like healthcare.
If we can’t tell which predictions are wrong, we can’t improve models, we can’t debug, and we can’t trust them in high-stakes tasks like healthcare.
Plenty of algorithms have been designed to handle label noise by predicting well on average, but we show how they still fail on specific individuals.
Plenty of algorithms have been designed to handle label noise by predicting well on average, but we show how they still fail on specific individuals.
Come chat if you’re interested in learning more! This is work done with wonderful collaborators: Yang Liu, @fcalmon.bsky.social, and @berkustun.bsky.social
Come chat if you’re interested in learning more! This is work done with wonderful collaborators: Yang Liu, @fcalmon.bsky.social, and @berkustun.bsky.social
This helps us spot when a model is unreliable at the individual-level.
This helps us spot when a model is unreliable at the individual-level.
Regret is inevitable with label noise -- it tells us where models silently fail, and how we can guide safer predictions.
Regret is inevitable with label noise -- it tells us where models silently fail, and how we can guide safer predictions.
If we can’t tell which predictions are wrong, we can’t improve models, we can’t debug, and we can’t trust them in high-stakes tasks like healthcare.
If we can’t tell which predictions are wrong, we can’t improve models, we can’t debug, and we can’t trust them in high-stakes tasks like healthcare.
Plenty of algorithms have been designed to handle label noise by predicting well on average —
But we show how they can still fail on specific individuals.
Plenty of algorithms have been designed to handle label noise by predicting well on average —
But we show how they can still fail on specific individuals.
💬 Come chat with me at #ICLR2025 Poster Session 2!
Shoutout to my amazing colleagues behind this work:
@tomhartvigsen.bsky.social
@berkustun.bsky.social
💬 Come chat with me at #ICLR2025 Poster Session 2!
Shoutout to my amazing colleagues behind this work:
@tomhartvigsen.bsky.social
@berkustun.bsky.social
We applied our method to stress detection from smartwatches where we have noisy self-reported labels vs. clean physiological measures.
📈 Our model tracks the true time-varying label noise—reducing test error over baselines.
We applied our method to stress detection from smartwatches where we have noisy self-reported labels vs. clean physiological measures.
📈 Our model tracks the true time-varying label noise—reducing test error over baselines.
💥 Results:
On 4 real-world time series tasks:
✅ Temporal methods beat static baselines
✅ Our methods better approximate the true noise function
✅ They work when the noise function is unknown!
💥 Results:
On 4 real-world time series tasks:
✅ Temporal methods beat static baselines
✅ Our methods better approximate the true noise function
✅ They work when the noise function is unknown!
A temporal label noise function defines how likely each true label is to be flipped—as a function of time.
Using this function, we propose a new time series loss function that is provably robust to label noise.
A temporal label noise function defines how likely each true label is to be flipped—as a function of time.
Using this function, we propose a new time series loss function that is provably robust to label noise.
In many real-world time series (e.g., wearables, EHRs), label quality fluctuates over time
➡️ Participants fatigue
➡️ Clinicians miss more during busy shifts
➡️ Self-reports drift seasonally
Existing methods assume static noise → they fail here
In many real-world time series (e.g., wearables, EHRs), label quality fluctuates over time
➡️ Participants fatigue
➡️ Clinicians miss more during busy shifts
➡️ Self-reports drift seasonally
Existing methods assume static noise → they fail here