Lightnews — Scholar-powered news

Karim Abdel Sadek

@karimabdel.bsky.social

190 followers 94 following 16 posts

Incoming PhD, UC Berkeley

Interested in RL, AI Safety, Cooperative AI, TCS

https://karim-abdel.github.io

Posts Replies Media Videos

Karim Abdel Sadek

@karimabdel.bsky.social

*New Paper*

🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal.

😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!

July 8, 2025 at 5:16 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news