Karim Abdel Sadek
karimabdel.bsky.social
Karim Abdel Sadek
@karimabdel.bsky.social
Incoming PhD, UC Berkeley

Interested in RL, AI Safety, Cooperative AI, TCS

https://karim-abdel.github.io
*New Paper*

🚨 Goal misgeneralization occurs when AI agents learn the wrong reward function, instead of the human's intended goal.

😇 We show that training with a minimax regret objective provably mitigates it, promoting safer and better-aligned RL policies!
July 8, 2025 at 5:16 PM