Abhishek Sharma
abhishekshar.bsky.social
Abhishek Sharma
@abhishekshar.bsky.social
CS PhD @Harvard w/ Finale Doshi-Velez | Research in {Reinforcement Learning | Healthcare | Representation Learning}

🌐 https://abhishekshar.com/
Our paper: Decision-Point Guided Safe Policy Improvement
We show that a simple approach to learn safe RL policies can outperform most offline RL methods. (+theoretical guarantees!)

How? Just allow the state-actions that have been seen enough times! 🤯

arxiv.org/abs/2410.09361
Decision-Point Guided Safe Policy Improvement
Within batch reinforcement learning, safe policy improvement (SPI) seeks to ensure that the learnt policy performs at least as well as the behavior policy that generated the dataset. The core challeng...
arxiv.org
January 23, 2025 at 6:23 PM
Wow this is amazing! Thanks for sharing!
December 9, 2024 at 6:32 PM
The notes are great! Thank you!
November 22, 2024 at 3:23 PM
Would be cool to be included! (work on RL in healthcare)..
November 22, 2024 at 4:30 AM