Mattie Fellows
mattieml.bsky.social
Mattie Fellows
@mattieml.bsky.social
Reinforcement Learning Postdoc at FLAIR, University of Oxford @universityofoxford.bsky.social

All opinions are my own.
1/2 Offline RL has always bothered me. It promises that by exploiting offline data, an agent can learn to behave near-optimally once deployed. In real life, it breaks this promise, requiring large amount of online samples for tuning and has no guarantees of behaving safely to achieve desired goals.
May 30, 2025 at 8:39 AM