Scott Jeen
enjeeneer.io
Scott Jeen
@enjeeneer.io
PhD Student at Cambridge University. AI and reinforcement learning.
It's dedicated to the late Barry Sealey CBE and Helen Sealey whose funding of my earlier postgraduate studies opened the door to a PhD. I'm hugely indebted to them for their kindness and generosity.
September 3, 2025 at 9:01 PM
My PhD thesis--On Zero-Shot Reinforcement Learning--is now on arXiv.
September 3, 2025 at 9:01 PM
We explored different sequence models: Transformers, GRUs, LSTMs, S4d, S5.

To our surprise, we found GRUs to be far-and-away the most effective, and Transformers to be disappointingly ineffective.

Why? The combined F^T x B representation seems unstable for all non-GRU methods.
July 31, 2025 at 9:01 PM
We run experiments on amended ExORL environments with different types of partial observability. In particular, we explore partially observed states, and partially observed changes in dynamics.

In aggregate, we improve performance across all partially observed settings.
July 31, 2025 at 9:01 PM
We solve both failure modes by replacing BFMs' standard MLPs with sequence models that condition on trajectories of observations and actions.

We call the resultant family of methods: Behaviour Foundation Models with Memory.
July 31, 2025 at 9:01 PM
When Behaviour Foundation Models are fed unreliable observations, rather than states, they fail in two predictable ways.

We call these failure models *state* misidentification, and *task* misidentification.

Each inhibits performance in isolation; together they kill the model.
July 31, 2025 at 9:01 PM
BFMs are amazing.

Train them on expressive (s,a,s′) data and you'll get the optimal policy for *any* reward function in an env.

But, what if instead of states you have observations, as is almost always the case in practice?

Excited to share our new @rl-conference.bsky.social paper! 🧵
July 31, 2025 at 9:01 PM
I’m in Whistler/Vancouver for #NeurIPS2024, and I’ll be around all week to chat RL. Swing by our poster on Friday, or hit me up on here and we can find time for a coffee!

Poster #6008
West Ballroom A-D
Friday 13th Dec 4:30-7:30pm

More details: neurips.cc/virtual/2024...
December 9, 2024 at 2:17 PM