Sai Prasanna
saiprasanna.in
Sai Prasanna
@saiprasanna.in
See(k)ing the surreal

Causal World Models for Curious Robots @ University of Tübingen/Max Planck Institute for Intelligent Systems 🇩🇪

#reinforcementlearning #robotics #causality #meditation #vegan
But this is from the vibes of Tübingen from 1.5 days of visit. I have lived in Freiburg for 3 years
March 27, 2025 at 1:55 PM
Freiburg
March 27, 2025 at 1:55 PM
Tübingen
March 27, 2025 at 1:50 PM
Had a discussion with a fellow not-so-political Indian colleague doing a PhD in computer science in Europe. He is now thinking twice on his plan to go for an exchange at an US lab
March 27, 2025 at 9:29 AM
Curious to know which show
March 2, 2025 at 5:22 PM
One strategy I guess is to have good stream of good (BS filter) and diverse (topics, areas) inputs (books, research papers, what not)

And not get bogged by the fact that I am too distracted to go deep into one input stream (book or podcast or article or paper) at a time
March 1, 2025 at 10:12 PM
Do any of my fellow fox-brained folks (@vgr.bsky.social) have good strategies for aiding background processing? I think background processing feels more foxy thing intutively

@visakanv.com (not sure if you identify as a fox in the fox hedgehog dichotomy though)
March 1, 2025 at 10:06 PM
I guess the trick would be to do actions that makes the mind and emotional states to be fertile for the background processing to happen consistently!
March 1, 2025 at 10:02 PM
Conditioning gap in latent space world models is due to how uncertainty can go into latent posterior distribution or the learnt prior (dynamics model) and not conditioning on the future would put the uncertainty incorrectly into dynamics model.
March 1, 2025 at 9:50 PM
To re-think I think the problems could be orthogonal. Clever hans pertains to teacher forcing during training leading to easy solutions for lot of the timesteps skewing it to not learning the hard timestep which is most important for test-time.
March 1, 2025 at 9:50 PM
(Shame that argmax.org/blog is down now!! They're a really nice less known research group in Volkswagen doing important stuff in world models.)

Anyways, If these two problems are related, just establishing that would be an amazing paper!
argmax.org
March 1, 2025 at 9:29 PM
Conditioning gap: When you train a value encoder that computes an approximate posterior that's conditioned partially (say on past tokens), then the posterior has a worse lower bound than one also conditioned on everything (also future tokens).
March 1, 2025 at 9:29 PM
It reminds me of another problem, and I'm not sure if it's equivalent or if it's some dual problem. It's called the conditioning gap in latent space inference.
March 1, 2025 at 9:29 PM
The fix involves modelling forward and backward directions. I haven't grokked it fully, but I learnt about the above problem there. I find this two papers a really nice sequence of a fundamental problem and then a solution!
March 1, 2025 at 9:29 PM
And there is a new paper that claims to fix this for transformer architecture!!! They call it "belief state transformer". Apparently it fixes lots of practical problems arising due to clever hans cheat!

arxiv.org/abs/2410.23506
The Belief State Transformer
We introduce the "Belief State Transformer", a next-token predictor that takes both a prefix and suffix as inputs, with a novel objective of predicting both the next token for the prefix and the previ...
arxiv.org
March 1, 2025 at 9:29 PM