causalwizard.bsky.social
@causalwizard.bsky.social
In addition, by analyzing the divergence of the latent state we found evidence that the resulting plans (paths) are more consistent over time.
October 29, 2025 at 7:35 AM
We found that the HRM-Agent can learn to navigate in dynamic and uncertain maze environments, with doors which open and close randomly.
October 29, 2025 at 7:31 AM
The Hierarchical Reasoning Model (HRM) has impressive reasoning abilities given its small size, but has only been applied to supervised, static, fully-observable problems.

We wanted to see if we could train a HRM to navigate in a maze using only reinforcement learning.
October 29, 2025 at 7:30 AM