Lightnews — Scholar-powered news

Piyush Mishra

@peeyoushh.bsky.social

32 followers 82 following 12 posts

PhD student, percussionist
piyushmishra12.github.io

Posts Replies Media Videos

Piyush Mishra

@peeyoushh.bsky.social

Félicitations Alice 🎉

October 11, 2025 at 3:53 PM

Piyush Mishra

@peeyoushh.bsky.social

So the Bayesian approach is great for the actual smoothing, but transformers are remarkable for pruning the hypothesis-set. Can we hybridise to use the best of both worlds? Stay tuned :)

a woman stands in front of a large sign that says hannah montana

ALT: a woman stands in front of a large sign that says hannah montana

media.tenor.com

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

We thus see the emergence of two regimes, one where we have a lower no. of hypotheses (where the Bayesian approach is unmatched) and another with a higher no. of hypotheses (where transformers take the lead).

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

While the transformer is heavier for lower lookback, the compute of the Bayesian method increases super-exponentially on increasing lookback! This is a perfect illustration of our combinatorial challenge of tracking and how transformers could help in resolving it.

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

Not only is the transformer suboptimal, it remains suboptimal when the Bayesian method is optimal (hint: AI alignment problem). Increasing the amount of data starts decreasing the accuracy!

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

But what if we had a world where this was possible (i.e., short sequences of 8 time steps, hence less no. of hypotheses)? No matter how much we train the transformer, it never matches the optimal performance!

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

This suggests that increasing past information of sequences leads to a better robustness for both the strategies. So if the Bayesian approach can access all the past information of the sequence, it should be optimal! But doing that is intractable for realistic scenarios!

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

Transformers are robust when dealing with large information. On increasing noise (for 2 particles undergoing brownian motion for 150 timesteps) we see a prolongation in the breakpoint of accuracy in all cases. An increase in sequence lookback shows further prolongation!

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

The Bayesian multiple hypothesis tracking approach is the theoretical optimal solution but it can only handle a certain amount of hypotheses before it becomes intractable. We look at where the switch happens and what we can do about it.

December 23, 2024 at 4:08 PM

Piyush Mishra

@peeyoushh.bsky.social

We know that transformers work well. But should we just replace all our previous techniques with transformers and call it a day? (Spoiler: no)

December 23, 2024 at 4:08 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news