Shreyans
pyparrot.bsky.social
Shreyans
@pyparrot.bsky.social
Interpretability, AI ethics, Reinforcement Learning
It talks about the origins of Mechanistic Interpretability as a term, a field and a community... and some drama.. :P

paper link: arxiv.org/abs/2410.09087
blogpost link: shreyansjainn.github.io/blog/2026/me...
Mechanistic?
The rise of the term "mechanistic interpretability" has accompanied increasing interest in understanding neural models -- particularly language models. However, this jargon has also led to a fair amou...
arxiv.org
January 30, 2026 at 5:13 PM