Lightnews — Scholar-powered news

Shahab Bakhtiari

@shahabbakht.bsky.social

6.2K followers 1.1K following 1.2K posts

|| assistant prof at University of Montreal || leading the systems neuroscience and AI lab (SNAIL: https://www.snailab.ca/) 🐌 || associate academic member of Mila (Quebec AI Institute) || #NeuroAI || vision and learning in brains and machines

www.snailab.ca

Posts Media Videos Starter Packs

Pinned

Shahab Bakhtiari @shahabbakht.bsky.social · 13d

So excited to see this preprint released from the lab into the wild.

Charlotte has developed a theory for how learning curriculum influences learning generalization.
Our theory makes straightforward neural predictions that can be tested in future experiments. (1/4)

🧠🤖 🧠📈 #MLSky

Charlotte Volk @charlottevolk.bsky.social · 13d

🚨 New preprint alert!

🧠🤖
We propose a theory of how learning curriculum affects generalization through neural population dimensionality. Learning curriculum is a determining factor of neural dimensionality - where you start from determines where you end up.
🧠📈

A 🧵:

tinyurl.com/yr8tawj3

The curriculum effect in visual learning: the role of readout dimensionality

Generalization of visual perceptual learning (VPL) to unseen conditions varies across tasks. Previous work suggests that training curriculum may be integral to generalization, yet a theoretical explan...

tinyurl.com

1 6 31

Reposted by Shahab Bakhtiari

Sung Kim @sungkim.bsky.social · 2h

Diffusion models are not truly serial models

Diffusion models are:
- Methodologically looks serial (step by step).
- But performing less like a truly serial model (autoregression).

They find that diffusion model solves each problem with the same convergence rate. It will never be a serial model.

1 3 10

Reposted by Shahab Bakhtiari

Sung Kim @sungkim.bsky.social · 2h

Meta released a paper on Hybrid RL

It offers a promising way to go beyond purely verifiable rewards - combining the reliability of verifier signals with the richness of learned feedback. The results are: +11.7 pts vs RM-only and +9.2 pts vs verifier-only on hard-to-verify reasoning tasks.

1 1 10

Shahab Bakhtiari @shahabbakht.bsky.social · 4h

Haha … I can reassure you that wasn’t my intended message at all :)

Reposted by Shahab Bakhtiari

Dan Levenstein @dlevenstein.bsky.social · 3d

So I get that a Neuroscientist Couldn’t Understand a Microprocessor, and TBH I’m ok with that. But could a neuroscientist understand a deep RNN? Because that seems like a more pressing issue.

*assuming you think the brain operates through the parallel activity of many connected input/output units

13 4 49

Shahab Bakhtiari @shahabbakht.bsky.social · 10h

Most models within the neuroAI framework have been unimodal so far. I think moving towards multimodal models that are scalable and can satisfy some level of behavioural and neural alignment will force us to deal with the "readout" problem in a more serious way.

1 2

Shahab Bakhtiari @shahabbakht.bsky.social · 10h

Yeah there wasn’t enough space to expand on that.

I see neuroAI as a framework that gives us scalable, behaviorally relevant computational hypotheses. When it comes to readout, we haven’t even mapped out the space of possible mechanisms and algorithms yet.

1 3

Shahab Bakhtiari @shahabbakht.bsky.social · 16h

What do we talk about when we talk about "readout"?

I argued that our overly specialized, modular approach to studying the brain has given us a simplistic view of readout.

🧠📈

The Transmitter @thetransmitter.bsky.social · 18h

Figuring out how the brain uses information from visual neurons may require new tools, writes @neurograce.bsky.social. Hear from 10 experts in the field.

#neuroskyence

www.thetransmitter.org/the-big-pict...

Connecting neural activity, perception in the visual system

Figuring out how the brain uses information from visual neurons may require new tools. I asked nine experts to weigh in.

www.thetransmitter.org

1 3 19

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

Good article on AI boom by Noah Smith: open.substack.com/pub/noahpini...

"the great railroad bust did not happen because America built too many railroads. America didn’t build too many railroads! What happened was that America financed its railroads faster than they could capture value"

America's future could hinge on whether AI slightly disappoints

If the economy's single pillar goes down, Trump's presidency will be seen as a disaster.

open.substack.com

Shahab Bakhtiari @shahabbakht.bsky.social · 1d

Wait … how do you know that? :)

Reposted by Shahab Bakhtiari

Eugene Vinitsky 🍒 @eugenevinitsky.bsky.social · 1d

One strategic thing about making this an appealing scientific community is to overshare and boost work from graduate students here

5 9 62

Shahab Bakhtiari @shahabbakht.bsky.social · 4d

It's definitely not 50/50 for me. More like 10/90 ;)

Reposted by Shahab Bakhtiari

Eli Sennesh @elisennesh.bsky.social · 11d

We started our RL debate series!
www.youtube.com/watch?v=E0A0...

RL Debate Series: Eli "abolish the value function" Sennesh

YouTube video by Sensorimotor AI

www.youtube.com

1 4 14

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

This feels a lot like systems neuro, honestly. You could hear similar advice there, especially from the more experimentally-oriented minds.

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

I guess the whole predictive circuit finding approach can be seen as a convergent evolution, which probably doesn’t scale and generalize outside of the experiment setting?

2 1

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

Having full observation and control over the studied system is definitely the main advantage of MI. But the unintuitive mess of high-d computation is their shared problem, which seems to need more theories than experiments.

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

No doubt!

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

A systems neuroscientist turned mech interp researcher should write a paper on what the field should absolutely avoid, then observe how thoroughly they’ll be ignored :)

Though what I find intriguing in this domain (watching from afar): its much slower rate of progress compared to the rest of AI.

3 13

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

Regardless of what explainability/mech interp in AI is actually after, and whether or not they know what they’re searching for, we can confidently say they’re pursuing what systems neuroscience has pursued for decades, with very similar puzzles and confusions.

Mel Andrews @bayesianboy.bsky.social · 5d

What problem is explainability/interpretability research trying to solve in ML, and do you have a favorite paper articulating what that problem is?

3 8 46

Reposted by Shahab Bakhtiari

Mel Andrews @bayesianboy.bsky.social · 5d

What problem is explainability/interpretability research trying to solve in ML, and do you have a favorite paper articulating what that problem is?

15 10 56

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

I don't see a direct causal path, but pessimistically speaking, when bubbles burst they often leave subconscious biases against the bubbled topic, e.g., in evaluation committees. In other words, the current abundance of AI funding (relative to other fields) might not last.

Shahab Bakhtiari @shahabbakht.bsky.social · 5d

What if the bubble collapse also takes down our funding so we can't even afford H100s at half price?! :)

1 3

Reposted by Shahab Bakhtiari

Brokoslaw Laschowski @drlaschowski.bsky.social · 6d

Imagine a brain decoding algorithm that could generalize across different subjects and tasks. Today, we’re one step closer to achieving that vision.

Introducing the flagship paper of our brain decoding program: www.biorxiv.org/content/10.1...
#neuroAI #compneuro @utoronto.ca @uhn.ca

3 14 61

Shahab Bakhtiari @shahabbakht.bsky.social · 7d

The results challenge the texture-bias hypothesis of Geirhos et al. (2019).

This is one of those cases where a deep, careful review can add real value.

arxiv.org/abs/2509.20234

🧠🤖 #MLSky

ImageNet-trained CNNs are not biased towards texture: Revisiting feature reliance through controlled suppression

The hypothesis that Convolutional Neural Networks (CNNs) are inherently texture-biased has shaped much of the discourse on feature use in deep learning. We revisit this hypothesis by examining limitat...

arxiv.org

1 6 18

Reposted by Shahab Bakhtiari

Sushrut Thorat @sushrutthorat.bsky.social · 7d

and the low-D part has been on the horizon since a bit now - proceedings.neurips.cc/paper/2019/h... - given complex numbers you can go loooowwww haha (O(1)). Also this is linked to top-down attention: arxiv.org/abs/1907.12309 , arxiv.org/abs/2502.15634 - which is a low-D modulation (O(N) vs O(N^2)).

Superposition of many models into one

proceedings.neurips.cc

1 1 3

Shahab Bakhtiari @shahabbakht.bsky.social · 7d

Yeah, it all makes sense in hindsight. I think the low-d structure of weights was actually the rationale behind LoRA when it was proposed.