Lightnews — Scholar-powered news

Neehar Kondapaneni

@therealpaneni.bsky.social

58 followers 500 following 27 posts

Researching interpretability and alignment in computer vision.
PhD student @ Vision Lab Caltech

Posts Replies Media Videos

Neehar Kondapaneni

@therealpaneni.bsky.social

To summarize our experimental results, we propose a new metric that quantifies how well a method isolates model differences. We found RDX consistently outperforms baseline approaches on this metric and several other established metrics.

November 19, 2025 at 4:50 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

Beyond controlled experiments, RDX uncovers previously unknown differences. For example, we found DINOv2 has extra structure for distinguishing monkey species, helping explain its improved fine-grained performance.

November 19, 2025 at 4:50 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

In a case study on models with small performance differences, baselines (like SAE and NMF) mostly captured shared concepts. RDX, in contrast, localized groups of images that were incorrectly clustered in the weaker model, revealing subtle but important differences.

November 19, 2025 at 4:50 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

In the game, you’ll see that prior methods often highlight the wrong parts of each model’s representations by explaining shared structure. RDX, by contrast, consistently focuses on the differences between models, making it much easier to interpret what changed during training. 🔍

November 19, 2025 at 4:50 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

Excited to share our paper Representational Difference Explanations (RDX) was accepted to #NeurIPS2025! 🎉RDX is a new method for model diffing designed to isolate 🔍 representational differences. 1/7

November 19, 2025 at 4:50 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

Even on a simple MNIST model, it is essentially impossible to anticipate that a weighted sum over these explanations results in this normal-looking five. Linear combinations of explanation grids are tricky to understand!

July 8, 2025 at 3:43 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

We compare RDX to several popular dictionary-learning (DL) methods (like SAEs and NMF) and find that the DL methods struggle. In the spotted wing (SW) comparison experiment, we find that NMF shows model similarities rather than differences.

July 8, 2025 at 3:43 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

After demonstrating that RDX works when there are known differences, we compare models with unknown differences. For example, when comparing DINO and DINOv2, we find that DINOv2 has learned a color based categorization of gibbons that is not present in DINO.

July 8, 2025 at 3:43 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

We apply RDX on trained models with known differences and show that it isolates the core differences. For example, we compare model representations with and w/out a “spotted wing” (SW) concept and find that RDX shows that only one model groups birds according to this feature.

July 8, 2025 at 3:43 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

We found these unique and important concepts to be fairly complex, requiring deep analysis. We use ChatGPT-4o to analyze the concept collages and find that it gives detailed and clear explanations about the differences between models. More examples here -- nkondapa.github.io/rsvc-page/

April 11, 2025 at 4:11 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

We then look at “in-the-wild” models. We compare ResNets and ViTs trained on ImageNet. We measure concept importance and concept similarity. Do models learn unique and important concepts? Yes, sometimes they do!

April 11, 2025 at 4:11 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

We first show this approach can recover known differences. We train Model 1 to use a pink square to make classification decisions and Model 2 to ignore it. Our method, RSVC, isolates this difference.

April 11, 2025 at 4:11 PM

Neehar Kondapaneni

@therealpaneni.bsky.social

Have you ever wondered what makes two models different?
We all know the ViT-Large performs better than the Resnet-50, but what visual concepts drive this difference? Our new ICLR 2025 paper addresses this question! nkondapa.github.io/rsvc-page/

April 11, 2025 at 4:11 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news