Lightnews — Scholar-powered news

BlackboxNLP

@blackboxnlp.bsky.social

Our panel moderated by @danaarad.bsky.social
"Evaluating Interpretability Methods: Challenges and Future Directions" just started! 🎉 Come to learn more about the MIB benchmark and hear the takes of @michaelwhanna.bsky.social, Michal Golovanevsky, Nicolò Brunello and Mingyang Wang!

November 9, 2025 at 6:55 AM

BlackboxNLP

@blackboxnlp.bsky.social

Next up: Kentaro Ozeki presenting "Normative Reasoning in Large Language Models: A Comparative Benchmark from Logical and Modal Perspectives" aclanthology.org/2025.blackbo...

November 9, 2025 at 6:32 AM

BlackboxNLP

@blackboxnlp.bsky.social

After a productive poster session, BlackboxNLP returns with the second keynote "Memorization: Myth or Mystery?" by @vernadankers.bsky.social!

November 9, 2025 at 5:48 AM

BlackboxNLP

@blackboxnlp.bsky.social

Nadav Shani is giving the first oral presentation of the day: Language Dominance in Multilingual Large Language Models. Find the paper here: aclanthology.org/2025.blackbo...

November 9, 2025 at 2:19 AM

BlackboxNLP

@blackboxnlp.bsky.social

Next up: Circuit-Tracer: A New Library for Finding Feature Circuits presented by @michaelwhanna.bsky.social! Paper: aclanthology.org/2025.blackbo...

November 9, 2025 at 2:18 AM

BlackboxNLP

@blackboxnlp.bsky.social

Quanshi Zhang is giving the first keynote of the day: Can Neural Network Interpretability Be the Key to Breaking Through Scaling Law Limitations in Deep Learning?

November 9, 2025 at 1:38 AM

BlackboxNLP

@blackboxnlp.bsky.social

BlackboxNLP is up and running! Here's the topics covered by this year's edition at a glance. Excited to see so many interesting topics, and the growing interest in reasoning!

November 9, 2025 at 1:38 AM

BlackboxNLP

@blackboxnlp.bsky.social

📢 Call for Papers! 📢
#BlackboxNLP 2025 invites the submission of archival and non-archival papers on interpreting and explaining NLP models.

📅 Deadlines: Aug 15 (direct submissions), Sept 5 (ARR commitment)
🔗 More details: blackboxnlp.github.io/2025/call/

August 12, 2025 at 7:10 PM

BlackboxNLP

@blackboxnlp.bsky.social

Just 5 days left to submit your method to the MIB Shared Task at #BlackboxNLP!

Have last-minute questions or need help finalizing your submission?
Join the Discord server: discord.gg/n5uwjQcxPR

August 3, 2025 at 6:40 AM

BlackboxNLP

@blackboxnlp.bsky.social

With the new extended deadline, there's still plenty of time to submit your method to the MIB Shared Task!

We welcome submissions of existing methods, experimental POCs, or any approach addressing circuit discovery or causal variable localization 💡

July 30, 2025 at 5:57 AM

BlackboxNLP

@blackboxnlp.bsky.social

Results deadline extended by one week!
Following requests from participants, we’re extending the MIB Shared Task submission deadline by one week.

🗓️ New deadline: August 8, 2025
Submit your method via the MIB leaderboard!

July 29, 2025 at 9:35 AM

BlackboxNLP

@blackboxnlp.bsky.social

📝 Technical report guidelines are out!

If you're submitting to the MIB Shared Task at #BlackboxNLP, feel free to take a look to help you prepare your report: blackboxnlp.github.io/2025/task/

July 28, 2025 at 12:34 PM

BlackboxNLP

@blackboxnlp.bsky.social

Just 10 days to go until the results submission deadline for the MIB Shared Task at #BlackboxNLP!

If you're working on:
🧠 Circuit discovery
🔍 Feature attribution
🧪 Causal variable localization
now’s the time to polish and submit!

Join us on Discord: discord.gg/n5uwjQcxPR

July 23, 2025 at 7:42 AM

BlackboxNLP

@blackboxnlp.bsky.social

⏳ Three weeks left! Submit your work to the MIB Shared Task at #BlackboxNLP, co-located with @emnlpmeeting.bsky.social

Whether you're working on circuit discovery or causal variable localization, this is your chance to benchmark your method in a rigorous setup!

July 13, 2025 at 5:56 AM

BlackboxNLP

@blackboxnlp.bsky.social

Working on feature attribution, circuit discovery, feature alignment, or sparse coding?
Consider submitting your work to the MIB Shared Task, part of this year’s #BlackboxNLP

We welcome submissions of both existing methods and new or experimental POCs!

July 8, 2025 at 9:35 AM

BlackboxNLP

@blackboxnlp.bsky.social

New to mechanistic interpretability?
The MIB shared task is a great opportunity to experiment:
✅ Clean setup
✅ Open baseline code
✅ Standard evaluation

Join the discord server for ideas and discussions: discord.gg/n5uwjQcxPR

July 7, 2025 at 8:42 AM

BlackboxNLP

@blackboxnlp.bsky.social

🚨 Excited to announce two invited speakers at #BlackboxNLP 2025!

Join us to hear from two leading voices in interpretability:
🎙️ Quanshi Zhang (Shanghai Jiao Tong University)
🎙️ Verna Dankers (McGill University)

‪@vernadankers.bsky.social‬

July 4, 2025 at 8:14 AM

BlackboxNLP

@blackboxnlp.bsky.social

A typical pipeline:
• Build contrastive input pairs differing only in the target variable.
• (If supervised) train the featurizer on these pairs.
• To evaluate: Transform activation, intervene in feature space, transform back out, and check if behavior shifts as expected.

July 1, 2025 at 4:49 PM

BlackboxNLP

@blackboxnlp.bsky.social

One month to go! ⏰
Working on featurization methods - ways to transform LM activations to better isolate causal variables?
Submit your work to the Causal Variable Localization Track of the MIB Shared Task!

July 1, 2025 at 4:49 PM

BlackboxNLP

@blackboxnlp.bsky.social

Working on the MIB shared task?
Join the discord server: discord.gg/n5uwjQcxPR

🔍 Check out submission ideas
🔍 Brainstorm possible directions
🔍 Ask questions and get help with setup issues

Full task description: blackboxnlp.github.io/2025/task/

June 30, 2025 at 8:32 AM

BlackboxNLP

@blackboxnlp.bsky.social

The Circuit Localization Track benchmarks methods for discovering causal circuits, subgraphs of a model responsible for specific behavior.

These methods typically:
• Score model components or edges
• Ablate all but the top-ranked ones
• Evaluate the performance of the resulting subgraph

June 24, 2025 at 2:24 PM

BlackboxNLP

@blackboxnlp.bsky.social

Working on circuit discovery in LMs?
Consider submitting your work to the MIB Shared Task, part of #BlackboxNLP at @emnlpmeeting.bsky.social 2025!

The goal: benchmark existing MI methods and identify promising directions to precisely and concisely recover causal pathways in LMs >>

June 24, 2025 at 2:24 PM

BlackboxNLP

@blackboxnlp.bsky.social

The task builds on the new Mechanistic Interpretability Benchmark (MIB) by Mueller* & Geiger* et al. (2025), with two tracks:
* Circuit Localization – identify subgraphs that carry out specific computations
* Causal Variable Localization – align internal representations with known causal factors

June 23, 2025 at 2:46 PM

BlackboxNLP

@blackboxnlp.bsky.social

Have you heard about this year's shared task? 📢

Mechanistic Interpretability (MI) is quickly advancing, but comparing methods remains a challenge. This year at #BlackboxNLP, we're introducing a shared task to rigorously evaluate MI methods in language models 🧵

June 23, 2025 at 2:46 PM

BlackboxNLP

@blackboxnlp.bsky.social

BlackboxNLP, the leading workshop on interpretability and analysis of language models, will be co-located with EMNLP 2025 in Suzhou this November! 📆

This edition will feature a new shared task on circuits/causal variable localization in LMs, details here: blackboxnlp.github.io/2025/task

May 15, 2025 at 8:21 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news