Nishant Subramani @ ACL
nsubramani23.bsky.social
Nishant Subramani @ ACL
@nsubramani23.bsky.social
PhD student @CMU LTI - working on model #interpretability, student researcher @google; prev predoc @ai2; intern @MSFT
nishantsubramani.github.io
Congrats 🥳🥳🥳🥳
June 13, 2025 at 7:08 PM
Come to our poster in Albuquerque on Thursday 2-330pm in the interpretability & analysis section!

Paper: aclanthology.org/2025.naacl-l...
Code (coming soon): github.com/microsoft/mi...

🧵/🧵
April 29, 2025 at 1:41 PM
MICE 🐭:
🎯 - significantly beats baselines on expected tool-calling utility, especially in high risk scenarios
✅ - matches expected calibration error of baselines
✅ - is sample efficient
✅ - generalizes zeroshot to unseen tools

5/🧵
April 29, 2025 at 1:41 PM
Calibration is not sufficient: both an oracle and a model that just predicts the base rate are perfectly calibrated🤦🏽‍♂️

We develop a new metric expected tool-calling utility 🛠️to measure the utility of deciding whether or not to execute a tool call via a confidence score!

4/🧵
April 29, 2025 at 1:41 PM
We propose 🐭 MICE to better assess confidence when calling tools:

1️⃣ decode from each intermediate layer of an LM
2️⃣ compute similarity scores between each layer’s generation and the final output.
3️⃣ train a probabilistic classifier on these features

3/🧵
April 29, 2025 at 1:41 PM
1️⃣ Tool-using agents need to be useful and safe as they take actions in the world
2️⃣ Language models are poorly calibrated

🤔 Can we use model internals to better calibrate language models to make tool-using agents safer and more useful?

2/🧵
April 29, 2025 at 1:41 PM
Congrats!!
April 24, 2025 at 4:30 AM
Congrats! 🥳
March 27, 2025 at 3:10 AM
👍🏽 looks good to me!
December 14, 2024 at 1:27 AM
🙋🏽
November 21, 2024 at 2:25 PM
🙋🏽
November 19, 2024 at 2:45 PM