Alessandro Stolfo
alestolfo.bsky.social
Alessandro Stolfo
@alestolfo.bsky.social
PhD @ ETHZ - LLM Interpretability
alestolfo.github.io
Our paper "Improving Instruction-Following in Language Models through Activation Steering” has been accepted to #ICLR2025!

We're also excited to share that our public GitHub repo is now live.
Code: github.com/microsoft/ll...
Camera-ready: arxiv.org/abs/2410.12877
April 15, 2025 at 4:35 PM