Huge thanks to the team: Michal Klein, Eleonora Gualdoni, Valentino Maiorca, Arno Blaas, Luca Zappella, Marco Cuturi, & Xavier Suau (who contributed like a 1st author too🥇)!
💻https://github.com/apple/ml-lineas
📄https://arxiv.org/abs/2503.10679
Huge thanks to the team: Michal Klein, Eleonora Gualdoni, Valentino Maiorca, Arno Blaas, Luca Zappella, Marco Cuturi, & Xavier Suau (who contributed like a 1st author too🥇)!
💻https://github.com/apple/ml-lineas
📄https://arxiv.org/abs/2503.10679
Extra 👏 to Xavi for making this so great! Like a friend would say, he's the Rolls-Royce of the co-authors, and he should be regarded the first author too!
Extra 👏 to Xavi for making this so great! Like a friend would say, he's the Rolls-Royce of the co-authors, and he should be regarded the first author too!
🤝 Unifying activation steering w/ OT.
✨ Linear-AcT preserves distributions w/ interpretable ([0, 1]) strength.
💪 Robust: models/layers/modalities
💬 LLMs: toxicity mitigation, truthfulness and concept induction,
🌄 T2I: style induction and concept negation.
🚀 Negligible cost!
🤝 Unifying activation steering w/ OT.
✨ Linear-AcT preserves distributions w/ interpretable ([0, 1]) strength.
💪 Robust: models/layers/modalities
💬 LLMs: toxicity mitigation, truthfulness and concept induction,
🌄 T2I: style induction and concept negation.
🚀 Negligible cost!
In the image, StableDiffusion XL prompted with: “2 tier cake with multicolored stars attached to it and no {white bear, pink elephant, gorilla} can be seen.”
✨Linear-AcT makes the negated concept disappear✨
In the image, StableDiffusion XL prompted with: “2 tier cake with multicolored stars attached to it and no {white bear, pink elephant, gorilla} can be seen.”
✨Linear-AcT makes the negated concept disappear✨
In this example, we induce a specific style (Art Nouveau 🎨), which we can accurately control with our λ parameter.
In this example, we induce a specific style (Art Nouveau 🎨), which we can accurately control with our λ parameter.
And the best result is always obtained at λ=1, as opposed to vector-based steering methods!
And the best result is always obtained at λ=1, as opposed to vector-based steering methods!
🍰 All we need is two small sets of sentences {a},{b} from source and target distributions to estimate the Optimal Transport (OT) map 🚚
🚀 We linearize the map for speed/memory, thus ⭐Linear-AcT⭐
🍰 All we need is two small sets of sentences {a},{b} from source and target distributions to estimate the Optimal Transport (OT) map 🚚
🚀 We linearize the map for speed/memory, thus ⭐Linear-AcT⭐
Most AS techniques perform a vector addition such as a* = a + λv, where v is some estimated vector and λ the conditioning strength. How v is estimated differs for each method.
Most AS techniques perform a vector addition such as a* = a + λv, where v is some estimated vector and λ the conditioning strength. How v is estimated differs for each method.
- Pre-prompting
- Fine-tuning
- RLHF
However, these techniques can be slow/expensive! 🐢
- Pre-prompting
- Fine-tuning
- RLHF
However, these techniques can be slow/expensive! 🐢