Research in information retrieval and conversational search: Towards Language models that know what they know 🧠
Homepage : victormorand.github.io
Building upon these findings, we've managed to externalize this internal mechanism, creating a general-purpose mention detector with promising results. Stay tuned! 🔜
Building upon these findings, we've managed to externalize this internal mechanism, creating a general-purpose mention detector with promising results. Stay tuned! 🔜
Our method enables reconstruction of entity mentions from any representation within LLMs, allowing to ask: “What entity is the model thinking about right now?”
💡 When reading ‘the City of Lights iconic monument’, the model internally “thinks” of Paris and the Eiffel Tower !
Our method enables reconstruction of entity mentions from any representation within LLMs, allowing to ask: “What entity is the model thinking about right now?”
💡 When reading ‘the City of Lights iconic monument’, the model internally “thinks” of Paris and the Eiffel Tower !
By sucessfully learning "Tasks Vectors" steering the model to reconstruct the mention, we uncover new evidence that LLMs form dedicated internal circuits to represent and manipulate multi-token entities.
By sucessfully learning "Tasks Vectors" steering the model to reconstruct the mention, we uncover new evidence that LLMs form dedicated internal circuits to represent and manipulate multi-token entities.
We prove that common multi-token mentions (e.g. "Eiffel Tower") can be recovered from the middle-layer hidden state of its last token only !
Uncommon mentions aren't fully encoded this way; but rather retrieved from the context when needed.
We prove that common multi-token mentions (e.g. "Eiffel Tower") can be recovered from the middle-layer hidden state of its last token only !
Uncommon mentions aren't fully encoded this way; but rather retrieved from the context when needed.