danielsc4.it
As M scales, EAGer consistently:
🚀 Achieves HIGHER Pass@k,
✂️ Uses FEWER tokens than baseline,
🕺 Shifts the Pareto frontier favorably across all tasks.
🧵5/
As M scales, EAGer consistently:
🚀 Achieves HIGHER Pass@k,
✂️ Uses FEWER tokens than baseline,
🕺 Shifts the Pareto frontier favorably across all tasks.
🧵5/
Full EAGer uses labels to catch failing prompts, lowering threshold to branch or add sequences. Great for verifiable pipelines!
🧵4/
Full EAGer uses labels to catch failing prompts, lowering threshold to branch or add sequences. Great for verifiable pipelines!
🧵4/
We cap at M sequences/prompt, saving budget on easy ones without regen. Training-free!
🧵3/
We cap at M sequences/prompt, saving budget on easy ones without regen. Training-free!
🧵3/
Meet EAGer: We show that monitoring token-level uncertainty lets LLMs allocate compute dynamically - spending MORE on hard problems, LESS on easy ones.
🧵👇
Meet EAGer: We show that monitoring token-level uncertainty lets LLMs allocate compute dynamically - spending MORE on hard problems, LESS on easy ones.
🧵👇
Following SpARE (@yuzhaouoe.bsky.social @alessiodevoto.bsky.social), we propose ✨ contrastive SAE steering ✨ with mutual info to personalize literary MT by tuning latent features 4/
Following SpARE (@yuzhaouoe.bsky.social @alessiodevoto.bsky.social), we propose ✨ contrastive SAE steering ✨ with mutual info to personalize literary MT by tuning latent features 4/
✓ Classifiers can find styles with high acc. (humans kinda don’t)
✓ Multi-shot prompting boosts style a lot
✓ We can detect strong style traces in activations (esp. mid layers) 3/
✓ Classifiers can find styles with high acc. (humans kinda don’t)
✓ Multi-shot prompting boosts style a lot
✓ We can detect strong style traces in activations (esp. mid layers) 3/
We steer LLM generations to mimic human translator styles on literary novels in 7 languages. 📚
SAE steering can beat few-shot prompting, leading to better personalization while maintaining quality.
🧵1/
We steer LLM generations to mimic human translator styles on literary novels in 7 languages. 📚
SAE steering can beat few-shot prompting, leading to better personalization while maintaining quality.
🧵1/