Star Attention is a new way to make large language models process very long texts much faster while maintaining accuracy.
Author @shantanuacharya.bsky.social is on alphaXiv this week to answer your questions on his paper!
Star Attention is a new way to make large language models process very long texts much faster while maintaining accuracy.
Author @shantanuacharya.bsky.social is on alphaXiv this week to answer your questions on his paper!
✅ Improves inference by 11x while preserving 95-100% accuracy
✅Integrates with any LLM without any finetuning
Paper: arxiv.org/abs/2411.17116
✅ Improves inference by 11x while preserving 95-100% accuracy
✅Integrates with any LLM without any finetuning
Paper: arxiv.org/abs/2411.17116