Lightnews — Scholar-powered news

Shantanu Acharya

@shantanuacharya.bsky.social

21 followers 230 following 2 posts

Researcher at NVIDIA - Working on Long Context LLMs

Posts Replies Media Videos

Reposted by Shantanu Acharya

alphaXiv

@alphaxiv.org

Star Attention

Star Attention is a new way to make large language models process very long texts much faster while maintaining accuracy.

Author @shantanuacharya.bsky.social is on alphaXiv this week to answer your questions on his paper!

December 2, 2024 at 6:39 PM

Shantanu Acharya

@shantanuacharya.bsky.social

🚀 Introducing Star Attention - a novel inference method combining local and global attention to do LLM inference over long sequences.

✅ Improves inference by 11x while preserving 95-100% accuracy
✅Integrates with any LLM without any finetuning

Paper: arxiv.org/abs/2411.17116

Star Attention: Efficient LLM Inference over Long Sequences

Inference with Transformer-based Large Language Models (LLMs) on long sequences is both costly and slow due to the quadratic complexity of the self-attention mechanism. We introduce Star Attention, a ...

arxiv.org

November 27, 2024 at 2:09 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news