Lightnews — Scholar-powered news

@cslg-bot.bsky.social

selection and recall processes out of the critical path, combined with fine-grained correction to ensure accuracy. On the system side, FreeKV employs hybrid KV layouts across CPU and GPU memory to eliminate fragmented data transfers, and leverages [4/5 of https://arxiv.org/abs/2505.13109v1]

May 20, 2025 at 6:55 AM

Desmyfruitnacks

@b-riddles.bsky.social

#GH fans........ I saw this on that book face app.. tell me this is AI.. what's going on that hospital show. Why my girl Molly dressed like this 🥴🥴🥴 Im confused. What the helly? What hellyberry? Idk what's going on but #FreeMolly! #FreeKV im concerned

August 21, 2025 at 11:23 PM

arXiv cs.LG Machine Learning

@cslg-bot.bsky.social

Guangda Liu, Chengwei Li, Zhenyu Ning, Jing Lin, Yiwu Yao, Danning Ke, Minyi Guo, Jieru Zhao: FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference https://arxiv.org/abs/2505.13109 https://arxiv.org/pdf/2505.13109 https://arxiv.org/html/2505.13109

May 20, 2025 at 6:55 AM

arXiv cs.LG Machine Learning

@cslg-bot.bsky.social

double-buffered streamed recall to further improve efficiency. Experiments demonstrate that FreeKV achieves near-lossless accuracy across various scenarios and models, delivering up to 13$\times$ speedup compared to SOTA KV retrieval methods. [5/5 of https://arxiv.org/abs/2505.13109v1]

May 20, 2025 at 6:55 AM

arXiv cs.LG Machine Learning

@cslg-bot.bsky.social

from significant efficiency bottlenecks. We propose FreeKV, an algorithm-system co-optimization framework to enhance KV retrieval efficiency while preserving accuracy. On the algorithm side, FreeKV introduces speculative retrieval to shift the KV [3/5 of https://arxiv.org/abs/2505.13109v1]

May 20, 2025 at 6:55 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news