Chuxu Song, Zhencan Peng, Jiuqi Wei, Chuanhui Yang

CSAttention: Centroid-Scoring Attention for Accelerating LLM Inference

Chuxu Song, Zhencan Peng, Jiuqi Wei, Chuanhui Yang / April 13, 2026

arXiv:2604.08584v1 Announce Type: new
Abstract: Long-context LLMs increasingly rely on extended, reusable prefill prompts for agents and domain Q&A, pushing attention and KV-cache to become the dominant decode-time bottlenecks. While sparse attention …

Author name: Chuxu Song, Zhencan Peng, Jiuqi Wei, Chuanhui Yang

CSAttention: Centroid-Scoring Attention for Accelerating LLM Inference