Jinghan Yao, Sam Ad\'e Jacobs, Walid Krichene, Masahiro Tanaka, Dhabaleswar K Panda

MAC-Attention: a Match-Amend-Complete Scheme for Fast and Accurate Attention Computation

Jinghan Yao, Sam Ad\'e Jacobs, Walid Krichene, Masahiro Tanaka, Dhabaleswar K Panda / April 2, 2026

arXiv:2604.00235v1 Announce Type: cross
Abstract: Long-context decoding in LLMs is IO-bound: each token re-reads an ever-growing KV cache. Prior accelerations cut bytes via compression, which lowers fidelity, or selection/eviction, which restricts wha…

Author name: Jinghan Yao, Sam Ad\'e Jacobs, Walid Krichene, Masahiro Tanaka, Dhabaleswar K Panda

MAC-Attention: a Match-Amend-Complete Scheme for Fast and Accurate Attention Computation