cs.CL, cs.LG

LongFlow: Efficient KV Cache Compression for Reasoning Models

arXiv:2603.11504v2 Announce Type: replace-cross
Abstract: Recent reasoning models such as OpenAI-o1 and DeepSeek-R1 have shown strong performance on complex tasks including mathematical reasoning and code generation. However, this performance gain com…