Bottlenecked Transformers: Periodic KV Cache Consolidation for Generalised Reasoning
arXiv:2505.16950v4 Announce Type: replace-cross
Abstract: Transformer LLMs have been shown to exhibit strong reasoning ability that scales with inference-time compute, most prominently through token-space “thinking” chains of thought. A growing line o…