cs.AI, cs.CL, cs.LG

Universal Transformers Need Memory: Depth-State Trade-offs in Adaptive Recursive Reasoning

arXiv:2604.21999v1 Announce Type: cross
Abstract: We study learned memory tokens as computational scratchpad for a single-block Universal Transformer (UT) with Adaptive Computation Time (ACT) on Sudoku-Extreme, a combinatorial reasoning benchmark. We …