cs.AI

Reasoning Fails Where Step Flow Breaks

arXiv:2604.06695v1 Announce Type: new
Abstract: Large reasoning models (LRMs) that generate long chains of thought now perform well on multi-step math, science, and coding tasks. However, their behavior is still unstable and hard to interpret, and exi…