ThinkBrake: Efficient Reasoning via Log-Probability Margin Guided Decoding
arXiv:2510.00546v5 Announce Type: replace
Abstract: Large Reasoning Models (LRMs) allocate substantial inference-time compute to Chain-of-Thought (CoT) reasoning, improving performance on mathematics, scientific QA, and tool usage. However, this intro…