Post Reasoning: Improving the Performance of Non-Thinking Models at No Cost
arXiv:2605.06165v1 Announce Type: new
Abstract: As the widespread adoption of Large Language Models (LLMs) accelerates, token consumption from intermediate reasoning traces increasingly contributes to inference latency and operational cost. Recent stu…