cs.AI

Policy-Guided Stepwise Model Routing for Cost-Effective Reasoning

arXiv:2605.06116v1 Announce Type: new
Abstract: Inference-time computation has greatly enhanced the performance of large language models (LLMs) on challenging reasoning tasks, but this strategy can incur high inference costs. One solution is to route …