Wenwen Si, Insup Lee, Osbert Bastani

Policy-Guided Stepwise Model Routing for Cost-Effective Reasoning

Wenwen Si, Insup Lee, Osbert Bastani / May 8, 2026

arXiv:2605.06116v1 Announce Type: new
Abstract: Inference-time computation has greatly enhanced the performance of large language models (LLMs) on challenging reasoning tasks, but this strategy can incur high inference costs. One solution is to route …

Author name: Wenwen Si, Insup Lee, Osbert Bastani

Policy-Guided Stepwise Model Routing for Cost-Effective Reasoning