Evaluating Small Language Models for Front-Door Routing: A Harmonized Benchmark and Synthetic-Traffic Experiment
arXiv:2604.02367v1 Announce Type: cross
Abstract: Selecting the appropriate model at inference time — the routing problem — requires jointly optimizing output quality, cost, latency, and governance constraints. Existing approaches delegate this deci…