Author name: Warren Johnson, Charles Lee

Evaluating Small Language Models for Front-Door Routing: A Harmonized Benchmark and Synthetic-Traffic Experiment

Warren Johnson, Charles Lee / April 6, 2026

arXiv:2604.02367v1 Announce Type: cross
Abstract: Selecting the appropriate model at inference time — the routing problem — requires jointly optimizing output quality, cost, latency, and governance constraints. Existing approaches delegate this deci…

cs.CL

Prompt Compression in Production Task Orchestration: A Pre-Registered Randomized Trial

Warren Johnson, Charles Lee / March 26, 2026

arXiv:2603.23525v1 Announce Type: new
Abstract: The economics of prompt compression depend not only on reducing input tokens but on how compression changes output length, which is typically priced several times higher. We evaluate this in a pre-regist…