Three Roles, One Model: Role Orchestration at Inference Time to Close the Performance Gap Between Small and Large Agents
arXiv:2604.11465v2 Announce Type: replace
Abstract: Large language model (LLM) agents show promise on realistic tool-use tasks, but deploying capable agents on modest hardware remains challenging. We study whether inference-time scaffolding alone, wit…