Training LLMs for Multi-Step Tool Orchestration with Constrained Data Synthesis and Graduated Rewards
arXiv:2603.24709v1 Announce Type: cross
Abstract: Multi-step tool orchestration, where LLMs must invoke multiple dependent APIs in the correct order while propagating intermediate outputs, remains challenging. State-of-the-art models frequently fail o…