Revisiting DAgger in the Era of LLM-Agents
arXiv:2605.12913v1 Announce Type: new
Abstract: Long-horizon LM agents learn from multi-turn interaction, where a single early mistake can alter the subsequent state distribution and derail the whole trajectory. Existing recipes fall short in compleme…