Breaking Down and Building Up: Mixture of Skill-Based Vision-and-Language Navigation Agents
arXiv:2508.07642v4 Announce Type: replace-cross
Abstract: Vision-and-Language Navigation (VLN) poses significant challenges for agents to interpret natural language instructions and navigate complex 3D environments. While recent progress has been driv…