cs.AI, cs.CV

Aerial Vision-Language Navigation with a Unified Framework for Spatial, Temporal and Embodied Reasoning

arXiv:2512.08639v3 Announce Type: replace-cross
Abstract: Aerial Vision-and-Language Navigation (VLN) aims to enable unmanned aerial vehicles (UAVs) to interpret natural language instructions and navigate complex urban environments using onboard visua…