Structured Observation Language for Efficient and Generalizable Vision-Language Navigation
arXiv:2603.27577v1 Announce Type: cross
Abstract: Vision-Language Navigation (VLN) requires an embodied agent to navigate complex environments by following natural language instructions, which typically demands tight fusion of visual and language moda…