cs.RO

VLAs are Confined yet Capable of Generalizing to Novel Instructions

arXiv:2505.03500v5 Announce Type: replace
Abstract: Vision-language-action models (VLAs) often achieve high performance on demonstrated tasks but struggle significantly when required to extrapolate, combining skills learned from different tasks in nov…