cs.RO

Learning Structured Robot Policies from Vision-Language Models via Synthetic Neuro-Symbolic Supervision

arXiv:2604.02812v1 Announce Type: new
Abstract: Vision-language models (VLMs) have recently demonstrated strong capabilities in mapping multimodal observations to robot behaviors. However, most current approaches rely on end-to-end visuomotor policies…