cs.CV, cs.RO

SIMPACT: Simulation-Enabled Action Planning using Vision-Language Models

arXiv:2512.05955v2 Announce Type: replace-cross
Abstract: Vision-Language Models (VLMs) exhibit remarkable common-sense and semantic reasoning capabilities. However, they lack a grounded understanding of physical dynamics. This limitation arises from …