V-CAGE: Vision-Closed-Loop Agentic Generation Engine for Robotic Manipulation
arXiv:2604.09036v1 Announce Type: new
Abstract: Scaling Vision-Language-Action (VLA) models requires massive datasets that are both semantically coherent and physically feasible. However, existing scene generation methods often lack context-awareness,…