Vision-Language-Action in Robotics: A Survey of Datasets, Benchmarks, and Data Engines
arXiv:2604.23001v1 Announce Type: cross
Abstract: Despite remarkable progress in Vision–Language–Action (VLA) models, a central bottleneck remains underexamined: the data infrastructure that underlies embodied learning. In this survey, we argue that…