From Video to Control: A Survey of Learning Manipulation Interfaces from Temporal Visual Data
arXiv:2604.04974v1 Announce Type: new
Abstract: Video is a scalable observation of physical dynamics: it captures how objects move, how contact unfolds, and how scenes evolve under interaction — all without requiring robot action labels. Yet translat…