cs.CV, cs.RO

HumanNet: Scaling Human-centric Video Learning to One Million Hours

arXiv:2605.06747v1 Announce Type: cross
Abstract: Progress in embodied intelligence increasingly depends on scalable data infrastructure. While vision and language have scaled with internet corpora, learning physical interaction remains constrained by…