From Observation to Action: Latent Action-based Primitive Segmentation for VLA Pre-training in Industrial Settings
arXiv:2511.21428v2 Announce Type: replace
Abstract: We present a novel unsupervised framework to unlock vast unlabeled human demonstration data from continuous industrial video streams for Vision-Language-Action (VLA) model pre-training. Our method fi…