DenseStep2M: A Scalable, Training-Free Pipeline for Dense Instructional Video Annotation
arXiv:2604.26565v1 Announce Type: new
Abstract: Long-term video understanding requires interpreting complex temporal events and reasoning over procedural activities. While instructional video corpora, like HowTo100M, offer rich resources for model tra…