cs.CV

SD-MVSum: Script-Driven Multimodal Video Summarization Method and Datasets

arXiv:2510.05652v2 Announce Type: replace
Abstract: In this work, we present a method and two large-scale datasets for Script-Driven Multimodal Video Summarization. The proposed method, SD-MVSum, builds on our earlier SD-VSum method for script-driven …