cs.AI, cs.CV, cs.LG

LVSum: A Benchmark for Timestamp-Aware Long Video Summarization

arXiv:2604.10024v1 Announce Type: cross
Abstract: Long video summarization presents significant challenges for current multimodal large language models (MLLMs), particularly in maintaining temporal fidelity over extended durations and producing summar…