Yuning Huang, Xiaoyu Ji, Joseph Huang, Yichi Zhang, Fengqing Zhu

Adaptive Greedy Frame Selection for Long Video Understanding

Yuning Huang, Xiaoyu Ji, Joseph Huang, Yichi Zhang, Fengqing Zhu / May 8, 2026

arXiv:2603.20180v2 Announce Type: replace-cross
Abstract: Large vision–language models (VLMs) are increasingly applied to long-video question answering, yet inference is often bottlenecked by the number of input frames and resulting visual tokens. Na…

Author name: Yuning Huang, Xiaoyu Ji, Joseph Huang, Yichi Zhang, Fengqing Zhu

Adaptive Greedy Frame Selection for Long Video Understanding