Shida Wang, YongXiang Hua, Zhou Tao, Haoyu Cao, Linli Xu

Dynamic Token Compression for Efficient Video Understanding through Reinforcement Learning

Shida Wang, YongXiang Hua, Zhou Tao, Haoyu Cao, Linli Xu / March 30, 2026

arXiv:2603.26365v1 Announce Type: new
Abstract: Multimodal Large Language Models have demonstrated remarkable capabilities in video understanding, yet face prohibitive computational costs and performance degradation from ”context rot” due to massive…

Author name: Shida Wang, YongXiang Hua, Zhou Tao, Haoyu Cao, Linli Xu

Dynamic Token Compression for Efficient Video Understanding through Reinforcement Learning