HieraVid: Hierarchical Token Pruning for Fast Video Large Language Models
arXiv:2604.01881v1 Announce Type: new
Abstract: Video Large Language Models (VideoLLMs) have demonstrated impressive capabilities in video understanding, yet the massive number of input video tokens incurs a significant computational burden for deploy…