Why and When Visual Token Pruning Fails? A Study on Relevant Visual Information Shift in MLLMs Decoding
arXiv:2604.12358v2 Announce Type: replace
Abstract: Recently, visual token pruning has been studied to handle the vast number of visual tokens in Multimodal Large Language Models. However, we observe that while existing pruning methods perform reliabl…