cs.CV

Video Patch Pruning: Efficient Video Instance Segmentation via Early Token Reduction

arXiv:2604.00827v1 Announce Type: new
Abstract: Vision Transformers (ViTs) have demonstrated state-ofthe-art performance in several benchmarks, yet their high computational costs hinders their practical deployment. Patch Pruning offers significant sav…