Fabio Montello, Ronja G\"uldenring, Lazaros Nalpantidis

ClustViT: Clustering-based Token Merging for Semantic Segmentation

Fabio Montello, Ronja G\"uldenring, Lazaros Nalpantidis / May 4, 2026

arXiv:2510.01948v2 Announce Type: replace
Abstract: Vision Transformers can achieve high accuracy and strong generalization across various contexts, but their practical applicability on real-world robotic systems is limited due to their quadratic atte…

Author name: Fabio Montello, Ronja G\"uldenring, Lazaros Nalpantidis

ClustViT: Clustering-based Token Merging for Semantic Segmentation