VER: Vision Expert Transformer for Robot Learning via Foundation Distillation and Dynamic Routing
arXiv:2510.05213v2 Announce Type: replace-cross
Abstract: Pretrained vision foundation models (VFMs) advance robotic learning via rich visual representations, yet individual VFMs typically excel only in specific domains, limiting generality across tas…