Hongjie Wang, Niraj K. Jha

LinMU: Multimodal Understanding Made Linear

Hongjie Wang, Niraj K. Jha / May 5, 2026

arXiv:2601.01322v2 Announce Type: replace-cross
Abstract: Modern Vision-Language Models (VLMs) achieve impressive performance but are limited by the quadratic complexity of self-attention, which prevents their deployment on edge devices and makes thei…

Author name: Hongjie Wang, Niraj K. Jha

LinMU: Multimodal Understanding Made Linear