Chaoxiang Cai, Longrong Yang, Minghe Weng, Xuewei Li, Zequn Qin, Xi Li

Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model

Chaoxiang Cai, Longrong Yang, Minghe Weng, Xuewei Li, Zequn Qin, Xi Li / April 3, 2026

arXiv:2507.01351v2 Announce Type: replace
Abstract: The mixture-of-experts (MoE) architecture, which replaces dense networks with sparse ones, has attracted significant attention in large vision-language models (LVLMs) for achieving comparable perform…

Author name: Chaoxiang Cai, Longrong Yang, Minghe Weng, Xuewei Li, Zequn Qin, Xi Li

Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model