cs.CV

Long-Tailed Distribution-Aware Router For Mixture-of-Experts in Large Vision-Language Model

arXiv:2507.01351v2 Announce Type: replace
Abstract: The mixture-of-experts (MoE) architecture, which replaces dense networks with sparse ones, has attracted significant attention in large vision-language models (LVLMs) for achieving comparable perform…