VisMMOE: Exploiting Visual-Expert Affinity for Efficient Visual-Language MoE Offloading
arXiv:2605.05899v1 Announce Type: new
Abstract: Large-scale vision-language mixture-of-experts (VL-MoE) models provide strong multimodal capability, but efficient deployment on memory-constrained platforms remains difficult. Existing MoE offloading sy…