Author name: Linghao Jin, Chufan Shi, Huijuan Wang, Nuan Wen, Zhengzhong Liu, Eric Xing, Xuezhe Ma

EMO: Frustratingly Easy Progressive Training of Extendable MoE

Linghao Jin, Chufan Shi, Huijuan Wang, Nuan Wen, Zhengzhong Liu, Eric Xing, Xuezhe Ma / May 15, 2026

arXiv:2605.13247v2 Announce Type: replace
Abstract: Sparse Mixture-of-Experts (MoE) models offer a powerful way to scale model size without increasing compute, as per-token FLOPs depend only on k active experts rather than the total pool of E experts….

cs.LG

EMO: Frustratingly Easy Progressive Training of Extendable MoE

Linghao Jin, Chufan Shi, Huijuan Wang, Nuan Wen, Zhengzhong Liu, Eric Xing, Xuezhe Ma / May 14, 2026

arXiv:2605.13247v1 Announce Type: new
Abstract: Sparse Mixture-of-Experts (MoE) models offer a powerful way to scale model size without increasing compute, as per-token FLOPs depend only on k active experts rather than the total pool of E experts. Yet…