X. Y. Han, Yuan Zhong

A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models

X. Y. Han, Yuan Zhong / April 28, 2026

arXiv:2512.03915v3 Announce Type: replace-cross
Abstract: In large-scale AI training, Sparse Mixture-of-Experts (s-MoE) layers enable scaling by activating only a small subset of experts per token. An operational challenge in this design is load balan…

Author name: X. Y. Han, Yuan Zhong

A Theoretical Framework for Auxiliary-Loss-Free Load Balancing of Sparse Mixture-of-Experts in Large-Scale AI Models