cs.IT, cs.LG, math.IT, stat.ML

Mixture-of-Experts under Finite-Rate Gating: Communication–Generalization Trade-offs

arXiv:2602.15091v2 Announce Type: replace
Abstract: Mixture-of-Experts (MoE) architectures decompose prediction tasks into specialized expert sub-networks selected by a gating mechanism. This letter adopts a communication-theoretic view of MoE gating,…