cs.AI, cs.CL

Cosine-Similarity Routing with Semantic Anchors for Interpretable Mixture-of-Experts Language Models

arXiv:2509.14255v2 Announce Type: replace
Abstract: Mixture-of-Experts (MoE) models improve efficiency through sparse activation, but their learned gating functions provide limited insight into routing decisions. This work introduces the Semantic Reso…