cs.AI, cs.CL, cs.LG

Polysemantic Experts, Monosemantic Paths: Routing as Control in MoEs

arXiv:2604.17837v1 Announce Type: cross
Abstract: An LLM’s residual stream is both state and instruction: it encodes the current context and determines the next transformation. We introduce a parameter-free decomposition for Mixture-of-Experts models …