Polysemantic Experts, Monosemantic Paths: Routing as Control in MoEs
arXiv:2604.17837v1 Announce Type: cross
Abstract: An LLM’s residual stream is both state and instruction: it encodes the current context and determines the next transformation. We introduce a parameter-free decomposition for Mixture-of-Experts models …