Accidentally discovered you can teach frozen MoE models new knowledge by just steering their expert routing — no training needed
Was probing Gemma 4's MoE routing patterns out of curiosity and noticed something weird experts route differently when the model "knows" something vs. when it doesn't. So I recorded that difference and replayed it. Called it Adaptive …