MuonEq: Balancing Before Orthogonalization with Lightweight Equilibration
arXiv:2603.28254v2 Announce Type: replace-cross
Abstract: Orthogonalized-update optimizers such as Muon improve training of matrix-valued parameters, but existing extensions typically either rescale updates after orthogonalization or use heavier white…