Muon Does Not Converge on Convex Lipschitz Functions
arXiv:2605.08980v1 Announce Type: cross
Abstract: Muon and its variants have shown strong empirical performance in a variety of deep learning tasks. Existing convergence analyses of Muon rely on smoothness assumptions, though arguably the most success…