math.OC

cs.AI, cs.LG, math.OC

The Newton-Muon Optimizer

arXiv:2604.01472v1 Announce Type: cross
Abstract: The Muon optimizer has received considerable attention for its strong performance in training large language models, yet the design principle behind its matrix-gradient orthogonalization remains largel…

Scroll to Top