Generalization at the Edge of Stability
arXiv:2604.19740v1 Announce Type: new
Abstract: Training modern neural networks often relies on large learning rates, operating at the edge of stability, where the optimization dynamics exhibit oscillatory and chaotic behavior. Empirically, this regim…