On the Loss Landscape Geometry of Regularized Deep Matrix Factorization: Uniqueness and Sharpness
arXiv:2603.27072v1 Announce Type: cross
Abstract: Weight decay is ubiquitous in training deep neural network architectures. Its empirical success is often attributed to capacity control; nonetheless, our theoretical understanding of its effect on the …