cs.LG, math.OC

On the global convergence of gradient descent for wide shallow models with bounded nonlinearities

arXiv:2605.10775v1 Announce Type: cross
Abstract: A surprising phenomenon in the training of neural networks is the ability of gradient descent to find global minimizers of the training loss despite its non-convexity. Following earlier works, we inves…