On the global convergence of gradient descent for wide shallow models with bounded nonlinearities
arXiv:2605.10775v1 Announce Type: cross
Abstract: A surprising phenomenon in the training of neural networks is the ability of gradient descent to find global minimizers of the training loss despite its non-convexity. Following earlier works, we inves…