Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
arXiv:2206.00939v3 Announce Type: replace
Abstract: The training of neural networks by gradient descent methods is a cornerstone of the deep learning revolution. Yet, despite some recent progress, a complete theory explaining its success is still miss…