Ioannis Bantzis, James B. Simon, Arthur Jacot

Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape

Ioannis Bantzis, James B. Simon, Arthur Jacot / April 21, 2026

arXiv:2505.21722v2 Announce Type: replace-cross
Abstract: When a deep ReLU network is initialized with small weights, gradient descent (GD) is at first dominated by the saddle at the origin in parameter space. We study the so-called escape directions …

Author name: Ioannis Bantzis, James B. Simon, Arthur Jacot

Saddle-To-Saddle Dynamics in Deep ReLU Networks: Low-Rank Bias in the First Saddle Escape