Criticality and Saturation in Orthogonal Neural Networks
arXiv:2605.06563v1 Announce Type: new
Abstract: It has been known for a long time that initializing weight matrices to be orthogonal instead of having i.i.d. Gaussian components can improve training performance. This phenomenon can be analyzed using f…