Why and When Deep is Better than Shallow: Implementation-Agnostic State-Transition Model of Deep Learning
arXiv:2505.15064v4 Announce Type: replace-cross
Abstract: Why and when does depth improve generalization? We study this question in an implementation-agnostic state-transition model, where a depth-$k$ predictor is a readout class $H$ composed with the…