Deep Learning as a Convex Paradigm of Computation: Minimizing Circuit Size with ResNets
arXiv:2511.20888v2 Announce Type: replace-cross
Abstract: This paper argues that DNNs implement a computational Occam’s razor — finding the `simplest’ algorithm that fits the data — and that this could explain their incredible and wide-ranging succe…