The Geometric Cost of Normalization: Affine Bounds on the Bayesian Complexity of Neural Networks
arXiv:2603.27432v1 Announce Type: new
Abstract: LayerNorm and RMSNorm impose fundamentally different geometric constraints on their outputs – and this difference has a precise, quantifiable consequence for model complexity. We prove that LayerNorm’s m…