Grokking as Dimensional Phase Transition in Neural Networks
arXiv:2604.04655v1 Announce Type: cross
Abstract: Neural network grokking — the abrupt memorization-to-generalization transition — challenges our understanding of learning dynamics. Through finite-size scaling of gradient avalanche dynamics across e…