cs.LG, math.PR, stat.ML

Gaussian Equivalence for Self-Attention: Asymptotic Spectral Analysis of Attention Matrix

arXiv:2510.06685v2 Announce Type: replace
Abstract: Self-attention layers have become fundamental building blocks of modern deep neural networks, yet their theoretical understanding remains limited, particularly from the perspective of random matrix t…