Linearized Attention Cannot Enter the Kernel Regime at Any Practical Width
arXiv:2603.13085v2 Announce Type: replace-cross
Abstract: Understanding whether attention mechanisms converge to the kernel regime is foundational to the validity of influence functions for transformer accountability. Exact NTK characterization of sof…