Jingkun Liu, Yisong Yue, Max Welling, Yue Song

Krause Synchronization Transformers

Jingkun Liu, Yisong Yue, Max Welling, Yue Song / May 15, 2026

arXiv:2602.11534v3 Announce Type: replace
Abstract: Self-attention in Transformers relies on globally normalized softmax weights, causing all tokens to compete for influence at every layer. When composed across depth, this interaction pattern induces …

Author name: Jingkun Liu, Yisong Yue, Max Welling, Yue Song

Krause Synchronization Transformers