cs.CL, cs.LG

Coupled Query-Key Dynamics for Attention

arXiv:2604.01683v1 Announce Type: cross
Abstract: Standard scaled dot-product attention computes scores from static, independent projections of the input. We show that evolving queries and keys \emph{jointly} through shared learned dynamics before sco…