Working Memory Constraints Scaffold Learning in Transformers under Data Scarcity
arXiv:2604.20789v2 Announce Type: cross
Abstract: We investigate the integration of human-like working memory constraints into the Transformer architecture and implement several cognitively inspired attention variants, including fixed-width windows ba…