cs.AI, cs.CL, cs.LG

Toeplitz MLP Mixers are Low Complexity, Information-Rich Sequence Models

arXiv:2605.06683v1 Announce Type: cross
Abstract: Transformer-based large language models are in some respects limited by the quadratic time and space computational complexity of attention. We introduce the Toeplitz MLP Mixer (TMM), a transformer-like…