MARS: Enabling Autoregressive Models Multi-Token Generation
arXiv:2604.07023v1 Announce Type: new
Abstract: Autoregressive (AR) language models generate text one token at a time, even when consecutive tokens are highly predictable given earlier context. We introduce MARS (Mask AutoRegreSsion), a lightweight fi…