Attention-Based Sampler for Diffusion Language Models
arXiv:2604.08564v1 Announce Type: cross
Abstract: Auto-regressive models (ARMs) have established a dominant paradigm in language modeling. However, their strictly sequential decoding paradigm imposes fundamental constraints on both inference efficienc…