cs.CL, cs.LG

Attention-Based Sampler for Diffusion Language Models

arXiv:2604.08564v1 Announce Type: cross
Abstract: Auto-regressive models (ARMs) have established a dominant paradigm in language modeling. However, their strictly sequential decoding paradigm imposes fundamental constraints on both inference efficienc…