S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation
arXiv:2603.25702v1 Announce Type: new
Abstract: Block-diffusion language models offer a promising path toward faster-than-autoregressive generation by combining block-wise autoregressive decoding with within-block parallel denoising. However, in the f…