DepCap: Adaptive Block-Wise Parallel Decoding for Efficient Diffusion LM Inference
arXiv:2604.15750v1 Announce Type: cross
Abstract: Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive language generation due to their potential for parallel decoding and global refinement of the entire sequence….