cs.AI, cs.LG

DepCap: Adaptive Block-Wise Parallel Decoding for Efficient Diffusion LM Inference

arXiv:2604.15750v1 Announce Type: cross
Abstract: Diffusion language models (DLMs) have emerged as a promising alternative to autoregressive language generation due to their potential for parallel decoding and global refinement of the entire sequence….