cs.AI, cs.CL

TAD: Temporal-Aware Trajectory Self-Distillation for Fast and Accurate Diffusion LLM

arXiv:2605.09536v1 Announce Type: cross
Abstract: Diffusion large language models (dLLMs) offer a promising paradigm for parallel text generation, but in practice they face an accuracy-parallelism trade-off, where increasing tokens per forward (TPF) o…