Liran Ringel, Yaniv Romano

Accelerating Speculative Decoding with Block Diffusion Draft Trees

Liran Ringel, Yaniv Romano / April 15, 2026

arXiv:2604.12989v1 Announce Type: new
Abstract: Speculative decoding accelerates autoregressive language models by using a lightweight drafter to propose multiple future tokens, which the target model then verifies in parallel. DFlash shows that a blo…

Author name: Liran Ringel, Yaniv Romano

Accelerating Speculative Decoding with Block Diffusion Draft Trees