DFlash: The Trick That Makes LLMs Stop Crawling One Token at a Time
Speculative decoding was already clever. DFlash makes the draft stage parallel, turning diffusion from a clumsy text generator into a very…Continue reading on Medium »
Speculative decoding was already clever. DFlash makes the draft stage parallel, turning diffusion from a clumsy text generator into a very…Continue reading on Medium »