UniVer: A Unified Perspective for Multi-step and Multi-draft Speculative Decoding
arXiv:2605.04543v1 Announce Type: cross
Abstract: Speculative decoding accelerates Large Language Models via draft-then-verify, where verification can be framed as an Optimal Transport (OT) problem. Existing approaches typically handle multi-draft and…