Jie Jiang, Xing Sun - Provide.ai

Performance-Driven Policy Optimization for Speculative Decoding with Adaptive Windowing

Jie Jiang, Xing Sun / May 15, 2026

arXiv:2605.14978v1 Announce Type: new
Abstract: Speculative decoding accelerates LLM inference by having a lightweight draft model propose speculative windows of candidate tokens for parallel verification by a larger target model. In practice, specula…

Author name: Jie Jiang, Xing Sun

Performance-Driven Policy Optimization for Speculative Decoding with Adaptive Windowing