Sungkyun Kim, Jaemin Kim, Dogyung Yoon, Jiho Shin, Junyeol Lee, Jiwon Seo

Speculative Verification: Exploiting Information Gain to Refine Speculative Decoding

Sungkyun Kim, Jaemin Kim, Dogyung Yoon, Jiho Shin, Junyeol Lee, Jiwon Seo / April 21, 2026

arXiv:2509.24328v2 Announce Type: replace
Abstract: LLMs have low GPU efficiency and high latency due to autoregressive decoding. Speculative decoding (SD) mitigates this using a small draft model to speculatively generate multiple tokens, which are t…

Author name: Sungkyun Kim, Jaemin Kim, Dogyung Yoon, Jiho Shin, Junyeol Lee, Jiwon Seo

Speculative Verification: Exploiting Information Gain to Refine Speculative Decoding