Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference
arXiv:2604.13634v1 Announce Type: cross
Abstract: Speculative decoding accelerates autoregressive generation by letting draft tokens bypass full verification, but conventional frameworks suffer from frequent false rejections, particularly when draft m…