cs.CL

Double: Breaking the Acceleration Limit via Double Retrieval Speculative Parallelism

arXiv:2601.05524v3 Announce Type: replace
Abstract: Parallel Speculative Decoding (PSD) accelerates traditional Speculative Decoding (SD) by overlapping draft generation with verification. However, it remains hampered by two fundamental challenges: (1…