LocalLLaMA

Qwen-3.6-27B, llamacpp, speculative decoding – appreciation post

First a little explanation about what is happening in the pictures. I did a small experiment with the aim of determining how much improvement using speculative decoding brings to the speed of the new Qwen (TL;DR big!). image shows my simple prom…