LocalLLaMA

Got MTP + TurboQuant running — Qwen3.6-27B — 80+ t/s at 262K context on a single RTX 4090

So I've been messing around trying to get MTP working alongside TBQ4_0 (TurboQuant's lossless 4.25 bpv KV cache) on Qwen3.6-27B for my own use. So after a day of vibecoding I think I may have gotten something viable. Went from about 43 t/s whe…