LocalLLaMA

Qwen3.5-122B-Q5-MTP – Qwen3.5-122B-Q6-MTP

for anyone who cares… 😄 prompt = spen a 1000 tokens unsloth MTP models strix halo llama.cpp:server-rocm-mtp \ –spec-type draft-mtp \ –spec-draft-n-max 3 Qwen3.5-122B-Q5-MTP-General n_decoded = 100 tg = 29.77 t/s n_decoded = 179 tg = 27.95 t/s n_dec…