Strix Halo Llama.cpp MTP Benchmarks: 27B Gets Much Faster, 35B Is Mixed
TL;DR All models were Qwen3.6 27B-MTP vs Base 27B (15k single-turn): Faster overall Total Time (wall): 87.44s → 77.39s (10.05s faster / -11.50%) Generation: 7.63 → 16.15 t/s (+111.77% speedup) Prompt Processing: 279.75 → 244.90 t/s (-12.46% slowdown) …