LocalLLaMA

Qwen 3.6 27b MTP vLLM

Hello everyone, i am banging my head trying to properly configure qwen 3.6 27b mtp in vllm. I am using vllm v0.20.0 in docker, unquantized model with tp4 (4 3090s), max context length. At low context size, mtp with value of 3 gives the best results: 4…