/u/3VITAERC - Provide.ai

Testing llama.cpp MTP support on Qwen3.6 – RTX 5090

/u/3VITAERC / May 17, 2026

Setup: – RTX 5090, 32 GB, Linux – Built llama.cpp from 4f13cb7 (the official ghcr.io/ggml-org/llama.cpp:server-cuda image hasn't picked up the merge yet as of writing — had to docker build from source with CUDA_DOCKER_ARCH=120) – Unsloth'…

Author name: /u/3VITAERC

Testing llama.cpp MTP support on Qwen3.6 – RTX 5090