Qwen3.6 27B NVFP4 + MTP on a single RTX 5090: 200k context working in vLLM
So I spent some time testing Qwen3.6 27B NVFP4 on my RTX 5090 and wanted to share the numbers, since most of the recent good posts are either around 48GB cards, FP8, or llama.cpp/GGUF. This is not a "best possible setup" claim. More like: thi…