/u/swizzcheezegoudaSWFA

Benchmarking the new b9200 update: Optimizing Qwen 3.6 27B mtp for Hermes Agent on a single RTX 3090

/u/swizzcheezegoudaSWFA / May 18, 2026

UPDATED (POST b9200) TL;DR If you're running rigid agent frameworks locally with mtp on consumer hardware: drop your draft window to 3, lock parallel slots to 1, and compile to b9200 or newer to get your memory bandwidth back. The numbers back it u…

Author name: /u/swizzcheezegoudaSWFA

Benchmarking the new b9200 update: Optimizing Qwen 3.6 27B mtp for Hermes Agent on a single RTX 3090