LocalLLaMA

Very happy with Qwen 3.5 122B output. But is slowness expected?

I'm running the 122-billion Qwen 3.5, specifically Qwen3.5-122B-A10B-Q5_K_M, on DGX Spark (128 GB contiguous memory). I'm (very!) impressed with the general knowledge output. I can talk to it in multiple languages, and don't feel the…