Qwen3.6-35B-A3B – even in VRAM limited scenarios it can be better to use bigger quants than you’d expect!
So maybe this is a no-brainer to many experienced local LLM users but it was not obvious for me. I am running a 3070 8gb + 64gb DDR4. Pretty lightweight setup so I chose the smallest Q4 unsloth model Qwen3.6-35B-A3B-UD-IQ4_XS.gguf – which is ~18gb. It …