Recommended parameters for Qwen 3.6 35B A3B on a 8GB VRAM card and 24GB RAM?
I was running Q3_K_S with 90k context and was getting 21tok/s and gets reduced to 19.5 something after a few messages (I am using mmproj-F16 as i need vision for some task) And slowly reduces. Any way to get a bit better performance while keeping high …