Switched from Qwen3.6 35b-a3b to Qwen3.6 27b mid coding and it’s noticeably better!

A bit of context. I was coding up a little html tower defense game where you can alter the path by placing additional waypoints.

My setup: 32gb ram with 16gb vram 5070 ti. Using AesSedai/Qwen3.6-35B-A3B-GGUF IQ4_XS on LM Studio with OpenCode. I've graduated from one-shot vibe-coding prompts.

The spec for this game was complicated enough that it couldn't have been done in LM Studio so I tried OpenCode. The project was chugging along, Qwen3.6 35b-a3b was getting things done when 27b dropped. Naturally I had to try it. Only problem is that I couldn't use any of the Q4 models due to vram issues, so I dropped to an IQ3_M model from mradermacher/Qwen3.6-27B-i1-GGUF.

I had worries that IQ3_M would have been too much compression but it did fine and was even able to find a difficult bug that IQ4_XS version of Qwen3.6 35b-a3b couldn't. They say dense models handle compression better than MoE models. Is that the reason for this? What are other people's experience with 35b-a3b vs 27b versions of Qwen3.6?

Using LM Studio,

I got 50-60 tokens per second with Qwen3.6 35b-a3b (AesSedai/Qwen3.6-35B-A3B-GGUF IQ4_XS) but the prompt processing gets real slow sometimes.

I got 40ish tokens per second with mradermacher/Qwen3.6-27B-i1-GGUF IQ3_M but it was decent speed throughout.

How are people's experiences with these two models at 16gb vram? Anyone doing actual work with IQ3 models of 27b?

Oh, the Waypoint Tower Defense game is done and can be played on htmlbin. The save/load doesn't seem to work on their site, but if you download the file and open it in browser, it'll work fine. It's a self-contained single html game. Mean to be like minesweeper but for tower defense.

submitted by /u/LocalAI_Amateur
[link] [comments]

Leave a Comment