Qwen 3.6 35b a3b Q4 vs qwen 3.6 27b q6, on m5 pro 64gb

Tried to test the two versions of models in my own m5 pro 64, curated the results on claude, not an expert so settings/config might not be the best. do share what results or improvements that can be attempted. test prompts were generated in claude for testing purposes.

Qwen3.6 35B A3B vs 27B UD — M5 Pro 64GB benchmark

Hardware: MacBook Pro M5 Pro 18-core · 64GB unified memory · LM Studio · MLX runtime · thinking OFF (/no_think) · 128K context

Specs

	35B A3B MLX 4bit	27B UD MLX 6bit
Model size	~21.7GB	~30.5GB
Architecture	MoE — 3B active/token	Dense — 27B active/token
RAM at 128K ctx	~27GB	~38GB

Speed

Test	35B A3B	27B UD
800 token test	~72 tok/s · 11s	~9 tok/s · 32s
1200 token test	~70 tok/s · 16s	~9 tok/s · 70s
Advantage	8x faster	baseline

Intelligence — 4-task coding benchmark

Task	35B A3B	27B UD
Auth hook (useRequireAuth)	9.5/10 — typed, mounted cleanup	8/10 — used any, no cleanup
Conflict resolution (500ms rules)	10/10	10/10
Delete account (ordered ops)	10/10	10/10
Bug identification (syncBatch)	10/10 — found 3 bugs + improvements	7/10 — found 1 bug
Overall	9.8/10	8.75/10

Test prompt: 4 coding tasks · max_tokens 1200 · temp 0.6 · /no_think system prompt

Verdict: 35B A3B wins on both speed and quality for coding tasks on 64GB Apple Silicon. 27B is slower (8x) and didn't demonstrate the reasoning depth advantage expected from a dense model on these tasks.

wanted to have some number/references when i was looking for mac to get, hopefully this helps someone out there.

submitted by /u/skyyyy007
[link] [comments]

Leave a Comment