Tried to test the two versions of models in my own m5 pro 64, curated the results on claude, not an expert so settings/config might not be the best. do share what results or improvements that can be attempted. test prompts were generated in claude for testing purposes.
Qwen3.6 35B A3B vs 27B UD — M5 Pro 64GB benchmark
Hardware: MacBook Pro M5 Pro 18-core · 64GB unified memory · LM Studio · MLX runtime · thinking OFF (/no_think) · 128K context
Specs
| 35B A3B MLX 4bit | 27B UD MLX 6bit | |
|---|---|---|
| Model size | ~21.7GB | ~30.5GB |
| Architecture | MoE — 3B active/token | Dense — 27B active/token |
| RAM at 128K ctx | ~27GB | ~38GB |
Speed
| Test | 35B A3B | 27B UD |
|---|---|---|
| 800 token test | ~72 tok/s · 11s | ~9 tok/s · 32s |
| 1200 token test | ~70 tok/s · 16s | ~9 tok/s · 70s |
| Advantage | 8x faster | baseline |
Intelligence — 4-task coding benchmark
| Task | 35B A3B | 27B UD |
|---|---|---|
| Auth hook (useRequireAuth) | 9.5/10 — typed, mounted cleanup | 8/10 — used any, no cleanup |
| Conflict resolution (500ms rules) | 10/10 | 10/10 |
| Delete account (ordered ops) | 10/10 | 10/10 |
| Bug identification (syncBatch) | 10/10 — found 3 bugs + improvements | 7/10 — found 1 bug |
| Overall | 9.8/10 | 8.75/10 |
Test prompt: 4 coding tasks · max_tokens 1200 · temp 0.6 · /no_think system prompt
Verdict: 35B A3B wins on both speed and quality for coding tasks on 64GB Apple Silicon. 27B is slower (8x) and didn't demonstrate the reasoning depth advantage expected from a dense model on these tasks.
wanted to have some number/references when i was looking for mac to get, hopefully this helps someone out there.
[link] [comments]