DTree on MLX … tiny win over DFlash on Qwen3.5-4B (M2)..

I ported DTree to MLX ... and finally got one setting that seems to beat matched DFlash locally.

M2 Max 32GB, Qwen3.5-4B, q4_g64, spec=16, tree_budget=24 - DFlash: 45.07 e2e tok/s - DTree: 48.31 e2e tok/s 

So basically ~1.07x over DFlash. Not massive, but at least it looks real and repeatable enough to mention.

A lot of the other things I tried were flat or just worse, so my current read is that MLX verifier cost is still the main limiter here.

anyone has gotten bigger DTree gains on MLX?

https://github.com/drbh/dtree-mlx

submitted by /u/naftalinus
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top