DTree on MLX … tiny win over DFlash on Qwen3.5-4B (M2)..
I ported DTree to MLX … and finally got one setting that seems to beat matched DFlash locally. M2 Max 32GB, Qwen3.5-4B, q4_g64, spec=16, tree_budget=24 – DFlash: 45.07 e2e tok/s – DTree: 48.31 e2e tok/s So basically ~1.07x over DFlash. Not massive, …