LocalLLaMA

DFlash Doubles the T/S Gen Speed of Qwen3.5 27B (BF16) on Mac M5 Max

The new DFlash support in oMLX 0.3.5 RC1 looks like it doubles (!!!) the speed of Qwen3.5 27B (BF16). Initial test. Generation T/S went from 9 to 22 T/S! Models used (HuggingFace) Main Model: Jackrong/MLX-Qwopus3.5-27B-v3-bf16 Draft Model: z-lab/…