Are Unsloth models as good as I read?

Has anybody done some comparing between the models that Unsloth offers and their counter part?
For example: I've been using qwen3.6:35b-a3b Q4_K_M , and on my MBP 64GB I get around 39 t/s
Using Unsloth Studio, unsloth/qwen3.6:35b-a3b UD-Q4_K_XL I get around 57 t/s

The difference in speed is significant. From what I've understood the Unsloth model runs a per-layer sensitivity analysis and assigns different quantization levels depending on how "important" each layer is. This obviously makes the model smaller, and from what I've been reading, the model should even perform better.

What are your experiences?

submitted by /u/denis-craciun
[link] [comments]

Leave a Comment