Qwen3.6-27B-3bit-mlx · Hugging Face: 3 & 5 mixed quant for RAM poor Mac users.

Qwen3.6-27B-3bit-mlx · Hugging Face: 3 & 5 mixed quant for RAM poor Mac users.

Just dropped a 3bit mixed quant (5bit for embeds and prediction layers) for Mac users.

There was only one 3 bit version of this model (from Unsloth), but it was very heavy and painfully slow:

https://huggingface.co/models?other=base_model:quantized:Qwen%2FQwen3.6-27B&sort=trending&search=3-bit

This one is twice as fast, and in my own agentic tests equally good. Turn on preserve thinking in jinja template on LM Studio with:

{%- set preserve_thinking = true %}

submitted by /u/JLeonsarmiento
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top