MiniMax-M2.7 Q3_K_L & Q8_0 — First GGUF quants, Apple Silicon (M3 Max 128GB)

Just quantized MiniMax-M2.7 (229B MoE) — first GGUF quants available on HuggingFace.

Files:

- Q3_K_L (~110 GB) — fits 128GB unified memory

- Q8_0 (~243 GB) — for 256GB+ setups

PPL benchmark running now (c=512, seed=1337) — will update with results.

Baseline from M2.5 Q3_K_L: 8.7948 PPL, 28.7 t/s

Architecture: MiniMax-M2 MoE, 256 experts, 8 active/token.

Source: FP8 safetensors → Q8_0 → Q3_K_L via llama.cpp.