ubergarm/Kimi-K2.6-GGUF Q4_X now available

By /u/VoidAlchemy / April 20, 2026

Big thanks to jukofyork and AesSedai today giving me some tips to patch and quantize the "full size" Kimi-K2.6 "Q4_X". It runs on both ik and mainline llama.cpp if you have over ~584GB RAM+VRAM...

I'll follow up with imatrix for anyone else making custom quants, and some smaller quants that run on ik_llama.cpp soon. AesSedai will likely have mainline MoE optimized recipes up soon too!

Cheers and curious how this big one compares with GLM-5.1.

submitted by /u/VoidAlchemy
[link] [comments]

Leave a Comment