GitHub – intel/auto-round: A SOTA quantization algorithm for high-accuracy low-bit LLM inference, seamlessly optimized for CPU/XPU/CUDA, with multi-datatype support and full compatibility with vLLM, SGLang, and Transformers.By /u/muyuu / May 1, 2026 submitted by /u/muyuu [link] [comments]