cs.AI, cs.CL

Fitting Is Not Enough: Smoothness in Extremely Quantized LLMs

arXiv:2605.08894v1 Announce Type: cross
Abstract: Large language models (LLMs) achieve strong performance but incur high deployment costs, motivating extremely low-bit but lossy quantization. Existing quantization algorithms mainly focus on improving …