Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
arXiv:2408.00923v2 Announce Type: replace-cross
Abstract: This paper explores a novel paradigm in low-bit (i.e. 4-bits or lower) quantization, differing from existing state-of-the-art methods, by framing optimal quantization as an architecture search …