SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models
arXiv:2604.10091v1 Announce Type: new
Abstract: Large language models (LLMs) have shown remarkable performance in various domains, but they are constrained by massive computational and storage costs. Quantization, an effective technique for compressin…