Robust Ultra Low-Bit Post-Training Quantization via Stable Diagonal Curvature Estimate
arXiv:2604.13806v1 Announce Type: new
Abstract: Large Language Models (LLMs) are widely used across many domains, but their scale makes deployment challenging. Post-Training Quantization (PTQ) reduces memory footprint without retraining by leveraging …