cs.LG

UniComp: A Unified Evaluation of Large Language Model Compression via Pruning, Quantization and Distillation

arXiv:2602.09130v3 Announce Type: replace
Abstract: Model compression is increasingly essential for deploying large language models (LLMs), yet existing comparative studies largely focus on pruning and quantization evaluated primarily on knowledge-cen…