Vanishing Contributions: A Unified Framework for Smooth and Iterative Model Compression
arXiv:2510.09696v2 Announce Type: replace
Abstract: The increasing scale of Deep Neural Networks (DNNs) increases the need for compression techniques such as pruning, quantization, and low-rank decomposition. While these methods are very effective at reducing memory, computation, and energy consumption, they may introduce severe accuracy degradation, which is often mitigated by using iterative, gradual compression. However, different compression techniques require distinct iterative approaches, and some result in unstable, discontinuous model fine-tuning. We introduce Vanishing Contributions (VCON), a unified framework for the smooth, iterative transition of DNNs into a compressed form. Rather than replacing the original network directly with its compressed version, VCON executes both in parallel during fine-tuning. The contribution of the original (uncompressed) model is progressively reduced, while that of the compressed model is gradually increased. This affine combination allows the network to slowly adapt, improving stability and mitigating accuracy degradation. We evaluate VCON on computer vision and natural language processing benchmarks, using multiple compression strategies. In most settings, our framework improves accuracy over post-shot and iterative baselines. Typical gains exceed 1%, while some configuration exhibits improvements above 15%. VCON is thus compatible with existing compression techniques and consistently improves performance across diverse tasks.