Activation Compression in LLMs: Theoretical Analysis and Efficient Algorithm
arXiv:2605.01255v1 Announce Type: new
Abstract: Training large language models (LLMs) is highly memory-intensive, as training must store not only weights and optimizer states but also intermediate activations for backpropagation. While existing memory…