cs.AI, cs.CL, cs.LG, cs.LO

Structural Sensitivity in Compressed Transformers: Relative Error Propagation and Layer Removal

arXiv:2603.20991v2 Announce Type: replace-cross
Abstract: Compressing transformer weights makes large language models cheaper to deploy. But each layer’s compression introduces an error. These errors accumulate as the signal passes through later layer…