BASIS: Balanced Activation Sketching with Invariant Scalars for “Ghost Backpropagation”
arXiv:2604.16324v1 Announce Type: new
Abstract: The activation memory required for exact backpropagation scales linearly with network depth, context length, and feature dimensionality, forming an O(L * BN ) spatial bottleneck (where B is the sequence-…