FOAM: Blocked State Folding for Memory-Efficient LLM Training
arXiv:2512.07112v2 Announce Type: replace-cross
Abstract: Large language models (LLMs) have demonstrated remarkable performance due to their large parameter counts and extensive training data. However, their scale leads to significant memory bottlenec…