Abdelrahman Abouzeid (Georgia Institute of Technology)

Does Your Optimizer Care How You Normalize? Normalization-Optimizer Coupling in LLM Training

Abdelrahman Abouzeid (Georgia Institute of Technology) / April 3, 2026

arXiv:2604.01563v1 Announce Type: cross
Abstract: In LLM training, normalization layers and optimizers are typically treated as independent design choices. In a 3×2 factorial at 1B parameters and 1000 training steps, we show this assumption can fail: …

Author name: Abdelrahman Abouzeid (Georgia Institute of Technology)

Does Your Optimizer Care How You Normalize? Normalization-Optimizer Coupling in LLM Training