cs.AI

Preservation Is Not Enough for Width Growth: Regime-Sensitive Selection of Dense LM Warm Starts

arXiv:2604.04281v1 Announce Type: new
Abstract: Width expansion offers a practical route to reuse smaller causal-language-model checkpoints, but selecting a widened warm start is not solved by zero-step preservation alone. We study dense width growth …