Grokking or Glitching? How Low-Precision Drives Slingshot Loss Spikes
arXiv:2605.06152v1 Announce Type: cross
Abstract: Deep neural networks exhibit periodic loss spikes during unregularized long-term training, a phenomenon known as the “Slingshot Mechanism.” Existing work usually attributes this to intrinsic optimizati…