cs.LG, math.ST, stat.ML, stat.TH

Perturbation is All You Need for Extrapolating Language Models

arXiv:2605.04344v1 Announce Type: new
Abstract: We introduce a simple yet powerful framework for training large language models. In contrast to the standard autoregressive next-token prediction based on an exact prefix, we propose a perturbation-based…