cs.LG

A geometric relation of the error introduced by sampling a language model’s output distribution to its internal state

arXiv:2605.04899v1 Announce Type: new
Abstract: GPT-style language models are sensitive to single-token changes at generation points where the predicted probability distribution is spread across multiple tokens. Viewing this sensitivity as a geometric…