Konstantin Krestnikov

Truth as a Compression Artifact in Language Model Training

Konstantin Krestnikov / April 7, 2026

arXiv:2603.11749v3 Announce Type: replace
Abstract: Why do language models trained on contradictory data prefer correct answers? In controlled experiments with small transformers (3.5M–86M parameters), we show that this preference tracks the compress…

Author name: Konstantin Krestnikov

Truth as a Compression Artifact in Language Model Training