cs.AI, cs.CL

Truth as a Compression Artifact in Language Model Training

arXiv:2603.11749v3 Announce Type: replace
Abstract: Why do language models trained on contradictory data prefer correct answers? In controlled experiments with small transformers (3.5M–86M parameters), we show that this preference tracks the compress…