Amelie Knecht, Lucas Florin, Thilo Hagendorff

Evaluation Awareness in Language Models Has Limited Effect on Behaviour

Amelie Knecht, Lucas Florin, Thilo Hagendorff / May 8, 2026

arXiv:2605.05835v1 Announce Type: new
Abstract: Large reasoning models (LRMs) sometimes note in their chain of thought (CoT) that they may be under evaluation. Researchers worry that this verbalised evaluation awareness (VEA) causes models to adapt th…

Author name: Amelie Knecht, Lucas Florin, Thilo Hagendorff

Evaluation Awareness in Language Models Has Limited Effect on Behaviour