cs.CL, cs.CY

Evaluation Awareness in Language Models Has Limited Effect on Behaviour

arXiv:2605.05835v1 Announce Type: new
Abstract: Large reasoning models (LRMs) sometimes note in their chain of thought (CoT) that they may be under evaluation. Researchers worry that this verbalised evaluation awareness (VEA) causes models to adapt th…