Amelie Knecht - Provide.ai

Uncategorised

Verbalised evaluation awareness in language models has little effect on their behaviour

Amelie Knecht / May 12, 2026

TL;DR: We provide evidence that the presence of verbalised evaluation awareness (VEA) in CoTs does not imply eval gaming. We tested this across 8 open-weight LRMs and 4 benchmarks (safety, alignment, moral dilemmas, political opinion) by comparing answ…

Author name: Amelie Knecht

Verbalised evaluation awareness in language models has little effect on their behaviour