Shahriar Golchin, Marc Wetter

Intent Laundering: AI Safety Datasets Are Not What They Seem

Shahriar Golchin, Marc Wetter / April 24, 2026

arXiv:2602.16729v3 Announce Type: replace-cross
Abstract: We systematically evaluate the quality of widely used adversarial safety datasets from two perspectives: in isolation and in practice. In isolation, we examine how well these datasets reflect r…

Author name: Shahriar Golchin, Marc Wetter

Intent Laundering: AI Safety Datasets Are Not What They Seem