Alexandra Souly, Robert Kirk, Jacob Merizian, Abby D'Cruz, Xander Davies

UK AISI Alignment Evaluation Case-Study

Alexandra Souly, Robert Kirk, Jacob Merizian, Abby D'Cruz, Xander Davies / April 2, 2026

arXiv:2604.00788v1 Announce Type: new
Abstract: This technical report presents methods developed by the UK AI Security Institute for assessing whether advanced AI systems reliably follow intended goals. Specifically, we evaluate whether frontier model…

Author name: Alexandra Souly, Robert Kirk, Jacob Merizian, Abby D'Cruz, Xander Davies

UK AISI Alignment Evaluation Case-Study