The AI risk repository: A meta-review, database, and taxonomy of risks from artificial intelligence

arXiv:2408.12622v3 Announce Type: replace-cross Abstract: Artificial intelligence (AI) is reshaping society, from video generation to medical diagnosis, coding agents to autonomous vehicles. Yet researchers, policymakers, and technology companies lack shared terminology for discussing AI risks. Consider "privacy": one framework uses this term to describe a model's ability to leak sensitive training data, while another uses it to mean freedom from government surveillance. Conversely, researchers have introduced "Goodhart's law," "specification gaming," "reward hacking," and "mesa-optimization" to describe the same phenomenon of AI systems optimizing for measured proxies rather than intended goals. This terminological diversity creates friction: comparing findings across studies requires mapping between frameworks, and comprehensive risk coverage requires consulting multiple taxonomies that use different organizing principles. This paper addresses this challenge by creating a comprehensive catalog of AI risks. We systematically analyzed every major AI risk framework published to date-74 frameworks containing 1,725 distinct risks-and organized them into a unified system. Our two classification systems reveal important patterns: contrary to common assumptions, human decisions cause nearly as many AI risks (38%) as the AI systems themselves (42%). The work provides practical tools for anyone working on AI safety, from developers conducting risk assessments to policymakers writing regulations to auditors evaluating AI systems. By establishing a common reference point, this repository creates the foundation for more coordinated and comprehensive approaches to managing AI's risks while realizing its benefits.

Leave a Comment