Safety & Alignment

Confidence-Building Measures for Artificial Intelligence: Workshop proceedings

OpenAI News / August 1, 2023

Frontier Model Forum

OpenAI News / July 26, 2023

We’re forming a new industry body to promote the safe and responsible development of frontier AI systems: advancing AI safety research, identifying best practices and standards, and facilitating information sharing among policymakers and industry.

Safety & Alignment

Moving AI governance forward

OpenAI News / July 21, 2023

OpenAI and other leading labs reinforce AI safety, security and trustworthiness through voluntary commitments.

Safety & Alignment

Frontier AI regulation: Managing emerging risks to public safety

OpenAI News / July 6, 2023

Safety & Alignment

Insights from global conversations

OpenAI News / June 29, 2023

We are sharing what we learned from our conversations across 22 countries, and how we will be incorporating those insights moving forward.

Safety & Alignment

Governance of superintelligence

OpenAI News / May 22, 2023

Now is a good time to start thinking about the governance of superintelligence—future AI systems dramatically more capable than even AGI.

Safety & Alignment

Language models can explain neurons in language models

OpenAI News / May 9, 2023

We use GPT-4 to automatically write explanations for the behavior of neurons in large language models and to score those explanations. We release a dataset of these (imperfect) explanations and scores for every neuron in GPT-2.

Safety & Alignment

Our approach to AI safety

OpenAI News / April 5, 2023

Ensuring that AI systems are built, deployed, and used safely is critical to our mission.

Safety & Alignment

Planning for AGI and beyond

OpenAI News / February 24, 2023

Our mission is to ensure that artificial general intelligence—AI systems that are generally smarter than humans—benefits all of humanity.

Safety & Alignment

How should AI systems behave, and who should decide?

OpenAI News / February 16, 2023

We’re clarifying how ChatGPT’s behavior is shaped and our plans for improving that behavior, allowing more user customization, and getting more public input into our decision-making in these areas.