Publication

Detecting and reducing scheming in AI models

OpenAI News / September 17, 2025

Apollo Research and OpenAI developed evaluations for hidden misalignment (“scheming”) and found behaviors consistent with scheming in controlled tests across frontier models. The team shared concrete examples and stress tests of an early method to redu…

Publication

Collective alignment: public input on our Model Spec

OpenAI News / August 27, 2025

OpenAI surveyed over 1,000 people worldwide on how AI should behave and compared their views to our Model Spec. Learn how collective alignment is shaping AI defaults to better reflect diverse human values and perspectives.

Publication

Accelerating life sciences research

OpenAI News / August 22, 2025

Discover how a specialized AI model, GPT-4b micro, helped OpenAI and Retro Bio engineer more effective proteins for stem cell therapy and longevity research.

Publication

GPT-5 System Card

OpenAI News / August 7, 2025

This GPT-5 system card explains how a unified model routing system powers fast and smart responses using gpt-5-main, gpt-5-thinking, and lightweight versions like gpt-5-thinking-nano, optimized for different tasks and developer use.

Publication

ChatGPT agent System Card

OpenAI News / July 17, 2025

ChatGPT agent System Card: OpenAI’s agentic model unites research, browser automation, and code tools with safeguards under the Preparedness Framework.

Publication

Introducing HealthBench

OpenAI News / May 12, 2025

HealthBench is a new evaluation benchmark for AI in healthcare which evaluates models in realistic scenarios. Built with input from 250+ physicians, it aims to provide a shared standard for model performance and safety in health.

Publication

OpenAI o3 and o4-mini System Card

OpenAI News / April 16, 2025

OpenAI o3 and OpenAI o4-mini combine state-of-the-art reasoning with full tool capabilities—web browsing, Python, image and file analysis, image generation, canvas, automations, file search, and memory.

Publication

Our updated Preparedness Framework

OpenAI News / April 15, 2025

Sharing our updated framework for measuring and protecting against severe harm from frontier AI capabilities.

Publication

PaperBench: Evaluating AI’s Ability to Replicate AI Research

OpenAI News / April 2, 2025

We introduce PaperBench, a benchmark evaluating the ability of AI agents to replicate state-of-the-art AI research.