OpenAI’s Biggest Week Wasn’t About ChatGPT

A model that could reshape drug discovery. A cyber variant built for defenders. And the week that proved AI has become genuinely dangerous territory — in more ways than one.

Last week, every major AI story came from one company.

GPT-Rosalind — a purpose-built life sciences model — hit benchmarks that put it above the 95th percentile of human experts on novel biological data.

GPT-5.4-Cyber gave vetted security professionals access to a model with the restrictions turned down. The Agents SDK got an overhaul that makes agentic infrastructure significantly easier to deploy. And then, on a Friday at 4 am, a 20-year-old man from Texas threw a Molotov cocktail at Sam Altman’s house.

These are not separate stories. They’re the same story told at different temperatures.

GPT-Rosalind: Why drug discovery needs its own model

The problem with applying general AI to life sciences isn’t intelligence. It’s workflow. A researcher working on a new gene therapy has to survey hundreds of recent papers, identify patterns in protein structures, design a cloning protocol, and predict how a specific RNA sequence will behave in a cell — each step requiring different tools, different databases, and significant domain expertise. General models can help with individual pieces. They struggle with the chain.

GPT-Rosalind is built for modern scientific work across published evidence, data, tools, and experiments. OpenAI says it delivers the best performance on tasks that require reasoning over molecules, proteins, genes, pathways, and disease-relevant biology, and is more effective at using scientific tools and databases in multi-step workflows — literature review, sequence-to-function interpretation, experimental planning, and data analysis. The number that actually matters is from the Dyno Therapeutics evaluation. The model was evaluated on RNA sequence-to-function prediction using unpublished sequences — data that had never been part of any public training set, ruling out memorization as a confounding factor. The model’s best-of-ten submissions ranked above the 95th percentile of human experts on prediction tasks and reached the 84th percentile for sequence generation. 95th percentile. On data the model had never seen before. That’s not a benchmark optimization — that’s a model that is genuinely useful in a real scientific workflow, on novel problems, against expert competition.

OpenAI is working with Amgen, Moderna, Thermo Fisher Scientific, the Allen Institute, and Los Alamos National Laboratory on applications ranging from drug candidate identification to protein and catalyst design.

These aren’t pilot agreements. They’re production partnerships in domains where the stakes are measured in years of human life.

OpenAI is calling Rosalind the first model in a Life Sciences series and views it as the beginning of a long-term commitment to building AI that can accelerate scientific discovery across human health and broader biological research. The access is gated — qualified enterprise customers in the US only, with governance requirements and beneficial use verification. That’s not just liability management. A model this capable of reasoning about biological sequences carries biosecurity implications that general models don’t. OpenAI is being deliberate about that. The same pattern as Mythos, applied to a different domain.

Drug development in the US typically takes 10 to 15 years from target discovery to regulatory approval. Most of that time is spent not in breakthrough moments but in painstaking analytical work — sifting through literature, designing reagents, interpreting complex biological data. Rosalind is aimed directly at the front of that pipeline. If the early signal holds, the compounding effect on development timelines could be significant.

GPT-5.4-Cyber: the Glasswing response

OpenAI watched Anthropic launch Project Glasswing with 12 hand-picked partners and $100 million in compute credits, and came back a week later with a different answer to the same problem.

GPT-5.4-Cyber is a variant of GPT-5.4 that lowers the refusal boundary for legitimate cybersecurity work and enables new capabilities for advanced defensive workflows, including binary reverse engineering — the ability to analyze compiled software for malware potential, vulnerabilities, and security robustness without needing access to source code. That last capability matters. Most AI security tools require source code. The majority of real-world security work involves compiled binaries — software where the source is unavailable, proprietary, or simply not provided. Binary reverse engineering is a specialized, time-intensive skill. A model that can do it reliably changes the speed at which defenders can work.

The philosophical difference between Anthropic and OpenAI here is worth being precise about. Anthropic built a coalition of 12 organizations and gave them exclusive access to the most capable model they’ve built. OpenAI is building a tiered system that scales to thousands. OpenAI says it aims to make tools “as widely available as possible while preventing misuse” through identity verification and monitoring systems rather than manual gatekeeping decisions. Neither approach is obviously correct. Anthropic bets that the capability is dangerous enough to require very tight control and maximum defensive leverage per unit of access. OpenAI bets that democratized access to defenders — at scale, with verification — produces better aggregate security outcomes than concentrating capability in a small coalition.

Codex Security has already contributed to fixing over 3,000 critical and high-severity vulnerabilities, and through Codex for Open Source, more than 1,000 open-source projects have received free security scans. Those numbers suggest the broad-access approach is generating real defensive value. The question is whether it also generates proportional risk, and that question won’t have a clear answer for months.

The Agents SDK: infrastructure that enables everything else

Less visible than the model launches, the Agents SDK update is doing something important in the background: making it significantly easier to deploy AI agents that can actually operate in the real world.

The update introduces a model-native harness that allows agents to work across files and tools directly on a computer, a new sandbox execution environment, and configurable memory and orchestration systems. Before this, developers building agentic systems had to manage most of that infrastructure themselves. Now OpenAI is handling it natively.

The business logic is clear — more agentic infrastructure inside OpenAI’s ecosystem means more token consumption, more dependency, more stickiness. The practical effect is equally clear: the barrier to building serious agentic systems just dropped. Tools that previously required significant engineering investment to wire together can now be deployed faster, with better built-in security and memory handling.

Both the Rosalind and GPT-5.4-Cyber launches are only as useful as the agentic infrastructure underneath them. A model that can reason about biological sequences needs to be able to actually query the databases, run the analyses, and return results in context. A security model doing binary analysis needs a persistent state across long workflows. The SDK update is the plumbing that makes both of those things work at scale.

The attack that changed the register

At 4 am on April 10, Daniel Moreno-Gama, 20 years old, from Spring, Texas, was caught on surveillance video throwing an incendiary device at the gate of Sam Altman’s San Francisco home. He was arrested an hour later outside OpenAI’s headquarters, where he was attempting to break through the glass doors. He had kerosene in his backpack and additional incendiary devices on his person. When arrested, he was carrying a document he had written expressing opposition to AI and the executives of AI companies. The document discussed AI’s purported risk to humanity and “our impending extinction.” He had a list of names and addresses of board members, CEOs of AI companies, and investors. Moreno-Gama faces charges including two counts of attempted murder — one for Altman, one for a security guard at the residence — along with attempted arson and federal charges for possession of an unregistered firearm and destruction of property by means of explosives. US prosecutors indicated they may treat the attack as domestic terrorism if evidence shows it was intended to influence public policy. No one was injured. That’s what happened.

What it means is harder to write about without reaching for easy framings. This is not “AI backlash gone violent” as a category. It’s one person’s extreme act. But it doesn’t exist in a vacuum either. The week it happened, Anthropic was running a controlled model that found thousands of critical security vulnerabilities. OpenAI was launching a model aimed at accelerating drug discovery. Google was teasing a goal-driven coding agent. The pace of capability development is genuinely alarming to a lot of people — researchers, policy experts, and clearly some members of the public — and that fear is not irrational even if this response to it was.

Altman posted a photo of his husband and their toddler hours after the attack. “Normally we try to be pretty private, but in this case I am sharing a photo in the hopes that it might dissuade the next person from throwing a Molotov cocktail at our house, no matter what they think about me.” He added that “fear and anxiety about AI is justified” but called for de-escalating the rhetoric. That’s a reasonable response to an unreasonable week. The fear part, at least, is calibrated correctly. AI is moving faster than most institutions can track it. People who are worried about that are not wrong to be worried. The question is what a legitimate, productive version of that concern looks like — and what happens to the policy conversation when it gets contaminated by events like this one.

The pattern underneath all four stories

GPT-Rosalind is access-controlled because biological reasoning at this level has biosecurity implications. GPT-5.4-Cyber is tiered and verified because security models are dual-use by definition. The Agents SDK is the infrastructure that makes both of them deployable. And somewhere in the middle of all of it, a 20-year-old drove from Texas to San Francisco to try to kill a tech executive over what AI might become.

The common thread is stakes. AI moved from “interesting research” to “infrastructure that affects biological research pipelines, national cybersecurity posture, and how software gets built” — and it moved there fast enough that the social, regulatory, and emotional infrastructure hasn’t caught up.

The labs are doing their own version of catching up: restricted access programs, tiered verification, coordinated disclosure, partner coalitions. Whether that’s enough, or whether it’s the right shape of response, is a conversation the industry is having mostly with itself.

That conversation needs to get louder, involve more people, and produce clearer answers faster than it currently is. Because the capability isn’t waiting.

OpenAI’s Biggest Week Wasn’t About ChatGPT was originally published in Towards AI on Medium, where people are continuing the conversation by highlighting and responding to this story.