I see a lot of AI safety strategies that don’t fully engage with the complexity of the real world—and therefore are unlikely to succeed in the real world.
To take a simple example: many strategies rely heavily on government playing a leading role through regulation and perhaps even nationalization. That’s a reasonable strategy in the abstract, but the recent conflict between DoW and Anthropic raises serious questions about the real-world viability of that approach. Too many people are stuck thinking about some idealized government they’d like to have, rather than the government we actually have in 2026.
My thinking about AI safety strategy is anchored by six foundational beliefs about the world in which that strategy has to operate:
- Timelines are probably short
- Many open questions have been resolved
- The future is high variance
- We need a portfolio of strategies
- It’s all about the game theory
- Expect tough tradeoffs
1: Timelines are probably short
I believe in short timelines, which drives many of my beliefs about safety strategy. For the sake of this article, I’m going to go with Daniel Kokotajlo’s most recent timeline:
- 25% chance of AGI by the end of 2027, and 50% by the end of 2029
- 50% chance of superintelligence by the end of 2030
Humanity’s fate will likely be sealed—for better or for worse—no later than the arrival of superintelligence. There is a substantial chance that the decisions that determine humanity’s future will be made within the next 4 years.
It follows that there is great urgency to choosing and implementing a strategy, both at a personal and a global level. A less obvious consequence of short timelines is that we now know a great deal about the world in which the AI transition will occur.
2: Many open questions have been resolved
Ten years ago, many questions about AI strategy and governance were necessarily abstract. It was useful, in those days, to ask “when America navigates the AI transition, what role should government take and what is the purview of the private sector?”
In 2026, that conversation is much more concrete: “what AI decisions are best made by the Trump administration, and what decisions should be left to Dario Amodei and Sam Altman?”
Given short timelines, all of the following are likely to be true during the development of AGI:
- The US will be governed by the Trump administration
- China will be governed by Xi Jinping
- AGI will be developed by Anthropic, OpenAI, or Google DeepMind.
- The rules-based international order will be essentially non-functional
- International trust and cooperation will be at a generational low
- In the US, AI politics will be heavily entangled with populism, distrust of big tech, and concern about jobs
3: The future is high variance
We know a lot about the world, but it is simultaneously true that the future is high variance:
- China may or may not invade Taiwan, disrupting America’s main source of new compute
- If China invades Taiwan, the US and China may or may not engage in a shooting war in the Western Pacific
- The US may be led by a president with an iron grip on power, or one who is crippled by an antagonistic congress controlled by the opposition
- The US government may or may not try to destroy America’s leading AI lab
- The US and Europe may or may not be allies
Each of those contingencies has profound implications for AI strategy, and each one is highly unpredictable. It is therefore not possible to come up with a single, fixed plan that will work well in all possible future worlds.
4: We need a portfolio of strategies
An international treaty to pause AI development might be a great option in some worlds, but isn’t a realistic option if the US and China are at war. And a plan to nationalize AI might be feasible if the Republicans keep control of congress in the midterms, but not if congress is at loggerheads with the executive branch.
In a simpler world, it might be possible to devise the One True Plan that would guarantee humanity’s survival no matter what. That isn’t possible in this world: there are simply too many unknowns. We therefore need to develop and pursue a portfolio of different strategies. Some strategies (like greater transparency requirements for AI labs) will be useful in many possible futures, while others (designing verification protocols for a pause treaty) will be vital in some futures but irrelevant in others.
Naturally, different people and organizations will have different areas of expertise, and will choose to focus in different places. That diversity is vital for maximizing our chance of success no matter what the next few years throw at us.
5: It’s all about the game theory
Compared to the naive vision many of us had 10 or 20 years ago, AGI will come of age in a complex political landscape. Multiple countries, companies, and individuals have key decision-making roles, and many of them are driven by complex motivations that do not necessarily prioritize humanity’s long-term flourishing.
For example: Donald Trump and Xi Jinping are both old enough that their personal chance of survival is likely maximized by proceeding quickly to AGI (and therefore longevity medicine), even if doing so entails a significant risk of human extinction. Any attempt to pause AI development needs to contend with the fact that for those two key actors, a significant pause might be a death sentence.
Any useful strategy needs to fully engage with that challenging reality. It isn’t enough to have a plan that would guarantee humanity’s survival if everyone adopts it: you need to have a robust strategy for ensuring the key actors are motivated to enact your plan.
6: Expect to make hard tradeoffs
I don’t love this, but that doesn’t mean it isn’t true:
- Any plan that entails a significant risk of human extinction is a bad plan
- There are no feasible plans that do not entail a significant risk of human extinction
- Therefore, our assigned task is to pick the least bad plan from the available options
So what now?
Everything I’ve said here is compatible with a wide range of strategies—my purpose today is not to champion a specific strategy, but simply to establish a baseline of engagement with reality that any serious strategy ought to meet.
Discuss