I want to start with this provocative claim: Stopping AI is easier than regulating AI.
I often hear people say “Stopping is too hard, so we should do XYZ instead” where XYZ is some other form of regulation, such as mandating safety testing. It seems like the purpose of doing safety testing would be to stop building AIs if we can’t get them to pass the tests, so unless that’s not the purpose, or proponents are confident that we can get them to pass the tests (and hopefully also confident that the test work, which they quite likely do not…), this particular idea doesn’t make a lot of sense. But people might in general think that we can instead regulate the way AI is used or something like that.
Thanks for reading The Real AI! Subscribe for free to receive new posts and support my work.
But I think this line of argument gets it exactly backwards. Stopping AI is easier than regulating it.
Why? Well let’s dive in. First, I need to explain what I mean…
I mean, specifically, that stopping AI is an easier way to reduce the risks from AI to an acceptable level that other approaches to regulating AI.
The way I imagine stopping AI is actually a particular form of regulating AI, specifically via an international treaty along the lines of Systematically Dismantling the AI Compute Supply Chain.
Also, when I say “it’s easier”, what do I mean? Well, there are a few ways in which stopping is hard. I’d separate technical and incentive challenges from political challenges, and I’m setting aside political challenges, because I think we should be clear about what should happen and why, and then seek to accomplish it politically.
Besides politics, the main underlying issue preventing meaningful AI regulation is international competition, especially between the US and China.1 So basically, I mean stopping AI is the most effective way to address this key barrier to international cooperation, which is necessary to reduce AI risks to an acceptable level.
I believe in the fundamentals of AI, and I believe alignment is not doomed, so I believe that AI could indeed end up giving one nation or company control over the future. It’s still not clear that it’s rational to race to build AI, given the risks involved. But it does seem hard for me to imagine a stable situation where governments aren’t confident their adversaries aren’t building super powerful AI in secret.
Proposals to govern super powerful AI internationally while still building it suffer from a bunch of challenges that stopping it doesn’t. But basically, approaches that instead try to regulate development or use of AI to ensure it is safe and beneficial are harder to monitor and enforce, and hence more likely to fail.
Challenges
Let’s consider a hypothetical agreement between the US and China (leaving out the other countries for simplicity), and consider some of these challenges in detail.
Monitoring hardware
Suppose you have an agreement that allows AI to proceed in some particular “authorized” directions. How do you verify compliance? This basically boils down to: How can you be sure that no significant fraction of the world’s computer power is being used in unauthorized ways? This seems hard for a few reasons:
How can you be sure you know where all the computer chips are? This is a problem in any case, but it’s more of a problem if you keep making more computer chips, and you keep around the factories that make the chips. Right now, we know where a ton of the chips are -- they’re in data centers, which are easy to spot. But what’s to stop countries from secretly siphoning off some chips here and there? Or making a secret factory to produce more secret chips? We can certainly try and monitor for such things, but there’s an ongoing risk of failure. What happens when a shipment of chips go missing unexpectedly? If the US (e.g.) actually lost them (and wasn’t secretly using them), China would have to trust that that is the case, or the agreement might collapse. In general, whenever monitoring breaks down, the “enforcement clock” starts ticking, where enforcement could easily and quickly escalate to war.
As chip manufacturing technology advances and it becomes easier and easier to build or acquire a dangerous amount of computer power, it also becomes harder and harder to be sure that nobody has done so secretly.
We need to agree on which uses of the computer power, i.e. which computations, are and are not authorized.
One solution commonly proposed is a whitelist allowing existing AI models to be used, but prohibiting further training that would make AI more powerful. Note that this is now essentially a form of stopping AI, but it’s not clear if it goes far enough.
One problem with this is that it’s possible to use AIs to drive AI progress, even if you never “train” them, e.g. by automating research into developing better tools and ways of using the AI in combination with other tools. If we analogize the AI to a person: You could make that person vastly more powerful by giving them new tools and instruction manuals, even if you don’t teach them new concepts.
We could try to further restrict which queries of AI are authorized. But it seems possible to decompose arbitrary queries into authorized queries, and it might be easier to hide this activity than to detect it.
The problem is much harder if you need to continuously update the list of authorized computations, or wish to use a blacklist instead of a whitelist. Then you get into the problem of agreeing on standards.
Agreeing on standards
If we move away from a static whitelist of authorized computations, we then need a process for determining which computations should be authorized. This is hard for a few reasons:
There is still a lot of technical uncertainty about how to do AI assurance to a high standard. Testing AI systems is largely a matter of vibes. For instance, there is no suite of tests where, if an AI passed those tests, we could conclude it was not going to “go rogue”.
In addition, for any particular test(s), AIs can be designed specifically to fool those tests. So both the US and China have an incentive to use tests that maximally advantage their AI. One solution here might be to ensure that AI passes all the tests proposed by either side, but again, it might be easier to fool the tests the other side runs -- even without advanced knowledge of which tests those would be -- than to create a reliable set of tests that cannot be fooled.
The US and China might have very different standards for what they consider to be “safe”, e.g. due to differences in values and priorities. I expect that such disagreements could be resolved, but they still create an extra challenge that could stall or sink negotiations.
In general, every point which requires some element of subjective judgment and negotiation is a potential point of failure.
Violations of the agreement
It’s going to be easier to violate the agreement if there are a bunch of AIs and AI chips around that are being used according to the agreement. You just say “we’re done with this treaty”, and then start doing whatever you want with the ones you control. There are proposals to make it technically difficult to use AI chips in ways that aren’t authorized, but they aren’t mature or tested, and it’s likely that the US and/or China could find ways to subvert such controls.
Once a violation occurs, the other side might need to intervene rapidly to protect themselves. In the current paradigm, training a new, more powerful AI might take months, but that’s not a comfortable amount of time for resolving a tense international security dispute. And if all that’s required to be a threat is for an adversary to “fine-tune” an existing AI, or use it in an unauthorized way, lead time might be measured in days -- or even seconds.
On the other hand, if the infrastructure needed to build dangerous AI systems does not exist in any form, and a violator would need to build up the compute supply chain again, this would probably give other parties years to negotiate an arrangement that undoes the violation and doesn’t involve war.
Conclusion
Summing things up, If you are concerned that stopping AI altogether might be too hard to enforce, you should only expect alternative approaches to international governance to be harder. From this point of view, alternative approaches add unnecessary complexity and fail the KISS (”Keep it simple, stupid”) design principle. They may provide more of an opportunity to capture benefits of AI, but this doesn’t matter if they aren’t actually workable. If you believe international governance of AI is needed to reduce the risk to an acceptable level, the coherent points of view available seem to be:
We cannot regulate AI internationally in any substantive way.
Stopping AI is possible and would reduce the risk to an acceptable level, but this is also true of more nuanced approaches that allow us to capture more of the benefits.
Stopping AI is the only way to reduce the risk to an acceptable level.
I’m not sure which of these is right, but my money is on (3). Note that “Stopping AI is too hard, we need to regulate it in a different way instead” is not on the list.
But this is also often used, politically, as an argument for why pausing is impossible. And this means that addressing this concern is also a big way to address the political barriers to pausing.
Discuss