Introduction
I posted a quick take a few days back claiming AI moratoriums won't work, and people seemed to really disagree with it. I honestly thought it was a commonly held opinion, but the more I engage with LessWrong, the more I realize the way I'm approaching things seems, at least to me, different.
So let me elaborate with a refined approach and a proper introduction. The problem of constraining AI development is a social problem and the way I believe we should treat society is as a sociophysical system determined by sets of natural societal laws. Society isn't a collection of people who collectively control what society does. Rather a society is a collection of people with qualified collective influence over society. My point summed up is there are things as a society which we simply can't do. Just like Jupiter can't suddenly start orbiting backwards due to the laws of physics, we can't change society in a manner that breaks the sociophysical dynamics of the system.
Think of this analogy: electrons are to physical systems as humans are to social systems. Electrons can behave very differently than how their encompassing physical system behaves. If we let the electron be part of say, a metal teaspoon, then if you throw the teaspoon against a wall it's going to collide and bounce off, probably with a loud jarring noise. But the electron, being a quantum particle, has a not insignificant probability of quantum tunneling through a thin wall. This is because electrons exist as complex probability clouds. The teaspoon can't quantum tunnel through the wall, it's restrained by physical laws of motion which exist at the macroscopic scale only.
Its the same with society. Humans are like electrons, existing with completely different governing laws than the larger object they compose. Assuming a simple relationship of humans being "in control", or even assuming some kind of human-society feedback loop, misses the point. These two things exist in different worlds altogether. Some macroscopic properties of society are not influenceable, as they exist as properties of what a society is.
An Ad-Hoc Model
Now let me get my hands dirty and justify my position on AGI moratoriums with a model of my own design created to get my point across formally. I worked on this over the course of maybe 2 days, so there's bound to be a mistake somewhere.
Generalizing, define a irreversible social rupture as a society-level event which causes a disregard of past social norms, laws, and customs. It is an event from which there is no going back. If such an event happens society is altered forever. There are endless examples:
- Alexander The Great's conquest of the mid-east
- Suez Canal Crisis in 1956
- The Invention of Gunpowder
- AGI
- Leibniz creating calculus (his notation is better so fight me Newton fans)
- etc.
Each of these events caused society to shift so significantly that returning to society pre-event was impossible. These are akin to irreversible processes from physics. You can't unbreak an egg just like you can't give the British back their empire.
Further, these events were instantiated sometimes by individuals, but most often by organizations. This is the viewpoint I will take for this model, using organizations as the atom, instead of individual people. Organizations are collections of people with clearly defined goals and methods, such that they persist beyond the individual members that make them up. An organization can still exist 100 years later, even if everyone who founded it is long dead.
Laying the Groundwork
Consider
We can break
Now let each organization choose to pursue social ruptures via an expected reward function
Where
is the reward for causing a social rupture is the cost of attempting and failing to cause a social rupture is the expected reward/cost from allowing another organization to cause the rupture first
Note
You can picture
Then
And we get
Likewise
We can now analyze dynamics. Assume we have an organization who is pursuing a social rupture. Then it's
And the problem reveals itself: as an organization grows in ability
An Example: Competition vs Cooperation
Lets consider two societies filled with organizations. Society A is one of ruthless competition, with the only way to benefit being to step on others. Society B is the exact opposite. These two configurations control the sign of
with organizations in society A having
Take organization
becomes
Since
For society
Note On Future Work
I'm going to keep playing around with the model to see what falls out. If I find anything striking I'll make another post documenting more examples.
Conclusion
We live in a mixed society, where some organizations experience net cooperation and others net competition. However when we look at the reality of the few large AI companies in a zero-sum race, it seems we have subsocieties that look a lot like society A from the previous example. The incentive to reach AGI first is massive, and some AI companies have already demonstrated cut-throat business practices, indicating they experience net competition. From this the model paints a dark picture depending on how close we actually are to an AI system which can cause an irreversible society-level event.
Now: AGI moratoriums. If I think they are impossible at this point then what do we do? We follow the sociophysical dynamics towards a branching point at which point we as humans do have the ability to affect the outcome. Think of it like a high entropy partition of the societal configuration space: the system is at an unstable point and can go in different directions. So, we lean into AGI research, hard, but take different approaches. LLM control is a dead end, we need a shift in perspective that recognizes we can't control LLMs as they are before one is created that can cause a society-level event from which there is no going back (looking at you Mythos). Instead divert funding towards, and in my opinion further subsidize, AGI research that promotes interpretability first. This maintains and strengthens the flow of capital, preventing economic collapse, while giving us a better shot at not going extinct.
We have to lean into the physics, not fight it.
Let me know if I made a mistake or a wrong turn somewhere. I'm open to changing my mind as well.
Discuss