Could Frontier AI Researchers Collectively Slow the Race? A Conditional Pledge Mechanism

Overview

This is a project proposal and early research on the question of how and whether Frontier AI researchers (not companies themselves) might take on personal risk and pledge to conditionally pause AI development. I am looking for feedback on whether there’s a version of this that researchers might find palatable, and if so what the details might look like. I am especially interested in hearing from people with experience in frontier AI development or doing similar advocacy and outreach work. My specific questions are here.

Many AI researchers have concerns about the risks of continuing the race toward AGI and ASI capabilities, but continue to advance the research, for any number of reasons. The public may be making unwarranted inferences about the severity of risks and harms based on the fact that the researchers in the frontier labs, those who theoretically know the technology best, continue to work on it.

I hypothesize that at least part of the reason that researchers continue to work on technology they have concerns about is that they may feel powerless to change the trajectory on their own. I propose that, with the right mechanism in place, they could change the trajectory collectively. Here I propose a collective pledge mechanism where individual researchers would agree to temporarily step away from AI capability work for some number of months if enough of their peers across all frontier labs made the same pledge. The goal with such a work stoppage would be to temporarily slow the race and call public and governmental attention to the urgency of governing AI in a way that is consistent with how societies and governments deal with other significant risks to human safety and wellbeing.

My inspiration comes partly from the recent mirror life moratorium, where researchers in synthetic biology voluntarily agreed to halt a cool line of work they had pioneered, due to the potential for catastrophic risk they didn’t feel they could rule out, or control once the technology was created. I'm curious whether something analogous is possible in AI, despite many differences between the two communities.

This is not something I am able to spearhead myself at this time. I offer it as a possible mechanism for action to anyone who might wish to pick it up. And I’m seeking feedback to improve the proposal so that it's as useful as possible to the AI safety community, and as actionable as possible for anyone who might want to try to implement it, including possibly myself at a future time.

The risks and harms motivating a need for action

The 2023 AI Impacts survey showed that many top AI researchers believe that there is a non-negligible chance of “human inability to control future advanced AI systems causing human extinction or similarly permanent and severe disempowerment of the human species”. The same survey found even greater concern regarding other near-term catastrophic scenarios: 73% of respondents expressed substantial or extreme concern about the possibilities of both AI-enabled bioweapons and AI-assisted authoritarian control of populations. 79% expressed this level of concern regarding the manipulation of large-scale public opinion trends, and 86% for AI-enabled disinformation (e.g., via deepfakes).

Harms to humans from AI are already with us today. The AI Incident Database (AIID) collects reports of cases where AI systems have caused or nearly caused harm. In 2025, there were 362 reported incidents, up from 281 in 2024 and 168 in 2023. Reported harms include AI chatbots implicated in multiple teen suicides; algorithmic bias causing documented discrimination in hiring, lending, criminal justice and healthcare; and AI-generated disinformation being deployed across 30+ countries during their 2024 elections.

The proposed mechanism

Could frontier AI researchers organize a coordinated work stoppage, and would they want to? I believe that such an action would be best achieved in two steps:

A survey of frontier AI lab employees to ascertain whether there is any version of the following conditional pledge that they’d be willing to publicly sign on to, and if not, what are the barriers to participation:

If at least N employees from each of the frontier AI labs internationally sign this statement, I will agree to walk away from my work on AI model advancement for a period of at least K months.

The intent of “walk away from my work” is doing anything other than work to advance AI capabilities. This may look like taking vacation, a sabbatical, or unpaid leave. It may look like walking out of work against policy. Or it may look like showing up but refusing to work on anything but AI safety projects.

Assuming a palatable enough version of the pledge could achieve sufficient consensus, build a secure website to collect pledges. Pledgers would need to sign with their real names and verify their employment at one of the frontier labs; however, names would be held confidentially and only publicized if the trigger condition is met.

The website would publicly post the tallies of conditional pledgers at each of the organizations, with organization names anonymized (e.g., Org. A: 27, Org B: 98, Org C: 125, etc.). This would serve two purposes. The first would be providing a safe but public reflection of the number of researchers willing to take an action that would be personally and professionally costly in order to slow the race, which may have value even if the trigger condition is never met. The second would be a practical indicator to let pledgers know when the trigger condition is close to occurring, so that they could make plans accordingly.

My initial thought regarding which labs to count as “frontier AI labs internationally” is: OpenAI, Anthropic, Google DeepMind, Meta AI, xAI, Alibaba, and DeepSeek. There are some interesting questions regarding whether Western participants would still participate if we excluded the Chinese labs, and the extent to which Chinese researchers would be less willing/able to participate in something like this.

Theory of Change

At each stage of this project, there would be benefits that would potentially help us move toward a pause in AI development, even if the walkout condition is never achieved.

Survey

We know that researchers have safety concerns from prior belief surveys. This project’s proposed survey would go beyond belief surveys to investigate under what circumstances this concern is actionable. Ideally, it would identify a path forward for initiating the conditional pledge mechanism. But in any case it would identify key barriers to researcher action, which might then provide a possible strategy for AI safety organizations to work to remove these barriers.

Conditional Pledge

Even before triggering, the number of people willing to sign on to the conditional pledge could raise public and government awareness more strongly than prior open letters. It represents a public expression of researchers’ willingness to take a personally costly action to slow down capabilities research. That’s a stronger statement than just asserting that you believe research should be paused or slowed.

Triggered Halt

If the condition is met and a halt is triggered, this would temporarily slow AI capabilities advancement, creating a concrete window for AI safety research to advance, and buying a little more time for governments to act. It would strengthen the case for government action by strongly signaling that AI researchers are very concerned about AI risks and harms.

Conditional Pledge never initiated or triggered

If no path is found to a conditional pledge that a significant number of researchers would be willing to sign on to, or if the pledge is initiated but never triggered, this could strengthen the case for government action in a different way – by rebutting the argument that we should let the industry self-regulate with evidence that researchers are reluctant to act.

Technical and Logistical challenges

Implementing something like this would have several technical and logistical challenges. Here are the ones that I foresee being the most pressing:

How to get the proposal (or initial survey) in front of enough frontier lab employees
How to validate that survey respondents are frontier lab employees without alerting their employers to their participation, and without associating responses with individual identities
How to validate which frontier lab employs each pledger, in order to increment public counts, but without alerting employers or leaking the identity of any pledger until the trigger condition is met

I have some thoughts about each of these, but no real solutions. I may explore this further in a separate post. Ultimately I don’t think there is a zero-risk way to authenticate a work email address.

I also believe that the survey probably needs IRB (Institutional Review Board) approval. IRB review is a formal ethical oversight process required in the US for systematic research involving human subjects that is intended to produce generalizable, publishable findings. This survey probably requires this type of review because it collects information from potentially identifiable people about sensitive beliefs and intended behaviors, because participation may carry risks (particularly, to their continued employment), and because publishing results is part of our theory of change. An IRB would need answers to the questions about anonymization and data handling before it could proceed. As a practical matter, this means the survey would need an institutional partner such as a university, think tank, or research organization with an existing IRB.

A minor logistical issue: there probably needs to be a mechanism for someone to revoke a pledge if their circumstances change. In particular, they should revoke if they leave the lab they pledged as a member of.

Anticipated barriers to participation

There are any number of reasons why researchers might not be willing to sign on to such a pledge. In the wake of the Center for AI Safety Statement on AI Risk, Daniel Eth wrote a Medium post about why researchers who are concerned about extinction from ASI don’t just quit their jobs, such as believing that their work in particular is not dangerous, or that the benefits of AGI outweigh the risks. Many of those reasons could apply to an action like this one as well.

While this action doesn’t call for researchers to quit their jobs, there is certainly a risk that by walking away against their management’s wishes, they could lose their jobs. People may be understandably reluctant to take that risk for many reasons, including financial and career consequences, and personal reputation. Some may not have the savings or liquidity to even forgo their income for the duration of the walk out.

There are also potential visa consequences for the significant portion of researchers at American firms who are not US citizens. (Marco Polo’s Global AI Talent Tracker suggests that about 2/3 of the top AI talent in the US was educated elsewhere – mostly China and India. Some portion of them will be naturalized or have permanent resident status, but those on visas such as H1-B would need to find a new sponsor within 60 days of losing their sponsoring jobs.)

Researchers at the Chinese labs may face additional barriers beyond those shared with their Western counterparts. In addition to the professional and financial risks, they may be subject to greater government scrutiny. They operate within a regulatory and cultural environment where collective action of this kind has fewer precedents, and may face consequences that extend beyond job loss.

Finally, people may not be interested in signing on to such a pledge if they don’t believe it would have a real impact on improving AI outcomes.

Questions to the community

I welcome opinions, comments, and suggestions from the community. It would be helpful if commenters would state the context for their opinions, such as their relationship to the frontier AI research community.

Some specific questions for discussion:

Is the conditional pledge mechanism the right frame, or is there a better one?
Is this the right version of the conditional pledge, or should the condition be of a different shape?
What do you think is the most likely evidence you could receive in the future that would make you much more likely to make the pledge?
Is there a worthwhile version of this that doesn’t rely on Chinese researchers being willing/able to participate?
Which barriers do you think are most significant and hardest to address?
What support structures could make participation more realistic?
Do you have thoughts on how to resolve the technical/logistical challenges?
Is there an organization that might be interested and well-positioned to organize something like this?

Feel free to PM if you’d like to discuss off-the-record.

Discuss