theory uplift differentially benefits safety & is underleveraged

[1] We will likely have near-superhuman mathematics AI by Q1 2027.^[1]

[2] Qualitatively, AI mathematics capabilities are developing significantly faster than automated AI R&D capabilities.^[2]

[3] Thus, we will likely have a period of time where the rate of our ability to rigorously & usefully verify and understand model behavior and model outputs outpaces the rate of capability development itself.

[4] Our ability to take advantage of this period is bottlenecked on the quality of our specification generation infrastructure, elicitation tooling (for proofs & specs etc.), and the institutional capacity for scaling useful outputs with capital.

[5] My understanding is that basically no one^[3] is working on building infra that can usefully turn >$100M of compute credits into safety-relevant mathematical output.

[5.1] The number of theory-driven ASI alignment efforts is also small, but growing. ARC is a much better bet now than it was in 2023.

[5.2]. My understanding is also that no one is working on developing AI-powered conceptual tooling infrastructure for tackling problems in, for instance, metaphilosophy. This is a much harder problem.

[6] In worlds where alignment is easy, prosaic methods may scale. In worlds where alignment is harder than easy, guarantees are more wanted. Hard worlds are less hard now because guarantees are easier. This is incredible and should be leveraged.

^{^}
Concretely: I expect an OAI model to serve as a more efficient proof oracle for ~any well-specified mathematics problem (bar perhaps a singular expert) by Q1 2027.
^{^}
AI Futures has as medians for automated coder and superhuman AI researcher as June and December 2028 respectively.
^{^}
I mean this in both the absolute sense and the relative sense.

Who comes to mind: Convergent is incubating a FRO focused on the specification generation problem, Theorem is working on formal verification, ARIA is funding formal methods broadly, Fulcrum is working on elicitation.

5.5 Pro estimates total investment in these orgs to be ~$105-145M, $79M of which was the Safeguarded AI mandate. Given we're expecting $37-100B/yr in philanthropic funding imminently, this is minuscule.

(The taut constraints are institutional capacity & talent, not capital).

Discuss

Leave a Comment