Containment Verification: AI Safety Guarantees Independent of Alignment
arXiv:2605.09045v1 Announce Type: new
Abstract: Agentic frameworks are the software layer through which AI agents act in the world. Existing safety methods intervene on the model and therefore remain conditional on unverifiable properties of learned b…