MachineLearning

Adversarial testing of Minimus OpenClaw: agent discovered and exploited its own tool documentation to escape sandbox, modify production config, and contact real users [R]

We ran 635 security tests against a hardened AI gateway (Minimus OpenClaw). Sandbox on. Tool restrictions configured. Access controls in place. 131 tests failed. Then it got worse. The agent read its own documentation, found a parameter that let it run…