From Controlled to the Wild: Evaluation of Pentesting Agents for the Real-World
arXiv:2605.10834v1 Announce Type: new
Abstract: AI pentesting agents are increasingly credible as offensive security systems, but current benchmarks still provide limited guidance on which will perform best in real-world targets. Existing evaluation p…