The Persistent Vulnerability of Aligned AI Systems
arXiv:2604.00324v1 Announce Type: cross
Abstract: Autonomous AI agents are being deployed with filesystem access, email control, and multi-step planning. This thesis contributes to four open problems in AI safety: understanding dangerous internal comp…