Structural Enforcement of Goal Integrity in AI Agents via Separation-of-Powers Architecture
arXiv:2604.23646v1 Announce Type: new
Abstract: Recent evidence suggests that frontier AI systems can exhibit agentic misalignment, generating and executing harmful actions derived from internally constructed goals, even without explicit user requests…