Khanh Linh Nguyen, Hoa Nghiem, Tu Tran

TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol

Khanh Linh Nguyen, Hoa Nghiem, Tu Tran / April 7, 2026

arXiv:2604.03968v1 Announce Type: cross
Abstract: AI control protocols use monitors to detect attacks by untrusted AI agents, but standard single-score monitors face two limitations: they miss subtle attacks where outputs look clean but reasoning is o…

Author name: Khanh Linh Nguyen, Hoa Nghiem, Tu Tran

TraceGuard: Structured Multi-Dimensional Monitoring as a Collusion-Resistant Control Protocol