Willful Disobedience: Automatically Detecting Failures in Agentic Traces
arXiv:2603.23806v1 Announce Type: cross
Abstract: AI agents are increasingly embedded in real software systems, where they execute multi-step workflows through multi-turn dialogue, tool invocations, and intermediate decisions. These long execution his…