Machine Learning

When Should AI Step Aside?: Teaching Agents When Humans Want to Intervene

Recent advances in large language models (LLMs) have enabled AI agents to perform increasingly complex tasks in web navigation. Despite this progress, effective use of such agents continues to rely on human involvement to correct misinterpretations or adjust outputs that diverge from their preferences. However, current agentic systems lack an understanding of when and why humans intervene. As a result, they might overlook user needs and proceed incorrectly, or interrupt users too frequently with unnecessary confirmation requests.  This blogpost is based on our recent work — Modeling Distinct Human Interaction in Web Agents — where we shift the focus from autonomy to collaboration. Instead of optimizing agents solely for an end-to-end autonomous pipeline, we ask: Can agents anticipate when humans are likely to intervene? CowCorpus: Learning from Real Interaction To formulate this task, we collect CowCorpus – a novel dataset of interleaved human and agent action trajectories. Compared to existing datasets comprising either only agent trajectory or human trajectory, CowCorpus captures the collaborative task execution by a team of a human and an agent. In total, CowCorpus has: 400 real human–agent web sessions 4,200+ interleaved actions Step-level annotations of intervention moments We curate CowCorpus from 20 real-world users using CowPilot, […]