NC2.5 ↔ HORIZON: On the Structural Reducibility of Long-Horizon Agent Failures to a Single…

The HORIZON benchmark (arXiv:2604.11978) documents empirical failures of long-horizon agentic systems across four cognitive domains. NC2.5…