cs.AI

AutomationBench

arXiv:2604.18934v1 Announce Type: new
Abstract: Existing AI benchmarks for software automation rarely combine cross-application coordination, autonomous API discovery, and policy adherence. Real business workflows demand all three: a single task may s…