cs.AI

Evaluating Strategic Reasoning in Forecasting Agents

arXiv:2604.26106v1 Announce Type: new
Abstract: Forecasting benchmarks produce accuracy leaderboards but little insight into why some forecasters are more accurate than others. We introduce Bench to the Future 2 (BTF-2), 1,417 pastcasting questions wi…