cs.RO

LongBench: Evaluating Robotic Manipulation Policies on Real-World Long-Horizon Tasks

arXiv:2604.16788v1 Announce Type: new
Abstract: Robotic manipulation policies often degrade over extended horizons, yet existing benchmarks provide limited insight into why such failures occur. Most prior benchmarks are either simulation-based or repo…