HELM: Harness-Enhanced Long-horizon Memory for Vision-Language-Action Manipulation
arXiv:2604.18791v1 Announce Type: new
Abstract: Vision-Language-Action (VLA) models fail systematically on long-horizon manipulation tasks despite strong short-horizon performance. We show that this failure is not resolved by extending context length …