cs.RO

Embodied Interpretability: Linking Causal Understanding to Generalization in Vision-Language-Action Models

arXiv:2605.00321v1 Announce Type: new
Abstract: Vision-Language-Action (VLA) policies often fail under distribution shift, suggesting that decisions may depend on spurious visual correlations rather than task-relevant causes. We formulate visual-actio…