DeepThinkVLA: Enhancing Reasoning Capability of Vision-Language-Action Models
arXiv:2511.15669v2 Announce Type: replace-cross
Abstract: Does Chain-of-Thought (CoT) reasoning genuinely improve Vision-Language-Action (VLA) models, or does it merely add overhead? Existing CoT-VLA systems report limited and inconsistent gains, yet …