Yi Chen, Yuying Ge, Hui Zhou, Mingyu Ding, Yixiao Ge, Xihui Liu

DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA

Yi Chen, Yuying Ge, Hui Zhou, Mingyu Ding, Yixiao Ge, Xihui Liu / April 1, 2026

arXiv:2603.29844v1 Announce Type: cross
Abstract: The development of Vision-Language-Action (VLA) models has been significantly accelerated by pre-trained Vision-Language Models (VLMs). However, most existing end-to-end VLAs treat the VLM primarily as…

Author name: Yi Chen, Yuying Ge, Hui Zhou, Mingyu Ding, Yixiao Ge, Xihui Liu

DIAL: Decoupling Intent and Action via Latent World Modeling for End-to-End VLA