Chengxin Liu, Wonseok Choi, Chenshuang Zhang, Tae-Hyun Oh

Aligning What Vision-Language Models See and Perceive with Adaptive Information Flow

Chengxin Liu, Wonseok Choi, Chenshuang Zhang, Tae-Hyun Oh / April 20, 2026

arXiv:2604.15809v1 Announce Type: new
Abstract: Vision-Language Models (VLMs) have demonstrated strong capability in a wide range of tasks such as visual recognition, document parsing, and visual grounding. Nevertheless, recent work shows that while V…

Author name: Chengxin Liu, Wonseok Choi, Chenshuang Zhang, Tae-Hyun Oh

Aligning What Vision-Language Models See and Perceive with Adaptive Information Flow