Fanpu Cao, Xin Zou, Xuming Hu, Hui Xiong

When Looking Is Not Enough: Visual Attention Structure Reveals Hallucination in MLLMs

Fanpu Cao, Xin Zou, Xuming Hu, Hui Xiong / May 13, 2026

arXiv:2605.11559v1 Announce Type: new
Abstract: Multimodal large language models (MLLMs) have become a key interface for visual reasoning and grounded question answering, yet they remain vulnerable to visual hallucinations, where generated responses c…

Author name: Fanpu Cao, Xin Zou, Xuming Hu, Hui Xiong

When Looking Is Not Enough: Visual Attention Structure Reveals Hallucination in MLLMs