Don’t Blink: Evidence Collapse during Multimodal Reasoning
arXiv:2604.04207v1 Announce Type: new
Abstract: Reasoning VLMs can become more accurate while progressively losing visual grounding as they think. This creates task-conditional danger zones where low-entropy predictions are confident but ungrounded, a…