cs.CV

Beyond Shortcuts: Mitigating Visual Illusions in Frozen VLMs via Qualitative Reasoning

arXiv:2604.26250v1 Announce Type: new
Abstract: While Vision-Language Models (VLMs) have achieved state-of-the-art performance in general visual tasks, their perceptual robustness remains remarkably brittle when confronted with optical illusions. Thes…