cs.AI, cs.CV

When Seeing Overrides Knowing: Disentangling Knowledge Conflicts in Vision-Language Models

arXiv:2507.13868v2 Announce Type: replace
Abstract: Vision-language models (VLMs) increasingly combine visual and textual information to perform complex tasks. However, conflicts between their internal knowledge and external visual input can lead to h…