cs.CL

System-Mediated Attention Imbalances Make Vision-Language Models Say Yes

arXiv:2601.12430v2 Announce Type: replace
Abstract: Vision-language model (VLM) hallucination is commonly linked to imbalanced allocation of attention across input modalities: system, image and text. However, existing mitigation strategies tend toward…