cs.CL, cs.HC

Believing without Seeing: Quality Scores for Contextualizing Vision-Language Model Explanations

arXiv:2509.25844v3 Announce Type: replace
Abstract: When people query Vision-Language Models (VLMs) but cannot see the accompanying visual context (e.g. for blind and low-vision users), augmenting VLM predictions with natural language explanations can…