cs.AI, cs.CV

Good Scores, Bad Data: A Metric for Multimodal Coherence

arXiv:2603.25924v1 Announce Type: new
Abstract: Multimodal AI systems are evaluated by downstream task accuracy, but high accuracy does not mean the underlying data is coherent. A model can score well on Visual Question Answering (VQA) while its input…