cs.AI, cs.CV

SIEVES: Selective Prediction Generalizes through Visual Evidence Scoring

arXiv:2604.25855v1 Announce Type: new
Abstract: Multimodal large language models (MLLMs) achieve ever-stronger performance on visual-language tasks. Even as traditional visual question answering benchmarks approach saturation, reliable deployment requ…