Beyond Standard Benchmarks: A Systematic Audit of Vision-Language Model’s Robustness to Natural Semantic Variation Across Diverse Tasks
arXiv:2604.04473v1 Announce Type: new
Abstract: Recent advances in vision-language models (VLMs) trained on web-scale image-text pairs have enabled impressive zero-shot transfer across a diverse range of visual tasks. However, comprehensive and indepe…