Pay Less Attention to Function Words for Free Robustness of Vision-Language Models
arXiv:2512.07222v4 Announce Type: replace
Abstract: To address the trade-off between robustness and performance for robust VLM, we observe that function words could incur vulnerability of VLMs against cross-modal adversarial attacks, and propose Funct…