cs.CV, cs.MM

PDA: Text-Augmented Defense Framework for Robust Vision-Language Models against Adversarial Image Attacks

arXiv:2604.01010v1 Announce Type: new
Abstract: Vision-language models (VLMs) are vulnerable to adversarial image perturbations. Existing works based on adversarial training against task-specific adversarial examples are computationally expensive and …