Aharon Azulay, Jan Dubi\'nski, Zhuoyun Li, Atharv Mittal, Yossi Gandelsman

Jailbreaking Vision-Language Models Through the Visual Modality

Aharon Azulay, Jan Dubi\'nski, Zhuoyun Li, Atharv Mittal, Yossi Gandelsman / May 4, 2026

arXiv:2605.00583v1 Announce Type: cross
Abstract: The visual modality of vision-language models (VLMs) is an underexplored attack surface for bypassing safety alignment. We introduce four jailbreak attacks exploiting the vision component: (1) encoding…

Author name: Aharon Azulay, Jan Dubi\'nski, Zhuoyun Li, Atharv Mittal, Yossi Gandelsman

Jailbreaking Vision-Language Models Through the Visual Modality