Jailbreaking Vision-Language Models Through the Visual Modality
arXiv:2605.00583v1 Announce Type: cross
Abstract: The visual modality of vision-language models (VLMs) is an underexplored attack surface for bypassing safety alignment. We introduce four jailbreak attacks exploiting the vision component: (1) encoding…