Rui Yang Tan, Yujia Hu, Roy Ka-Wei Lee

Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models

Rui Yang Tan, Yujia Hu, Roy Ka-Wei Lee / April 24, 2026

arXiv:2603.21697v2 Announce Type: replace-cross
Abstract: Multimodal Large Language Models (MLLMs) extend text-only LLMs with visual reasoning, but also introduce new safety failure modes under visually grounded instructions. We study comic-template j…

Author name: Rui Yang Tan, Yujia Hu, Roy Ka-Wei Lee

Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models