Junhao Dong, Yifei Zhang, Hao Zhu, Yew-Soon Ong, Piotr Koniusz

Hierarchically Robust Zero-shot Vision-language Models

Junhao Dong, Yifei Zhang, Hao Zhu, Yew-Soon Ong, Piotr Koniusz / April 22, 2026

arXiv:2604.18867v1 Announce Type: cross
Abstract: Vision-Language Models (VLMs) can perform zero-shot classification but are susceptible to adversarial attacks. While robust fine-tuning improves their robustness, existing approaches align fixed text e…

Author name: Junhao Dong, Yifei Zhang, Hao Zhu, Yew-Soon Ong, Piotr Koniusz

Hierarchically Robust Zero-shot Vision-language Models