Biao Wu, Yiwu Zhong, Meng Fang, Ling Chen

DOSE: Data Selection for Multi-Modal LLMs via Off-the-Shelf Models

Biao Wu, Yiwu Zhong, Meng Fang, Ling Chen / April 21, 2026

arXiv:2604.16979v1 Announce Type: new
Abstract: High-quality and diverse multimodal data are essential for improving vision-language models (VLMs), yet existing datasets often contain noisy, redundant, and poorly aligned samples. To address these prob…

Author name: Biao Wu, Yiwu Zhong, Meng Fang, Ling Chen

DOSE: Data Selection for Multi-Modal LLMs via Off-the-Shelf Models