Information Router for Mitigating Modality Dominance in Vision-Language Models
arXiv:2604.16264v1 Announce Type: cross
Abstract: Vision Language models (VLMs) have demonstrated strong performance across a wide range of benchmarks, yet they often suffer from modality dominance, where predictions rely disproportionately on a singl…