Shuitsu Koyama, Yuiga Wada, Daichi Yashima, Komei Sugiura

MLLM-as-a-Judge Exhibits Model Preference Bias

Shuitsu Koyama, Yuiga Wada, Daichi Yashima, Komei Sugiura / April 14, 2026

arXiv:2604.11589v1 Announce Type: new
Abstract: Automatic evaluation using multimodal large language models (MLLMs), commonly referred to as MLLM-as-a-Judge, has been widely used to measure model performance. If such MLLM-as-a-Judge methods were biase…

Author name: Shuitsu Koyama, Yuiga Wada, Daichi Yashima, Komei Sugiura

MLLM-as-a-Judge Exhibits Model Preference Bias