Author name: Sua Lee, Sanghee Park, Jinbae Im

MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge

Sua Lee, Sanghee Park, Jinbae Im / April 22, 2026

arXiv:2604.18164v2 Announce Type: replace-cross
Abstract: Multimodal Large Language Models (MLLMs) have been increasingly used as automatic evaluators-a paradigm known as MLLM-as-a-Judge. However, their reliability and vulnerabilities to biases remain…

cs.AI, cs.CL, cs.CV

MM-JudgeBias: A Benchmark for Evaluating Compositional Biases in MLLM-as-a-Judge

Sua Lee, Sanghee Park, Jinbae Im / April 21, 2026

arXiv:2604.18164v1 Announce Type: new
Abstract: Multimodal Large Language Models (MLLMs) have been increasingly used as automatic evaluators-a paradigm known as MLLM-as-a-Judge. However, their reliability and vulnerabilities to biases remain underexpl…