Jindi Guo, Xi Fang, Chaozheng Huang

Can MLLMs “Read” What is Missing?

Jindi Guo, Xi Fang, Chaozheng Huang / April 24, 2026

arXiv:2604.21277v1 Announce Type: new
Abstract: We introduce MMTR-Bench, a benchmark designed to evaluate the intrinsic ability of Multimodal Large Language Models (MLLMs) to reconstruct masked text directly from visual context. Unlike conventional qu…

Author name: Jindi Guo, Xi Fang, Chaozheng Huang

Can MLLMs “Read” What is Missing?