Kyosuke Takami, Yuka Tateisi, Satoshi Sekine, Yusuke Miyao

Human-Grounded Multimodal Benchmark with 900K-Scale Aggregated Student Response Distributions from Japan’s National Assessment of Academic Ability

Kyosuke Takami, Yuka Tateisi, Satoshi Sekine, Yusuke Miyao / May 13, 2026

arXiv:2605.11663v1 Announce Type: new
Abstract: Authentic school examinations provide a high-validity test bed for evaluating multimodal large language models (MLLMs), yet benchmarks grounded in Japanese K-12 assessments remain scarce. We present a mu…

Author name: Kyosuke Takami, Yuka Tateisi, Satoshi Sekine, Yusuke Miyao

Human-Grounded Multimodal Benchmark with 900K-Scale Aggregated Student Response Distributions from Japan’s National Assessment of Academic Ability