Junha Song, Sangdoo Yun, Dongyoon Han, Jaegul Choo, Byeongho Heo

RL makes MLLMs see better than SFT

Junha Song, Sangdoo Yun, Dongyoon Han, Jaegul Choo, Byeongho Heo / April 14, 2026

arXiv:2510.16333v2 Announce Type: replace-cross
Abstract: A dominant assumption in Multimodal Language Model (MLLM) research is that its performance is largely inherited from the LLM backbone, given its immense parameter scale and remarkable capabilit…

Author name: Junha Song, Sangdoo Yun, Dongyoon Han, Jaegul Choo, Byeongho Heo

RL makes MLLMs see better than SFT