Visual Reasoning through Tool-supervised Reinforcement Learning
arXiv:2604.19945v1 Announce Type: new
Abstract: In this paper, we investigate the problem of how to effectively master tool-use to solve complex visual reasoning tasks for Multimodal Large Language Models. To achieve that, we propose a novel Tool-supe…