Glance-or-Gaze: Incentivizing LMMs to Adaptively Focus Search via Reinforcement Learning
arXiv:2601.13942v2 Announce Type: replace-cross
Abstract: Large Multimodal Models (LMMs) have achieved remarkable success in visual understanding, yet they struggle with knowledge-intensive queries involving long-tail entities or evolving information …