cs.AI, cs.CV

Glance-or-Gaze: Incentivizing LMMs to Adaptively Focus Search via Reinforcement Learning

arXiv:2601.13942v2 Announce Type: replace-cross
Abstract: Large Multimodal Models (LMMs) have achieved remarkable success in visual understanding, yet they struggle with knowledge-intensive queries involving long-tail entities or evolving information …