CoLVR: Enhancing Exploratory Latent Visual Reasoning via Contrastive Optimization
arXiv:2605.08802v2 Announce Type: replace
Abstract: Due to the potential for exploratory reasoning of Latent Visual Reasoning, recent works tend to enable MLLMs (Multimodal Large Language Models) to perform visual reasoning by propagating continuous h…