cs.CL, cs.CV

Forest Before Trees: Latent Superposition for Efficient Visual Reasoning

arXiv:2601.06803v2 Announce Type: replace-cross
Abstract: While Chain-of-Thought empowers Large Vision-Language Models with multi-step reasoning, explicit textual rationales suffer from an information bandwidth bottleneck, where continuous visual deta…