cs.CV

SpatialMosaic: A Multiview VLM Dataset for Partial Visibility

arXiv:2512.23365v2 Announce Type: replace
Abstract: The rapid progress of Multimodal Large Language Models (MLLMs) has unlocked the potential for enhanced 3D scene understanding and spatial reasoning. A recent line of work explores learning spatial re…