SpatialMosaic: A Multiview VLM Dataset for Partial Visibility
arXiv:2512.23365v2 Announce Type: replace
Abstract: The rapid progress of Multimodal Large Language Models (MLLMs) has unlocked the potential for enhanced 3D scene understanding and spatial reasoning. A recent line of work explores learning spatial re…