Chat-Scene++: Exploiting Context-Rich Object Identification for 3D LLM
arXiv:2603.27507v2 Announce Type: replace
Abstract: Recent advancements in multi-modal large language models (MLLMs) have shown strong potential for 3D scene understanding. However, existing methods struggle with fine-grained object grounding and cont…