cs.CV

EgoMind: Activating Spatial Cognition through Linguistic Reasoning in MLLMs

arXiv:2604.03318v1 Announce Type: new
Abstract: Multimodal large language models (MLLMs) are increasingly being applied to spatial cognition tasks, where they are expected to understand and interact with complex environments. Most existing works impro…