AssemLM: Spatial Reasoning Multimodal Large Language Models for Robotic Assembly
arXiv:2604.08983v1 Announce Type: new
Abstract: Spatial reasoning is a fundamental capability for embodied intelligence, especially for fine-grained manipulation tasks such as robotic assembly. While recent vision-language models (VLMs) exhibit prelim…