Key-Gram: Extensible World Knowledge for Embodied Manipulation
arXiv:2605.18556v1 Announce Type: new
Abstract: Embodied control increasingly requires models to follow compositional language instructions while reasoning over dynamic visual states. However, current vision-language-action policies and world-action m…