cs.CV

PLUME: Latent Reasoning Based Universal Multimodal Embedding

arXiv:2604.02073v1 Announce Type: new
Abstract: Universal multimodal embedding (UME) maps heterogeneous inputs into a shared retrieval space with a single model. Recent approaches improve UME by generating explicit chain-of-thought (CoT) rationales be…

Scroll to Top