cs.LG

SMA: Submodular Modality Aligner For Data Efficient Multimodal Learning

arXiv:2605.12872v1 Announce Type: new
Abstract: Despite the recent success of Multimodal Foundation Models (FMs), their reliance on massive paired datasets limits their applicability in low-data and rare-scenario settings where aligned data is scarce …