CoLA: Cross-Modal Low-rank Adaptation for Multimodal Downstream Tasks
arXiv:2604.03314v1 Announce Type: new
Abstract: Foundation models have revolutionized AI, but adapting them efficiently for multimodal tasks, particularly in dual-stream architectures composed of unimodal encoders, such as DINO and BERT, remains a sig…