SRL-CLIP: Efficient CLIP Video Adaptation via Structured Semantic Role Labels
arXiv:2401.07669v2 Announce Type: replace
Abstract: Adapting CLIP for videos has gained popularity due to its semantic and rich representation. While CLIP is a good starting point, it typically undergoes post-pretraining (contrastive finetuning) on la…