cs.CV

Representation Learning with Semantic-aware Instance and Sparse Token Alignments

arXiv:2601.08165v2 Announce Type: replace
Abstract: Medical contrastive vision-language pre-training (VLP) has demonstrated significant potential in improving performance on downstream tasks. Traditional approaches typically employ contrastive learnin…