Shivika, Kartik Bose, Pankaj Gupta

CLIP Architecture for Abdominal CT Image-Text Alignment and Zero-Shot Learning: Investigating Batch Composition and Data Scaling

Shivika, Kartik Bose, Pankaj Gupta / April 16, 2026

arXiv:2604.13561v1 Announce Type: cross
Abstract: Vision-language models trained with contrastive learning on paired medical images and reports show strong zero-shot diagnostic capabilities, yet the effect of training batch composition on learned repr…

Author name: Shivika, Kartik Bose, Pankaj Gupta

CLIP Architecture for Abdominal CT Image-Text Alignment and Zero-Shot Learning: Investigating Batch Composition and Data Scaling