CVA: Context-aware Video-text Alignment for Video Temporal Grounding
arXiv:2603.24934v1 Announce Type: cross
Abstract: We propose Context-aware Video-text Alignment (CVA), a novel framework to address a significant challenge in video temporal grounding: achieving temporally sensitive video-text alignment that remains r…