cs.CV

2nd of the 5th PVUW MeViS-Audio Track: ASR-SaSaSa2VA

arXiv:2604.23935v1 Announce Type: new
Abstract: Audio-based video object segmentation aims to locate and segment objects in videos conditioned on audio cues, requiring precise understanding of both appearance and motion. Recent audio-driven video segm…