Retrieving to Recover: Towards Incomplete Audio-Visual Question Answering via Semantic-consistent Purification
arXiv:2604.10695v2 Announce Type: replace
Abstract: Recent Audio-Visual Question Answering (AVQA) methods have advanced significantly. However, most AVQA methods lack effective mechanisms for handling missing modalities, suffering from severe performa…