cs.AI, cs.SD, eess.AS

Spatial-Aware Conditioned Fusion for Audio-Visual Navigation

arXiv:2604.02390v1 Announce Type: cross
Abstract: Audio-visual navigation tasks require agents to locate and navigate toward continuously vocalizing targets using only visual observations and acoustic cues. However, existing methods mainly rely on sim…