Collision-Aware Object-Goal Visual Navigation via Two-Stage Deep Reinforcement Learning
arXiv:2502.13498v2 Announce Type: replace-cross
Abstract: Object-goal visual navigation aims to reach a specific target object using egocentric visual observations. Recent deep reinforcement learning (DRL) approaches have achieved promising success rates but often neglect collisions during evaluation, limiting real-world deployment. To address this issue, this letter introduces a collision-aware evaluation metric, namely collision-free success rate (CF-SR), to explicitly measure navigation performance under collision constraints. In addition, collision-free success weighted by path length (CF-SPL) is adopted to further evaluate navigation efficiency. Furthermore, a two-stage DRL training framework with collision prediction is proposed to improve collision-free navigation performance. In the first stage, a collision prediction module is trained by supervising the agent's collision states during exploration. In the second stage, leveraging the trained collision prediction, the agent learns to navigate toward target objects while avoiding collision. Extensive experiments across multiple navigation models in the AI2-THOR environment demonstrate consistent improvements in both CF-SR and CF-SPL. Real-world experiments further validate the effectiveness and generalization capability of the proposed framework.