Zaiwei Chen, Phalguni Nanda

From Set Convergence to Pointwise Convergence: Finite-Time Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes

Zaiwei Chen, Phalguni Nanda / April 7, 2026

arXiv:2504.18743v2 Announce Type: replace-cross
Abstract: This work presents the first finite-time analysis for the last-iterate convergence of average-reward $Q$-learning with an asynchronous implementation. A key feature of the algorithm we study is…

Author name: Zaiwei Chen, Phalguni Nanda

From Set Convergence to Pointwise Convergence: Finite-Time Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes