Saunak Kumar Panda, Tong Li, Ruiqi Liu, Yisha Xiang

Online Statistical Inference of Constant Sample-averaged Q-Learning

Saunak Kumar Panda, Tong Li, Ruiqi Liu, Yisha Xiang / March 31, 2026

arXiv:2603.26982v1 Announce Type: cross
Abstract: Reinforcement learning algorithms have been widely used for decision-making tasks in various domains. However, the performance of these algorithms can be impacted by high variance and instability, part…

Author name: Saunak Kumar Panda, Tong Li, Ruiqi Liu, Yisha Xiang

Online Statistical Inference of Constant Sample-averaged Q-Learning