Online Statistical Inference of Constant Sample-averaged Q-Learning
arXiv:2603.26982v1 Announce Type: cross
Abstract: Reinforcement learning algorithms have been widely used for decision-making tasks in various domains. However, the performance of these algorithms can be impacted by high variance and instability, part…