cs.LG, math.OC, stat.ML

Achieving $\epsilon^{-2}$ Sample Complexity for Single-Loop Actor-Critic under Minimal Assumptions

arXiv:2605.13639v1 Announce Type: cross
Abstract: In this paper, we establish last-iterate convergence rates for off-policy actor–critic methods in reinforcement learning. In particular, under a single-loop, single-timescale implementation and a broa…