A Measure-Theoretic Finite-Sample Theory for Adaptive-Data Fitted Q-Iteration
arXiv:2605.05791v1 Announce Type: new
Abstract: While reinforcement learning (RL) promises to revolutionize the control of complex nonlinear robotic systems, a profound gap persists between the heuristic success of model-free off-policy deep RL and th…