cs.LG, math.OC

A Finite-Iteration Theory for Asynchronous Categorical Distributional Temporal-Difference Learning

arXiv:2605.06866v1 Announce Type: new
Abstract: Recent non-asymptotic analyses have substantially advanced the theory of distributional policy evaluation, but they largely concern synchronous full-state updates under a generative model, model-based es…