Analysis of On-policy Policy Gradient Methods under the Distribution Mismatch
arXiv:2503.22244v2 Announce Type: replace
Abstract: Policy gradient methods are one of the most successful approaches for solving challenging reinforcement learning problems. Despite their empirical successes, many state-of-the-art policy gradient alg…