cs.LG

Augmented Lagrangian Method for Last-Iterate Convergence for Constrained MDPs

arXiv:2605.11694v1 Announce Type: new
Abstract: We study policy optimization for infinite-horizon, discounted constrained Markov decision processes (CMDPs). While existing theoretical guarantees typically hold for the mixture policy, deploying such a …