cs.LG, stat.ML

Constrained Policy Optimization with Cantelli-Bounded Value-at-Risk

arXiv:2601.22993v3 Announce Type: replace
Abstract: We introduce the Value-at-Risk Constrained Policy Optimization algorithm (VaR-CPO), a sample efficient and conservative method designed to optimize Value-at-Risk (VaR) constrained reinforcement learn…