Model-Based Proactive Cost Generation for Learning Safe Policies Offline with Limited Violation Data
arXiv:2605.01356v1 Announce Type: cross
Abstract: Learning constraint-satisfying policies from offline data without risky online interaction is crucial for safety-critical decision making. Conventional methods typically learn cost value functions from…