cs.AI, cs.LG

SB-TRPO: Towards Safe Reinforcement Learning with Hard Constraints

arXiv:2512.23770v3 Announce Type: replace
Abstract: In safety-critical domains, reinforcement learning (RL) agents must often satisfy strict, zero-cost safety constraints while accomplishing tasks. Existing model-free methods frequently either fail to…