How Does the Lagrangian Guide Safe Reinforcement Learning through Diffusion Models?
arXiv:2602.02924v2 Announce Type: replace
Abstract: Diffusion policy sampling enables reinforcement learning (RL) to represent multimodal action distributions beyond suboptimal unimodal Gaussian policies. However, existing diffusion-based RL methods p…