Florian Wolf, Ilyas Fatkhullin, Niao He

Global Optimality for Constrained Exploration via Penalty Regularization

Florian Wolf, Ilyas Fatkhullin, Niao He / May 1, 2026

arXiv:2604.28144v1 Announce Type: new
Abstract: Efficient exploration is a central problem in reinforcement learning and is often formalized as maximizing the entropy of the state-action occupancy measure. While unconstrained maximum-entropy explorati…

Author name: Florian Wolf, Ilyas Fatkhullin, Niao He

Global Optimality for Constrained Exploration via Penalty Regularization