Axel Friedrich Wolter, Tobias Sutter

A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance

Axel Friedrich Wolter, Tobias Sutter / April 15, 2026

arXiv:2505.04494v3 Announce Type: replace-cross
Abstract: We study reinforcement learning by combining recent advances in regularized linear programming formulations with the classical theory of stochastic approximation. Motivated by the challenge of …

Author name: Axel Friedrich Wolter, Tobias Sutter

A Two-Timescale Primal-Dual Framework for Reinforcement Learning via Online Dual Variable Guidance