Gugan Thoppe, L. A. Prashanth, Ankur Naskar, Sanjay Bhat

Reinforcement Learning for Exponential Utility: Algorithms and Convergence in Discounted MDPs

Gugan Thoppe, L. A. Prashanth, Ankur Naskar, Sanjay Bhat / May 11, 2026

arXiv:2605.08053v1 Announce Type: new
Abstract: Reinforcement learning (RL) for exponential-utility optimization in discounted Markov decision processes (MDPs) lacks principled value-based algorithms. We address this gap in the fixed risk-aversion set…

Author name: Gugan Thoppe, L. A. Prashanth, Ankur Naskar, Sanjay Bhat

Reinforcement Learning for Exponential Utility: Algorithms and Convergence in Discounted MDPs