cs.LG, math.OC, stat.ML

Faster Fixed-Point Methods for Multichain MDPs

arXiv:2506.20910v2 Announce Type: replace-cross
Abstract: We study value-iteration (VI) algorithms for solving general (a.k.a. multichain) Markov decision processes (MDPs) under the average-reward criterion, a fundamental but theoretically challenging…