Haoxing Tian, Zaiwei Chen, Ioannis Ch. Paschalidis, Alex Olshevsky

Bridging the Gap Between Average and Discounted TD Learning

Haoxing Tian, Zaiwei Chen, Ioannis Ch. Paschalidis, Alex Olshevsky / May 5, 2026

arXiv:2605.02103v1 Announce Type: new
Abstract: The analysis of Temporal Difference (TD) learning in the average-reward setting faces notable theoretical difficulties because the Bellman operator is not contractive with respect to any norm. This compl…

Author name: Haoxing Tian, Zaiwei Chen, Ioannis Ch. Paschalidis, Alex Olshevsky

Bridging the Gap Between Average and Discounted TD Learning