Offline Two-Player Zero-Sum Markov Games with KL Regularization
arXiv:2605.13025v1 Announce Type: new
Abstract: We study the problem of learning Nash equilibria in offline two-player zero-sum Markov games. While existing approaches often rely on explicit pessimism to address distribution shift, we show that KL reg…