cs.LG, cs.MA

High entropy leads to symmetry equivariant policies in Dec-POMDPs

arXiv:2511.22581v3 Announce Type: replace
Abstract: We prove that in any Dec-POMDP, sufficiently high entropy regularization ensures that the policy gradient flow with tabular softmax parametrization always converges, for any initialization, to the sa…