cs.LG

Scale-free adaptive planning for deterministic dynamics & discounted rewards

arXiv:2604.18312v1 Announce Type: new
Abstract: We address the problem of planning in an environment with deterministic dynamics and stochastic rewards with discounted returns. The optimal value function is not known, nor are the rewards bounded. We p…