cs.LG

Provably Efficient Offline-to-Online Value Adaptation with General Function Approximation

arXiv:2604.13966v1 Announce Type: new
Abstract: We study value adaptation in offline-to-online reinforcement learning under general function approximation. Starting from an imperfect offline pretrained $Q$-function, the learner aims to adapt it to the…