cs.AI, cs.RO

RankQ: Offline-to-Online Reinforcement Learning via Self-Supervised Action Ranking

arXiv:2605.11151v1 Announce Type: cross
Abstract: Offline-to-online reinforcement learning (RL) improves sample efficiency by leveraging pre-collected datasets prior to online interaction. A key challenge, however, is learning an accurate critic in la…