cs.LG, cs.RO

Adaptive Q-Chunking for Offline-to-Online Reinforcement Learning

arXiv:2605.05544v1 Announce Type: cross
Abstract: Offline-to-online reinforcement learning with action chunking eliminates multi-step off-policy bias and enables temporally coherent exploration, but all existing methods use a fixed chunk size across e…