cs.LG, stat.ML

Sample-Mean Anchored Thompson Sampling for Offline-to-Online Learning with Distribution Shift

arXiv:2605.10289v1 Announce Type: cross
Abstract: Offline-to-online learning aims to improve online decision-making by leveraging offline logged data. A central challenge in this setting is the distribution shift between offline and online environment…