Replicable Bandits with UCB based Exploration
arXiv:2604.20024v1 Announce Type: new
Abstract: We study replicable algorithms for stochastic multi-armed bandits (MAB) and linear bandits with UCB (Upper Confidence Bound) based exploration. A bandit algorithm is $\rho$-replicable if two executions u…