Best of both worlds: Stochastic & adversarial best-arm identification
arXiv:2604.14860v1 Announce Type: new
Abstract: We study bandit best-arm identification with arbitrary and potentially adversarial rewards. A simple random uniform learner obtains the optimal rate of error in the adversarial scenario. However, this ty…