Nearly-Optimal Algorithm for Adversarial Kernelized Bandits

arXiv:2605.10299v1 Announce Type: new Abstract: This paper studies kernelized bandits (also known as Gaussian process bandits) in an adversarial environment, where the reward functions in a known reproducing kernel Hilbert space (RKHS) may be adversarially chosen at each round. We show that the exponential-weight algorithm achieves $\tilde{O}(\sqrt{T \gamma_T})$ adversarial regret, where $T$ and $\gamma_T$ denote the number of total rounds and the maximum information gain, respectively. For squared exponential (SE) and $\nu$-Mat\'ern kernels, we also show algorithm-independent lower bounds that guarantee the optimality of our algorithm up to polylogarithmic factors. Furthermore, we present a computationally efficient variant of our algorithm using Nystr\"om approximation while maintaining nearly optimal regret guarantees.

Leave a Comment