cs.LG, stat.ML

Optimal Regret for Single Index Bandits

arXiv:2605.09454v1 Announce Type: new
Abstract: We study the $\textit{single-index bandit}$ problem, where rewards depend on an unknown one-dimensional projection of high-dimensional contexts through an unknown reward function. This model extends line…