Online Learning of Whittle Indices for Restless Bandits with Non-Stationary Transition Kernels
arXiv:2506.18186v3 Announce Type: replace
Abstract: The restless multi-armed bandit (RMAB) framework is a popular approach to solving resource allocation problems in networked systems. In this paper, we study optimal resource allocation in RMABs facin…