cs.LG, stat.ML

A single algorithm for both restless and rested rotting bandits

arXiv:2604.21432v1 Announce Type: new
Abstract: In many application domains (e.g., recommender systems, intelligent tutoring systems), the rewards associated to the actions tend to decrease over time. This decay is either caused by the actions execute…