Near-Optimal Privacy-Preserving Learning for Max-Min Fair Multi-Agent Bandits
arXiv:2306.04498v3 Announce Type: replace
Abstract: We study fair multi-agent multi-armed bandit learning under collision-only coordination. Agents cannot communicate explicitly during learning and observe only their own rewards and whether collisions…