Multi-Objective Multi-Agent Bandits: From Learning Efficiency to Fairness Optimization
arXiv:2605.06864v1 Announce Type: new
Abstract: We study multi-objective multi-agent multi-armed bandits (MO-MA-MAB) under stochastic rewards, where agents observe heterogeneous reward vectors and communicate over time-varying graphs. We formulate thi…