cs.AI, cs.LG, cs.MA

Descent-Guided Policy Gradient for Scalable Cooperative Multi-Agent Learning

arXiv:2602.20078v3 Announce Type: replace-cross
Abstract: Scaling cooperative multi-agent reinforcement learning (MARL) is fundamentally limited by cross-agent noise. When agents share a common reward, each agent’s learning signal is computed from a s…