AI RESEARCH
CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning
arXiv CS.LG
•
ArXi:2604.23308v1 Announce Type: new Offline multi-agent reinforcement learning (MARL) enables policy learning from fixed datasets, but is prone to coordination failure: agents trained on static, off-policy data converge to suboptimal joint behaviours because they cannot co-adapt as their policies change. We