CODA: Coordination via On-Policy Diffusion for Multi-Agent Offline Reinforcement Learning

ArXi:2604.23308v1 Announce Type: new Offline multi-agent reinforcement learning (MARL) enables policy learning from fixed datasets, but is prone to coordination failure: agents trained on static, off-policy data converge to suboptimal joint behaviours because they cannot co-adapt as their policies change. We