Zero Shot Coordination for Sparse Reward Tasks with Diverse Reward Shapings

ArXi:2604.25076v1 Announce Type: new Many Multi-Agent Reinforcement Learning (MARL) agents fail to adapt properly to cooperating with agents trained with the same objectives but different seeds, algorithms, or other