CAPO: Counterfactual Credit Assignment in Sequential Cooperative Teams

ArXi:2604.17693v1 Announce Type: new In cooperative teams where agents act in a fixed order and share a single team reward, it is hard to know how much each agent contributed, and harder still when agents are updated one at a time because data collected earlier no longer reflects the new policies. We