Internal State-Based Policy Gradient Methods for Partially Observable Markov Potential Games

ArXi:2604.00433v1 Announce Type: cross This letter studies multi-agent reinforcement learning in partially observable Marko potential games. Solving this problem is challenging due to partial observability, decentralized information, and the curse of dimensionality. First, to address the first two challenges, we leverage the common information framework, which allows agents to act based on both shared and local information. Second, to ensure tractability, we study an internal state that compresses accumulated information, preventing it from growing unboundedly over time.