GradMAP: Gradient-Based Multi-Agent Proximal Learning for Grid-Edge Flexibility

ArXi:2604.24549v1 Announce Type: new Coordinating large populations of grid-edge devices requires learning methods that remain fully decentralised in deployment while still respecting three-phase AC distribution-network physics. This paper proposes gradient-based multi-agent proximal learning (GradMAP) to address this challenge. GradMAP trains independent neural-network policies for each agent without any parameter sharing, and each agent uses only its own local observation for online decision-making without communication. During offline.