Asynchronous Policy Gradient Aggregation for Efficient Distributed Reinforcement Learning

ArXi:2509.24305v2 Announce Type: replace We study distributed reinforcement learning (RL) with policy gradient methods under asynchronous and parallel computations and communications. While non-distributed methods are well understood theoretically and have achieved remarkable empirical success, their distributed counterparts remain less explored, particularly in the presence of heterogeneous asynchronous computations and communication bottlenecks. We