AI RESEARCH

LangMARL: Natural Language Multi-Agent Reinforcement Learning

arXiv CS.CL

ArXi:2604.00722v1 Announce Type: new Large language model (LLM) agents struggle to autonomously evolve coordination strategies in dynamic environments, largely because coarse global outcomes obscure the causal signals needed for local policy refinement. We identify this bottleneck as a multi-agent credit assignment problem, which has long been studied in classical multi-agent reinforcement learning (MARL) but remains underaddressed in LLM-based systems.