AI RESEARCH
Hail to the Thief: Exploring Attacks and Defenses in Decentralised GRPO
arXiv CS.LG
•
ArXi:2511.09780v2 Announce Type: replace Group Relative Policy Optimization (GRPO) has nstrated wide adoption in the post-