Can GRPO be 10x Efficient? Kwai AI’s SRPO Suggests Yes with SRPO

Synced Review
Generative AI Reinforcement Learning

Kwai AI's SRPO framework slashes LLM RL post-