SEIF: Self-Evolving Reinforcement Learning for Instruction Following

ArXi:2605.07465v1 Announce Type: new Instruction following is a fundamental capability of large language models (LLMs), yet continuously improving this capability remains challenging. Existing methods typically rely either on costly external supervision from humans or strong teacher models, or on self-play