AI RESEARCH

ReflectRM: Boosting Generative Reward Models via Self-Reflection within a Unified Judgment Framework

arXiv CS.CL

ArXi:2604.07506v1 Announce Type: cross Reward Models (RMs) are critical components in the Reinforcement Learning from Human Feedback (RLHF) pipeline, directly determining the alignment quality of Large Language Models (LLMs). Recently, Generative Reward Models (GRMs) have emerged as a superior paradigm, offering higher interpretability and stronger generalization than traditional scalar RMs. However, existing methods for GRMs focus primarily on outcome-level supervision, neglecting analytical process quality, which constrains their potential.