AI RESEARCH

Reward Under Attack: Analyzing the Robustness and Hackability of Process Reward Models

arXiv CS.LG

ArXi:2603.06621v1 Announce Type: new Process Reward Models (PRMs) are rapidly becoming the backbone of LLM reasoning pipelines, yet we nstrate that state-of-the-art PRMs are systematically exploitable under adversarial optimization pressure. To address this, we