Decoding-Time Debiasing via Process Reward Models: From Controlled Fill-in to Open-Ended Generation

ArXi:2605.02348v1 Announce Type: cross Large language models pick up social biases from the data they are trained on and carry those biases into downstream applications, often reinforcing stereotypes around gender, race, religion, disability, age, and socioeconomic status. The standard fixes (re