AI RESEARCH
Process Supervision via Verbal Critique Improves Reasoning in Large Language Models
arXiv CS.CL
•
ArXi:2604.21611v1 Announce Type: new Inference-time scaling for LLM reasoning has focused on three axes: chain depth, sample breadth, and learned step-scorers (PRMs). We