Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

ArXi:2508.15202v2 Announce Type: replace Process Reward Models (PRMs) supervise intermediate reasoning steps in large language models (LLMs), but existing PRMs are mainly trained on general-domain data and struggle with the structured, symbolic, and fact-sensitive nature of financial reasoning. Financial tasks require not only correct final answers but also verifiable intermediate steps grounded in domain knowledge.