Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models
arXiv:2508.15202v2 Announce Type: replace
Abstract: Process Reward Models (PRMs) supervise intermediate reasoning steps in large language models (LLMs), but existing PRMs are mainly trained on general-domain data and struggle with the structured, symb…