AI RESEARCH
Enhanced LLM Reasoning by Optimizing Reward Functions with Search-Driven Reinforcement Learning
arXiv CS.CL
•
ArXi:2605.02073v1 Announce Type: new Mathematical reasoning is a key benchmark for large language models. Reinforcement learning is a standard post-