AI RESEARCH
Is Mathematical Problem-Solving Expertise in Large Language Models Associated with Assessment Performance?
arXiv CS.AI
•
ArXi:2603.25633v1 Announce Type: new Large Language Models (LLMs) are increasingly used in math education not only as problem solvers but also as assessors of learners' reasoning. However, it remains unclear whether stronger math problem-solving ability is associated with stronger step-level assessment performance. This study examines that relationship using the GSM8K and MATH subsets of PROCESSBENCH, a human-annotated benchmark for identifying the earliest erroneous step in mathematical reasoning.