AI RESEARCH
Physics-R1: An Audited Olympiad Corpus and Recipe for Visual Physics Reasoning
arXiv CS.CL
•
ArXi:2605.14040v1 Announce Type: new We audit the multimodal-physics evaluation pipeline end-to-end and document three undetected construction practices that distort how the field measures vision-language reasoning: train-eval contamination, translation drift, and MCQ saturation. (1) Public