AI RESEARCH

Detecting Data Contamination from Reinforcement Learning Post-training for Large Language Models

arXiv CS.LG

ArXi:2510.09259v2 Announce Type: replace-cross Data contamination poses a significant threat to the reliable evaluation of Large Language Models (LLMs). This issue arises when benchmark samples may inadvertently appear in