AI RESEARCH
SciCoQA: Quality Assurance for Scientific Paper--Code Alignment
arXiv CS.AI
•
ArXi:2601.12910v2 Announce Type: replace-cross We present SciCoQA, a dataset for detecting discrepancies between scientific publications and their codebases to ensure faithful implementations. We construct SciCoQA from GitHub issues and reproducibility papers, and to scale our dataset, we propose a synthetic data generation method for constructing paper-code discrepancies. We analyze the paper-code discrepancies in detail and propose discrepancy types and categories to better understand the occurring mismatches.