NeurIPS Should Require Reproducibility Standards for Frontier AI Safety Claims

ArXi:2605.08192v1 Announce Type: cross Frontier AI safety claims - published assertions that a highly capable general-purpose model is below a threshold of concern, adequately mitigated, or suitable for release - increasingly shape model deployment, governance, and public trust. Yet the artefacts needed to evaluate them are routinely withheld, producing an evidential inversion: the most consequential claims in AI safety are often the least reproducible.