AI RESEARCH
Likelihood scoring for continuations of mathematical text: a self-supervised benchmark with tests for shortcut vulnerabilities
arXiv CS.LG
•
ArXi:2605.10810v1 Announce Type: new