VeriContest: A Competitive-Programming Benchmark for Verifiable Code Generation

ArXi:2605.08553v1 Announce Type: cross Large language models can generate useful code from natural language, but their outputs come without correctness guarantees. Verifiable code generation offers a path beyond testing by requiring models to produce not only executable code, but also formal specifications and machine-checkable proofs.