AI RESEARCH

TaoBench: Do Automated Theorem Prover LLMs Generalize Beyond MathLib?

arXiv CS.AI • March 16, 2026

ArXi:2603.12744v1 Announce Type: cross Automated theorem proving (ATP) benchmarks largely consist of problems formalized in MathLib, so current ATP