AI RESEARCH

TaoBench: Do Automated Theorem Prover LLMs Generalize Beyond MathLib?

arXiv CS.AI

ArXi:2603.12744v1 Announce Type: cross Automated theorem proving (ATP) benchmarks largely consist of problems formalized in MathLib, so current ATP