AI RESEARCH
TaoBench: Do Automated Theorem Prover LLMs Generalize Beyond MathLib?
arXiv CS.AI
•
ArXi:2603.12744v1 Announce Type: cross Automated theorem proving (ATP) benchmarks largely consist of problems formalized in MathLib, so current ATP