DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality

ArXi:2603.05912v1 Announce Type: new Search-augmented LLM agents can produce deep research reports (DRRs), but verifying claim-level factuality remains challenging. Existing fact-checkers are primarily designed for general-domain, factoid-style atomic claims, and there is no benchmark to test whether such verifiers transfer to DRRs. Yet building such a benchmark is itself difficult.