AI RESEARCH

[P] Cold Validation: Open-source system where one AI agent audits another with zero shared context

r/MachineLearning

We released an open-source architecture for independent AI agent verification. The core idea: the agent that built something should never review it. Cold validation uses two agents with strict separation - Builder (Claude Code) produces plans and code - Reviewer (Codex CLI) audits only artifacts - never sees reasoning - An orchestrator enforces phase gates and convergence The reviewer runs filesystem-isolated (temp dir, no repo access). Findings are tracked with durable fingerprints across rounds. The controller independently reconciles verdicts against blocking findings. Apache 2.0.