Why I Make Claude and Gemini Argue: Building an Adversarial Agentic Workflow (Open-Source Skill)

In traditional engineering, you'd never let a developer merge code without a peer review. So why are we letting AI grade its own homework? I've been building with Claude Code for 750+ sessions across multiple projects - a content pipeline, a marketing site, a design system, decision frameworks. Somewhere around session 200, I noticed a pattern: Claude is brilliant, but it has consistent blind spots. It favors certain architectures. It misses edge cases in its own prompts. It quietly accepts assumptions that a different perspective would challenge.