Stop AI from hallucinating E2E test selectors — code analysis + live browser exploration via Claude Agent SDK and 2 MCP servers

Generating E2E tests with an LLM sounds great in a. You hand a Playwright test spec to Claude, ask it to produce TypeScript code, and the toy app passes. Plug it into a real codebase and the wheels come off immediately. The AI confidently generates await page.click('-button') for a project where the actual element is Sign in. Selectors are invented from common patterns ("most projects use -button ") rather than read from your code or your DOM. Result: ~every generated test fails on first run, because the selectors are pure hallucination.