Why Single-Pass AI Test Generation Produces Garbage

After 9 years of writing test cases manually, I built an AI tool that generates them from User Stories. The first version used a single API call. The output looked reasonable until I tried to automate it. "Verify the system works correctly." What does that mean in Playwright? "Enter valid data and submit." What data? Which field? What's the expected state after submit? Single-pass AI treats test case writing like creative writing. But test cases are engineering artifacts.