MIT tested 41 AI models on 11,000 real tasks. The "good enough" problem is worse than you think.

Everyone's debating whether AI will replace jobs. The MIT study this week asks a better question: what happens when AI delivers "acceptable" work and nobody checks? The numbers: → 65% of text tasks pass at minimal quality → 0% reliably hit "superior" on complex tasks → Management, judgment, coordination: 53% success rate Real consequences already documented: - A consulting firm delivered hallucinated reports to government clients - Law firms submitted fake citations in court filings - Media outlets published articles under fake bylines In every case, someone had "reviewed" the output.