AcademiClaw: The Benchmark Where Even the Best AI Agents Flunk 45% of Real Student Work

Towards AI • May 07, 2026

Generative AI AI Research

80 Real Student Tasks Reveal a 55% Ceiling, a Token-Quality Disconnect, and Three Distinct Ways AI Agents Fail