AI RESEARCH

Prompt Architecture Determines Reasoning Quality: A Variable Isolation Study on the Car Wash Problem

arXiv CS.AI

ArXi:2602.21814v2 Announce Type: replace Large language models consistently fail the "car wash problem," a viral reasoning benchmark requiring implicit physical constraint inference. We present a variable isolation study (n=20 per condition, 6 conditions, 120 total trials) examining which prompt architecture layers in a production system enable correct reasoning.