The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More

ArXi:2603.23971v1 Announce Type: cross Developers and consumers increasingly choose reasoning language models (RLMs) based on their listed API prices. However, how accurately do these prices reflect actual inference costs? We conduct the first systematic study of this question, evaluating 8 frontier RLMs across 9 diverse tasks covering competition math, science QA, code generation, and multi-domain reasoning.