AI models suck slightly less at math than they did last year

Latest ORCA test results out exclusive Current-day LLMs are prediction engines and, as such, they can only find the most likely solution to problems, which is not necessarily the correct one. Though popular models have mostly become better at math, even top performer Gemini 3 Flash would receive a C if assessed with a letter grade