SynthWorlds: Controlled Parallel Worlds for Disentangling Reasoning and Knowledge in Language Models

ArXi:2510.24427v3 Announce Type: replace Evaluating the reasoning ability of language models (LMs) is complicated by their extensive parametric world knowledge, where benchmark performance often reflects factual recall rather than genuine reasoning. Existing datasets and approaches (e.g., temporal filtering, paraphrasing, adversarial substitution) cannot cleanly separate the two. We present SynthWorlds, a framework that disentangles task reasoning complexity from factual knowledge.