Evaluating Counterfactual Strategic Reasoning in Large Language Models

ArXi:2603.19167v1 Announce Type: new We evaluate Large Language Models (LLMs) in repeated game-theoretic settings to assess whether strategic performance reflects genuine reasoning or reliance on memorized patterns. We consider two canonical games, Prisoner's Dilemma (PD) and Rock-Paper-Scissors (RPS), upon which we