AI RESEARCH
Evaluating Prompting and Execution-Based Methods for Deterministic Computation in LLMs
arXiv CS.AI
•
ArXi:2605.03227v1 Announce Type: new Large Language Models (LLMs) have nstrated strong capabilities in natural language understanding and reasoning. However, their ability to perform exact, deterministic computation remains unclear. In this work, we systematically evaluate multiple prompting strategies, including Chain-of-Thought (CoT), Least-to-Most decomposition, Program-of-Thought (PoT), and Self-Consistency (SC), on tasks requiring precise and error-free outputs, including binary counting, longest substring detection, and arithmetic evaluation. To this study, we