Relative Kinetic Utility for Reasoning-Aware Structural Pruning in Large Language Models

ArXi:2605.09008v1 Announce Type: new Chain-of-Thought (CoT) prompting symbolized a huge improvement of reasoning capabilities of Large Language Models (LLMs). However, scaling up test-time computation yields extensive CoT sequences,