AI RESEARCH

The Power of Power Law: Asymmetry Enables Compositional Reasoning

arXiv CS.LG

ArXi:2604.22951v1 Announce Type: cross Natural language data follows a power-law distribution, with most knowledge and skills appearing at very low frequency. While a common intuition suggests that reweighting or curating data towards a uniform distribution may help models better learn these long-tail skills, we find a counterintuitive result: across a wide range of compositional reasoning tasks, such as state tracking and multi-step arithmetic