AI RESEARCH
The Power of Power Law: Asymmetry Enables Compositional Reasoning
arXiv CS.LG
•
ArXi:2604.22951v1 Announce Type: cross Natural language data follows a power-law distribution, with most knowledge and skills appearing at very low frequency. While a common intuition suggests that reweighting or curating data towards a uniform distribution may help models better learn these long-tail skills, we find a counterintuitive result: across a wide range of compositional reasoning tasks, such as state tracking and multi-step arithmetic