AI RESEARCH
How Hyper-Datafication Impacts the Sustainability Costs in Frontier AI
arXiv CS.AI
•
ArXi:2602.00056v3 Announce Type: replace-cross Large-scale data has fuelled the success of frontier artificial intelligence (AI) models over the past decade. This expansion has relied on sustained efforts by large technology corporations to aggregate and curate internet-scale datasets. In this work, we examine the environmental, social, and economic costs of large-scale data in AI through a sustainability lens. We argue that the field is shifting from building models from data to actively creating data for building models.