BenCSSmark: Making the Social Sciences Count in LLM Research

ArXi:2605.04886v1 Announce Type: new This position paper argues that the under-representation of social science tasks in contemporary LLM benchmarks limits advances in both LLM evaluation and social scientific inquiry. Benchmarks -- standardized tools for assessing computational systems -- are pivotal in the development of artificial intelligence (AI), including large language models (LLMs). Benchmarks do than measure progress -- they actively structure it, shaping reputations, research agendas, and commercial outcomes.