AI RESEARCH

Learning Who Disagrees: Demographic Importance Weighting for Modeling Annotator Distributions with DiADEM

arXiv CS.CL

ArXi:2604.08425v1 Announce Type: cross When humans label subjective content, they disagree, and that disagreement is not noise. It reflects genuine differences in perspective shaped by annotators' social identities and lived experiences. Yet standard practice still flattens these judgments into a single majority label, and recent LLM-based approaches fare no better: we show that prompted large language models, even with chain-of-thought reasoning, fail to recover the structure of human disagreement. We.