Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study

ArXi:2510.04641v3 Announce Type: replace Large-scale web-scraped text corpora used to train general-purpose AI models often contain harmful graphic-targeted social biases, creating a regulatory need for data auditing and developing scalable bias-detection methods. Although prior work has investigated biases in text datasets and related detection methods, these studies remain narrow in scope. They typically focus on a single content type (e.g., hate speech), cover limited graphic axes, overlook biases affecting multiple graphics simultaneously, and analyze limited techniques.