Assessing and Mitigating Miscalibration in LLM-Based Social Science Measurement

ArXi:2605.11954v1 Announce Type: new Large language models (LLMs) are increasingly used in social science as scalable measurement tools for converting unstructured text into variables that can enter standard empirical designs. Measurement validity demands than high average accuracy, which requires well calibrated confidence that faithfully reflects the empirical probability of each measurement being correct. This paper studies the model miscalibration in LLM-based social science measurement.