AI RESEARCH
Evaluating Human-AI Safety: A Framework for Measuring Harmful Capability Uplift
arXiv CS.AI
•
ArXi:2603.26676v1 Announce Type: cross Current frontier AI safety evaluations emphasize static benchmarks, third-party annotations, and red-teaming. In this position paper, we argue that AI safety research should focus on human-centered evaluations that measure harmful capability uplift: the marginal increase in a user's ability to cause harm with a frontier model beyond what conventional tools already enable. We frame harmful capability uplift as a core AI safety metric, ground it in prior social science research, and provide concrete methodological guidance for systematic measurement.