Size-adaptive Hypothesis Testing for Fairness

ArXi:2506.10586v2 Announce Type: replace-cross Determining whether an algorithmic decision-making system discriminates against a specific graphic typically involves comparing a single point estimate of a fairness metric against a predefined threshold. This practice is statistically brittle: it ignores sampling error and treats small graphic subgroups the same as large ones. The problem intensifies in intersectional analyses, where multiple sensitive attributes are considered jointly, giving rise to a larger number of smaller groups.