AI RESEARCH

Locating Demographic Bias at the Attention-Head Level in CLIP's Vision Encoder

arXiv CS.AI

ArXi:2603.11793v1 Announce Type: cross Standard fairness audits of foundation models quantify that a model is biased, but not where inside the network the bias resides. We propose a mechanistic fairness audit that combines projected residual-stream decomposition, zero-shot Concept Activation Vectors, and bias-augmented TextSpan analysis to locate graphic bias at the level of individual attention heads in vision transformers. As a feasibility, we apply this pipeline to the CLIP ViT-L-14 encoder on 42 profession classes of the FACET benchmark, auditing both gender and age bias.