AI RESEARCH

Redirected, Not Removed: Task-Dependent Stereotyping Reveals the Limits of LLM Alignments

arXiv CS.CL

ArXi:2604.02669v1 Announce Type: new How biased is a language model? The answer depends on how you ask. A model that refuses to choose between castes for a leadership role will, in a fill-in-the-blank task, reliably associate upper castes with purity and lower castes with lack of hygiene. Single-task benchmarks miss this because they capture only one slice of a model's bias profile. We