Self-Debias: Self-correcting for Debiasing Large Language Models

ArXi:2604.08243v1 Announce Type: new Although Large Language Models (LLMs) nstrate remarkable reasoning capabilities, inherent social biases often cascade throughout the Chain-of-Thought (CoT) process, leading to continuous "Bias Propagation". Existing debiasing methods primarily focus on static constraints or external interventions, failing to identify and interrupt this propagation once triggered. To address this limitation, we