How political censorship actually works inside Qwen, DeepSeek, GLM, and Yi: Ablation and behavioral results across 9 models
r/LocalLLaMA
•
Generative AI
Open Source AI
AI Research
New paper studying the internal mechanisms of political censorship in Chinese-origin LLMs: Findings relevant to this community: On Qwen/Alibaba - the generational shift: Across Qwen2.5-7B → Qwen3-8B → Qwen3.5-4B → Qwen3.5-9B, hard refusal went from 6.2% to 25% to 0% to 0%. But steering (CCP narrative framing) rose from 4.33/5 to 5.00/5 over the same period. The newest Qwen models don't refuse - they answer everything in maximally steered language. Any evaluation that counts refusals would conclude Qwen3.5 is less censored. It isn't.