AI RESEARCH

Correcting Suppressed Log-Probabilities in Language Models with Post-Transformer Adapters

arXiv CS.LG

ArXi:2604.14174v1 Announce Type: cross Alignment-tuned language models frequently suppress factual log-probabilities on politically sensitive topics despite retaining the knowledge in their hidden representations. We show that a 786K-parameter (approximately 0.02% of the base model) post-transformer adapter, trained on frozen hidden states, corrects this suppression on 31 ideology-discriminating facts across Qwen3-4B, 8B, and 14B. The adapter memorizes all 15