AI RESEARCH
Activation Steering via Generative Causal Mediation
arXiv CS.LG
•
ArXi:2602.16080v2 Announce Type: replace-cross Where should we intervene in a language model (LM) to localize and control behaviors that are diffused across many tokens of a long-form response? We