AI RESEARCH
From Passive to Persuasive: Localized Activation Injection for Empathy and Negotiation
arXiv CS.AI
•
ArXi:2511.12832v3 Announce Type: replace-cross Complex social behaviors, such as empathy and strategic politeness, are widely assumed to resist the directional decomposition that makes activation steering effective for coarse attributes like sentiment or toxicity. We present STAR: Steering via Attribution and Representation, which tests this assumption by using attribution patching to identify the layer--token positions where each behavioral trait causally originates, then injecting contrastive activation vectors at precisely those locations.