Painless Activation Steering: An Automated, Lightweight Approach for Post-Training Large Language Models

ArXi:2509.22739v3 Announce Type: replace-cross Language models (LMs) are typically post-trained for desired capabilities and behaviors via weight-based or prompt-based steering, but the former is time-consuming and expensive, and the latter is not precisely controllable and often requires manual trial-and-error. While activation steering (AS) promises a cheap, fast, and controllable alternative to the two existing post-