Steer Like the LLM: Activation Steering that Mimics Prompting

ArXi:2605.03907v1 Announce Type: cross Large language models can be steered at inference time through prompting or activation interventions, but activation steering methods often underperform compared to prompt-based approaches. We propose a framework that formulates prompt steering as a form of activation steering and investigates whether distilling successful prompt steering behavior into simpler, interpretable models can close this gap.