Beyond Linear Steering: Unified Multi-Attribute Control for Language Models

ArXi:2505.24535v3 Announce Type: replace-cross Controlling multiple behavioral attributes in large language models (LLMs) at inference time is a challenging problem due to interference between attributes and the limitations of linear steering methods, which assume additive behavior in activation space and require per-attribute tuning. We