AI RESEARCH
Mechanisms of Introspective Awareness
arXiv CS.LG
•
ArXi:2603.21396v1 Announce Type: new Recent work shows that LLMs can sometimes detect when steering vectors are injected into their residual stream and identify the injected concept, a phenomenon cited as evidence of "