AI RESEARCH
[R] An attack class that passes every current LLM filter - no payload, no injection signature, no log trace
r/MachineLearning
•
I've been documenting what I'm calling postural manipulation: a specific class of language that installs an interpretive stance before a task arrives, producing measurably larger directional shifts in model outputs than matched control text of identical length and semantic similarity. The core empirical claim: this is not ordinary context sensitivity. Matched controls produced significantly smaller shifts. Binary decision reversals documented with paired controls across four frontier models using a locked scoring rubric.