Alignment Drift in Long-Term Human-LLM Interaction: A Mechanism-Oriented Framework

ArXi:2605.16516v1 Announce Type: cross Long-term interaction with LLM-based systems may produce alignment drift: a gradual process in which system outputs become less constrained by the user's current message and shaped by prior interaction history, while still appearing helpful, coherent, and responsive. This process is difficult to detect because the user's subjective experience may improve as the system becomes familiar, useful, and attuned.