AI RESEARCH

Policy-Invisible Violations in LLM-Based Agents

arXiv CS.AI

ArXi:2604.12177v1 Announce Type: new LLM-based agents can execute actions that are syntactically valid, user-sanctioned, and semantically appropriate, yet still violate organizational policy because the facts needed for correct policy judgment are hidden at decision time. We call this failure mode policy-invisible violations: cases in which compliance depends on entity attributes, contextual state, or session history absent from the agent's visible context.