Agent Privilege Separation in OpenClaw: A Structural Defense Against Prompt Injection

ArXi:2603.13424v1 Announce Type: cross Prompt injection remains one of the most practical attack vectors against LLM-integrated applications. We replicate the Microsoft LLMail-Inject benchmark (Greshake, 2024) against current generation models running inside OpenClaw, an open source multitool agent platform. Our proposed defense combines two mechanisms: agent isolation, implemented as a privilege separated two-agent pipeline with tool partitioning, and JSON formatting, which produces structured output that strips persuasive framing before the action agent processes it.