AI RESEARCH

Intentional Deception as Controllable Capability in LLM Agents

arXiv CS.AI

ArXi:2603.07848v1 Announce Type: new As LLM-based agents increasingly operate in multi-agent systems, understanding adversarial manipulation becomes critical for defensive design. We present a systematic study of intentional deception as an engineered capability, using LLM-to-LLM interactions within a text-based RPG where parameterized behavioral profiles (9 alignments x 4 motivations, yielding 36 profiles with explicit ethical ground truth) serve as our experimental testbed.