AI RESEARCH

DeceptGuard :A Constitutional Oversight Framework For Detecting Deception in LLM Agents

arXiv CS.CL

ArXi:2603.13791v1 Announce Type: new Reliable detection of deceptive behavior in Large Language Model (LLM) agents is an essential prerequisite for safe deployment in high-stakes agentic contexts. Prior work on scheming detection has focused exclusively on black-box monitors that observe only externally visible tool calls and outputs, discarding potentially rich internal reasoning signals.