PRISM: Prompt Reliability via Iterative Simulation and Monitoring for Enterprise Conversational AI

ArXi:2605.15665v1 Announce Type: new Deploying large language model (LLM)-driven conversational agents in enterprise settings requires prompts that are simultaneously correct at launch and resilient to the non-deterministic behavioral drift that characterizes production LLM deployments. Existing prompt optimization frameworks address prompt quality as a one-time compile-time problem, leaving open the equally critical question of how to detect and repair prompt regressions caused by silent LLM behavior changes over time.