Microsoft: LLMs Corrupt 25% of Docs in Long Edits

Dev.to AI
Generative AI

Microsoft paper shows LLMs corrupt ~25% of documents across 52 domains during 20-edit sessions, with failures compounding silently. Microsoft researchers found that current AI assistants corrupt about 25% of document content during long editing jobs. The paper, titled "LLMs Corrupt Your Documents When You Delegate," tests 19 models across 52 domains with 20 sequential edits. Key facts 19 models tested across 52 domains. 20 sequential editing interactions per run. ~25% document content corrupted on average. Agentic tool use did not improve outcomes.