LLMs Corrupt Your Documents When You Delegate
A recent study reveals that Large Language Models (LLMs) can significantly degrade document quality during delegated tasks. The research, conducted using a new framework called DELEGATE-52, found that even advanced models corrupt an average of 25% of document content. This raises concerns about the reliability of LLMs in professional workflows, as errors can accumulate over time.
- ▪The study introduced DELEGATE-52 to evaluate LLM performance in delegated workflows across 52 professional domains.
- ▪Current LLMs, including Gemini 3.1 Pro and GPT 5.4, were found to corrupt an average of 25% of document content during long interactions.
- ▪The degradation of documents was exacerbated by factors such as document size and the presence of distractor files.
Opening excerpt (first ~120 words) tap to expand
Computer Science > Computation and Language arXiv:2604.15597 (cs) [Submitted on 17 Apr 2026] Title:LLMs Corrupt Your Documents When You Delegate Authors:Philippe Laban, Tobias Schnabel, Jennifer Neville View a PDF of the paper titled LLMs Corrupt Your Documents When You Delegate, by Philippe Laban and Tobias Schnabel and Jennifer Neville View PDF HTML (experimental) Abstract:Large Language Models (LLMs) are poised to disrupt knowledge work, with the emergence of delegated work as a new interaction paradigm (e.g., vibe coding). Delegation requires trust - the expectation that the LLM will faithfully execute the task without introducing errors into documents. We introduce DELEGATE-52 to study the readiness of AI systems in delegated workflows.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at arXiv.org.