A maintenance agent: 412 fixed, 14 refused. The 14 are the point
A maintenance agent was run on a real SaaS codebase for 10 days, automatically fixing 412 out of 559 identified issues while escalating 14 to human reviewers. The agent operated under a strict 'audit-only, never add' rule, preventing it from introducing new code or dependencies. The 14 escalations, where the agent refused to act due to potential system-wide impact, highlight the effectiveness of its constraints.
- ▪The agent processed 559 bugs, auto-resolving 412 (73.7%), leaving 14 for human review due to high-risk implications.
- ▪Specialist agents handled different domains, with auto-fix rates ranging from 90% in security to 51% in API design, reflecting the risk and scope of changes.
- ▪The agent followed a strict 'audit-only' rule, refusing to implement fixes that required adding new infrastructure, services, or dependencies.
- ▪Escalated issues included high-risk scenarios such as schema migrations that could cause data loss or break existing functionality.
- ▪The system prioritized safety by reverting fixes that failed tests and only committing changes that passed the full test suite.
Opening excerpt (first ~120 words) tap to expand
I ran a maintenance agent for 10 days. The number that mattered was 14 April 26, 2026 I ran a maintenance agent loop against a real codebase for 10 days. It filed 559 bugs, fixed 412 on its own, asked me for help on 14, and marked 31 as won’t-fix. The remaining 100 are in the open queue. The interesting number isn’t the 412. It’s the 14. The setup The codebase is a working SaaS — Go backend, TypeScript/Next.js frontend, roughly 200k lines between them, plus a non-trivial migration history. Not a toy project. The loop is boring. Fresh claude -p per bug, no conversation chains, no long-lived agent memory.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Adriacidre.