Why AI Coding Burns Money
$100–$1,100 token burns in a single session are documented, not hypothetical. Here's exactly where the money goes — and how to stop it.
The Five Cost Leak Sources
Retry Inflation
35% of total wasteFailed attempts compound token consumption. Each retry adds more context, making subsequent retries more expensive.
Context Waste
25% of total wasteVerbose error messages, stale conversation, and failed file reads consuming tokens without producing value.
Scope Creep
20% of total wasteAgent modifies files outside the requested scope, then spends tokens fixing the unintended changes.
Unattended Execution
15% of total wasteAgent runs overnight or during meetings with no human oversight, burning tokens on circular logic.
Orchestration Loops
5% of total wasteMulti-agent workflows where agents agree with each other without doing work, consuming compute at scale.
Real Documented Incidents
Agent in retry loop for 6h 36m. Zero usable output.
340 turns of agents agreeing. Zero code produced.
67 retry attempts on a simple task.
How Governance Contains Costs
- Per-task budget caps — execution halts at $25 by default
- Per-session budget caps — hard ceiling at $50 per session
- Retry limits — maximum 3 retries before human escalation
- Unattended timeout — automatic halt after 30 minutes without interaction
- Agreement loop detection — halts multi-agent workflows with no tool invocations
- Scope enforcement — blocks file modifications outside the approved scope