Real Agentic Failures.
Real Costs. Real Containment.
Documented runtime incidents from Claude Code, Cursor, Windsurf, and multi-agent systems. Each incident maps to the governance system that would have prevented it.
The $1,100 Overnight Token Burn
11:47 PM — 6:23 AM (6h 36m unattended)
$1,147 in API tokens consumed. Zero usable output.
Agent entered recursive retry loop on a failing test. No financial circuit breaker. No unattended execution limits. Agent burned through context window 14 times, each time restarting from scratch.
AI Cost Containment System would have halted execution at $25 budget cap (97.8% savings). Unattended timeout would have triggered at 30 minutes.
Deploy AI Cost ContainmentThe 47-File Cursor Rewrite
2:15 PM — 2:52 PM (37 minutes)
47 files modified. 12 new phantom dependencies introduced. 3 config files overwritten.
Agent was asked to refactor a single utility function. Without scope enforcement, it followed import chains across the entire codebase, "fixing" each file it touched. Ghost dependencies imported from packages not in package.json.
Repository Drift Prevention would have blocked out-of-scope mutations at file 2. Import validator would have caught phantom dependencies immediately.
Deploy Repository Drift PreventionThe .env Credential Leak via MCP
10:30 AM — 10:31 AM (instant)
AWS access keys, database credentials, and Stripe API keys exposed to third-party MCP server.
Agent connected to an MCP tool server that requested file system access. Server read .env file containing production credentials. No context isolation. No capability manifest validation.
MCP Governance System would have blocked .env access via file-guard, validated server against manifest, and enforced context isolation.
Deploy MCP GovernanceThe $890 Agreement Loop
9:00 AM — 3:15 PM (6h 15m)
$890 in compute. 340 turns of agents agreeing with each other. Zero tool invocations. Zero code produced.
Three agents entered an agreement loop — each validating the previous agent's output without performing any actual work. No turn limit. No tool-invocation requirement. No agreement loop detection.
Orchestration Entropy System would have detected the agreement loop at turn 10 and halted the workflow (99% cost prevention).
Deploy Orchestration EntropyThe Rubber-Stamp PR Avalanche
Sprint duration (2 weeks)
34 AI-generated PRs merged with <2 min review. 8 contained bugs. 3 reached production. 1 caused a customer-facing outage.
AI code generation volume exceeded team review capacity. Engineers began rubber-stamping PRs to clear the queue. No confidence scoring. No review timer. No burnout detection.
Verification Burden Collapse Prevention would have flagged rubber-stamp reviews, throttled AI generation when queue exceeded 8 PRs, and routed low-confidence code to deep review.
Deploy Verification Burden CollapseContext Rot: Agent Forgot Its Own Architecture
10:00 AM — 1:45 PM (3h 45m)
23 files corrupted with contradictory implementations. Agent began patching its own patches. 6 hours remediation.
After 90 minutes, the agent's context window filled. Original architecture instructions were pushed out. Agent continued generating code that contradicted the initial design, then tried to "fix" the contradictions by patching files it had just modified.
Context Rot Prevention would have triggered checkpoint rotation at 65% utilization and mandatory semantic reset at 85%. Patch chain detector would have halted at depth 3.
Deploy Context Rot PreventionEvery incident above was preventable.
Deploy runtime governance infrastructure to contain these failures before they occur.