Failure Intelligence

Runtime Failure Index

The canonical taxonomy of agentic runtime failures. 15 documented failure modes ranked by frequency, cost, blast radius, and trend direction. Each failure maps to a deployable governance containment module.

Most Common

Context Rot

Affects virtually all sessions > 60 min

Most Expensive

$1,100 single session

Overnight retry inflation burn

Fastest Growing

MCP Credential Leaks

MCP adoption scaling without governance

Highest Blast Radius

Repository Drift

94 files modified in one incident

Most Underestimated

Governance Theater

System prompts ≠ deterministic governance

Largest Organizational Cost

$135K/quarter

AI tools net-negative at enterprise scale

#	Failure Mode	Category	Frequency	Avg Cost	Trend	Risk	Module
1	Context Rot All agents	Cognition	Very High	$80-$340/incident	↑ Growing	98	Context Rot Prevention →
2	Retry Inflation All agents	Economics	Very High	$25-$1,100/incident	↑ Growing	96	Retry Inflation Control →
3	Repository Drift Cursor, Windsurf	Environment	High	$200-$2,000/incident	↑ Growing	94	Repository Drift Prevention →
4	Identity Drift All agents	Identity	Very High	$50-$200/incident	→ Stable	90	Deterministic Agentic Engineering →
5	MCP Credential Exposure Claude Code, Cline	Security	Medium	$5K-$500K/breach	↑ Growing rapidly	92	MCP Governance →
6	Tool Permission Leak Windsurf, Roo Code	Security	Medium	$500-$10K/incident	↑ Growing	88	Tool Permission Governance →
7	Verification Bypass All agents	Quality	High	$50K-$200K/quarter	↑ Growing	86	Verification Burden Collapse →
8	Orchestration Collapse Multi-agent	Architecture	Medium	$100-$890/incident	↑ Growing	82	Orchestration Entropy →
9	Hallucination Debt Codex, Claude Code	Quality	High	$100-$500/incident	→ Stable	80	Hallucination Debt Reduction →
10	Context Window Overflow All agents	Cognition	Very High	$30-$150/incident	→ Stable	78	Context Window Compression →
11	Token Cost Overrun All agents	Economics	High	$100-$1,100/incident	↑ Growing	85	AI Cost Containment →
12	Scope Creep Mutation Cursor, Claude Code	Environment	High	$200-$1,000/incident	→ Stable	76	Agentic Change Management →
13	Autonomous Execution Risk All agents	Security	Medium	$500-$5K/incident	↑ Growing rapidly	84	Autonomous Execution Safety →
14	Governance Theater All agents	Architecture	Very High	Unquantified	→ Stable	74	Runtime Governance →
15	Engineering Economics Collapse Enterprise-scale	Economics	High	$135K/quarter	↑ Growing	88	AI Engineering Economics →

Failure Categories

Cognition

failure modes

Economics

failure modes

Environment

failure modes

Identity

failure modes

Security

failure modes

Quality

failure modes

Architecture

failure modes

Key Findings

No agent ships runtime governance

Claude Code, Cursor, Windsurf, Cline, Roo Code, and Codex all lack deterministic governance enforcement.

MCP risks are scaling fastest

As MCP adoption increases, credential exposure and supply chain risks grow proportionally without governance.

AI agents are frequently net-negative

Documented enterprise deployments show remediation costs exceeding productivity gains without governance.

Governance reduces costs by 60-93%

Documented containment across all 15 failure modes shows consistent 60-93% cost reduction when governance is deployed.

System prompts are not governance

Text-based instructions in CLAUDE.md/.cursorrules are routinely bypassed under context pressure. Only middleware enforcement is deterministic.

The 4-layer model contains all failures

Every documented failure maps to Identity, Skill, Tool, or Environment governance — confirming the runtime architecture is complete.

Incident Reports

15 Documented Incidents →

Full timelines, telemetry, and containment analysis

Telemetry

Governance Metrics →

Before/after governance impact data

Executive

Board-Ready Briefing →

Maturity model, risk matrix, ROI analysis

Deploy Containment for Any Failure Mode

Every failure in this index maps to a deployable runtime infrastructure module with TypeScript middleware, YAML policy manifests, and operational tooling.

View All 15 Runtime Modules →View Architecture Diagrams

← Return to Infrastructure Catalog

⚡

Need an expert verdict?

30-minute rapid-fire evaluation. You describe the problem, I tell you which approach wins — and why.

Schedule Evaluation ($450)View All Advisory Options

Richard Ewing — AI Economist & Capital Auditor