ML Pipeline Economics: Why Training Costs Are Just the Beginning

Training is 20% of ML costs. Serving, monitoring, and retraining are the other 80%.

By Richard Ewing·September 4, 2025

The 80/20 Rule of ML Costs

Training gets all the attention, but it's 20% of total cost. The other 80%: model serving (30%), monitoring and evaluation (15%), data pipelines (20%), retraining (15%).

Optimization: serve models on CPU where latency allows (10x cheaper than GPU serving), batch predictions for non-real-time use cases, implement model compression (2-5x inference cost reduction), and set retraining triggers based on drift (not schedules).

Like this analysis?

Get the weekly engineering economics briefing — one email, every Monday.

Subscribe Free →

Canonical Frameworks

Technical Insolvency Date

The Technical Insolvency Date (TID) is the specific future quarter when an organization's technical debt maintenance will consume 100% of engineering capacity, leaving zero time for new feature development. Every software organization accumulates technical debt over time — shortcuts taken under deadline pressure, aging infrastructure, deprecated dependencies, and code that nobody understands anymore. This debt isn't free. It requires ongoing maintenance hours: bug fixes, security patches, dependency updates, and workarounds for architectural limitations. The critical insight is that maintenance burden grows faster than most leaders realize. If your team currently spends 40% of its time on maintenance and that percentage is growing 3% per quarter, you can calculate the exact quarter when maintenance reaches 100%. That quarter is your Technical Insolvency Date. At the TID, your engineering team is fully consumed by keeping existing systems alive. Feature velocity drops to zero. No new capabilities. No competitive response. No innovation. Your R&D investment becomes pure maintenance spend — you're paying innovation-era salaries for maintenance-era output. The concept draws from financial insolvency: the point where a company's liabilities exceed its assets and it cannot meet its obligations. Technical insolvency is the same idea applied to engineering capacity — the point where your maintenance obligations exceed your available engineering hours. Most organizations don't realize they're approaching the TID because they track technical debt qualitatively rather than quantitatively. Telling a board "we have technical debt" gets deprioritized. Telling a board "we are 8 quarters from technical insolvency — the point where we can no longer ship any new features" gets immediate action and budget allocation.

Read Definition →

Audit Interview

The Audit Interview is a hiring protocol that tests verification skills instead of code generation skills. In the AI age, the scarce human skill is not writing code — it's catching what AI gets wrong. Traditional coding interviews ask candidates to write algorithms on a whiteboard or in a shared editor. This was a reasonable proxy for engineering skill when humans wrote all the code. But in 2026, AI tools like GitHub Copilot, Cursor, and Claude generate code faster and often more correctly than human candidates under interview pressure. When Anthropic discovered that candidates were using Claude to pass their own coding interviews, it proved that traditional interviews are testing the wrong thing. They're testing a skill that AI performs better than humans under artificial conditions. The Audit Interview flips the model. Instead of asking candidates to generate code, it presents them with AI-generated code that contains hidden flaws — security vulnerabilities, logic errors, performance anti-patterns, edge case failures, and architectural problems. The candidate's job is to find the bugs, rank them by severity, and make a ship/no-ship recommendation. The protocol works like this: candidates receive a realistic code review scenario (500-1000 lines of AI-generated code with 3-5 hidden flaws). They have 10 minutes to review the code, identify issues, and present their findings. The evaluation scores 4 dimensions of engineering judgment: 1. Verification: How many bugs did they find? Did they catch the security vulnerability? 2. Prioritization: Did they correctly rank issues by severity? 3. Communication: Can they explain the risk to a non-technical stakeholder? 4. Judgment: Would they ship this code? Under what conditions? With what caveats? The free Audit Interview tool at richardewing.io/tools/audit-interview generates realistic AI-written code with calibrated flaws for interviewers to use immediately.

Read Definition →

📊