Module 2.1: AI COGS Analysis
Master the economics of AI features: inference costs, token economics, and pricing architecture. Build AI features that protect — not destroy — your gross margins.
🎯 What You'll Learn
- ✓ How AI COGS differs from traditional SaaS COGS
- ✓ How to analyze token economics (input/output, model selection, caching)
- ✓ How to design pricing models that account for AI variable costs
- ✓ How to project AI margin at 10x and 100x scale
Lesson 1: The AI COGS Equation
Unlike traditional software where COGS is near-zero after development, AI features have per-request costs that scale linearly with usage. This fundamentally changes the margin equation.
Hosting + bandwidth + support = 15-25% of revenue. Marginal cost of serving one more user is nearly zero.
Traditional COGS + inference costs + embedding storage + model fine-tuning + guardrail processing. Each user interaction has a real cost.
If AI feature cost per user > revenue per user, you lose money on every interaction. More users = more losses. This is AI margin collapse.
Calculate your AI feature's COGS: (tokens consumed × cost per token) + (embedding storage) + (guardrail processing). Express as cost per user per month.
Lesson 2: Token Economics Deep Dive
Tokens are the fundamental unit of AI cost. Understanding token economics — input vs. output, model selection, caching — determines whether your AI feature is profitable.
Output tokens cost 2-4x more than input tokens. A chatbot that generates long responses costs dramatically more than one that gives concise answers.
Choosing GPT-4o vs. GPT-4o-mini can reduce costs 15-20x. Most features don't need frontier models. Matching model capability to task complexity is the #1 cost lever.
Shorter, more precise prompts = fewer tokens = lower cost. System prompts repeated on every request are the biggest hidden cost multiplier.
Semantic caching (storing responses for similar queries) can reduce inference calls 30-70%. The cache hit rate directly reduces your AI COGS.
Use the AUEB calculator at /tools/aueb to model your AI feature's token economics at current volume, 10x volume, and 100x volume. At what scale does margin collapse occur?
Lesson 3: API Pricing Architecture
How you price AI features determines whether they're profit centers or cost centers. The pricing model must account for the variable cost nature of AI.
Charge per API call, per query, or per action. Aligns costs with revenue. Risk: usage spikes can overwhelm infrastructure.
Free tier (limited queries) → Pro tier (more queries + features) → Enterprise (unlimited + SLAs). The free tier is your PLG acquisition engine.
Per-seat pricing plus an AI "credits" budget per seat. When credits run out, user upgrades. Combines predictability with usage correlation.
Design a pricing model for an AI feature with: 1) known cost per query, 2) variable usage patterns, 3) a free tier for PLG. Calculate break-even at each tier.
📊 Module Assessment
Complete to demonstrate mastery of Module 2.1: