What is Model Right-Sizing?
Model right-sizing is the practice of selecting the smallest, cheapest AI model that achieves acceptable accuracy for a given use case.
Model right-sizing is the practice of selecting the smallest, cheapest AI model that achieves acceptable accuracy for a given use case. It directly addresses the Cost of Predictivity curve — the exponential relationship between AI accuracy and inference cost.
The Right-Sizing Principle: - Simple queries (classification, routing): Use a small, fast model (GPT-4o-mini, Claude Haiku) - Medium complexity (summarization, extraction): Use a mid-tier model - High complexity (reasoning, code generation): Use a frontier model - Critical decisions: Use a frontier model with verification layer
A well-right-sized AI system can serve 80% of requests at 10-20% of the cost of using a single frontier model for everything.
Why It Matters
Most AI products use a single large model for all requests — the equivalent of using a Ferrari to drive to the mailbox. This destroys gross margins unnecessarily.
Richard Ewing's AUEB calculator (richardewing.io/tools/aueb) helps teams identify the optimal model for each use case by modeling the accuracy-cost tradeoff.
How to Measure
Map each use case to accuracy requirements. Test model options at each tier. Calculate cost per request at each accuracy level. Select the model that meets accuracy requirements at minimum cost.
Frequently Asked Questions
Won't using smaller models hurt quality?
Only if accuracy requirements are mismatched. For classification and routing tasks, small models achieve 95%+ accuracy at 5% of the cost. For complex reasoning, frontier models are necessary — but these should be the exception, not the default.
Related Terms
Need Expert Help?
Richard Ewing is a Product Economist and AI Capital Auditor. He helps companies translate technical complexity into financial clarity.
Book Advisory Call →