Framework Definition

Small Language Models (SLM)

Coined by Richard Ewing, AI Economist

Definition

Small Language Models (SLMs) are highly distilled AI models typically containing under 8 billion parameters. They are optimized for specific, deterministic tasks rather than emergent general reasoning. While frontier models (GPT-4) cost fractions of a cent per token and latency is high, SLMs can run locally on edge devices (laptops, phones) or highly optimized serverless endpoints. They drastically reduce inferencing costs and eliminate the need to send data off-site.

Why It Matters

In the pursuit of positive Return on AI Investment (ROAI), using a 1-trillion parameter model to route support tickets is economically devastating. SLMs right-size the intelligence to the task, achieving margin preservation.

How to Calculate

1Identify repetitive classification tasks in the AI orchestration chain
2Calculate the cost delta between frontier API calls and local SLM inference
3Implement routing architecture to leverage SLMs as the frontline tier

"ROAI is the New ROI: Why CFOs Are Killing Your AI Pilots in 2026" — The Canon, Apr 2026

Deep Dive on the Blog

Explore the latest analysis and practical applications of this framework on the engineering economics blog.

Search the Blog Archive →

Calculate Yours

Use the interactive tool to calculate your Small Language Models (SLM).

Use the AI Unit Economics Benchmark (AUEB) →

Citation

To cite this definition:

Ewing, R. (2026). "Small Language Models (SLM)." richardewing.io.
https://www.richardewing.io/articles/frameworks/slm