What is LLM Fine-Tuning?
LLM Fine-Tuning is the process of training a pre-trained large language model on a domain-specific dataset to improve its performance on specialized tasks.
LLM Fine-Tuning is the process of training a pre-trained large language model on a domain-specific dataset to improve its performance on specialized tasks. Unlike prompting (which provides instructions at inference time), fine-tuning permanently modifies the model's weights.
When to fine-tune vs. prompt: - Fine-tune when: You need consistent formatting, domain-specific terminology, or the task requires knowledge not in the base model - Prompt when: The task is achievable with instructions and examples, or you need flexibility to change behavior quickly - Use RAG when: The required knowledge changes frequently or is too large for fine-tuning
Cost considerations: Fine-tuning requires training compute (one-time), but the fine-tuned model may require fewer tokens per request (ongoing savings).
Why It Matters
Fine-tuning decisions directly impact AI unit economics. A fine-tuned model can achieve higher accuracy with fewer tokens (reducing the Cost of Predictivity), but the upfront training cost must be amortized across usage.
The AUEB calculator at richardewing.io/tools/aueb helps teams model the break-even point: how many requests does it take for fine-tuning savings to exceed the training cost?
How to Measure
Compare: accuracy of fine-tuned model vs. prompted base model, cost per request for each, and calculate the break-even point based on expected request volume.
Frequently Asked Questions
Should we fine-tune or use RAG?
Use RAG when knowledge changes frequently. Fine-tune when you need consistent behavior and the knowledge is stable. Many production systems use both: fine-tuning for style/format and RAG for up-to-date knowledge.
Related Terms
Need Expert Help?
Richard Ewing is a Product Economist and AI Capital Auditor. He helps companies translate technical complexity into financial clarity.
Book Advisory Call →