Fine-tuning is the practice of continuing the training of a pre-trained language model on a smaller, task-specific dataset to specialize it for one domain or output format. The result is a model that still understands general language but produces outputs closer to the target distribution.

When should we fine-tune versus prompt or do RAG?

Prompt first. RAG second when the gap is missing data. Fine-tune only when the gap is output shape, brand voice, dense domain vocabulary, or cost and latency targets a smaller model can hit if specialized. Fine-tuning does not teach new facts; it teaches patterns.

What is LoRA fine-tuning?

Low-Rank Adaptation is a parameter-efficient fine-tuning method that trains a small set of adapter weights on top of a frozen base model instead of updating all of the model's parameters. The result is a tune that takes hours instead of days, costs orders of magnitude less to train, and ships dozens of variants on one base model.

How many examples do we need to fine-tune?

Between a few hundred and a few thousand high-quality, labeled examples is enough for most narrow tasks. Quality beats quantity. The fixture set used for evaluation should be held out from the training set entirely.

Fine-tuning · Morvion Glossary

Fine-tuning continues the training of a general-purpose model on a curated, task-specific dataset, so the model's weights shift toward the language, structure, and judgments of one domain. It is the right move when prompting alone cannot get the model to where the workflow needs it. It is the wrong first move on almost every project.

When fine-tuning actually helps.

Strict output format. The base model emits JSON-ish output but breaks the schema five percent of the time. Fine-tuning on a few hundred examples can bring schema adherence above ninety-nine percent.
Brand voice. The model reads like an LLM in a tone that does not match the operator. A few hundred labeled voice examples shift it.
Domain vocabulary. Heavy jargon, regulated phrasing, or proprietary nomenclature that the base model has not seen in distribution.
Latency or cost. A small fine-tuned model can match a much larger base model on the narrow task, at a fraction of the per-call cost.

When it does not help.

Fine-tuning does not teach the model new facts; it teaches patterns. If the missing piece is retrieval (the model does not know your data), the answer is RAG, not fine-tuning. If the missing piece is reasoning capability, the answer is a larger model or chain-of-thought prompting, not fine-tuning. Teams that reach for fine-tuning first usually find they have paid for a slower, more brittle version of the same gap.

LoRA and parameter-efficient methods.

Modern fine-tuning rarely updates every weight. Low-Rank Adaptation (LoRA) and related methods update a small set of adapter weights, so a single base model can host dozens of fine-tuned variants without the storage and serving overhead of full copies. For most production workflows in 2026, LoRA is the default.

Fine-tuning requires evals first.

A fine-tune without a fixture set is a guess. The pre-tuning baseline, the post-tuning score, and the regression check against the baseline are how the team learns whether the tune actually moved the workflow forward.

Frequently asked.

What is fine-tuning?: Fine-tuning is the practice of continuing the training of a pre-trained language model on a smaller, task-specific dataset to specialize it for one domain or output format. The result is a model that still understands general language but produces outputs closer to the target distribution.
When should we fine-tune versus prompt or do RAG?: Prompt first. RAG second when the gap is missing data. Fine-tune only when the gap is output shape, brand voice, dense domain vocabulary, or cost and latency targets a smaller model can hit if specialized. Fine-tuning does not teach new facts; it teaches patterns.
What is LoRA fine-tuning?: Low-Rank Adaptation is a parameter-efficient fine-tuning method that trains a small set of adapter weights on top of a frozen base model instead of updating all of the model's parameters. The result is a tune that takes hours instead of days, costs orders of magnitude less to train, and ships dozens of variants on one base model.
How many examples do we need to fine-tune?: Between a few hundred and a few thousand high-quality, labeled examples is enough for most narrow tasks. Quality beats quantity. The fixture set used for evaluation should be held out from the training set entirely.

Fine-tuning

When fine-tuning actually helps.

When it does not help.

LoRA and parameter-efficient methods.

Fine-tuning requires evals first.

Frequently asked.

Intelligent Systems & AI Infrastructure

Keep reading the glossary.

AI infrastructure

CRM intelligence

Immersive website

AI agent

Business intelligence dashboard

Client portal

Discovery sprint

Digital operating layer

Document intelligence

Eval-driven AI

Hospitality website

Marketplace platform

Multi-agent workflow

Real-time dashboard

Retrieval-augmented generation (RAG)

Prompt engineering

Vector database

AI observability

Embedding model

Vector search

Semantic search

Hallucination

Chain-of-thought

Function calling

Model distillation

Safety rails

Eval harness

Regression gate

Model Context Protocol (MCP)

Structured output

Agent tool use

Prompt injection

Agentic search

Observability traces

LLM guardrails

Agent handoff

Vector index

Token budget

Retrieval rerank

Embedding space

Semantic cache

Context window

Faithfulness

Cross-encoder

Model router

AI cost control

Agent memory

Structured extraction

AI evaluation framework

Retrieval quality

AI guardrail policy

Eval fixture

Eval rubric

AI incident

Agent orchestration

Eval versioning

Model fallback

Fine-grained routing

AI policy version control