Apprendre Supervised Fine-tuning

Glissez pour afficher le menu

Pre-training teaches a model to predict the next token. That is useful, but it does not make the model follow instructions. Supervised fine-tuning (SFT) bridges that gap by training the model on curated prompt-response pairs, teaching it to produce outputs that align with user intent.

Pre-training vs. SFT

Pre-training exposes the model to vast amounts of raw text — the objective is next-token prediction with no notion of what a "good" response looks like. SFT shifts the objective: every training example has a prompt and a desired response, and the model is trained to generate that response given the prompt.

The model weights are updated using the same cross-entropy loss as pre-training, but only computed over the response tokens — not the prompt. The prompt is used as context but does not contribute to the loss.

Prompt-Response Format

A typical SFT dataset entry for customer support looks like this:

Prompt: "My internet connection is slow. Can you help me troubleshoot?"
Response: "Let's start by restarting your router. Unplug it, wait 30 seconds, and plug it back in. If the issue persists, let me know your router model and I can walk you through further steps."

The model sees the prompt as input context and learns to generate the response. At scale, thousands of such pairs teach the model the expected tone, format, and domain knowledge.

What SFT Changes

SFT does not fundamentally change the model architecture or add parameters. It adjusts the weight distribution so the model assigns higher probability to responses that match the training distribution. A well-curated SFT dataset can produce dramatic improvements in instruction-following with relatively few training steps — often just one or two epochs over a few thousand examples.

from transformers import AutoTokenizer, AutoModelForCausalLM
from torch.optim import AdamW

model_name = "gpt2"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
optimizer = AdamW(model.parameters(), lr=2e-5)

prompt = "My internet connection is slow. Can you help me troubleshoot?"
response = "Let's start by restarting your router."

# Concatenating prompt and response — loss computed over full sequence here for simplicity
text = prompt + " " + response
inputs = tokenizer(text, return_tensors="pt")

model.train()
outputs = model(**inputs, labels=inputs["input_ids"])
loss = outputs.loss
loss.backward()
optimizer.step()

print(f"SFT loss: {loss.item():.4f}")

Run this locally to see SFT in its simplest form. In a real setup, you would mask the prompt tokens so the loss is only computed over the response.

Tout était clair ?

Merci pour vos commentaires !

Section 1. Chapitre 2

Demandez à l'IA

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Section 1. Chapitre 2