Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära Implementing LoRA with the PEFT Library | Section
Fine-tuning and Adapting LLMs

bookImplementing LoRA with the PEFT Library

Svep för att visa menyn

The peft library wraps any Hugging Face model with LoRA adapters in a few lines of code. You define which layers to adapt and at what rank, and peft handles injecting the trainable matrices and freezing everything else.

Applying LoRA to a Pretrained Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model

model_name = "bigscience/bloom-560m"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

lora_config = LoraConfig(
    r=8,                              # Rank of the low-rank matrices
    lora_alpha=32,                    # Scaling factor for adapter outputs
    target_modules=["query_key_value"],  # Attention modules to adapt
    lora_dropout=0.05,                # Dropout on adapter layers
    bias="none",                      # No additional bias parameters
    task_type="CAUSAL_LM"
)

lora_model = get_peft_model(model, lora_config)
lora_model.print_trainable_parameters()

print_trainable_parameters() shows how many parameters are trainable vs. frozen. For a 560M model with r=8, expect trainable parameters to be well under 1% of total.

Run this locally to see the parameter breakdown and confirm that only the LoRA adapter weights are marked as trainable.

Key Configuration Parameters

r sets the rank of the adapter matrices. Higher rank means more capacity but more memory. Start with r=8 and increase only if the model underfits.

lora_alpha scales the adapter output – effectively a learning rate multiplier for the adapters. A common heuristic is to set lora_alpha = 2 × r.

target_modules controls which layers receive adapters. Targeting only attention projections is the most common approach; adding MLP layers increases capacity at higher cost.

lora_dropout applies dropout to adapter outputs during training, reducing overfitting on small datasets.

Training with the Wrapped Model

The lora_model is a standard nn.Module – you train it with the same loop as any other PyTorch model:

from torch.optim import AdamW

optimizer = AdamW(lora_model.parameters(), lr=2e-4)

lora_model.train()
inputs = tokenizer("Fine-tuning with LoRA is efficient.", return_tensors="pt")
outputs = lora_model(**inputs, labels=inputs["input_ids"])

outputs.loss.backward()
optimizer.step()
optimizer.zero_grad()

print(f"Loss: {outputs.loss.item():.4f}")
question mark

What Is Correct about PEFT-based LoRA Fine-tuning?

Vänligen välj det korrekta svaret

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 5

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 1. Kapitel 5
some-alt