Implementing LoRA with the PEFT Library
Svep för att visa menyn
The peft library wraps any Hugging Face model with LoRA adapters in a few lines of code. You define which layers to adapt and at what rank, and peft handles injecting the trainable matrices and freezing everything else.
Applying LoRA to a Pretrained Model
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import LoraConfig, get_peft_model
model_name = "bigscience/bloom-560m"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
lora_config = LoraConfig(
r=8, # Rank of the low-rank matrices
lora_alpha=32, # Scaling factor for adapter outputs
target_modules=["query_key_value"], # Attention modules to adapt
lora_dropout=0.05, # Dropout on adapter layers
bias="none", # No additional bias parameters
task_type="CAUSAL_LM"
)
lora_model = get_peft_model(model, lora_config)
lora_model.print_trainable_parameters()
print_trainable_parameters() shows how many parameters are trainable vs. frozen. For a 560M model with r=8, expect trainable parameters to be well under 1% of total.
Run this locally to see the parameter breakdown and confirm that only the LoRA adapter weights are marked as trainable.
Key Configuration Parameters
r sets the rank of the adapter matrices. Higher rank means more capacity but more memory. Start with r=8 and increase only if the model underfits.
lora_alpha scales the adapter output – effectively a learning rate multiplier for the adapters. A common heuristic is to set lora_alpha = 2 × r.
target_modules controls which layers receive adapters. Targeting only attention projections is the most common approach; adding MLP layers increases capacity at higher cost.
lora_dropout applies dropout to adapter outputs during training, reducing overfitting on small datasets.
Training with the Wrapped Model
The lora_model is a standard nn.Module – you train it with the same loop as any other PyTorch model:
from torch.optim import AdamW
optimizer = AdamW(lora_model.parameters(), lr=2e-4)
lora_model.train()
inputs = tokenizer("Fine-tuning with LoRA is efficient.", return_tensors="pt")
outputs = lora_model(**inputs, labels=inputs["input_ids"])
outputs.loss.backward()
optimizer.step()
optimizer.zero_grad()
print(f"Loss: {outputs.loss.item():.4f}")
Tack för dina kommentarer!
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal