Challenge: Fine-tune a Customer Support LLM
Desliza para mostrar el menú
Task
Fine-tune a small open-source LLM for a customer support scenario using LoRA or QLoRA. Complete all steps locally.
- Choose a base model: pick a small pretrained model compatible with PEFT, such as
bigscience/bloom-560morfacebook/opt-125m; - Prepare an instruction dataset: write at least 10–20 prompt-response pairs covering real customer support interactions. Format them as JSONL with
instructionandresponsefields; - Apply LoRA or QLoRA: use the
peftlibrary to add adapters. Choose QLoRA if your GPU has less than 8GB of VRAM; - Fine-tune: implement a training loop using the SFT techniques from this course. Use gradient accumulation and a cosine schedule with warmup;
- Generate responses: after training, run both the base model and your fine-tuned model on several prompts, including examples from your dataset and new unseen queries;
- Evaluate: compare outputs side by side. Score them on helpfulness, accuracy, and tone. Note where the fine-tuned model improves and where it still falls short.
After completing the steps, reflect on the following:
- How did the model's responses change after fine-tuning?
- What effect did dataset size have on output quality?
- What would you do differently with more data or compute – more epochs, higher rank, a larger base model?
¿Todo estuvo claro?
¡Gracias por tus comentarios!
Sección 1. Capítulo 11
Pregunte a AI
Pregunte a AI
Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla
Sección 1. Capítulo 11