Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprende Challenge: Fine-tune a Customer Support LLM | Section
Fine-tuning and Adapting LLMs

bookChallenge: Fine-tune a Customer Support LLM

Desliza para mostrar el menú

Task

Fine-tune a small open-source LLM for a customer support scenario using LoRA or QLoRA. Complete all steps locally.

  1. Choose a base model: pick a small pretrained model compatible with PEFT, such as bigscience/bloom-560m or facebook/opt-125m;
  2. Prepare an instruction dataset: write at least 10–20 prompt-response pairs covering real customer support interactions. Format them as JSONL with instruction and response fields;
  3. Apply LoRA or QLoRA: use the peft library to add adapters. Choose QLoRA if your GPU has less than 8GB of VRAM;
  4. Fine-tune: implement a training loop using the SFT techniques from this course. Use gradient accumulation and a cosine schedule with warmup;
  5. Generate responses: after training, run both the base model and your fine-tuned model on several prompts, including examples from your dataset and new unseen queries;
  6. Evaluate: compare outputs side by side. Score them on helpfulness, accuracy, and tone. Note where the fine-tuned model improves and where it still falls short.

After completing the steps, reflect on the following:

  • How did the model's responses change after fine-tuning?
  • What effect did dataset size have on output quality?
  • What would you do differently with more data or compute – more epochs, higher rank, a larger base model?
¿Todo estuvo claro?

¿Cómo podemos mejorarlo?

¡Gracias por tus comentarios!

Sección 1. Capítulo 11

Pregunte a AI

expand

Pregunte a AI

ChatGPT

Pregunte lo que quiera o pruebe una de las preguntas sugeridas para comenzar nuestra charla

Sección 1. Capítulo 11
some-alt