Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Impara Challenge: Build and Test Your Transformer | Section
Transformer Architecture

bookChallenge: Build and Test Your Transformer

Scorri per mostrare il menu

Task

Build a transformer from scratch and train it on a synthetic sequence-to-sequence task: English-to-reversed-English translation. The input is a sequence of random lowercase letters (e.g., "hello"), and the target is its reverse (e.g., "olleh").

Use only the components you implemented in previous chapters — scaled dot-product attention, multi-head attention, positional encoding, encoder and decoder blocks, layer normalization, and feed-forward sublayers. Do not use external transformer implementations.

Your implementation should:

  1. Generate a synthetic dataset of random lowercase strings and their reverses;
  2. Tokenize strings at the character level and build a vocabulary;
  3. Assemble a full encoder-decoder transformer from your own components;
  4. Implement a training loop with cross-entropy loss;
  5. Evaluate sequence-level accuracy on a held-out test set — the percentage of inputs where the predicted output exactly matches the reversed string.

Once your model trains successfully, experiment with the following:

  • Number of encoder and decoder layers;
  • Number of attention heads;
  • d_model and d_ff values;
  • Sequence length and dataset size;
  • Learning rate and number of training epochs.

Observe how each change affects accuracy and training stability. Note any interesting behaviors – for example, at what point does the model start generalizing, and what happens when you increase sequence length?

Tutto è chiaro?

Come possiamo migliorarlo?

Grazie per i tuoi commenti!

Sezione 1. Capitolo 11

Chieda ad AI

expand

Chieda ad AI

ChatGPT

Chieda pure quello che desidera o provi una delle domande suggerite per iniziare la nostra conversazione

Sezione 1. Capitolo 11
some-alt