Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Challenge: Build and Test Your Transformer | Section
Transformer Architecture

bookChallenge: Build and Test Your Transformer

Sveip for å vise menyen

Task

Build a transformer from scratch and train it on a synthetic sequence-to-sequence task: English-to-reversed-English translation. The input is a sequence of random lowercase letters (e.g., "hello"), and the target is its reverse (e.g., "olleh").

Use only the components you implemented in previous chapters — scaled dot-product attention, multi-head attention, positional encoding, encoder and decoder blocks, layer normalization, and feed-forward sublayers. Do not use external transformer implementations.

Your implementation should:

  1. Generate a synthetic dataset of random lowercase strings and their reverses;
  2. Tokenize strings at the character level and build a vocabulary;
  3. Assemble a full encoder-decoder transformer from your own components;
  4. Implement a training loop with cross-entropy loss;
  5. Evaluate sequence-level accuracy on a held-out test set — the percentage of inputs where the predicted output exactly matches the reversed string.

Once your model trains successfully, experiment with the following:

  • Number of encoder and decoder layers;
  • Number of attention heads;
  • d_model and d_ff values;
  • Sequence length and dataset size;
  • Learning rate and number of training epochs.

Observe how each change affects accuracy and training stability. Note any interesting behaviors – for example, at what point does the model start generalizing, and what happens when you increase sequence length?

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 11

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Seksjon 1. Kapittel 11
some-alt