Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Leer Challenge: Build and Test Your Transformer | Section
Transformer Architecture

bookChallenge: Build and Test Your Transformer

Veeg om het menu te tonen

Task

Build a transformer from scratch and train it on a synthetic sequence-to-sequence task: English-to-reversed-English translation. The input is a sequence of random lowercase letters (e.g., "hello"), and the target is its reverse (e.g., "olleh").

Use only the components you implemented in previous chapters — scaled dot-product attention, multi-head attention, positional encoding, encoder and decoder blocks, layer normalization, and feed-forward sublayers. Do not use external transformer implementations.

Your implementation should:

  1. Generate a synthetic dataset of random lowercase strings and their reverses;
  2. Tokenize strings at the character level and build a vocabulary;
  3. Assemble a full encoder-decoder transformer from your own components;
  4. Implement a training loop with cross-entropy loss;
  5. Evaluate sequence-level accuracy on a held-out test set — the percentage of inputs where the predicted output exactly matches the reversed string.

Once your model trains successfully, experiment with the following:

  • Number of encoder and decoder layers;
  • Number of attention heads;
  • d_model and d_ff values;
  • Sequence length and dataset size;
  • Learning rate and number of training epochs.

Observe how each change affects accuracy and training stability. Note any interesting behaviors – for example, at what point does the model start generalizing, and what happens when you increase sequence length?

Was alles duidelijk?

Hoe kunnen we het verbeteren?

Bedankt voor je feedback!

Sectie 1. Hoofdstuk 11

Vraag AI

expand

Vraag AI

ChatGPT

Vraag wat u wilt of probeer een van de voorgestelde vragen om onze chat te starten.

Sectie 1. Hoofdstuk 11
some-alt