Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Aprenda Challenge: Build and Test Your Transformer | Section
Transformer Architecture

bookChallenge: Build and Test Your Transformer

Deslize para mostrar o menu

Task

Build a transformer from scratch and train it on a synthetic sequence-to-sequence task: English-to-reversed-English translation. The input is a sequence of random lowercase letters (e.g., "hello"), and the target is its reverse (e.g., "olleh").

Use only the components you implemented in previous chapters — scaled dot-product attention, multi-head attention, positional encoding, encoder and decoder blocks, layer normalization, and feed-forward sublayers. Do not use external transformer implementations.

Your implementation should:

  1. Generate a synthetic dataset of random lowercase strings and their reverses;
  2. Tokenize strings at the character level and build a vocabulary;
  3. Assemble a full encoder-decoder transformer from your own components;
  4. Implement a training loop with cross-entropy loss;
  5. Evaluate sequence-level accuracy on a held-out test set — the percentage of inputs where the predicted output exactly matches the reversed string.

Once your model trains successfully, experiment with the following:

  • Number of encoder and decoder layers;
  • Number of attention heads;
  • d_model and d_ff values;
  • Sequence length and dataset size;
  • Learning rate and number of training epochs.

Observe how each change affects accuracy and training stability. Note any interesting behaviors – for example, at what point does the model start generalizing, and what happens when you increase sequence length?

Tudo estava claro?

Como podemos melhorá-lo?

Obrigado pelo seu feedback!

Seção 1. Capítulo 11

Pergunte à IA

expand

Pergunte à IA

ChatGPT

Pergunte o que quiser ou experimente uma das perguntas sugeridas para iniciar nosso bate-papo

Seção 1. Capítulo 11
some-alt