Challenge: Build and Test Your Transformer
Svep för att visa menyn
Task
Build a transformer from scratch and train it on a synthetic sequence-to-sequence task: English-to-reversed-English translation. The input is a sequence of random lowercase letters (e.g., "hello"), and the target is its reverse (e.g., "olleh").
Use only the components you implemented in previous chapters — scaled dot-product attention, multi-head attention, positional encoding, encoder and decoder blocks, layer normalization, and feed-forward sublayers. Do not use external transformer implementations.
Your implementation should:
- Generate a synthetic dataset of random lowercase strings and their reverses;
- Tokenize strings at the character level and build a vocabulary;
- Assemble a full encoder-decoder transformer from your own components;
- Implement a training loop with cross-entropy loss;
- Evaluate sequence-level accuracy on a held-out test set — the percentage of inputs where the predicted output exactly matches the reversed string.
Once your model trains successfully, experiment with the following:
- Number of encoder and decoder layers;
- Number of attention heads;
d_modelandd_ffvalues;- Sequence length and dataset size;
- Learning rate and number of training epochs.
Observe how each change affects accuracy and training stability. Note any interesting behaviors – for example, at what point does the model start generalizing, and what happens when you increase sequence length?
Tack för dina kommentarer!
Fråga AI
Fråga AI
Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal