Challenge: Build and Test Your Transformer
Glissez pour afficher le menu
Task
Build a transformer from scratch and train it on a synthetic sequence-to-sequence task: English-to-reversed-English translation. The input is a sequence of random lowercase letters (e.g., "hello"), and the target is its reverse (e.g., "olleh").
Use only the components you implemented in previous chapters — scaled dot-product attention, multi-head attention, positional encoding, encoder and decoder blocks, layer normalization, and feed-forward sublayers. Do not use external transformer implementations.
Your implementation should:
- Generate a synthetic dataset of random lowercase strings and their reverses;
- Tokenize strings at the character level and build a vocabulary;
- Assemble a full encoder-decoder transformer from your own components;
- Implement a training loop with cross-entropy loss;
- Evaluate sequence-level accuracy on a held-out test set — the percentage of inputs where the predicted output exactly matches the reversed string.
Once your model trains successfully, experiment with the following:
- Number of encoder and decoder layers;
- Number of attention heads;
d_modelandd_ffvalues;- Sequence length and dataset size;
- Learning rate and number of training epochs.
Observe how each change affects accuracy and training stability. Note any interesting behaviors – for example, at what point does the model start generalizing, and what happens when you increase sequence length?
Merci pour vos commentaires !
Demandez à l'IA
Demandez à l'IA
Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion