Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Why Positional Encoding Matters | Understanding Transformer Foundations
Transformers for Natural Language Processing

bookWhy Positional Encoding Matters

Swipe to show menu

Without a way to encode the position of each token, a Transformer would perceive the input as a bag of words, losing critical information about sentence structure and meaning. For instance, the sentences "the cat chased the mouse" and "the mouse chased the cat" contain the same words but convey entirely different meanings due to word order. To address this, positional encoding is introduced to inject information about the order of tokens into the model, allowing it to distinguish between different arrangements of the same words and thus better understand the context and meaning of text.

There are multiple strategies for adding positional information to token embeddings in Transformers. The two most common are sinusoidal and learned positional encodings. Each approach has unique characteristics and trade-offs, especially when applied to various NLP tasks.

StrategyDescriptionProsCons
SinusoidalUses fixed sine and cosine functions to encode positions.No extra parameters are required to train the model.This approach is less flexible for specific data patterns.
LearnedLearns a unique embedding vector for each position.The model adapts more effectively to a specific dataset.It may not generalize well to longer sequences.
question mark

Which of the following best explains why positional encoding is necessary in Transformers?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 6

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Section 1. Chapter 6
some-alt