Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Queries, Keys, and Values: The Building Blocks | Foundations of Attention
Attention Mechanisms Explained

bookQueries, Keys, and Values: The Building Blocks

To understand how attention works in modern neural networks, you need to grasp the roles of queries, keys, and values. These are not just abstract terms—they are vectors that help the model decide what information to focus on. Imagine each word or element in your input is represented as a point in a high-dimensional space. The query is like a search vector sent from one position, asking: What should I pay attention to? The key is a vector attached to each possible source of information, signaling: Here's what I have to offer. The value is the actual content or information that might be passed along if its key matches the query. Geometrically, you can picture the query and key as arrows in space; their alignment (how close they point in the same direction) determines how much information flows from the value.

Note
Definition

Queries, keys, and values are all vectors, but they have distinct roles:

  • The query represents what you are searching for (from the current position or token);
  • The key encodes what each position or token offers for matching;
  • The value contains the actual information to be retrieved if the query and key are relevant.

Each is necessary: without queries, you would not know what you are searching for; without keys, there would be nothing to match against; without values, you would have nothing to retrieve.

Consider a sentence where you want to determine which words are most relevant to the word "bank" in the phrase He sat by the bank. The model creates a query vector for bank, which is compared to the key vectors of all words in the sentence. If the key for river aligns closely with the query for bank, it means bank is likely referring to a riverbank, not a financial institution. The value vector for river is then used to inform the representation of bank. This dynamic interaction—query searching, key matching, and value passing—lets the model focus on the most contextually relevant information at each step.

question mark

Which statement best describes the function of keys in the attention mechanism?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 2

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Awesome!

Completion rate improved to 10

bookQueries, Keys, and Values: The Building Blocks

Glissez pour afficher le menu

To understand how attention works in modern neural networks, you need to grasp the roles of queries, keys, and values. These are not just abstract terms—they are vectors that help the model decide what information to focus on. Imagine each word or element in your input is represented as a point in a high-dimensional space. The query is like a search vector sent from one position, asking: What should I pay attention to? The key is a vector attached to each possible source of information, signaling: Here's what I have to offer. The value is the actual content or information that might be passed along if its key matches the query. Geometrically, you can picture the query and key as arrows in space; their alignment (how close they point in the same direction) determines how much information flows from the value.

Note
Definition

Queries, keys, and values are all vectors, but they have distinct roles:

  • The query represents what you are searching for (from the current position or token);
  • The key encodes what each position or token offers for matching;
  • The value contains the actual information to be retrieved if the query and key are relevant.

Each is necessary: without queries, you would not know what you are searching for; without keys, there would be nothing to match against; without values, you would have nothing to retrieve.

Consider a sentence where you want to determine which words are most relevant to the word "bank" in the phrase He sat by the bank. The model creates a query vector for bank, which is compared to the key vectors of all words in the sentence. If the key for river aligns closely with the query for bank, it means bank is likely referring to a riverbank, not a financial institution. The value vector for river is then used to inform the representation of bank. This dynamic interaction—query searching, key matching, and value passing—lets the model focus on the most contextually relevant information at each step.

question mark

Which statement best describes the function of keys in the attention mechanism?

Select the correct answer

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 2
some-alt