Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Queries, Keys, and Values: The Building Blocks | Foundations of Attention
Attention Mechanisms Explained

bookQueries, Keys, and Values: The Building Blocks

To understand how attention works in modern neural networks, you need to grasp the roles of queries, keys, and values. These are not just abstract terms—they are vectors that help the model decide what information to focus on. Imagine each word or element in your input is represented as a point in a high-dimensional space. The query is like a search vector sent from one position, asking: What should I pay attention to? The key is a vector attached to each possible source of information, signaling: Here's what I have to offer. The value is the actual content or information that might be passed along if its key matches the query. Geometrically, you can picture the query and key as arrows in space; their alignment (how close they point in the same direction) determines how much information flows from the value.

Note
Definition

Queries, keys, and values are all vectors, but they have distinct roles:

  • The query represents what you are searching for (from the current position or token);
  • The key encodes what each position or token offers for matching;
  • The value contains the actual information to be retrieved if the query and key are relevant.

Each is necessary: without queries, you would not know what you are searching for; without keys, there would be nothing to match against; without values, you would have nothing to retrieve.

Consider a sentence where you want to determine which words are most relevant to the word "bank" in the phrase He sat by the bank. The model creates a query vector for bank, which is compared to the key vectors of all words in the sentence. If the key for river aligns closely with the query for bank, it means bank is likely referring to a riverbank, not a financial institution. The value vector for river is then used to inform the representation of bank. This dynamic interaction—query searching, key matching, and value passing—lets the model focus on the most contextually relevant information at each step.

question mark

Which statement best describes the function of keys in the attention mechanism?

Select the correct answer

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 2

Spør AI

expand

Spør AI

ChatGPT

Spør om hva du vil, eller prøv ett av de foreslåtte spørsmålene for å starte chatten vår

Awesome!

Completion rate improved to 10

bookQueries, Keys, and Values: The Building Blocks

Sveip for å vise menyen

To understand how attention works in modern neural networks, you need to grasp the roles of queries, keys, and values. These are not just abstract terms—they are vectors that help the model decide what information to focus on. Imagine each word or element in your input is represented as a point in a high-dimensional space. The query is like a search vector sent from one position, asking: What should I pay attention to? The key is a vector attached to each possible source of information, signaling: Here's what I have to offer. The value is the actual content or information that might be passed along if its key matches the query. Geometrically, you can picture the query and key as arrows in space; their alignment (how close they point in the same direction) determines how much information flows from the value.

Note
Definition

Queries, keys, and values are all vectors, but they have distinct roles:

  • The query represents what you are searching for (from the current position or token);
  • The key encodes what each position or token offers for matching;
  • The value contains the actual information to be retrieved if the query and key are relevant.

Each is necessary: without queries, you would not know what you are searching for; without keys, there would be nothing to match against; without values, you would have nothing to retrieve.

Consider a sentence where you want to determine which words are most relevant to the word "bank" in the phrase He sat by the bank. The model creates a query vector for bank, which is compared to the key vectors of all words in the sentence. If the key for river aligns closely with the query for bank, it means bank is likely referring to a riverbank, not a financial institution. The value vector for river is then used to inform the representation of bank. This dynamic interaction—query searching, key matching, and value passing—lets the model focus on the most contextually relevant information at each step.

question mark

Which statement best describes the function of keys in the attention mechanism?

Select the correct answer

Alt var klart?

Hvordan kan vi forbedre det?

Takk for tilbakemeldingene dine!

Seksjon 1. Kapittel 2
some-alt