Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Queries, Keys, and Values: The Building Blocks | Foundations of Attention
Attention Mechanisms Explained

bookQueries, Keys, and Values: The Building Blocks

To understand how attention works in modern neural networks, you need to grasp the roles of queries, keys, and values. These are not just abstract termsβ€”they are vectors that help the model decide what information to focus on. Imagine each word or element in your input is represented as a point in a high-dimensional space. The query is like a search vector sent from one position, asking: What should I pay attention to? The key is a vector attached to each possible source of information, signaling: Here's what I have to offer. The value is the actual content or information that might be passed along if its key matches the query. Geometrically, you can picture the query and key as arrows in space; their alignment (how close they point in the same direction) determines how much information flows from the value.

Note
Definition

Queries, keys, and values are all vectors, but they have distinct roles:

  • The query represents what you are searching for (from the current position or token);
  • The key encodes what each position or token offers for matching;
  • The value contains the actual information to be retrieved if the query and key are relevant.

Each is necessary: without queries, you would not know what you are searching for; without keys, there would be nothing to match against; without values, you would have nothing to retrieve.

Consider a sentence where you want to determine which words are most relevant to the word "bank" in the phrase He sat by the bank. The model creates a query vector for bank, which is compared to the key vectors of all words in the sentence. If the key for river aligns closely with the query for bank, it means bank is likely referring to a riverbank, not a financial institution. The value vector for river is then used to inform the representation of bank. This dynamic interactionβ€”query searching, key matching, and value passingβ€”lets the model focus on the most contextually relevant information at each step.

question mark

Which statement best describes the function of keys in the attention mechanism?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 2

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

Awesome!

Completion rate improved to 10

bookQueries, Keys, and Values: The Building Blocks

Swipe to show menu

To understand how attention works in modern neural networks, you need to grasp the roles of queries, keys, and values. These are not just abstract termsβ€”they are vectors that help the model decide what information to focus on. Imagine each word or element in your input is represented as a point in a high-dimensional space. The query is like a search vector sent from one position, asking: What should I pay attention to? The key is a vector attached to each possible source of information, signaling: Here's what I have to offer. The value is the actual content or information that might be passed along if its key matches the query. Geometrically, you can picture the query and key as arrows in space; their alignment (how close they point in the same direction) determines how much information flows from the value.

Note
Definition

Queries, keys, and values are all vectors, but they have distinct roles:

  • The query represents what you are searching for (from the current position or token);
  • The key encodes what each position or token offers for matching;
  • The value contains the actual information to be retrieved if the query and key are relevant.

Each is necessary: without queries, you would not know what you are searching for; without keys, there would be nothing to match against; without values, you would have nothing to retrieve.

Consider a sentence where you want to determine which words are most relevant to the word "bank" in the phrase He sat by the bank. The model creates a query vector for bank, which is compared to the key vectors of all words in the sentence. If the key for river aligns closely with the query for bank, it means bank is likely referring to a riverbank, not a financial institution. The value vector for river is then used to inform the representation of bank. This dynamic interactionβ€”query searching, key matching, and value passingβ€”lets the model focus on the most contextually relevant information at each step.

question mark

Which statement best describes the function of keys in the attention mechanism?

Select the correct answer

Everything was clear?

How can we improve it?

Thanks for your feedback!

SectionΒ 1. ChapterΒ 2
some-alt