Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Apprendre Challenge: Implementing Scaled Dot-Product Attention | Section
Transformer Architecture

bookChallenge: Implementing Scaled Dot-Product Attention

Glissez pour afficher le menu

Task

You now have all the pieces to implement scaled dot-product attention from scratch. Using the formula from the previous chapter, write a function scaled_dot_product_attention that:

  1. Takes Q, K, V tensors of shape (batch_size, seq_len, d_k) as input;
  2. Accepts an optional mask tensor of shape (batch_size, seq_len_q, seq_len_k) — when provided, positions where mask == 0 should be set to -inf before softmax;
  3. Returns the output tensor and the attention weights.

Implement the function locally.

Tout était clair ?

Comment pouvons-nous l'améliorer ?

Merci pour vos commentaires !

Section 1. Chapitre 3

Demandez à l'IA

expand

Demandez à l'IA

ChatGPT

Posez n'importe quelle question ou essayez l'une des questions suggérées pour commencer notre discussion

Section 1. Chapitre 3
some-alt