Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lære Challenge: Q-table Update with Q-learning | Classic RL Algorithms: Q-learning & SARSA
Hands-On Classic RL Algorithms with Python
Sektion 1. Kapitel 3
single

single

Challenge: Q-table Update with Q-learning

Stryg for at vise menuen

Opgave

Swipe to start coding

Challenge: Given a Q-table and a sequence of actions, update the Q-values using the Q-learning rule.

  • For each transition in transitions, update the Q-value in q_table for the given state and action using the Q-learning update formula.
  • Each transition is a tuple: (state, action, reward, next_state).
  • Use the learning rate alpha and discount factor gamma for the update.
  • The Q-learning update formula is:
    Q[state, action] = Q[state, action] + alpha * (reward + gamma * max(Q[next_state]) - Q[state, action]).

Løsning

Switch to desktopSkift til skrivebord for at øve i den virkelige verdenFortsæt der, hvor du er, med en af nedenstående muligheder
Var alt klart?

Hvordan kan vi forbedre det?

Tak for dine kommentarer!

Sektion 1. Kapitel 3
single

single

Spørg AI

expand

Spørg AI

ChatGPT

Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat

some-alt