Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Learn Challenge: Q-table Update with Q-learning | Classic RL Algorithms: Q-learning & SARSA
Hands-On Classic RL Algorithms with Python
Section 1. Chapter 3
single

single

Challenge: Q-table Update with Q-learning

Swipe to show menu

Task

Swipe to start coding

Challenge: Given a Q-table and a sequence of actions, update the Q-values using the Q-learning rule.

  • For each transition in transitions, update the Q-value in q_table for the given state and action using the Q-learning update formula.
  • Each transition is a tuple: (state, action, reward, next_state).
  • Use the learning rate alpha and discount factor gamma for the update.
  • The Q-learning update formula is:
    Q[state, action] = Q[state, action] + alpha * (reward + gamma * max(Q[next_state]) - Q[state, action]).

Solution

Switch to desktopSwitch to desktop for real-world practiceContinue from where you are using one of the options below
Everything was clear?

How can we improve it?

Thanks for your feedback!

Section 1. Chapter 3
single

single

Ask AI

expand

Ask AI

ChatGPT

Ask anything or try one of the suggested questions to begin our chat

some-alt