Notice: This page requires JavaScript to function properly.
Please enable JavaScript in your browser settings or update your browser.
Lära The Agent-Environment Loop: A Maze Analogy | The Learning Loop: Foundations of Reinforcement Learning
Reinforcement Learning Theory for Beginners

The Agent-Environment Loop: A Maze Analogy

Svep för att visa menyn

At the heart of reinforcement learning is the Learning Loop: a continuous cycle where an agent interacts with its environment, takes actions, receives feedback, and gradually learns to make better decisions. Imagine a robot placed in a maze. The robot is the agent, and the maze is its environment. At every step, the robot looks around to see where it is, decides which way to move, and then finds out what happens as a result - whether it hits a wall, moves closer to the exit, or even finds the way out. Each outcome provides feedback, helping the robot adjust its future choices. Over time, by repeating this loop, the robot becomes better at navigating the maze.

initialize agent_knowledge

while not done:
    state = observe_environment()
    action = agent_selects_action(state)
    new_state, reward = environment_responds(action)
    agent_updates_knowledge(state, action, reward, new_state)
    state = new_state

Here is a breakdown of each part of this loop using the maze analogy:

  • The agent observes the environment: the robot checks its current position in the maze and notes any nearby walls or open paths;
  • The agent selects an action: based on what it sees, the robot decides whether to move forward, turn left, turn right, or stay still;
  • The environment responds: after the robot moves, the maze provides feedback - maybe the robot bumps into a wall (negative feedback), moves closer to the exit (positive feedback), or finds the exit itself (reward);
  • The agent updates its knowledge: the robot remembers what happened as a result of its action, so next time it faces a similar situation, it can make a better decision.

This loop repeats, allowing the robot to learn from its successes and mistakes, improving its ability to solve the maze.

question mark

In the context of the agent-environment loop described in the chapter, what is the primary role of the agent?

Vänligen välj det korrekta svaret

Var allt tydligt?

Hur kan vi förbättra det?

Tack för dina kommentarer!

Avsnitt 1. Kapitel 1

Fråga AI

expand

Fråga AI

ChatGPT

Fråga vad du vill eller prova någon av de föreslagna frågorna för att starta vårt samtal

Avsnitt 1. Kapitel 1
some-alt