0% found this document useful (0 votes)

8 views3 pages

Ex No4rl

The document outlines a procedure for implementing the Q-learning algorithm to find the optimal path in a given environment. It includes steps for initializing parameters, creating a Q-table, defining state transitions and rewards, and testing the learned policy. The program demonstrates the Q-learning process with sample code and outputs the learned Q-table and the optimal path.

Uploaded by

ffidlogajaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views3 pages

Ex No4rl

Uploaded by

ffidlogajaya

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Ex.

No 4 Q-LEARNING
Date

Aim:

To find the optimal path in an environment using Q-learning algorithm.

Procedure:
1. Initialize Q-learning parameters.
2. Create and initialize the Q-table.
3. Define state transition matrix and rewards.

4. Implement Q-learning with epsilon-greedy policy.

5. Test the learned policy by printing the Q-table and finding the optimalpath.

Program:
import numpy as np
# Define the Q-learning parameters
num states =6

num actions = 2

learning_ rate =0.l

discount factor = 0.9
num_episodes = 1000
# Initialize the Q-table with zeros
Q=np.zeros((num_states, num_actions))
# Define the state transition matrix and rewards

T= np.array([[-1, 2]. [0, 3]. [1, 4]. [2, 5], (3, -1), (4, 5]])
rewards = np.array([[-Il, -1], (-1, -1], [-1, -1], [-1, -1], (-1, 100]. [10, 100]1)
#Q-learning algorithm
for episode in range(num _episodes):
state = 0 # Initial state is 0

while state != 5: #Continue until reaching the goal (state 5)

# Choose an action epsilon-greedily (exploration vs. exploitation)
epsilon =0.1
if np.random.rand() < epsilon:
action = np.random.choice(num actions)
else:

action = np.argmax(Q[state, :])

# Take the chosen action and observe the next state and reward

new_state = T[state, action]

reward = rewards[state, action]
# Update the Q-value using the Q-learning update rule
Q[state, action] += learning rate * (reward + discount_factor * np.max(Q[new_state, :])
-Q[state, action])
# Move to the next state

state = new state

# Print the learned Q-table

print("Learned Q-table:")
print(Q)

# Test the learned policy

state = 0

path = [state]
while state != 5:

action = np.argmax(Q(state, :])

new state = T[state, action]
path.append(new_state)
state F new state

print("Optimal path:", path)

Output:
Learned Q-table:
[[ 8.9895291 5e+02 6.65250406e+02] [
3.9471955 l e+02

9.61253887e+00] [
1.07808528e+02 8.48844034e+02][
1.99000000e-01

3.95632623e+01

3.22192736e+02] [
9.99481917e+02] [
8.96535389e+02 9.9995371 7e+02]]
Optimal path: [0, -1, 5]

Result:

To find the optimal path in an environment using Q-learming algorithm.

Ex No2 RL
No ratings yet
Ex No2 RL
3 pages
Reinforcement Learning II
No ratings yet
Reinforcement Learning II
28 pages
MDPs Solving
No ratings yet
MDPs Solving
19 pages
10.Q Learning Algorithm
No ratings yet
10.Q Learning Algorithm
2 pages
Nidhish RLAI-Lab1
No ratings yet
Nidhish RLAI-Lab1
18 pages
AI Seminar RL
No ratings yet
AI Seminar RL
27 pages
A Painless Q-Learning Tutorial
No ratings yet
A Painless Q-Learning Tutorial
6 pages
Exp1 D16AD 60
No ratings yet
Exp1 D16AD 60
11 pages
Unit 5
No ratings yet
Unit 5
65 pages
Unit 5
No ratings yet
Unit 5
54 pages
Deep Learning Binoy-19-3-RL Q Learning
No ratings yet
Deep Learning Binoy-19-3-RL Q Learning
26 pages
Q-Learning Algorithm
No ratings yet
Q-Learning Algorithm
13 pages
Muhammad Muaaz Aamer BSCS 2021 FAST NU LHR - Take Home Quiz No 3
No ratings yet
Muhammad Muaaz Aamer BSCS 2021 FAST NU LHR - Take Home Quiz No 3
4 pages
Q Learning
No ratings yet
Q Learning
6 pages
Reinforcement Learning Algorithms in Global Path Planning For Mobile Robot
No ratings yet
Reinforcement Learning Algorithms in Global Path Planning For Mobile Robot
5 pages
Ai (It) Unit-5
No ratings yet
Ai (It) Unit-5
43 pages
Q-Learning for Optimal Pathfinding
No ratings yet
Q-Learning for Optimal Pathfinding
2 pages
Adobe Scan Nov 18, 2024
No ratings yet
Adobe Scan Nov 18, 2024
13 pages
112 Q Learning N
100% (1)
112 Q Learning N
15 pages
Intro To Reinforcement Learning - DQ Q AC A3C
No ratings yet
Intro To Reinforcement Learning - DQ Q AC A3C
36 pages
Exam
No ratings yet
Exam
7 pages
Q Learning
No ratings yet
Q Learning
9 pages
CZ3005 Module 5 - Reinforcement Learning
No ratings yet
CZ3005 Module 5 - Reinforcement Learning
31 pages
Intelligent Optimization Algorithm For Master
No ratings yet
Intelligent Optimization Algorithm For Master
47 pages
Reinforcement Learning - Personal Study Notes
No ratings yet
Reinforcement Learning - Personal Study Notes
12 pages
Lec 09
No ratings yet
Lec 09
26 pages
Artificial Intelligence: Lecture 11 - Reinforcement Learning II Dr. Shivanjali Khare
No ratings yet
Artificial Intelligence: Lecture 11 - Reinforcement Learning II Dr. Shivanjali Khare
52 pages
Define The Problem
No ratings yet
Define The Problem
6 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
12 pages
CS480 Lecture November 21st
No ratings yet
CS480 Lecture November 21st
193 pages
p1 Piotr
No ratings yet
p1 Piotr
7 pages
MAS Lab7 QFA
No ratings yet
MAS Lab7 QFA
10 pages
Report p1
No ratings yet
Report p1
7 pages
Frozen Lake
No ratings yet
Frozen Lake
6 pages
Class-Work-1 (26-08-2024)
No ratings yet
Class-Work-1 (26-08-2024)
5 pages
Q Learning
No ratings yet
Q Learning
38 pages
Hota ML ReinforcementLearning
No ratings yet
Hota ML ReinforcementLearning
12 pages
S18 Reinforcement Learning 2
No ratings yet
S18 Reinforcement Learning 2
46 pages
7 - Reinforcement Learning
No ratings yet
7 - Reinforcement Learning
23 pages
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
No ratings yet
Q-Learning: Reinforcement Learning Basic Q-Learning Algorithm Common Modifications
22 pages
I2ml3e Chap18
No ratings yet
I2ml3e Chap18
27 pages
Q-Learning in C++
No ratings yet
Q-Learning in C++
4 pages
39-Q Learning Numerical
No ratings yet
39-Q Learning Numerical
13 pages
Q Learning
No ratings yet
Q Learning
12 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
10 pages
Unit-5 MLT
No ratings yet
Unit-5 MLT
13 pages
3x3 Grid World Reinforcement Learning
No ratings yet
3x3 Grid World Reinforcement Learning
6 pages
ML - Unit 3 - Part II
No ratings yet
ML - Unit 3 - Part II
51 pages
Q Learning Ejemplo
100% (1)
Q Learning Ejemplo
11 pages
DD2431 Machine Learning Lab 4: Reinforcement Learning Python Version
No ratings yet
DD2431 Machine Learning Lab 4: Reinforcement Learning Python Version
9 pages
Ass1 Merged Merged
No ratings yet
Ass1 Merged Merged
19 pages
New CZ3005 Module 5 - Reinforcement Learning
No ratings yet
New CZ3005 Module 5 - Reinforcement Learning
31 pages
3964 Double Q Learning
No ratings yet
3964 Double Q Learning
9 pages
Q-Learning in RL With Openai Gym: Joo Soon Lee
No ratings yet
Q-Learning in RL With Openai Gym: Joo Soon Lee
34 pages
Reinforcement Learning - Ipynb - Colaboratory
No ratings yet
Reinforcement Learning - Ipynb - Colaboratory
7 pages
I2ml3e Chap18
No ratings yet
I2ml3e Chap18
27 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
11 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
6 pages

Ex No4rl

Uploaded by

Ex No4rl

Uploaded by

Ex.

To find the optimal path in an environment using Q-learning algorithm.

4. Implement Q-learning with epsilon-greedy policy.

learning_ rate =0.l

while state != 5: #Continue until reaching the goal (state 5)

action = np.argmax(Q[state, :])

new_state = T[state, action]

state = new state

# Print the learned Q-table

# Test the learned policy

action = np.argmax(Q(state, :])

print("Optimal path:", path)

To find the optimal path in an environment using Q-learming algorithm.

You might also like