0% found this document useful (0 votes)

30 views23 pages

Sdfesdf

The document discusses reinforcement learning, an area of machine learning where agents learn behavior through trial-and-error interactions with an environment. Reinforcement learning agents aim to maximize rewards or minimize penalties over time by learning action-value functions or policies that map states to actions. Popular reinforcement learning algorithms include Q-learning, which learns action values directly from experience, and dynamic programming methods that learn value functions to derive optimal policies.

Uploaded by

freeintro0404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

30 views23 pages

Sdfesdf

Uploaded by

freeintro0404

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Artificial Intelligence

Lecturer 14 – Reinforcement Learning

School of Information and Communication

Technology - HUST

1
Reinforcement Learning (RL)
• RL is ML method that optimize the reward
• A class of tasks
• A process of trial-and-error learning
• Good actions are “rewarded”
• Bad actions are “punished”

2
Features of RL
• Learning from numerical rewards
• Interaction with the task; sequences of states,
actions and rewards
• Uncertainty and non-deterministic worlds
• Delayed consequences
• The explore/exploit dilemma
• The whole problem of goal-directed learning

3
Points of view
• From the point of view of agents
• RL is a process of trial-and-error learning
• How much reward will I get if I do this action?
• From the point of view of trainers
• RL is training by rewards and punishments
• Train computers like we train animals

4
Applications of RL
• Robot
• Animal training
• Scheduling
• Games
• Control systems
•…

5
Supervised Learning vs.
Reinforcement Learning
• Supervised learning • Reinforcement learning
• Teacher: Is this an AI course • World: You are in state 9.
or a Math course? Choose action A or B
• Leaner: Math • Leaner: A
• Teacher: No, AI • World: Your reward is 100
• … • …
• Teacher: Is this an AI course • World: You are in state 15.
or a Math course? Choose action C or D
• Leaner : AI • Learner: D
• Teacher : Yes • World : Your reward is 50

6
Examples
• Chess
• Win +1, loose -1
• Elevator dispatching
• reward based on mean squared time for elevator to arrive
(optimization problem)
• Channel allocation for cellular phones
• Lower rewards the more calls are blocked

7
Policy, Reward and Goal
• Policy
• defines the agent’s behaviour at a given time
• maps from perceptions to actions
• can be defined by: look-up table, neural net, search algorithm...
• may be stochastic
• Reward Function
• defines the goal(s) in an RL problem
• maps from states, state-action pairs, or state-action-successor state,
triplets to a numerical reward
• goal of the agent is to maximise the total reward in the long run
• the policy is altered to achieve this goal

8
Reward and Return
• The reward function indicates how good things are right now
• But the agent wants to maximize reward in the long-term i.e.
over many time steps
• We refer to long-term (multi-step) reward as return

Rt = rt +1 + rt + 2 + ... + rT
where
• T is the last time step of the world

9
Discounted Return
• The geometrically discounted model of return

Rt = rt +1 + rt + 2 + ... +  T rT
0   1
• is called discount rate, used to
• Bound the infinite sum
• Favor earlier rewards, in other words to give preference to
shorter paths

10
Optimal Policies
• An RL agent adapts its policy in order to increase
return
• A policy p1 is at least as good as a policy p2 if its
expected return is at least as great in each possible
initial state
• An optimal policy p is at least as good as any other
policy

11
Policy Adaptation Methods
• Value function-based methods
• Learn a value function for the policy
• Generate a new policy from the value function
• Q-learning, Dynamic Programming

12
Value Functions
• A value function maps each state to an estimate of
return under a policy
• An action-value function maps from state-action
pairs to estimates of return
• Learning a value function is referred to as the
“prediction” problem or ‘policy evaluation’ in the
Dynamic Programming literature

13
Q-learning
• Learns action-values Q(s,a) rather than state-values
V(s)
• Action-values learning

Q( s, a ) = R ( s, a ) +  max a ' Q(T ( s, a ), a ' )

• Q-learning improves action-values iteratively until
it converges

14
Q-learning Algorithm
1. Algorithm Q {
2. For each (s,a) initialize Q’(s,a) at zero
3. Choose current action s
4. Iterate infinitely{
5. Choose and execute action a
6. Get immediate reward r
7. Choose new state s’
8. Update Q’(s,a) as follows: Q’(s,a)  r + γ maxá Q’(s’,a’)
9. s s’
10. }
11.}

15
Example
• Initially • Initialization

0 100
G G
0
0 0
0 0 100
0 0

0 0

16
Example
• s1 • Assume  = 0,9
• Go right: s2
• Reward: 0

0 100 0 100
A G A G
0 0
0 0 0 0
0 0 100 0 0 100
0 0 0 0

0 0 0 0

17
Example
• Go right • Update s2
• Reward: 100 • Reward: 100

0 100 0 100
A G A G
0 0
0 0 0 0
0 0 100 0 0 100
0 0 0 0

0 0 0 0

18
Example
• Update s1 • s2
• Reward: 90

90 100 90 100
A G A G
0 0
0 0 0 0
0 0 100 0 0 100
0 0 0 0

0 0 0 0

19
Example: result of Q-learning

90 100
G
81
72 81
81 81 90 100
90

72 81

20
Exercice
• Agent is in room C of the building
• The goal is to get out of the building

21
Modeling the problem

A B C D E F
A
B 100

C
D
E 100

F 100

22
Result
 = 0,8
A B C D E F
A 400

B 320 500

C 320

D 400 255 400

E 320 320 500

F 400 400 500

Divide all rewards by 5

Result: C => D => B => F

C => D => E => F

Introduction To Reinforcement Learning: Instructor: Sergey Levine UC Berkeley
No ratings yet
Introduction To Reinforcement Learning: Instructor: Sergey Levine UC Berkeley
46 pages
Tilting Vice PDF
No ratings yet
Tilting Vice PDF
33 pages
Reinforcement Learning
100% (1)
Reinforcement Learning
64 pages
Unit 5
No ratings yet
Unit 5
45 pages
Introduction To Reinforcement Learning
100% (1)
Introduction To Reinforcement Learning
52 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
50 pages
Phones 2017 PDF
No ratings yet
Phones 2017 PDF
161 pages
Chapter 6 Barriers To International Trade
No ratings yet
Chapter 6 Barriers To International Trade
13 pages
Reinforcement Learning: Nguyen Do Van, PHD
No ratings yet
Reinforcement Learning: Nguyen Do Van, PHD
40 pages
Advanced Reinforcement Learning
No ratings yet
Advanced Reinforcement Learning
46 pages
Unit 5 - Reinforcement Learning
No ratings yet
Unit 5 - Reinforcement Learning
15 pages
Serge Levine Course Introduction To Reinforcement Learning 3: RL Introduction
No ratings yet
Serge Levine Course Introduction To Reinforcement Learning 3: RL Introduction
46 pages
Reinforced Learning
No ratings yet
Reinforced Learning
25 pages
4.1 Reinforcement Learning 2
No ratings yet
4.1 Reinforcement Learning 2
31 pages
Reinforcement Learning Guide
No ratings yet
Reinforcement Learning Guide
18 pages
Sections
No ratings yet
Sections
76 pages
Smartphones-Each & Everything You Want To Know About Your Smartphone-FiLELiST PDF
No ratings yet
Smartphones-Each & Everything You Want To Know About Your Smartphone-FiLELiST PDF
181 pages
Artificial Intelligence: Computer Science & Engineering, Khulna University
No ratings yet
Artificial Intelligence: Computer Science & Engineering, Khulna University
30 pages
Virtual Palletization Plan FNDE
No ratings yet
Virtual Palletization Plan FNDE
299 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
88 pages
Kumerahou: Pomaderris Kumeraho
No ratings yet
Kumerahou: Pomaderris Kumeraho
1 page
Reinforcement Learning
No ratings yet
Reinforcement Learning
29 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
32 pages
Geology of The Area
No ratings yet
Geology of The Area
4 pages
Bahrick Et Al. (1993) Spacing Effect
No ratings yet
Bahrick Et Al. (1993) Spacing Effect
7 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
45 pages
Reinforcement LN-6
No ratings yet
Reinforcement LN-6
13 pages
Design & Implement Trash Rack Cleaning System
No ratings yet
Design & Implement Trash Rack Cleaning System
23 pages
First Language Acquisition Theories
No ratings yet
First Language Acquisition Theories
28 pages
L13 Reinforcement Learning
No ratings yet
L13 Reinforcement Learning
35 pages
Unit-5 (AI)
No ratings yet
Unit-5 (AI)
21 pages
7 - Reinforcement Learning
No ratings yet
7 - Reinforcement Learning
23 pages
Unit 5 Deep Learning
No ratings yet
Unit 5 Deep Learning
24 pages
37 RL
No ratings yet
37 RL
18 pages
Man Xtvsuite en
No ratings yet
Man Xtvsuite en
74 pages
Unit 5d - Deep Reinforcement Learning
No ratings yet
Unit 5d - Deep Reinforcement Learning
52 pages
Sabrang' 22 Final Rulebook
No ratings yet
Sabrang' 22 Final Rulebook
50 pages
STEP 7 V56 - Compatibility List
No ratings yet
STEP 7 V56 - Compatibility List
31 pages
Unit 3
No ratings yet
Unit 3
12 pages
Reinforcement Learning MY101
No ratings yet
Reinforcement Learning MY101
15 pages
Multi2sim Quickstart
No ratings yet
Multi2sim Quickstart
10 pages
11-DL-Deep Learning For Reinforcement Learning
No ratings yet
11-DL-Deep Learning For Reinforcement Learning
47 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
30 pages
FLC Provider Database
0% (1)
FLC Provider Database
15 pages
Ai (It) Unit-5
No ratings yet
Ai (It) Unit-5
43 pages
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
No ratings yet
7.reinforcement Learning-Introduction-The Learning Task Q-Learning
34 pages
Adani Group Acquires NDTV Assingment No. 1
No ratings yet
Adani Group Acquires NDTV Assingment No. 1
11 pages
Business English Vocabulary Guide
No ratings yet
Business English Vocabulary Guide
27 pages
Module 01
No ratings yet
Module 01
66 pages
L11 Reinforcement Learning 1
No ratings yet
L11 Reinforcement Learning 1
18 pages
Lecture 9 Reiforcement Learning
No ratings yet
Lecture 9 Reiforcement Learning
29 pages
Concrete Prestressing Guide
No ratings yet
Concrete Prestressing Guide
23 pages
Reinforcement Learning: Karan Kathpalia
No ratings yet
Reinforcement Learning: Karan Kathpalia
80 pages
Richland Technologies 5th Anniversary Press Release
No ratings yet
Richland Technologies 5th Anniversary Press Release
2 pages
STID1103 SYLLABUS A211 Student
No ratings yet
STID1103 SYLLABUS A211 Student
5 pages
RL RS-Unit - 3
No ratings yet
RL RS-Unit - 3
6 pages
Review of Anthropometric Considerations For Tractor Seat Design
No ratings yet
Review of Anthropometric Considerations For Tractor Seat Design
9 pages
MLT Unit-5 Notes
No ratings yet
MLT Unit-5 Notes
17 pages
Unit 4
No ratings yet
Unit 4
56 pages
Unit 3 Ai
No ratings yet
Unit 3 Ai
5 pages
Reinforcement Learning Basics
No ratings yet
Reinforcement Learning Basics
19 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
9 pages
Unit 1 - Reinforcement Learning, Overfitting, Training, Validation Sets, Metrics, Bias and Variance
No ratings yet
Unit 1 - Reinforcement Learning, Overfitting, Training, Validation Sets, Metrics, Bias and Variance
16 pages
2023 Reports - Luminate On Diversity
No ratings yet
2023 Reports - Luminate On Diversity
28 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
38 pages
BIOLOGY PLUS TWO Short Notes - Line Foundation
No ratings yet
BIOLOGY PLUS TWO Short Notes - Line Foundation
9 pages
Grade 9 - English All Unit 3 and Moments #3
No ratings yet
Grade 9 - English All Unit 3 and Moments #3
5 pages
3.RL Unit 3
No ratings yet
3.RL Unit 3
31 pages
TL-30 Datasheet - UDNC
No ratings yet
TL-30 Datasheet - UDNC
2 pages
Solar PV Grant Declaration of Works Form
No ratings yet
Solar PV Grant Declaration of Works Form
2 pages
Reinforcement Learning
No ratings yet
Reinforcement Learning
10 pages
RL Week - 1
No ratings yet
RL Week - 1
53 pages
A Crash Course On Reinforcement Learning - Felix Wagner
No ratings yet
A Crash Course On Reinforcement Learning - Felix Wagner
84 pages
10 Deep Reinforcement
No ratings yet
10 Deep Reinforcement
40 pages
Lecture 9 - Reinforced Learning
No ratings yet
Lecture 9 - Reinforced Learning
18 pages
Unit-5 MLT
No ratings yet
Unit-5 MLT
13 pages
Lec17 ReinforcementLearning
No ratings yet
Lec17 ReinforcementLearning
58 pages
A (Long) Peek Into Reinforcement Learning - Lil'Log
No ratings yet
A (Long) Peek Into Reinforcement Learning - Lil'Log
23 pages
SM Chapter 5 Booster - Examiner Notes
No ratings yet
SM Chapter 5 Booster - Examiner Notes
60 pages
Deep Reinforcement Learning
No ratings yet
Deep Reinforcement Learning
25 pages
Unit 1 Reinforcement Learning
No ratings yet
Unit 1 Reinforcement Learning
70 pages
CMPE257 - W10C13 - Reinforcement Learning
No ratings yet
CMPE257 - W10C13 - Reinforcement Learning
161 pages
Unit 5 ML
No ratings yet
Unit 5 ML
15 pages
Blocked Credit Under GST
No ratings yet
Blocked Credit Under GST
15 pages
Multi-Agent Reinforcement Learning-Implementation of Hide and Seek
No ratings yet
Multi-Agent Reinforcement Learning-Implementation of Hide and Seek
7 pages
Lec 1
No ratings yet
Lec 1
7 pages
Reinforcement Learning - Personal Study Notes
No ratings yet
Reinforcement Learning - Personal Study Notes
12 pages
Q Learing
No ratings yet
Q Learing
30 pages