Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
4 views19 pages

RL Presentation2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views19 pages

RL Presentation2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 19

PRESENTATION

• Topic: Reinforcement Learning

• Prepared and Presented by:


Shahbaz Saeed
Muzahir Mehdi
Touqeer Awan
REINFORCEMENT LEARNING

• Reinforcement Learning is a feedback-based machine learning


approach here an agent learns to which actions to perform by
looking at the environment and the results of actions.

• For each correct action, the agent gets positive feedback, and
for each incorrect action, the agent gets negative feedback or
penalty.
Elements of Reinforcement Learning

• The agent or the learner

• The environment the


agent interacts with

• The policy that the agent


follows to take actions

• The reward signal that the


agent observes upon
taking actions
REINFORCEMENT LEARNING
• The agent interact with environment and identifies the possible actions.

• The primary goal of an agent in reinforcement learning is to perform actions by


looking at the environment and get the maximum positive reward.

• In reinforcement learning, the agent learns automatically sing feedbacks without


any labeled data, unlike supervised learning.

• Since there is no labeled data, so the agent is bound to learn by its experience
only.
REINFORCEMENT LEARNING
• Not just blind search, try to be smart about it.

• Reinforce learning is used to solve specific type of problem where decision making
is sequential, and the goal is long-term, such as game-playing, robotics etc.

Why do we need reinforcement learning?


• 1. To solve complex problems in uncertain environments
• 2. To enable agents to learn from their own experiences
• 3.To develop agents that can adapt to new situations.
Types of Reinforcement learning
• Positive Reinforcement Learning
 is a recurrence of behavior due to positive rewards.

 Positive rewards increase strength and the frequency of a specific behavior.

 This encourages to execute similar actions that yield maximum reward.

• Negative Reinforcement Learning


 negative rewards are used as a deterrent to weaken the behavior and to avoid it.

 Negative rewards decreases strength and the frequency of a specific behavior.


How Does Reinforcement Learning
Works?
• To understand the working process of RL, we need to consider two main
things:

Environment: It can be anything such as room, maze, football ground etc.

Agent: An intelligent agent such as AI robot.


• This maze is considering of an
S6 block, which is a wall, S8 a
fire pit, and S4 a diamond
block.

• The agent cannot cross the S6


block, as it is a solid wall.

• If the agent reaches S4 block,


then get the +1 reward; if it
reaches the fire pit, then gets
-1 reward point.

• It can take four actions: move


up, move down, move left
and move right.
• It will be the difficult
condition for the agent
whether he should go up or
down as each block has the
same value.

• So the above approach is not


suitable for the agent to
reach the destination.

• Hence to solve the problem,


we will use the Bellman
equation, which is the main
concept behind
reinforcement learning.
Model-Based vs Model-Free learning algorithms

• There are two main types of Reinforcement Learning algorithms:


• 1. Model-Based Algorithms
• 2. Model-Free Algorithms
Model-Based Algorithms
• They are used in scenarios where we have complete knowledge of the
environment and how it reacts to different actions.

• In Model-based Reinforcement Learning the agent has access to the model of the
environment i.e., action required to be performed to go from one state to
another, probabilities attached, and corresponding rewards attached.

• They allow the reinforcement learning agent to plan ahead by thinking ahead.

• For static/fixed environments, Model-based Reinforcement Learning is more


suitable.
Model-Free Algorithms
• Model-free algorithms find the optimal policy with very limited knowledge of the
dynamics of the environment.

• They estimate the optimal policy directly from experience i.e., interaction between
agent and environment without having any hint of the reward function.

• Model-free Reinforcement Learning should be applied in scenarios involving


incomplete information of the environment.

• In real-world, we don't have a fixed environment. Self-driving cars have a dynamic


environment with changing traffic conditions, route diversions etc. In such
scenarios, Model-free algorithms outperform other techniques
Common Mathematical and Algorithmic
Frameworks

• Markov Decision Process (MDP)


• Bellman Equations
• Dynamic Programming
• Value Iteration
• Policy Iteration
• Q-learning
Markov Decision Process (MDP)

• The components involved in a Markov Decision Process (MDP) is a


decision maker called an agent that interacts with the environment it is
placed in.

• These interactions occur sequentially overtime.

• In each timestamp, the agent will get some representation of the


environment state. Given this representation, the agent selects an action
to make. The environment is then transitioned into some new state and
the agent is given a reward as a consequence of its previous action.
Bellman
Equations
The value of a given state (s)
is determined by taking a
maximum of the actions we
can take in the state the
agent is in. The aim of the
agent is to pick the action that
is going to maximize the
value.
Q Learning

it’s a value-based model free


approach for supplying
information to intimate which
action an agent should
perform. It revolves around
the notion of updating Q
values which shows the value
of doing action A in state S.
Value update rule is the main
aspect of the Q-learning
algorithm.
Applications of deep Reinforcement Learning
• Industrial Manufacturing
Reinforcement Learning is very commonly applied in Robotics.
• Self-driving cars
The algorithms learn to recognize pedestrians, roads, traffic, detect street signs in the
environment and act accordingly.
• Trading and Finance
An RL agent can select whether to hold, buy or sell a share, it is assessed using market
benchmark standards.
• Natural Language Processing
NLP tasks like question-answering, summarization, chatbot implementation can
be done by a Reinforcement Learning agent.
• Healthcare
RL Bots trained to perform surgeries and in better diagnosis of diseases

You might also like