Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Contents for RL and Search Notebooks #567

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 3, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 16 additions & 5 deletions rl.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,25 @@
"from rl import *"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## CONTENTS\n",
"\n",
"* Overview\n",
"* Passive Reinforcement Learning\n",
"* Active Reinforcement Learning"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"## Review\n",
"## OVERVIEW\n",
"\n",
"Before we start playing with the actual implementations let us review a couple of things about RL.\n",
"\n",
"1. Reinforcement Learning is concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. \n",
Expand All @@ -42,10 +54,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Passive Reinforcement Learning\n",
"## PASSIVE REINFORCEMENT LEARNING\n",
"\n",
"In passive Reinforcement Learning the agent follows a fixed policy and tries to learn the Reward function and the Transition model (if it is not aware of that).\n",
"\n"
"In passive Reinforcement Learning the agent follows a fixed policy and tries to learn the Reward function and the Transition model (if it is not aware of that)."
]
},
{
Expand Down Expand Up @@ -294,7 +305,7 @@
"collapsed": true
},
"source": [
"## Active Reinforcement Learning\n",
"## ACTIVE REINFORCEMENT LEARNING\n",
"\n",
"Unlike Passive Reinforcement Learning in Active Reinforcement Learning we are not bound by a policy pi and we need to select our actions. In other words the agent needs to learn an optimal policy. The fundamental tradeoff the agent needs to face is that of exploration vs. exploitation. "
]
Expand Down
40 changes: 27 additions & 13 deletions search.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,23 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Review\n",
"## CONTENTS\n",
"\n",
"* Overview\n",
"* Problem\n",
"* Search Algorithms Visualization\n",
"* Breadth-First Tree Search\n",
"* Breadth-First Search\n",
"* Uniform Cost Search\n",
"* A\\* Search\n",
"* Genetic Algorithm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## OVERVIEW\n",
"\n",
"Here, we learn about problem solving. Building goal-based agents that can plan ahead to solve problems, in particular, navigation problem/route finding problem. First, we will start the problem solving by precisely defining **problems** and their **solutions**. We will look at several general-purpose search algorithms. Broadly, search algorithms are classified into two types:\n",
"\n",
Expand All @@ -57,7 +73,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Problem\n",
"## PROBLEM\n",
"\n",
"Let's see how we define a Problem. Run the next cell to see how abstract class `Problem` is defined in the search module."
]
Expand Down Expand Up @@ -184,7 +200,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Romania map visualisation\n",
"### Romania Map Visualisation\n",
"\n",
"Let's have a visualisation of Romania map [Figure 3.2] from the book and see how different searching algorithms perform / how frontier expands in each search algorithm for a simple problem named `romania_problem`."
]
Expand Down Expand Up @@ -420,9 +436,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Searching algorithms visualisations\n",
"## SEARCHING ALGORITHMS VISUALIZATION\n",
"\n",
"In this section, we have visualisations of the following searching algorithms:\n",
"In this section, we have visualizations of the following searching algorithms:\n",
"\n",
"1. Breadth First Tree Search - Implemented\n",
"2. Depth First Tree Search\n",
Expand Down Expand Up @@ -559,11 +575,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## BREADTH-FIRST TREE SEARCH\n",
"\n",
"## Breadth first tree search\n",
"\n",
"We have a working implementation in search module. But as we want to interact with the graph while it is searching, we need to modify the implementation. Here's the modified breadth first tree search.\n",
"\n"
"We have a working implementation in search module. But as we want to interact with the graph while it is searching, we need to modify the implementation. Here's the modified breadth first tree search."
]
},
{
Expand Down Expand Up @@ -654,7 +668,7 @@
"collapsed": true
},
"source": [
"## Breadth first search\n",
"## BREADTH-FIRST SEARCH\n",
"\n",
"Let's change all the node_colors to starting position and define a different problem statement."
]
Expand Down Expand Up @@ -740,7 +754,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Uniform cost search\n",
"## UNIFORM COST SEARCH\n",
"\n",
"Let's change all the node_colors to starting position and define a different problem statement."
]
Expand Down Expand Up @@ -832,7 +846,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## A* search\n",
"## A\\* SEARCH\n",
"\n",
"Let's change all the node_colors to starting position and define a different problem statement."
]
Expand Down Expand Up @@ -967,7 +981,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Genetic Algorithm\n",
"## GENETIC ALGORITHM\n",
"\n",
"Genetic algorithms (or GA) are inspired by natural evolution and are particularly useful in optimization and search problems with large state spaces.\n",
"\n",
Expand Down