Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit b79f68f

Browse files
antmarakisnorvig
authored andcommitted
Contents for RL and Search Notebooks (aimacode#567)
* Update rl.ipynb * Update search.ipynb
1 parent 0bb4069 commit b79f68f

File tree

2 files changed

+43
-18
lines changed

2 files changed

+43
-18
lines changed

rl.ipynb

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,25 @@
2020
"from rl import *"
2121
]
2222
},
23+
{
24+
"cell_type": "markdown",
25+
"metadata": {},
26+
"source": [
27+
"## CONTENTS\n",
28+
"\n",
29+
"* Overview\n",
30+
"* Passive Reinforcement Learning\n",
31+
"* Active Reinforcement Learning"
32+
]
33+
},
2334
{
2435
"cell_type": "markdown",
2536
"metadata": {
2637
"collapsed": true
2738
},
2839
"source": [
29-
"## Review\n",
40+
"## OVERVIEW\n",
41+
"\n",
3042
"Before we start playing with the actual implementations let us review a couple of things about RL.\n",
3143
"\n",
3244
"1. Reinforcement Learning is concerned with how software agents ought to take actions in an environment so as to maximize some notion of cumulative reward. \n",
@@ -42,10 +54,9 @@
4254
"cell_type": "markdown",
4355
"metadata": {},
4456
"source": [
45-
"## Passive Reinforcement Learning\n",
57+
"## PASSIVE REINFORCEMENT LEARNING\n",
4658
"\n",
47-
"In passive Reinforcement Learning the agent follows a fixed policy and tries to learn the Reward function and the Transition model (if it is not aware of that).\n",
48-
"\n"
59+
"In passive Reinforcement Learning the agent follows a fixed policy and tries to learn the Reward function and the Transition model (if it is not aware of that)."
4960
]
5061
},
5162
{
@@ -294,7 +305,7 @@
294305
"collapsed": true
295306
},
296307
"source": [
297-
"## Active Reinforcement Learning\n",
308+
"## ACTIVE REINFORCEMENT LEARNING\n",
298309
"\n",
299310
"Unlike Passive Reinforcement Learning in Active Reinforcement Learning we are not bound by a policy pi and we need to select our actions. In other words the agent needs to learn an optimal policy. The fundamental tradeoff the agent needs to face is that of exploration vs. exploitation. "
300311
]

search.ipynb

Lines changed: 27 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,23 @@
3131
"cell_type": "markdown",
3232
"metadata": {},
3333
"source": [
34-
"## Review\n",
34+
"## CONTENTS\n",
35+
"\n",
36+
"* Overview\n",
37+
"* Problem\n",
38+
"* Search Algorithms Visualization\n",
39+
"* Breadth-First Tree Search\n",
40+
"* Breadth-First Search\n",
41+
"* Uniform Cost Search\n",
42+
"* A\\* Search\n",
43+
"* Genetic Algorithm"
44+
]
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"metadata": {},
49+
"source": [
50+
"## OVERVIEW\n",
3551
"\n",
3652
"Here, we learn about problem solving. Building goal-based agents that can plan ahead to solve problems, in particular, navigation problem/route finding problem. First, we will start the problem solving by precisely defining **problems** and their **solutions**. We will look at several general-purpose search algorithms. Broadly, search algorithms are classified into two types:\n",
3753
"\n",
@@ -57,7 +73,7 @@
5773
"cell_type": "markdown",
5874
"metadata": {},
5975
"source": [
60-
"## Problem\n",
76+
"## PROBLEM\n",
6177
"\n",
6278
"Let's see how we define a Problem. Run the next cell to see how abstract class `Problem` is defined in the search module."
6379
]
@@ -184,7 +200,7 @@
184200
"cell_type": "markdown",
185201
"metadata": {},
186202
"source": [
187-
"# Romania map visualisation\n",
203+
"### Romania Map Visualisation\n",
188204
"\n",
189205
"Let's have a visualisation of Romania map [Figure 3.2] from the book and see how different searching algorithms perform / how frontier expands in each search algorithm for a simple problem named `romania_problem`."
190206
]
@@ -420,9 +436,9 @@
420436
"cell_type": "markdown",
421437
"metadata": {},
422438
"source": [
423-
"## Searching algorithms visualisations\n",
439+
"## SEARCHING ALGORITHMS VISUALIZATION\n",
424440
"\n",
425-
"In this section, we have visualisations of the following searching algorithms:\n",
441+
"In this section, we have visualizations of the following searching algorithms:\n",
426442
"\n",
427443
"1. Breadth First Tree Search - Implemented\n",
428444
"2. Depth First Tree Search\n",
@@ -559,11 +575,9 @@
559575
"cell_type": "markdown",
560576
"metadata": {},
561577
"source": [
578+
"## BREADTH-FIRST TREE SEARCH\n",
562579
"\n",
563-
"## Breadth first tree search\n",
564-
"\n",
565-
"We have a working implementation in search module. But as we want to interact with the graph while it is searching, we need to modify the implementation. Here's the modified breadth first tree search.\n",
566-
"\n"
580+
"We have a working implementation in search module. But as we want to interact with the graph while it is searching, we need to modify the implementation. Here's the modified breadth first tree search."
567581
]
568582
},
569583
{
@@ -654,7 +668,7 @@
654668
"collapsed": true
655669
},
656670
"source": [
657-
"## Breadth first search\n",
671+
"## BREADTH-FIRST SEARCH\n",
658672
"\n",
659673
"Let's change all the node_colors to starting position and define a different problem statement."
660674
]
@@ -740,7 +754,7 @@
740754
"cell_type": "markdown",
741755
"metadata": {},
742756
"source": [
743-
"## Uniform cost search\n",
757+
"## UNIFORM COST SEARCH\n",
744758
"\n",
745759
"Let's change all the node_colors to starting position and define a different problem statement."
746760
]
@@ -832,7 +846,7 @@
832846
"cell_type": "markdown",
833847
"metadata": {},
834848
"source": [
835-
"## A* search\n",
849+
"## A\\* SEARCH\n",
836850
"\n",
837851
"Let's change all the node_colors to starting position and define a different problem statement."
838852
]
@@ -967,7 +981,7 @@
967981
"cell_type": "markdown",
968982
"metadata": {},
969983
"source": [
970-
"## Genetic Algorithm\n",
984+
"## GENETIC ALGORITHM\n",
971985
"\n",
972986
"Genetic algorithms (or GA) are inspired by natural evolution and are particularly useful in optimization and search problems with large state spaces.\n",
973987
"\n",

0 commit comments

Comments
 (0)