Merge branch 'master' of https://github.com/aimacode/aima-python

SnShine · SnShine · commit 0081c5f3fcee · 2016-06-01T21:44:31.000+05:30
* 'master' of https://github.com/aimacode/aima-python: Added Example and Applet for Value Iteration Updating README's Index of Code to reflect actual implementation status of algorithms (aimacode#237) Removed refrence to depreacated utils * imports
diff --git a/README.md b/README.md
@@ -1,7 +1,7 @@
 # ![](https://github.com/aimacode/aima-java/blob/gh-pages/aima3e/images/aima3e.jpg)aima-python <a href="https://travis-ci.org/aimacode/aima-python" target="_blank"><img src="https://api.travis-ci.org/aimacode/aima-python.svg?branch=master" alt="Build Status"></a><a href="http://mybinder.org/repo/aimacode/aima-python" target="_blank"><img src="http://mybinder.org/badge.svg" alt="Binder"></a>
 
 
-Python code for the book *Artificial Intelligence: A Modern Approach.* You can use this in conjunction with a course on AI, or for study on your own. We're loooking for [solid contributors](https://github.com/aimacode/aima-python/blob/master/CONTRIBUTING.md) to help.
+Python code for the book *Artificial Intelligence: A Modern Approach.* You can use this in conjunction with a course on AI, or for study on your own. We're looking for [solid contributors](https://github.com/aimacode/aima-python/blob/master/CONTRIBUTING.md) to help.
 
 ## Python 3.4
 
@@ -48,7 +48,7 @@ Here is a table of algorithms, the figure, name of the code in the book and in t
 | 4.8     | Genetic-Algorithm  | `genetic_algorithm` | [`search.py`](../master/search.py) |
 | 4.11    | And-Or-Graph-Search | `and_or_graph_search` | [`search.py`](../master/search.py)  |
 | 4.21    | Online-DFS-Agent   | `online_dfs_agent` | [`search.py`](../master/search.py) |
-| 4.24    | LRTA\*-Agent       |        |        |
+| 4.24    | LRTA\*-Agent       | `LRTAStarAgent`    | [`search.py`](../master/search.py) |
 | 5.3     | Minimax-Decision   | `minimax_decision` | [`games.py`](../master/games.py) |
 | 5.7     | Alpha-Beta-Search  | `alphabeta_search` | [`games.py`](../master/games.py) |
 | 6       | CSP                | `CSP` | [`csp.py`](../master/csp.py) |
@@ -66,7 +66,7 @@ Here is a table of algorithms, the figure, name of the code in the book and in t
 | 7.17    | DPLL-Satisfiable?  | `dpll_satisfiable` | [`logic.py`](../master/logic.py) |
 | 7.18    | WalkSAT            | `WalkSAT` | [`logic.py`](../master/logic.py) |
 | 7.20    | Hybrid-Wumpus-Agent    |         |           |
-| 7.22    | SATPlan            |          |
+| 7.22    | SATPlan            | `SAT_plan`  | [`logic.py`](../master/logic.py) |
 | 9       | Subst              | `subst` | [`logic.py`](../master/logic.py) |
 | 9.1     | Unify              | `unify` | [`logic.py`](../master/logic.py) |
 | 9.3     | FOL-FC-Ask         | `fol_fc_ask` | [`logic.py`](../master/logic.py) |
@@ -89,7 +89,7 @@ Here is a table of algorithms, the figure, name of the code in the book and in t
 | 14.13   | Prior-Sample       | `prior_sample` | [`probability.py`](../master/probability.py) |
 | 14.14   | Rejection-Sampling | `rejection_sampling` | [`probability.py`](../master/probability.py) |
 | 14.15   | Likelihood-Weighting | `likelihood_weighting` | [`probability.py`](../master/probability.py) |
-| 14.16   | Gibbs-Ask           |          |
+| 14.16   | Gibbs-Ask           | `gibbs_ask`  | [`probability.py`](../master/probability.py) |
 | 15.4    | Forward-Backward   | `forward_backward` | [`probability.py`](../master/probability.py) |
 | 15.6    | Fixed-Lag-Smoothing | `fixed_lag_smoothing` | [`probability.py`](../master/probability.py) |
 | 15.17   | Particle-Filtering | `particle_filtering` | [`probability.py`](../master/probability.py) |
@@ -99,19 +99,19 @@ Here is a table of algorithms, the figure, name of the code in the book and in t
 | 17.7    | POMDP-Value-Iteration  |           |        |
 | 18.5    | Decision-Tree-Learning | `DecisionTreeLearner` | [`learning.py`](../master/learning.py) |
 | 18.8    | Cross-Validation   | `cross_validation` | [`learning.py`](../master/learning.py) |
-| 18.11   | Decision-List-Learning |          |
-| 18.24   | Back-Prop-Learning |          |
+| 18.11   | Decision-List-Learning | `DecisionListLearner` | [`learning.py`](../master/learning.py) |
+| 18.24   | Back-Prop-Learning | `BackPropagationLearner` | [`learning.py`](../master/learning.py) |
 | 18.34   | AdaBoost           | `AdaBoost` | [`learning.py`](../master/learning.py) |
 | 19.2    | Current-Best-Learning |          |
 | 19.3    | Version-Space-Learning |          |
 | 19.8    | Minimal-Consistent-Det |          |
 | 19.12   | FOIL               |          |
-| 21.2    | Passive-ADP-Agent  | `PassiveADPAgent` | [`rl.py`](../master/rl.py) |
+| 21.2    | Passive-ADP-Agent  | 				  | 						   |
 | 21.4    | Passive-TD-Agent   | `PassiveTDAgent` | [`rl.py`](../master/rl.py) |
 | 21.8    | Q-Learning-Agent   | `QLearningAgent` | [`rl.py`](../master/rl.py) |
 | 22.1    | HITS               |         |         |
 | 23      | Chart-Parse        | `Chart` | [`nlp.py`](../master/nlp.py) |
-| 23.5    | CYK-Parse          |         |         |
+| 23.5    | CYK-Parse          | `CYK_parse` | [`nlp.py`](../master/nlp.py) |
 | 25.9    | Monte-Carlo-Localization|       |
 
 
diff --git a/mdp.ipynb b/mdp.ipynb
@@ -11,7 +11,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": 172,
    "metadata": {
     "collapsed": true
    },
@@ -50,7 +50,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": 173,
    "metadata": {
     "collapsed": false
    },
@@ -87,7 +87,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 3,
+   "execution_count": 174,
    "metadata": {
     "collapsed": true
    },
@@ -119,7 +119,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 175,
    "metadata": {
     "collapsed": false
    },
@@ -153,7 +153,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 5,
+   "execution_count": 176,
    "metadata": {
     "collapsed": false
    },
@@ -181,7 +181,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 6,
+   "execution_count": 177,
    "metadata": {
     "collapsed": true
    },
@@ -221,18 +221,18 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 7,
+   "execution_count": 178,
    "metadata": {
     "collapsed": false
    },
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "<mdp.GridMDP at 0x7fcb2826ba58>"
+       "<mdp.GridMDP at 0x7fbecc40ebe0>"
       ]
      },
-     "execution_count": 7,
+     "execution_count": 178,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -241,14 +241,182 @@
     "sequential_decision_environment"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "collapsed": true
+   },
+   "source": [
+    "# Value Iteration\n",
+    "\n",
+    "Now that we have looked how to represent MDPs. Let's aim at solving them. Our ultimate goal is to obtain an optimal policy. We start with looking at Value Iteration and a visualisation that should help us understanding it better.\n",
+    "\n",
+    "We start by calculating Value/Utility for each of the states. The Value of each state is the expected sum of discounted future rewards given we start in that state and follow a particular policy pi.The algorithm Value Iteration (**Fig. 17.4** in the book) relies on finding solutions of the Bellman's Equation. The intuition Value Iteration works is because values propagate. This point will we more clear after we encounter the visualisation. For more information you can refer to **Section 17.2** of the book. \n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 179,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "%psource value_iteration"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "It takes as inputs two parameters an MDP to solve and epsilon the maximum error allowed in the utility of any state. It returns a dictionary containing utilities where the keys are the states and values represent utilities. Let us solve the **sequencial_decision_enviornment** GridMDP.\n"
+   ]
+  },
   {
    "cell_type": "code",
-   "execution_count": null,
+   "execution_count": 180,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "{(0, 0): 0.2962883154554812,\n",
+       " (0, 1): 0.3984432178350045,\n",
+       " (0, 2): 0.5093943765842497,\n",
+       " (1, 0): 0.25386699846479516,\n",
+       " (1, 2): 0.649585681261095,\n",
+       " (2, 0): 0.3447542300124158,\n",
+       " (2, 1): 0.48644001739269643,\n",
+       " (2, 2): 0.7953620878466678,\n",
+       " (3, 0): 0.12987274656746342,\n",
+       " (3, 1): -1.0,\n",
+       " (3, 2): 1.0}"
+      ]
+     },
+     "execution_count": 180,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "value_iteration(sequential_decision_environment)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To illustrate that values propagate out of states let us create a simple visualisation. We will be using a modified version of the value_iteration function which will store U over time. We will also remove the parameter epsilon and instead add the number of iterations we want."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 181,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
-   "source": []
+   "source": [
+    "def value_iteration_instru(mdp, iterations=20):\n",
+    "    U_over_time = []\n",
+    "    U1 = {s: 0 for s in mdp.states}\n",
+    "    R, T, gamma = mdp.R, mdp.T, mdp.gamma\n",
+    "    for _ in range(iterations):\n",
+    "        U = U1.copy()\n",
+    "        for s in mdp.states:\n",
+    "            U1[s] = R(s) + gamma * max([sum([p * U[s1] for (p, s1) in T(s, a)])\n",
+    "                                        for a in mdp.actions(s)])\n",
+    "        U_over_time.append(U)\n",
+    "    return U_over_time"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Next, we define a function to create the visualisation from the utilities returned by **value_iteration_instru**. The reader need not concern himself with the code that immediately follows as it is the usage of Matplotib with IPython Widgets. If you are interested in reading more about these visit [ipywidgets.readthedocs.io](http://ipywidgets.readthedocs.io)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 182,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "columns = 4\n",
+    "rows = 3\n",
+    "U_over_time = value_iteration_instru(sequential_decision_environment)\n",
+    "           "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 183,
+   "metadata": {
+    "collapsed": false
+   },
+   "outputs": [],
+   "source": [
+    "%matplotlib inline\n",
+    "import matplotlib.pyplot as plt\n",
+    "\n",
+    "def plot_grid(iteration):\n",
+    "    data = U_over_time[iteration]\n",
+    "    grid = []\n",
+    "    for row in range(rows):\n",
+    "        current_row = []\n",
+    "        for column in range(columns):\n",
+    "            try:\n",
+    "                current_row.append(data[(column, row)])\n",
+    "            except KeyError:\n",
+    "                current_row.append(0)\n",
+    "        grid.append(current_row)\n",
+    "    grid.reverse() # output like book\n",
+    "    fig = plt.matshow(grid, cmap=plt.cm.bwr);\n",
+    "    plt.axis('off')\n",
+    "    fig.axes.get_xaxis().set_visible(False)\n",
+    "    fig.axes.get_yaxis().set_visible(False) "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 184,
+   "metadata": {
+    "collapsed": false,
+    "scrolled": true
+   },
+   "outputs": [
+    {
+     "data": {
+      "image/png": "iVBORw0KGgoAAAANSUhEUgAAATgAAADtCAYAAAAr+2lCAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAAzZJREFUeJzt2rENwzAMAEExyP4r0wsE6Qwbj7uSalg9WGh29wAUfZ5e\nAOAuAgdkCRyQJXBAlsABWQIHZH3/Pc4cf0iA19s982vuggOyBA7IEjggS+CALIEDsgQOyBI4IEvg\ngCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOy\nBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4\nIEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAs\ngQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQO\nyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL\n4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIED\nsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgS\nOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CA\nLIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IE\nDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjgg\nS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyB\nA7IEDsgSOCBL4IAsgQOyBA7IEjggS+CALIEDsgQOyBI4IEvggCyBA7IEDsgSOCBL4IAsgQOyBA7I\nEjggS+CALIEDsgQOyJrdfXoHgFu44IAsgQOyBA7IEjggS+CALIEDsi6WyArVfE1QKgAAAABJRU5E\nrkJggg==\n",
+      "text/plain": [
+       "<matplotlib.figure.Figure at 0x7fbea96037f0>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "import ipywidgets as widgets\n",
+    "from IPython.display import display\n",
+    "\n",
+    "iteration_slider = widgets.IntSlider(min=0, max=15, step=1, value=0)\n",
+    "w=widgets.interactive(plot_grid,iteration=iteration_slider)\n",
+    "display(w)\n",
+    "        "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Move the slider above to observe how the utility changes across iterations."
+   ]
   }
  ],
  "metadata": {
diff --git a/search.ipynb b/search.ipynb
@@ -56,7 +56,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The first thing we observe is '`from utils import *`'. This means that everything in utils.py is imported for use in this module. Don't worry. You don't need to read utils.py in order to understand search algorithms.\n",
+    "The search and other modules of the repository make use of several imports from the utils module. We will point the useful ones out if they are required to follow the material below. Don't worry. You don't need to read utils.py in order to understand search algorithms.\n",
     "    \n",
     "The `Problem` class is an abstract class on which we define our problems(*duh*).\n",
     "Again, if you are confused about what `abstract class` means have a look at the `Intro` notebook.\n",
@@ -285,7 +285,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.4.3"
+   "version": "3.5.1"
   }
  },
  "nbformat": 4,