Update learning.ipynb (aimacode#628)

antmarakis · norvig · commit 01e4fcd4f9a4 · 2017-08-24T01:15:20.000-07:00
diff --git a/learning.ipynb b/learning.ipynb
@@ -110,7 +110,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 2,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
@@ -817,13 +817,13 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 23,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource PluralityLearner"
+    "psource(PluralityLearner)"
    ]
   },
   {
@@ -909,13 +909,13 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 25,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource NearestNeighborLearner"
+    "psource(NearestNeighborLearner)"
    ]
   },
   {
@@ -991,19 +991,39 @@
     "\n",
     "Information Gain is difference between entropy of the parent and weighted sum of entropy of children. The feature used for splitting is the one which provides the most information gain.\n",
     "\n",
+    "#### Pseudocode\n",
+    "\n",
+    "You can view the pseudocode by running the cell below:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "pseudocode(\"Decision Tree Learning\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
     "### Implementation\n",
     "The nodes of the tree constructed by our learning algorithm are stored using either `DecisionFork` or `DecisionLeaf` based on whether they are a parent node or a leaf node respectively."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 27,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource DecisionFork"
+    "psource(DecisionFork)"
    ]
   },
   {
@@ -1015,13 +1035,13 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 28,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource DecisionLeaf"
+    "psource(DecisionLeaf)"
    ]
   },
   {
@@ -1033,13 +1053,13 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 29,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource DecisionTreeLearner"
+    "psource(DecisionTreeLearner)"
    ]
   },
   {
@@ -1142,7 +1162,7 @@
    "source": [
     "### Implementation\n",
     "\n",
-    "The implementation of the Naive Bayes Classifier is split in two; Discrete and Continuous. The user can choose between them with the argument `continuous`."
+    "The implementation of the Naive Bayes Classifier is split in two; *Learning* and *Simple*. The *learning* classifier takes as input a dataset and learns the needed distributions from that. It is itself split into two, for discrete and continuous features. The *simple* classifier takes as input not a dataset, but already calculated distributions (a dictionary of `CountingProbDist` objects)."
    ]
   },
   {
@@ -1237,13 +1257,13 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 32,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource NaiveBayesDiscrete"
+    "psource(NaiveBayesDiscrete)"
    ]
   },
   {
@@ -1327,13 +1347,42 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 35,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource NaiveBayesContinuous"
+    "psource(NaiveBayesContinuous)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Simple\n",
+    "\n",
+    "The simple classifier (chosen with the argument `simple`) does not learn from a dataset, instead it takes as input a dictionary of already calculated `CountingProbDist` objects and returns a predictor function. The dictionary is in the following form: `(Class Name, Class Probability): CountingProbDist Object`.\n",
+    "\n",
+    "Each class has its own probability distribution. The classifier given a list of features calculates the probability of the input for each class and returns the max. The only pre-processing work is to create dictionaries for the distribution of classes (named `targets`) and attributes/features.\n",
+    "\n",
+    "The complete code for the simple classifier:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "psource(NaiveBayesSimple)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This classifier is useful when you already have calculated the distributions and you need to predict future items."
    ]
   },
   {
@@ -1385,7 +1434,83 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Notice how the Discrete Classifier misclassified the second item, while the Continuous one had no problem."
+    "Notice how the Discrete Classifier misclassified the second item, while the Continuous one had no problem.\n",
+    "\n",
+    "Let's now take a look at the simple classifier. First we will come up with a sample problem to solve. Say we are given three bags. Each bag contains three letters ('a', 'b' and 'c') of different quantities. We are given a string of letters and we are tasked with finding from which bag the string of letters came.\n",
+    "\n",
+    "Since we know the probability distribution of the letters for each bag, we can use the naive bayes classifier to make our prediction."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "bag1 = 'a'*50 + 'b'*30 + 'c'*15\n",
+    "dist1 = CountingProbDist(bag1)\n",
+    "bag2 = 'a'*30 + 'b'*45 + 'c'*20\n",
+    "dist2 = CountingProbDist(bag2)\n",
+    "bag3 = 'a'*20 + 'b'*20 + 'c'*35\n",
+    "dist3 = CountingProbDist(bag3)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now that we have the `CountingProbDist` objects for each bag/class, we will create the dictionary. We assume that it is equally probable that we will pick from any bag."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {
+    "collapsed": true
+   },
+   "outputs": [],
+   "source": [
+    "dist = {('First', 0.5): dist1, ('Second', 0.3): dist2, ('Third', 0.2): dist3}\n",
+    "nBS = NaiveBayesLearner(dist, simple=True)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now we can start making predictions:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "First\n",
+      "Second\n",
+      "Third\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(nBS('aab'))        # We can handle strings\n",
+    "print(nBS(['b', 'b']))   # And lists!\n",
+    "print(nBS('ccbcc'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The results make intuitive sence. The first bag has a high amount of 'a's, the second has a high amount of 'b's and the third has a high amount of 'c's. The classifier seems to confirm this intuition.\n",
+    "\n",
+    "Note that the simple classifier doesn't distinguish between discrete and continuous values. It just takes whatever it is given. Also, the `simple` option on the `NaiveBayesLearner` overrides the `continuous` argument. `NaiveBayesLearner(d, simple=True, continuous=False)` just creates a simple classifier."
    ]
   },
   {
@@ -1423,13 +1548,13 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 37,
+   "execution_count": null,
    "metadata": {
     "collapsed": true
    },
    "outputs": [],
    "source": [
-    "%psource PerceptronLearner"
+    "psource(PerceptronLearner)"
    ]
   },
   {