Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Knowledge Notebook: Version Space Learning #598

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 30, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
286 changes: 233 additions & 53 deletions knowledge.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@
"## CONTENTS\n",
"\n",
"* Overview\n",
"* Current-Best Learning"
"* Current-Best Learning\n",
"* Version-Space Learning"
]
},
{
Expand Down Expand Up @@ -267,7 +268,7 @@
"name": "stdout",
"output_type": "stream",
"text": [
"[{'Species': 'Cat', 'Rain': '!No'}, {'Coat': 'Yes', 'Species': 'Dog', 'Rain': 'Yes'}, {'Coat': 'Yes', 'Species': 'Cat'}]\n"
"[{'Species': 'Cat', 'Rain': '!No'}, {'Coat': 'Yes', 'Rain': 'Yes'}, {'Coat': 'Yes'}]\n"
]
}
],
Expand Down Expand Up @@ -304,6 +305,27 @@
"![restaurant](images/restaurant.png)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"With the function `r_example` we will build the dictionary examples:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"def r_example(Alt, Bar, Fri, Hun, Pat, Price, Rain, Res, Type, Est, GOAL):\n",
" return {'Alt': Alt, 'Bar': Bar, 'Fri': Fri, 'Hun': Hun, 'Pat': Pat,\n",
" 'Price': Price, 'Rain': Rain, 'Res': Res, 'Type': Type, 'Est': Est,\n",
" 'GOAL': GOAL}"
]
},
{
"cell_type": "markdown",
"metadata": {
Expand All @@ -315,60 +337,25 @@
},
{
"cell_type": "code",
"execution_count": 6,
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"restaurant = [\n",
" {'Alt': 'Yes', 'Bar': 'No', 'Fri': 'No', 'Hun': 'Yes', 'Pat': 'Some',\n",
" 'Price': '$$$', 'Rain': 'No', 'Res': 'Yes', 'Type': 'French', 'Est': '0-10',\n",
" 'GOAL': True},\n",
"\n",
" {'Alt': 'Yes', 'Bar': 'No', 'Fri': 'No', 'Hun': 'Yes', 'Pat': 'Full',\n",
" 'Price': '$', 'Rain': 'No', 'Res': 'No', 'Type': 'Thai', 'Est': '30-60',\n",
" 'GOAL': False},\n",
"\n",
" {'Alt': 'No', 'Bar': 'Yes', 'Fri': 'No', 'Hun': 'No', 'Pat': 'Some',\n",
" 'Price': '$', 'Rain': 'No', 'Res': 'No', 'Type': 'Burger', 'Est': '0-10',\n",
" 'GOAL': True},\n",
"\n",
" {'Alt': 'Yes', 'Bar': 'No', 'Fri': 'Yes', 'Hun': 'Yes', 'Pat': 'Full',\n",
" 'Price': '$', 'Rain': 'Yes', 'Res': 'No', 'Type': 'Thai', 'Est': '10-30',\n",
" 'GOAL': True},\n",
"\n",
" {'Alt': 'Yes', 'Bar': 'No', 'Fri': 'Yes', 'Hun': 'No', 'Pat': 'Full',\n",
" 'Price': '$$$', 'Rain': 'No', 'Res': 'Yes', 'Type': 'French', 'Est': '>60',\n",
" 'GOAL': False},\n",
"\n",
" {'Alt': 'No', 'Bar': 'Yes', 'Fri': 'No', 'Hun': 'Yes', 'Pat': 'Some',\n",
" 'Price': '$$', 'Rain': 'Yes', 'Res': 'Yes', 'Type': 'Italian', 'Est': '0-10',\n",
" 'GOAL': True},\n",
"\n",
" {'Alt': 'No', 'Bar': 'Yes', 'Fri': 'No', 'Hun': 'No', 'Pat': 'None',\n",
" 'Price': '$', 'Rain': 'Yes', 'Res': 'No', 'Type': 'Burger', 'Est': '0-10',\n",
" 'GOAL': False},\n",
"\n",
" {'Alt': 'No', 'Bar': 'No', 'Fri': 'No', 'Hun': 'Yes', 'Pat': 'Some',\n",
" 'Price': '$$', 'Rain': 'Yes', 'Res': 'Yes', 'Type': 'Thai', 'Est': '0-10',\n",
" 'GOAL': True},\n",
"\n",
" {'Alt': 'No', 'Bar': 'Yes', 'Fri': 'Yes', 'Hun': 'No', 'Pat': 'Full',\n",
" 'Price': '$', 'Rain': 'Yes', 'Res': 'No', 'Type': 'Burger', 'Est': '>60',\n",
" 'GOAL': False},\n",
"\n",
" {'Alt': 'Yes', 'Bar': 'Yes', 'Fri': 'Yes', 'Hun': 'Yes', 'Pat': 'Full',\n",
" 'Price': '$$$', 'Rain': 'No', 'Res': 'Yes', 'Type': 'Italian', 'Est': '10-30',\n",
" 'GOAL': False},\n",
"\n",
" {'Alt': 'No', 'Bar': 'No', 'Fri': 'No', 'Hun': 'No', 'Pat': 'None',\n",
" 'Price': '$', 'Rain': 'No', 'Res': 'No', 'Type': 'Thai', 'Est': '0-10',\n",
" 'GOAL': False},\n",
"\n",
" {'Alt': 'Yes', 'Bar': 'Yes', 'Fri': 'Yes', 'Hun': 'Yes', 'Pat': 'Full',\n",
" 'Price': '$', 'Rain': 'No', 'Res': 'No', 'Type': 'Burger', 'Est': '30-60',\n",
" 'GOAL': True}\n",
" r_example('Yes', 'No', 'No', 'Yes', 'Some', '$$$', 'No', 'Yes', 'French', '0-10', True),\n",
" r_example('Yes', 'No', 'No', 'Yes', 'Full', '$', 'No', 'No', 'Thai', '30-60', False),\n",
" r_example('No', 'Yes', 'No', 'No', 'Some', '$', 'No', 'No', 'Burger', '0-10', True),\n",
" r_example('Yes', 'No', 'Yes', 'Yes', 'Full', '$', 'Yes', 'No', 'Thai', '10-30', True),\n",
" r_example('Yes', 'No', 'Yes', 'No', 'Full', '$$$', 'No', 'Yes', 'French', '>60', False),\n",
" r_example('No', 'Yes', 'No', 'Yes', 'Some', '$$', 'Yes', 'Yes', 'Italian', '0-10', True),\n",
" r_example('No', 'Yes', 'No', 'No', 'None', '$', 'Yes', 'No', 'Burger', '0-10', False),\n",
" r_example('No', 'No', 'No', 'Yes', 'Some', '$$', 'Yes', 'Yes', 'Thai', '0-10', True),\n",
" r_example('No', 'Yes', 'Yes', 'No', 'Full', '$', 'Yes', 'No', 'Burger', '>60', False),\n",
" r_example('Yes', 'Yes', 'Yes', 'Yes', 'Full', '$$$', 'No', 'Yes', 'Italian', '10-30', False),\n",
" r_example('No', 'No', 'No', 'No', 'None', '$', 'No', 'No', 'Thai', '0-10', False),\n",
" r_example('Yes', 'Yes', 'Yes', 'Yes', 'Full', '$', 'No', 'No', 'Burger', '30-60', True)\n",
"]"
]
},
Expand All @@ -381,7 +368,7 @@
},
{
"cell_type": "code",
"execution_count": 7,
"execution_count": 8,
"metadata": {},
"outputs": [
{
Expand Down Expand Up @@ -419,14 +406,14 @@
},
{
"cell_type": "code",
"execution_count": 8,
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[{'Type': '!Thai', 'Fri': '!Yes', 'Alt': 'Yes'}, {'Fri': 'No', 'Type': 'Burger', 'Pat': '!None', 'Alt': 'No'}, {'Fri': 'Yes', 'Est': '10-30', 'Pat': 'Full', 'Rain': 'Yes', 'Res': 'No', 'Bar': 'No', 'Price': '$'}, {'Fri': 'No', 'Est': '0-10', 'Pat': 'Some', 'Res': 'Yes', 'Type': 'Italian', 'Alt': 'No'}, {'Fri': 'No', 'Pat': 'Some', 'Res': 'Yes', 'Type': 'Thai', 'Hun': 'Yes', 'Alt': 'No', 'Price': '$$'}, {'Fri': 'Yes', 'Pat': 'Full', 'Rain': 'No', 'Alt': 'Yes', 'Type': 'Burger', 'Hun': 'Yes', 'Bar': 'Yes', 'Price': '$'}]\n"
"[{'Res': '!No', 'Fri': '!Yes', 'Alt': 'Yes'}, {'Bar': 'Yes', 'Fri': 'No', 'Rain': 'No', 'Hun': 'No'}, {'Bar': 'No', 'Price': '$', 'Fri': 'Yes'}, {'Res': 'Yes', 'Price': '$$', 'Rain': 'Yes', 'Alt': 'No', 'Est': '0-10', 'Fri': 'No', 'Hun': 'Yes', 'Bar': 'Yes'}, {'Fri': 'No', 'Pat': 'Some', 'Price': '$$', 'Rain': 'Yes', 'Hun': 'Yes'}, {'Est': '30-60', 'Res': 'No', 'Price': '$', 'Fri': 'Yes', 'Hun': 'Yes'}]\n"
]
}
],
Expand All @@ -440,6 +427,199 @@
"source": [
"It might be quite complicated, with many disjunctions if we are unlucky, but it will always be correct, as long as a correct hypothesis exists."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## [VERSION-SPACE LEARNING](https://github.com/aimacode/aima-pseudocode/blob/master/md/Version-Space-Learning.md)\n",
"\n",
"### Overview\n",
"\n",
"**Version-Space Learning** is a general method of learning in logic based domains. We generate the set of all the possible hypotheses in the domain and then we iteratively remove hypotheses inconsistent with the examples. The set of remaining hypotheses is called **version space**. Because hypotheses are being removed until we end up with a set of hypotheses consistent with all the examples, the algorithm is sometimes called **candidate elimination** algorithm.\n",
"\n",
"After we update the set on an example, all the hypotheses in the set are consistent with that example. So, when all the examples have been parsed, all the remaining hypotheses in the set are consistent with all the examples. That means we can pick hypotheses at random and we will always get a valid hypothesis."
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"### Implementation\n",
"\n",
"The set of hypotheses is represented by a list and each hypothesis is represented by a list of dictionaries, each dictionary a disjunction. For each example in the given examples we update the version space with the function `version_space_update`. In the end, we return the version-space.\n",
"\n",
"Before we can start updating the version space, we need to generate it. We do that with the `all_hypotheses` function, which builds a list of all the possible hypotheses (including hypotheses with disjunctions). The function works like this: first it finds the possible values for each attribute (using `values_table`), then it builds all the attribute combinations (and adds them to the hypotheses set) and finally it builds the combinations of all the disjunctions (which in this case are the hypotheses build by the attribute combinations).\n",
"\n",
"You can read the code for all the functions by running the cells below:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%psource version_space_learning"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%psource version_space_update"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%psource all_hypotheses"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%psource values_table"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%psource build_attr_combinations"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%psource build_h_combinations"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Example\n",
"\n",
"Since the set of all possible hypotheses is enormous and would take a long time to generate, we will come up with another, even smaller domain. We will try and predict whether we will have a party or not given the availability of pizza and soda. Let's do it:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"party = [\n",
" {'Pizza': 'Yes', 'Soda': 'No', 'GOAL': True},\n",
" {'Pizza': 'Yes', 'Soda': 'Yes', 'GOAL': True},\n",
" {'Pizza': 'No', 'Soda': 'No', 'GOAL': False}\n",
"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Even though it is obvious that no-pizza no-party, we will run the algorithm and see what other hypotheses are valid."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"True\n",
"True\n",
"False\n"
]
}
],
"source": [
"V = version_space_learning(party)\n",
"for e in party:\n",
" guess = False\n",
" for h in V:\n",
" if guess_value(e, h):\n",
" guess = True\n",
" break\n",
"\n",
" print(guess)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The results are correct for the given examples. Let's take a look at the version space:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"959\n",
"[{'Pizza': 'Yes'}, {'Soda': 'Yes'}]\n",
"[{'Pizza': 'Yes'}, {'Pizza': '!No', 'Soda': 'No'}]\n",
"True\n"
]
}
],
"source": [
"print(len(V))\n",
"\n",
"print(V[5])\n",
"print(V[10])\n",
"\n",
"print([{'Pizza': 'Yes'}] in V)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"There are almost 1000 hypotheses in the set. You can see that even with just two attributes the version space in very large.\n",
"\n",
"Our initial prediction is indeed in the set of hypotheses. Also, the two other random hypotheses we got are consistent with the examples (since they both include the \"Pizza is available\" disjunction)."
]
}
],
"metadata": {
Expand Down
Loading