Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
14 views7 pages

ML Lec4

The document discusses concept learning, focusing on the Find-S algorithm and the Candidate-Elimination algorithm for hypothesis generation. It explains how to approximate Boolean-valued functions from training examples and introduces the concept of version space, which contains hypotheses consistent with training data. The document also highlights the limitations of the Find-S algorithm and the importance of handling errors in training examples for effective learning.

Uploaded by

luosuochao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views7 pages

ML Lec4

The document discusses concept learning, focusing on the Find-S algorithm and the Candidate-Elimination algorithm for hypothesis generation. It explains how to approximate Boolean-valued functions from training examples and introduces the concept of version space, which contains hypotheses consistent with training data. The document also highlights the limitations of the Find-S algorithm and the importance of handling errors in training examples for effective learning.

Uploaded by

luosuochao
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Lecture 4 1.

Introduction
Concept Learning
• Many learning involves acquiring general
concepts from specific training examples.
1. What is a concept learning task
• Each concept can be thought of as a Boolean-
2. Find-S algorithm valued function defined over a set (e.g. a function
3. Find the version space – List then elimination defined over all animals, whose value is true for
birds and false for other animals).
• Concept learning is approximating a Boolean-
valued function from examples.

A Concept Learning Task - cont


• Many possible representations
A Concept Learning Task
• Here, h is conjunction of constraints on attributes

• Each constraint can be

一 a specific value (e.g., Water = Warm)

一 don't care (e.g., “Water = ? ”)

一 no value allowed (e.g., "Water = ø”)

For example,

Sky AirTemp Humid Wind Water Forecst


• What is the general concept?
<Sunny ? ? Strong ? Same>
3 4
A Concept Learning Task - cont
The inductive learning hypothesis:
• Given:
一 Instances X: Possible days, each described by the attributes Sky, AirTemp, Any hypothesis found to approximate the target
Humidity, Wind, Water, Forecast function well over a sufficiently large set of training
一 Target function c: EnjoySport : X →{0,1} examples will also approximate the target function
一 Hypotheses H: Conjunctions of literals. E.g.
well over other unobserved examples.
〈?,Cold, High,?,?,?〉.

一 Training examples D: Positive and negative examples of the target function

〈x1,c(x1)〉, . . .〈xm, c(xm)〉

6
5

Instance, Hypotheses, and More-General-Than


General-to-specific Ordering of
Hypothesis Instances X Hypotheses H

7
8
Hypothesis Space Search by Find-S
2. Find-S Algorithm Instances X Hypotheses H

• Initialize h to the most specific hypothesis in H

• For each positive training instance x


- For each attribute constraint 𝑎𝑖 in h

If the constraint 𝑎𝑖 in h is satisfied by x


Then do nothing
Else replace 𝑎𝑖 in h by the next more general constraint that is
satisfied by x

• Output hypothesis h

9 10

Complaints about Find-S Complaints about Find-S - cont


3 Can't tell when training data inconsistent: In some cases, the training
1. Can't tell whether it has learned concept: Although Find-S will find a
examples will contain at least some error or noise. Such inconsistent sets
hypothesis consistent with the training data, it has no way to determine
of training examples can severely misled Find-S, given the fact that it
whether it has found the only hypotheses in H consistent with the data, or
ignores negative examples.
whether there are many other consistent hypotheses as well.
4 Depending on H, there might be several!: If there are several maximally
2. Picks a maximally specific h (why?): In case there are multiple hypotheses
specific hypotheses spaces, Find-S should be extended to allow it to
consistent with the training examples, Find-S will find the most specific.
backtrack on its choices of how to generalize the hypotheses, to
accommodate the possibility that the target concept lies along a different
branch of the partial ordering than the branch it has selected.

11 12
3. Find the Version Space
List-Then-Eliminate Algorithm
A hypothesis h is consistent with a set of training examples D of
target concept c if and only if h(x) = c(x) for each training 1. VersionSpace — a list containing every hypothesis in H
example〈x, c(x)〉in D. 2. For each training example,〈x, c(x)〉
remove from VersionSpace any hypothesis h for which
h(x) ≠ c(x)
3. Output the list of hypotheses in VersionSpace
The version space, V𝑆𝐻,𝐷 , with respect to hypothesis space H
and training examples D, is the subset of hypotheses from H
consistent with all training examples in D.

13 14

List-Then-Eliminate Algorithm - cont


• Recall that Find-S algorithm outputs the hypotheses
h = (Sunny, Warm, ?, Strong, ?, ?〉

• In fact, there are six different hypotheses from H that are consistent with
these training examples.

• The Candidate-Elimination algorithm represents the version space by


storing only its most general member (G) and it most specific member (S).

• Given only these two sets S and G, it is possible to enumerate all members
of the version space as needed by generating the hypotheses that lie between
these two sets in the general-to-specific partial ordering over hypotheses.

15 16
Candidate Elimination Algorithm
Representing Version Spaces G maximally general hypotheses in H
S maximally specific hypotheses in H
The General boundary, G, of version space V𝑆𝐻,𝐷 is the set of its
maximally general members of H consistent with D. For each training example d, do
• If d is a positive example
一 Remove from G any hypothesis inconsistent with d
一 For each hypothesis s in S that is not consistent with d
The Specific boundary, S, of version space V𝑆𝐻,𝐷 is the set of its
* Remove s from S
maximally specific members of H consistent with D.
* Add to S all minimal generalizations h of s such that
-h is consistent with d, and some member of G is more general than h
* Remove from S any hypothesis that is more general than another
hypothesis in S

17 18

• If d is a negative example
Example Trace
一 Remove from S any hypothesis inconsistent with d
一 For each hypothesis g in G that is not consistent with d
* Remove g from G
* Add to G all minimal specializations h of g such that
-h is consistent with d, and some member of S is more
specific than h
* Remove from G any hypothesis that is less general than
another hypothesis in G

19 20
What Next Training Example?
What Next Training Example?

21 22

What Next Training Example?

23 24
Remarks
• The version space learned by the Candidate-elimination
algorithm will converge toward the hypotheses that correctly
describes the target concept, provided that

1. there are no errors in the training examples, and


2. there is some hypothesis in H that correctly describes the target
concept.

25

You might also like