Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views20 pages

Instance Based Learning

The document discusses Instance-Based Learning, focusing on methods such as k-Nearest Neighbor, locally weighted regression, and Radial Basis Function networks. It highlights the advantages and disadvantages of these methods, including the curse of dimensionality and the differences between lazy and eager learning approaches. Additionally, it covers Case-Based Reasoning and the use of kd-trees for efficient neighbor searching.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views20 pages

Instance Based Learning

The document discusses Instance-Based Learning, focusing on methods such as k-Nearest Neighbor, locally weighted regression, and Radial Basis Function networks. It highlights the advantages and disadvantages of these methods, including the curse of dimensionality and the differences between lazy and eager learning approaches. Additionally, it covers Case-Based Reasoning and the use of kd-trees for efficient neighbor searching.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Instance Based Learning

• k-Nearest Neighbor
• Locally weighted regression
• Radial basis functions
• Case-based reasoning
• Lazy and eager learning

CS 5751 Machine Chapter 8 Instance Based Learning 1


Learning
Instance-Based Learning
Key idea : just store all training examples < xi ,f(xi ) >
Nearest neighbor (1 - Nearest neighbor) :
• Given query instance xq , locate nearest example xn , estimate
fˆ ( xq ) ← f ( xn )
k − Nearest neighbor :
• Given xq , take vote among its k nearest neighbors (if
discrete - valued target function)
• Take mean of f values of k nearest neighbors (if real - valued)

ˆf ( x ) ← ∑i =1
k
f ( xi )
q
k
CS 5751 Machine Chapter 8 Instance Based Learning 2
Learning
When to Consider Nearest Neighbor
• Instance map to points in Rn
• Less than 20 attributes per instance
• Lots of training data
Advantages
• Training is very fast
• Learn complex target functions
• Do not lose information
Disadvantages
• Slow at query time
• Easily fooled by irrelevant attributes
CS 5751 Machine Chapter 8 Instance Based Learning 3
Learning
k-NN Classification
5-Nearest Neighbor

xq 1-NN Decision Surface

CS 5751 Machine Chapter 8 Instance Based Learning 4


Learning
Behavior in the Limit
Define p(x) as probability that instance x will be
labeled 1 (positive) versus 0 (negative)
Nearest Neighbor
• As number of training examples approaches infinity,
approaches Gibbs Algorithm
Gibbs: with probability p(x) predict 1, else 0
k-Nearest Neighbor:
• As number of training examples approaches infinity and k
gets large, approaches Bayes optimal
Bayes optimal: if p(x) > 0.5 then predict 1, else 0
• Note Gibbs has at most twice the expected error of Bayes
optimal
CS 5751 Machine Chapter 8 Instance Based Learning 5
Learning
Distance-Weighted k-NN
Might want to weight nearer neighbors more heavily ...

ˆf ( x ) ← ∑i =1 i
k
w f ( xi )
∑i =1 wi
q k

where
1
wi ≡
d ( xq , xi ) 2
and d(xq ,xi ) is distance between xq and xi
Note, now it makes sense to use all training examples
instead of just k
→ Shepard' s method
CS 5751 Machine Chapter 8 Instance Based Learning 6
Learning
Curse of Dimensionality
Imagine instances described by 20 attributes, but
only 2 are relevant to target function
Curse of dimensionality: nearest neighbor is easily
misled when high-dimensional X
One approach:
• Stretch jth axis by weight zj, where z1,z2,…,zn chosen to
minimize prediction error
• Use cross-validation to automatically choose weights
z1,z2,…,zn
• Note setting zj to zero eliminates dimension j altogether
see (Moore and Lee, 1994)

CS 5751 Machine Chapter 8 Instance Based Learning 7


Learning
Locally Weighted Regression
k - NN forms local approximation to f for each query point xq
Why not form explicit approximation fˆ(x) for region around xq ?
• Fit linear function to k nearest neighbors
• Or fit quadratic, etc.
• Produces " piecewise approximation" to f
Several choices of error to minimize :
• Squared error over k nearest neighbors
E (x ) ≡ 1
1 q 2 ∑
( f ( x) − fˆ ( x)) 2
x∈k nearest neighbors of xq

• Distance - weighted squared error over all neighbors


q 2 ∑
E ( x ) ≡ 1 ( f ( x) − fˆ ( x)) 2 K (d ( x , x))
2
x∈D
q

CS 5751 Machine Chapter 8 Instance Based Learning 8


Learning
Radial Basis Function Networks
• Global approximation to target function, in terms
of linear combination of local approximations
• Used, for example, in image classification
• A different kind of neural network
• Closely related to distance-weighted regression,
but “eager” instead of “lazy”

CS 5751 Machine Chapter 8 Instance Based Learning 9


Learning
Radial Basis Function Networks
f(x)

w
w0 1
w2 k
w

1
where ai(x) are the attributes describing
instance x, and
a1(x) a2(x) an(x) k
f ( x) = w0 + ∑ wu K u (d ( xu , x))
u =1

One common choice for K u(d(xu ,x)) is


1
d 2 ( xu , x )
2σ u2
K u(d(xu ,x)) = e

CS 5751 Machine Chapter 8 Instance Based Learning 10


Learning
Training RBF Networks
Q1: What xu to use for kernel function Ku(d(xu,x))?
• Scatter uniformly through instance space
• Or use training instances (reflects instance distribution)
Q2: How to train weights (assume here Gaussian
Ku)?
• First choose variance (and perhaps mean) for each Ku
– e.g., use EM
• Then hold Ku fixed, and train linear output layer
– efficient methods to fit linear function

CS 5751 Machine Chapter 8 Instance Based Learning 11


Learning
Case-Based Reasoning
Can apply instance-based learning even when XV Rn
→ need different “distance” metric
Case-Based Reasoning is instance-based learning applied to
instances with symbolic logic descriptions:
((user-complaint error53-on-shutdown)
(cpu-model PowerPC)
(operating-system Windows)
(network-connection PCIA)
(memory 48meg)
(installed-applications Excel Netscape
VirusScan)
(disk 1Gig)
(likely-cause ???))

CS 5751 Machine Chapter 8 Instance Based Learning 12


Learning
Case-Based Reasoning in CADET
CADET: 75 stored examples of mechanical devices
• each training example:
<qualitative function, mechanical structure>
• new query: desired function
• target value: mechanical structure for this function

Distance metric: match qualitative function


descriptions

CS 5751 Machine Chapter 8 Instance Based Learning 13


Learning
Case-Based Reasoning in CADET
A stored case: T-junction pipe
Structure: Function:
Q1,T1 T = temperature Q1 +
Q = waterflow
Q3
Q2 +
Q3,T3
T1 +
T3
T2 +
Q2,T2
A problem specification: Water faucet
+
Structure: Function: Cc Qc +
++ + Qm

? Ch Qh

-
+
Tc

+
+ Tm
Th +
CS 5751 Machine Chapter 8 Instance Based Learning 14
Learning
Case-Based Reasoning in CADET
• Instances represented by rich structural
descriptions
• Multiple cases retrieved (and combined) to form
solution to new problem
• Tight coupling between case retrieval and problem
solving
Bottom line:
• Simple matching of cases useful for tasks such as
answering help-desk queries
• Area of ongoing research

CS 5751 Machine Chapter 8 Instance Based Learning 15


Learning
Lazy and Eager Learning
Lazy: wait for query before generalizing
• k-Nearest Neighbor, Case-Based Reasoning
Eager: generalize before seeing query
• Radial basis function networks, ID3, Backpropagation, etc.

Does it matter?
• Eager learner must create global approximation
• Lazy learner can create many local approximations
• If they use same H, lazy can represent more complex
functions (e.g., consider H=linear functions)

CS 5751 Machine Chapter 8 Instance Based Learning 16


Learning
kd-trees (Moore)
• Eager version of k-Nearest Neighbor
• Idea: decrease time to find neighbors
– train by constructing a lookup (kd) tree
– recursively subdivide space
• ignore class of points
• lots of possible mechanisms: grid, maximum variance, etc.
– when looking for nearest neighbor search tree
– nearest neighbor can be found in log(n) steps
– k nearest neighbors can be found by generalizing
process (still in log(n) steps if k is constant)
• Slower training but faster classification
CS 5751 Machine Chapter 8 Instance Based Learning 17
Learning
kd Tree

CS 5751 Machine Chapter 8 Instance Based Learning 18


Learning
Instance Based Learning Summary
• Lazy versus Eager learning
– lazy - work done at testing time
– eager -work done at training time
– instance based sometimes lazy
• k-Nearest Neighbor (k-nn) lazy
– classify based on k nearest neighbors
– key: determining neighbors
– variations:
• distance weighted combination
• locally weighted regression
– limitation: curse of dimensionality
• “stretching” dimensions
CS 5751 Machine Chapter 8 Instance Based Learning 19
Learning
Instance Based Learning Summary
• k-d trees (eager version of k-nn)
– structure built at train time to quickly find neighbors
• Radial Basis Function (RBF) networks (eager)
– units active in region (sphere) of space
– key: picking/training kernel functions
• Case-Based Reasoning (CBR) generally lazy
– nearest neighbor when no continuos features
– may have other types of features:
• structural (graphs in CADET)

CS 5751 Machine Chapter 8 Instance Based Learning 20


Learning

You might also like