0% found this document useful (0 votes)

70 views23 pages

Algorithms - K Nearest Neighbors

K-nearest neighbors (KNN) is a simple machine learning algorithm that classifies new data points based on the majority class of its k nearest neighbors. It calculates the distance between a new data point and all other points in the training set using a distance measure like Euclidean distance. It then finds the k nearest data points based on distance and assigns the new point to the class that is most common amongst its k neighbors. Choosing an optimal value for k and normalizing features are important for KNN to perform well. While simple, it can achieve high accuracy with large datasets.

Uploaded by

Xander Rodriguez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

70 views23 pages

Algorithms - K Nearest Neighbors

Uploaded by

Xander Rodriguez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Algorithms: K Nearest Neighbors

Tilani Gunawardena
1
Algorithms: K Nearest Neighbors

2
Simple Analogy..
• Tell me about your friends(who your
neighbors are) and I will tell you who you are.

3
Instance-based Learning

Its very similar to a

Desktop!!

4
KNN – Different names
• K-Nearest Neighbors
• Memory-Based Reasoning
• Example-Based Reasoning
• Instance-Based Learning
• Lazy Learning

5
What is KNN?

• A powerful classification algorithm used in pattern

recognition.

• K nearest neighbors stores all available cases and

classifies new cases based on a similarity measure(e.g
distance function)

• One of the top data mining algorithms used today.

• A non-parametric lazy learning algorithm (An Instance-

based Learning method).

6
KNN: Classification Approach

• An object (a new instance) is classified by a

majority votes for its neighbor classes.
• The object is assigned to the most common class
amongst its K nearest neighbors.(measured by a
distant function )

7
8
Distance Measure

Compute
Distance
Test
Record

Training
Records Choose k of the
“nearest” records

9
Distance measure for Continuous
Variables

10
Distance Between Neighbors
• Calculate the distance between new example
(E) and all examples in the training set.

• Euclidean distance between two examples.

– X = [x1,x2,x3,..,xn]
– Y = [y1,y2,y3,...,yn]

– The Euclidean distance between X and Y is defined

as: n
D( X , Y )  2
 (x  y )
i 1
i i
11
K-Nearest Neighbor Algorithm
• All the instances correspond to points in an n-dimensional
feature space.

• Each instance is represented with a set of numerical

attributes.

• Each of the training data consists of a set of vectors and a

class label associated with each vector.

• Classification is done by comparing feature vectors of

different K nearest points.

• Select the K-nearest examples to E in the training set.

• Assign E to the most common class among its K-nearest

neighbors.
12
3-KNN: Example(1)
Customer Age Income No. Class Distance from John
credit
cards
sqrt [(35-37)2+(35-50)2 +(3-
George 35 35K 3 No 2)2]=15.16

Rachel 22 50K 2 Yes sqrt [(22-37)2+(50-50)2 +(2-

2)2]=15
Steve 63 200K 1 No sqrt [(63-37)2+(200-50)2 +(1-
2)2]=152.23

Tom 59 170K 1 No sqrt [(59-37)2+(170-50)2 +(1-

2)2]=122

Anne 25 40K 4 Yes sqrt [(25-37)2+(40-50)2 +(4-

2)2]=15.74

John 37 50K 2 ? YES

13
How to choose K?

• If K is too small it is sensitive to noise points.

• Larger K works well. But too large K may include majority

points from other classes.

• Rule of thumb is K < sqrt(n), n is number of examples.

14
15
X X X

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points

that have the k smallest distance to x

16
KNN Feature Weighting

• Scale each feature by its importance for

classification

• Can use our prior knowledge about which features

are more important
• Can learn the weights wk using cross‐validation (to
be covered later)

17
Feature Normalization

• Distance between neighbors could be dominated

by some attributes with relatively large numbers.
 e.g., income of customers in our previous example.

• Arises when two features are in different scales.

• Important to normalize those features.

– Mapping values to numbers between 0 – 1.

18
Nominal/Categorical Data
• Distance works naturally with numerical attributes.

• Binary value categorical data attributes can be regarded as 1

or 0.

19
KNN Classification
$250,000

$200,000

$150,000

Loan$ Non-Default
$100,000 Default

$50,000

$0
0 10 20 30 40 50 60 70

Age

20
KNN Classification – Distance
Age Loan Default Distance
25 $40,000 N 102000
35 $60,000 N 82000
45 $80,000 N 62000
20 $20,000 N 122000
35 $120,000 N 22000
52 $18,000 N 124000
23 $95,000 Y 47000
40 $62,000 Y 80000
60 $100,000 Y 42000
48 $220,000 Y 78000
33 $150,000 Y 8000

48 $142,000 ?

D  ( x1  x2 )  ( y1  y2 )
2 2

21
KNN Classification – Standardized Distance
Age Loan Default Distance
0.125 0.11 N 0.7652
0.375 0.21 N 0.5200
0.625 0.31 N 0.3160
0 0.01 N 0.9245
0.375 0.50 N 0.3428
0.8 0.00 N 0.6220
0.075 0.38 Y 0.6669
0.5 0.22 Y 0.4437
1 0.41 Y 0.3650
0.7 1.00 Y 0.3861
0.325 0.65 Y 0.3771

0.7 0.61 ?
X  Min
Xs 
Max  Min
22
Strengths of KNN
• Very simple and intuitive.
• Can be applied to the data from any distribution.
• Good classification if the number of samples is large enough.

Weaknesses of KNN

• Takes more time to classify a new example.

• need to calculate and compare distance from new example
to all other examples.
• Choosing k may be tricky.
• Need large number of samples for accuracy.

Java Interview JavaTpoint
100% (1)
Java Interview JavaTpoint
170 pages
ML-LECTURE9 KNN Classification
No ratings yet
ML-LECTURE9 KNN Classification
23 pages
KNN Lecture Presentation
No ratings yet
KNN Lecture Presentation
9 pages
K-Nearest Neighbors Algorithm
No ratings yet
K-Nearest Neighbors Algorithm
7 pages
Machine Learning
No ratings yet
Machine Learning
32 pages
K-Nearest Neighbor Algorithm
No ratings yet
K-Nearest Neighbor Algorithm
6 pages
Ambo Town Credit System Project
No ratings yet
Ambo Town Credit System Project
75 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
K - Nearest Neighbors
No ratings yet
K - Nearest Neighbors
33 pages
KNN Basics for Machine Learning Beginners
100% (1)
KNN Basics for Machine Learning Beginners
8 pages
Instance Based Learning
No ratings yet
Instance Based Learning
7 pages
Unit 3 KNN
No ratings yet
Unit 3 KNN
16 pages
K Nearest Neighbor KNN
No ratings yet
K Nearest Neighbor KNN
18 pages
Intro to k-Nearest Neighbor Algorithm
No ratings yet
Intro to k-Nearest Neighbor Algorithm
3 pages
K-Nearest Neighbor Classification-Algorithm and Characteristics
No ratings yet
K-Nearest Neighbor Classification-Algorithm and Characteristics
6 pages
Algorithms: K Nearest Neighbors
No ratings yet
Algorithms: K Nearest Neighbors
16 pages
Notes On K
No ratings yet
Notes On K
3 pages
KNN PDF
No ratings yet
KNN PDF
30 pages
6 - KNN Classifier
No ratings yet
6 - KNN Classifier
10 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
20 pages
Multimedia Unit 4
No ratings yet
Multimedia Unit 4
16 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
22 pages
KNN Algorithm
No ratings yet
KNN Algorithm
2 pages
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
No ratings yet
K-Nearest Neighbors: Marcel Van Velzen Junior Marte Garcia
8 pages
AOPA - GPS Technology
100% (1)
AOPA - GPS Technology
16 pages
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
No ratings yet
K Nearest Neighbors KNN A Fundamental Machine Learning Algorithm
11 pages
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
No ratings yet
K-Nearest Neighbor (KNN) Algorithm: Last Updated: 14 May, 2025
14 pages
K Nearest Neighbor
No ratings yet
K Nearest Neighbor
32 pages
Which Chart or Graph Is Right For You? Tell Impactful Stories With Data
No ratings yet
Which Chart or Graph Is Right For You? Tell Impactful Stories With Data
14 pages
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
No ratings yet
Miss Erum Mahood Topic: KNN Algorthim: Presentator BY: Zobia Malaika Maryam Minahil
10 pages
K Nearest Neighbour Classifier
No ratings yet
K Nearest Neighbour Classifier
24 pages
K-Nearest Neighbor Algorithm
100% (1)
K-Nearest Neighbor Algorithm
6 pages
Lecture Note #3 - PEC-CS701E
No ratings yet
Lecture Note #3 - PEC-CS701E
27 pages
k-NN Algorithm Overview & Applications
No ratings yet
k-NN Algorithm Overview & Applications
35 pages
Part A 3. KNN Classification
No ratings yet
Part A 3. KNN Classification
35 pages
Pharma Code Printing Guide
No ratings yet
Pharma Code Printing Guide
12 pages
ML Lecture 13 KNN
No ratings yet
ML Lecture 13 KNN
14 pages
Week 5 - Instance-Based Learning & PCA
No ratings yet
Week 5 - Instance-Based Learning & PCA
69 pages
Control Engineering Basics
100% (1)
Control Engineering Basics
18 pages
ML 2
No ratings yet
ML 2
6 pages
Week 07
No ratings yet
Week 07
24 pages
k-NN Algorithm: Basics, Applications, and Advantages
No ratings yet
k-NN Algorithm: Basics, Applications, and Advantages
42 pages
K Nearest Neighbor: Presented by
No ratings yet
K Nearest Neighbor: Presented by
29 pages
Lecture 07 KNN 14112022 034756pm
100% (1)
Lecture 07 KNN 14112022 034756pm
24 pages
Bài nhóm tìm hiểu về KNN
No ratings yet
Bài nhóm tìm hiểu về KNN
5 pages
K - Nearest Neighbor
No ratings yet
K - Nearest Neighbor
13 pages
KNN
No ratings yet
KNN
53 pages
KNN - Feb 19
No ratings yet
KNN - Feb 19
42 pages
Research Paper
No ratings yet
Research Paper
6 pages
Supervised Example KNN
No ratings yet
Supervised Example KNN
22 pages
19-K-Nearest Neighbor Learning.-22-08-2024
No ratings yet
19-K-Nearest Neighbor Learning.-22-08-2024
25 pages
14 K - Nearest Neighbours
No ratings yet
14 K - Nearest Neighbours
8 pages
Lecture 22 - K-Nearnest Neighbours
No ratings yet
Lecture 22 - K-Nearnest Neighbours
11 pages
K-NN (Nearest Neighbor)
100% (1)
K-NN (Nearest Neighbor)
17 pages
KNN Classifier for Data Scientists
No ratings yet
KNN Classifier for Data Scientists
16 pages
KNN Algorithm Overview & Steps
No ratings yet
KNN Algorithm Overview & Steps
27 pages
K-Nearest Neighbor
No ratings yet
K-Nearest Neighbor
22 pages
12 ML KNN
No ratings yet
12 ML KNN
28 pages
Intro to KNN for Data Science
No ratings yet
Intro to KNN for Data Science
37 pages
K Nearest Neighbors
No ratings yet
K Nearest Neighbors
22 pages
K-Nearest Neighbour (KNN)
No ratings yet
K-Nearest Neighbour (KNN)
14 pages
Git and GitHub Guide for Beginners
100% (1)
Git and GitHub Guide for Beginners
14 pages
4.kNN Concepts
No ratings yet
4.kNN Concepts
12 pages
Instance Based Learning
No ratings yet
Instance Based Learning
16 pages
K-NN Method
No ratings yet
K-NN Method
12 pages
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
No ratings yet
Dr. S. Vairachilai Department of CSE CVR College of Engineering Mangalpalli Telangana
18 pages
Module 6 - Spring Boot Java (MCA)
No ratings yet
Module 6 - Spring Boot Java (MCA)
113 pages
GMP Training for Medical Devices
67% (3)
GMP Training for Medical Devices
110 pages
The Impact of Risk Management On Construction Projects Success PDF
No ratings yet
The Impact of Risk Management On Construction Projects Success PDF
33 pages
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
No ratings yet
K Nearest Neighbors (KNN) : "Birds of A Feather Flock Together"
16 pages
ADO
No ratings yet
ADO
7,264 pages
State of California Security Evaluation ES&S EVS 5210
No ratings yet
State of California Security Evaluation ES&S EVS 5210
12 pages
UTS - Lec 11 - Digital Self - Panganiban
No ratings yet
UTS - Lec 11 - Digital Self - Panganiban
13 pages
PMP Certification: PMBOK® 6.0
No ratings yet
PMP Certification: PMBOK® 6.0
11 pages
Hailey College of Commerce Punjab University, Lahore: Assignment: A.I.S (Oracle) Submited To
No ratings yet
Hailey College of Commerce Punjab University, Lahore: Assignment: A.I.S (Oracle) Submited To
6 pages
Payroll Calculator & Database Code
No ratings yet
Payroll Calculator & Database Code
49 pages
Curso Online FREE - 5
No ratings yet
Curso Online FREE - 5
1 page
HEC-RAS User's Manual Version 4.1
No ratings yet
HEC-RAS User's Manual Version 4.1
790 pages
FACTORS INFLUENCING ADOPTION OF E-PROCUREMENT IN HUMANITARIAN ORGANIZATIONS (A Case of Norwegian Refugee Council - Kakuma Refugee Camp
100% (1)
FACTORS INFLUENCING ADOPTION OF E-PROCUREMENT IN HUMANITARIAN ORGANIZATIONS (A Case of Norwegian Refugee Council - Kakuma Refugee Camp
72 pages
Least Mastered Competency: Consolidated
No ratings yet
Least Mastered Competency: Consolidated
2 pages
Towards MVD Semantic Level
No ratings yet
Towards MVD Semantic Level
11 pages
AutoCAD Hatch and Array Guide
No ratings yet
AutoCAD Hatch and Array Guide
5 pages
Generating API Using Django Rest Framework With Insomnia
No ratings yet
Generating API Using Django Rest Framework With Insomnia
7 pages
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
No ratings yet
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
6 pages
Curusos 01
No ratings yet
Curusos 01
5 pages
Interview Questions
No ratings yet
Interview Questions
50 pages
DFCM Driver Manual
No ratings yet
DFCM Driver Manual
52 pages
Encrypted Text Analysis
No ratings yet
Encrypted Text Analysis
77 pages
No-Code Biology for Beginners
No ratings yet
No-Code Biology for Beginners
108 pages
Cairo Documentation
No ratings yet
Cairo Documentation
19 pages
Boost OEE with TPM and Pareto Analysis
No ratings yet
Boost OEE with TPM and Pareto Analysis
15 pages
Mandarine Log
No ratings yet
Mandarine Log
37 pages
Github Vs Gitlab
No ratings yet
Github Vs Gitlab
8 pages
Efficient Python Tricks and Tools For Data Scientists (Git and GitHub)
No ratings yet
Efficient Python Tricks and Tools For Data Scientists (Git and GitHub)
8 pages
CEMS Exam Guidelines 2023
No ratings yet
CEMS Exam Guidelines 2023
1 page
Bank Account Transactions June-July 2024
No ratings yet
Bank Account Transactions June-July 2024
18 pages
vm51616H - Video - Matrix - Switch - Ds - en
No ratings yet
vm51616H - Video - Matrix - Switch - Ds - en
3 pages

Algorithms - K Nearest Neighbors

Uploaded by

Algorithms - K Nearest Neighbors

Uploaded by

Algorithms: K Nearest Neighbors

Its very similar to a

• A powerful classification algorithm used in pattern

• K nearest neighbors stores all available cases and

• One of the top data mining algorithms used today.

• A non-parametric lazy learning algorithm (An Instance-

• An object (a new instance) is classified by a

• Euclidean distance between two examples.

– The Euclidean distance between X and Y is defined

• Each instance is represented with a set of numerical

• Each of the training data consists of a set of vectors and a

• Classification is done by comparing feature vectors of

• Select the K-nearest examples to E in the training set.

• Assign E to the most common class among its K-nearest

Rachel 22 50K 2 Yes sqrt [(22-37)2+(50-50)2 +(2-

Tom 59 170K 1 No sqrt [(59-37)2+(170-50)2 +(1-

Anne 25 40K 4 Yes sqrt [(25-37)2+(40-50)2 +(4-

John 37 50K 2 ? YES

• If K is too small it is sensitive to noise points.

• Larger K works well. But too large K may include majority

• Rule of thumb is K < sqrt(n), n is number of examples.

(a) 1-nearest neighbor (b) 2-nearest neighbor (c) 3-nearest neighbor

K-nearest neighbors of a record x are data points

• Scale each feature by its importance for

• Can use our prior knowledge about which features

• Distance between neighbors could be dominated

• Arises when two features are in different scales.

• Important to normalize those features.

• Binary value categorical data attributes can be regarded as 1

• Takes more time to classify a new example.

You might also like