0% found this document useful (0 votes)

7 views36 pages

Linear-Classifier Lecture 4 SLB

The document introduces linear classifiers as a foundational method for classification tasks, emphasizing their simplicity compared to other techniques like decision trees. It outlines the classification process, examples, and various classification techniques, while also discussing the importance of model accuracy and evaluation through confusion matrices. Additionally, it highlights the geometric interpretation of classification problems and the role of linear discriminant functions in determining decision boundaries.

Uploaded by

Par Veen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views36 pages

Linear-Classifier Lecture 4 SLB

Uploaded by

Par Veen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 36

Linear Classification

Introduction to Classification using

Linear Classifiers

Last modified 1/1/19

1
Why Start with Linear Classifiers?

• Linear classifiers are the simplest classifiers

• Simpler than decision trees
• Textbook starts with decision trees
• We will use decision trees to introduce some of the more advanced concepts
• Learning method is linear regression
• We will use linear classifiers to introduce some concepts in
classification
• Linear classifier also provides yet one more classification algorithm
• Also helps demonstrate how different algorithms form different types of
decision boundaries

2
Classification: Definition
• Given a collection of records (training set )
• Each record contains a set of attributes and a class attribute
• Model the class attribute as a function of other attributes
• Goal: previously unseen records should be assigned a class
as accurately as possible (predictive accuracy)
• A test set is used to determine the accuracy of the model
• Usually the given labeled data is divided into training and test sets
• Training set used to build the model and test set to evaluate it

3
Classification Examples

• Predicting tumor cells as benign or malignant

• Classifying credit card transactions

as legitimate or fraudulent

• Classifying physical activities based on smartphone sensor data

• Categorizing news stories as finance,

weather, entertainment, sports, etc

4
Classification Techniques

• Decision Tree based Methods

• Memory based reasoning (Nearest Neighbor)
• Neural Networks
• Naïve Bayes
• Support Vector Machines
• Linear Regression (we start with this)

5
The Classification Problem Katydids

Given a collection of 5 instances of

Katydids and five Grasshoppers,
decide what type of insect the
unlabeled corresponds to.
Grasshoppers

Katydid or Grasshopper? 6
For any domain of interest,
we can measure features
Color {Green, Brown, Gray, Other} Has Wings?

Abdomen Thorax
Length Length Antennae
Length

Mandible
Size

Spiracle
Diameter
Leg Length

7
My_Collection
We can store features
Insect Abdomen Antennae Insect Class
in a database. ID Length Length
1 2.7 5.5 Grasshopper
2 8.0 9.1 Katydid
The classification Grasshopper
3 0.9 4.7
problem can now be 4 1.1 3.1 Grasshopper
expressed as: 5 5.4 8.5 Katydid
6 2.9 1.9 Grasshopper
• Given a training database, predict
the class label of a previously 7 6.1 6.6 Katydid
unseen instance 8 0.5 1.0 Grasshopper
9 8.3 6.6 Katydid
10 8.1 4.7 Katydids

previously unseen instance = 11 5.1 7.0 ???????

8
Grasshoppers Katydids

10
9
8
7
Antenna Length

6
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10
Abdomen Length

9
Grasshoppers Katydids

10
9
8
7
Antenna Length

6
5 Each of these data
4 objects are called…
• exemplars
3
• (training) examples
2 • instances
1 • tuples

1 2 3 4 5 6 7 8 9 10
Abdomen Length

10
We will return to the previous
slide in two minutes. In the
meantime, we are going to play
a quick game.

11
Problem 1
Examples of class A Examples of class B

3 4 5 2.5

1.5 5 5 2

6 8 8 3

2.5 5 4.5 3 12
Problem 1 What class is this
object?
Examples of class A Examples of class B

3 4 5 2.5 8 1.5

What about this one,

1.5 5 5 2
A or B?

6 8 8 3

4.5 7
2.5 5 4.5 3 13
Problem 2 Oh! This ones hard!

Examples of class A Examples of class B

4 4 5 2.5 8 1.5

5 5 2 5

6 6 5 3

3 3 2.5 3 14
Problem 3
Examples of class A Examples of class B

6 6

4 4 5 6 This one is really hard!

What is this, A or B?

1 5 7 5

6 3 4 8

3 7 7 7 15
Why did we spend so much
time with this game?

Because we wanted to
show that almost all
classification problems
have a geometric
interpretation, check out
the next 3 slides… 16
10
Problem 1 9
8
Examples of class A Examples of class B 7
6

Left Bar
5
4
3
2
3 4 5 2.5 1

1 2 3 4 5 6 7 8 9 10
Right Bar
1.5 5 5 2

Here is the rule again.

If the left bar is smaller
than the right bar, it is
6 8 8 3
an A, otherwise it is a B.

2.5 5 4.5 3 17
10
Problem 2 9
8
Examples of class A Examples of class B 7
6

Left Bar
5
4
3
2
4 4 5 2.5 1

1 2 3 4 5 6 7 8 9 10
Right Bar
5 5 2 5
Let me look it up… here it is..
the rule is, if the two bars are
equal sizes, it is an A.
Otherwise it is a B.
6 6 5 3

3 3 2.5 3 18
100
Problem 3 90
80
Examples of class A Examples of class B 70
60

Left Bar
50
40
30
20
4 4 5 6 10

10 20 30 40 50 60 70 80 90 100
Right Bar
1 5 7 5

6 3 4 8
The rule again:
if the square of the sum of the
two bars is less than or equal
to 100, it is an A. Otherwise it
3 7 7 7
is a B. 19
Problem 3

• An alternative rule that works on the original training data is X + Y ≤

10 🡺 Class A; else B
• Which is better?
• Ultimately is one right and one wrong?

20
Grasshoppers Katydids

10
9
8
7
Antenna Length

6
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10
Abdomen Length

21
previously unseen instance = 11 5.1 7.0 ???????

• We can “project” the previously

10 unseen instance into the same space
as the database.
9
8 • We have now abstracted away the
details of our particular problem. It
7
will be much easier to talk about
Antenna Length

6 points in space.
5
4
3
2
1

1 2 3 4 5 6 7 8 9 10
Katydids
Abdomen Length Grasshoppers 22
Simple Linear Classifier

10
9
8
7 R.A. Fisher
1890-1962
6
5 If previously unseen instance above the line
4 then
class is Katydid
3
else
2 class is Grasshopper
1
Katydids
1 2 3 4 5 6 7 8 9 10 Grasshoppers
23
Fitting a Model to Data

• One way to build a predictive model is to specify the structure of the

model with some parameters missing
• Parameter learning or parameter modeling
• Common in statistics but includes data mining methods since fields overlap
• Linear regression, logistic regression, support vector machines

24
Linear Discriminant Functions
• Equation of a line is y = mx + b
• A classification function may look like:
• Class + : if 1.0 × age – 1.5 × balance + 60 > 0
• Class - : if 1.0 × age – 1.5 × balance + 60 ≤ 0
• General form is f(x) = w0 + w1x1 + w2x2 + …
• Parameterized model where the weights for each feature are the
parameters
• The larger the magnitude of the weight the more important the feature
• The separator is a line when 2D, plane with 3D, and hyperplane with
more than 3D

25
What is the Best Separator?
Each separator has a different
10 margin, which is the distance to
9 the closest point. The orange line
has the largest margin.
8
7 For support vector machines, the
6 line/plane with the largest margin
5 is best.
4
3
2
1

1 2 3 4 5 6 7 8 9 10

26
Scoring and Ranking Instances

• Sometimes we want to know which examples are most likely to

belong to a class
• Linear discriminant functions can give us this
• Closer to separator is less confident and further away is more confident
• In fact the magnitude of f(x) give us this where larger values are more
confident/likely

27
Class Probability Estimation

• Class probability estimation is also something you often want

• Often free with methods like decision trees
• More complicated with linear discriminant functions since the
distance from the separator not a probability
• Logistic regression solves this
• We will not go into the details in this class
• Logistic regression determines class probability estimate

28
Classification Accuracy
Predicted class

Class = Katydid (1) Class = Grasshopper (0)

Class = Katydid (1) f11 f10
Actual Class
Class = Grasshopper (0) f01 f00

Confusion Matrix

Number of correct predictions f11 + f00

Accuracy = --------------------------------------------- = -----------------------
Total number of predictions f11 + f10 + f01 + f00

Number of wrong predictions f10 + f01

Error rate = --------------------------------------------- = -----------------------
Total number of predictions f11 + f10 + f01 + f00
29
Confusion Matrix
• In a binary decision problem, a classifier labels examples as either positive
or negative.
• Classifiers produce confusion/ contingency matrix, which shows four
entities: TP (true positive), TN (true negative), FP (false positive), FN (false
negative)

Confusion Matrix

Predicted Predicted
Positive Negative
(+) (-)
Actual
Positive (Y) TP FN

Actual For now responsible for

Negative (N) FP TN knowing Recall and Precision
30
The simple linear classifier is defined for higher dimensional spaces…

31
… we can visualize it as being
an n-dimensional hyperplane

32
Which of the “Problems” can be solved by the Simple
Linear Classifier? 10
9
8
7
6
5
1) Perfect 4
2) Useless 3
2
3) Pretty Good 1
1 2 3 4 5 6 7 8 9 10

100 10
90 9
Problems that can be 80 8
70 7
solved by a linear 60 6
classifier are call 50 5
linearly separable. 40 4
30 3
20 2
10 1
10 20 30 40 50 60 70 80 90 100 1 2 3 4 5 6 7 8 9 10
33
A Famous Problem
R. A. Fisher’s Iris Dataset.
Virginica
• 3 classes
• 50 of each class Versicolor
The task is to classify Iris plants
into one of 3 varieties using Petal Setos
Length and Petal Width. a Setosa Versicolor
Data:
https://archive.ics.uci.edu/ml/datasets/iris

34
Iris Setosa Iris Versicolor Iris Virginica
We can generalize to N classes by fitting N-1 lines. In this case we first learn the line
to discriminate between Setosa and Virginica/Versicolor, then we learned to
approximately discriminate between Virginica and Versicolor.

Virginica

Setosa
Versicolor

If petal width > 3.272 – (0.325 * petal length) then class = Virginica
Elseif petal width… 35
How to Compare Classification Algorithms?

• What criteria do we care about? What matters?

• Performance- predictive accuracy etc.
• Speed and Scalability
• time to construct model
• time to use/apply the model
• Expressive Power
• how flexible is the decision boundary
• Interpretability
• understanding and insight provided by the model
• ability to explain/justify the results
• Robustness
• handling noise, missing values and irrelevant features, streaming data

08 Class Basic
No ratings yet
08 Class Basic
154 pages
Visual Recognition
No ratings yet
Visual Recognition
123 pages
Pattern Recognition
No ratings yet
Pattern Recognition
461 pages
Classification Algorithms
No ratings yet
Classification Algorithms
62 pages
Unit4 PPT
No ratings yet
Unit4 PPT
118 pages
Mod 3
No ratings yet
Mod 3
56 pages
000 Supervised Learning 6-11
No ratings yet
000 Supervised Learning 6-11
98 pages
Fish Species Classification Guide
No ratings yet
Fish Species Classification Guide
141 pages
ML Unit 4
No ratings yet
ML Unit 4
76 pages
Unit 3
No ratings yet
Unit 3
100 pages
Lect 13
No ratings yet
Lect 13
94 pages
הרצאה-Classifiers and Decision Trees
No ratings yet
הרצאה-Classifiers and Decision Trees
119 pages
L6 Lecture Image - Classification.fundemental v4
No ratings yet
L6 Lecture Image - Classification.fundemental v4
66 pages
Chapter 4 - Introduction To Pattern Recognition &
No ratings yet
Chapter 4 - Introduction To Pattern Recognition &
71 pages
2 KNN
No ratings yet
2 KNN
67 pages
Intro to Machine Learning Basics
100% (1)
Intro to Machine Learning Basics
52 pages
Data Mining for CSE Students
No ratings yet
Data Mining for CSE Students
25 pages
Pattern Revision
No ratings yet
Pattern Revision
63 pages
Datamining Lect7knearst
No ratings yet
Datamining Lect7knearst
62 pages
Session 5
No ratings yet
Session 5
36 pages
CG DADL - 2024 June - Lecture 05
No ratings yet
CG DADL - 2024 June - Lecture 05
36 pages
Data Mining and Classification Basics
No ratings yet
Data Mining and Classification Basics
129 pages
ML Unit2
No ratings yet
ML Unit2
38 pages
DWDM Unit IV
No ratings yet
DWDM Unit IV
22 pages
K-Nearest Neighbors Classifiers 2025
No ratings yet
K-Nearest Neighbors Classifiers 2025
33 pages
Data Classification & Prediction Guide
No ratings yet
Data Classification & Prediction Guide
34 pages
Classification Basics for Data Mining
No ratings yet
Classification Basics for Data Mining
29 pages
Classification Basics and Techniques
No ratings yet
Classification Basics and Techniques
80 pages
Datamining Lect12
No ratings yet
Datamining Lect12
75 pages
Machine Learning
100% (2)
Machine Learning
76 pages
Classification
No ratings yet
Classification
22 pages
cs188 sp23 Lec25 - Z
No ratings yet
cs188 sp23 Lec25 - Z
38 pages
Statistical Classification
No ratings yet
Statistical Classification
6 pages
LDA & KNN: Advanced Analysis Guide
No ratings yet
LDA & KNN: Advanced Analysis Guide
37 pages
06-07-08-Supervised Learning by Computing Distances, Multi Class Classification, Decision Boundary
No ratings yet
06-07-08-Supervised Learning by Computing Distances, Multi Class Classification, Decision Boundary
32 pages
Lecture 1
No ratings yet
Lecture 1
48 pages
06 Lectureslides LinearClassification Fixed
No ratings yet
06 Lectureslides LinearClassification Fixed
52 pages
8.predictive Analytics - Classification 2
No ratings yet
8.predictive Analytics - Classification 2
28 pages
Lecture 3 Basics of Clssification
No ratings yet
Lecture 3 Basics of Clssification
53 pages
Lecture 10.self
No ratings yet
Lecture 10.self
9 pages
Lect 7 DM
No ratings yet
Lect 7 DM
65 pages
Uoc Luong Phi Tham So
No ratings yet
Uoc Luong Phi Tham So
84 pages
Lec 41
No ratings yet
Lec 41
6 pages
Data Mining Lecture 10B: Classification
No ratings yet
Data Mining Lecture 10B: Classification
62 pages
Categorical Data Analysis With SAS and SPSS Applications
100% (1)
Categorical Data Analysis With SAS and SPSS Applications
576 pages
Classification FoundationalMathofAI S24
No ratings yet
Classification FoundationalMathofAI S24
6 pages
Decision Tree
No ratings yet
Decision Tree
30 pages
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
No ratings yet
6.036: Intro To Machine Learning: Lecturer: Professor Leslie Kaelbling Notes By: Andrew Lin Fall 2019
50 pages
Lecture 4
No ratings yet
Lecture 4
31 pages
Introduction To Machine Learning Lecture 3: Linear Classification Methods
No ratings yet
Introduction To Machine Learning Lecture 3: Linear Classification Methods
40 pages
Machine Learning Introduction
No ratings yet
Machine Learning Introduction
56 pages
Harrell's Concordance Index R
No ratings yet
Harrell's Concordance Index R
13 pages
CH 8 Data Mining
No ratings yet
CH 8 Data Mining
30 pages
Lecture 1
No ratings yet
Lecture 1
36 pages
IT 802 ML Unit-2 Notes
No ratings yet
IT 802 ML Unit-2 Notes
19 pages
Internship - Report (Vinay Patil L V)
No ratings yet
Internship - Report (Vinay Patil L V)
34 pages
Unit 1 Part 3
No ratings yet
Unit 1 Part 3
11 pages
Topic 8 Basic Classification Methods
No ratings yet
Topic 8 Basic Classification Methods
2 pages
DM - Ch4 - Classification (Part1)
No ratings yet
DM - Ch4 - Classification (Part1)
20 pages
Multiclass Classification Survey
No ratings yet
Multiclass Classification Survey
9 pages
ML
No ratings yet
ML
22 pages
Quantitative Research Methods Guide
No ratings yet
Quantitative Research Methods Guide
15 pages
Intro to Binary Classification
No ratings yet
Intro to Binary Classification
10 pages
(Big Data For Industry 4.0) K. Suganthi, R. Karthik, G. Rajesh, Peter Ho Chiung Ching - Machine Learning and Deep Learning Techniques in Wireless and Mobile Networking Systems-CRC Press (2021)
No ratings yet
(Big Data For Industry 4.0) K. Suganthi, R. Karthik, G. Rajesh, Peter Ho Chiung Ching - Machine Learning and Deep Learning Techniques in Wireless and Mobile Networking Systems-CRC Press (2021)
285 pages
Econometrics: Qualitative Models
No ratings yet
Econometrics: Qualitative Models
144 pages
Likelihood Statistic For Interpretation of The Stability Graph For
No ratings yet
Likelihood Statistic For Interpretation of The Stability Graph For
10 pages
Journal of Petroleum Science and Engineering: Hong Tang, Christopher D. White
No ratings yet
Journal of Petroleum Science and Engineering: Hong Tang, Christopher D. White
6 pages
Community Perception Regarding Rabies Prevention and Stray Dog Control in Urban Slums in India
No ratings yet
Community Perception Regarding Rabies Prevention and Stray Dog Control in Urban Slums in India
7 pages
The Half-Wave Dipole Antenna
No ratings yet
The Half-Wave Dipole Antenna
2 pages
Section07 Solutions
No ratings yet
Section07 Solutions
11 pages
Logistic Regression for Researchers
No ratings yet
Logistic Regression for Researchers
38 pages
Faccio Et Al .2006. Political Connections and Corporate Bailouts
No ratings yet
Faccio Et Al .2006. Political Connections and Corporate Bailouts
44 pages
Data-Driven Auditing: A Predictive Modeling Approach To Fraud Detection and Classification
No ratings yet
Data-Driven Auditing: A Predictive Modeling Approach To Fraud Detection and Classification
19 pages
Econometric Panel & Limited Data Syllabus
No ratings yet
Econometric Panel & Limited Data Syllabus
6 pages
Machine Learning Techniques Using Python For Data
No ratings yet
Machine Learning Techniques Using Python For Data
17 pages
Monte Carlo Simulation Soccer
100% (1)
Monte Carlo Simulation Soccer
9 pages
Lec02 SDLC
No ratings yet
Lec02 SDLC
62 pages
Risk Factors of Morbidity Among Children Under Age Five in Ethiopia
No ratings yet
Risk Factors of Morbidity Among Children Under Age Five in Ethiopia
9 pages
Prevalence and Factors Associated With Fear of Public Speaking
No ratings yet
Prevalence and Factors Associated With Fear of Public Speaking
5 pages
SQuAD: A Challenge for AI Reading
No ratings yet
SQuAD: A Challenge for AI Reading
10 pages
Lec 3
No ratings yet
Lec 3
44 pages
Cash and Card Acceptance in Retail Payments - Motivations and Factors
No ratings yet
Cash and Card Acceptance in Retail Payments - Motivations and Factors
34 pages
Financial Statement Timeliness Factors
No ratings yet
Financial Statement Timeliness Factors
11 pages
Machine Learning
No ratings yet
Machine Learning
6 pages
A Detailed Explanation and Graphical Representation of The Blinder-Oaxaca Decomposition Method With Its Application in Health Inequalities
No ratings yet
A Detailed Explanation and Graphical Representation of The Blinder-Oaxaca Decomposition Method With Its Application in Health Inequalities
15 pages
Lecture 1 (Intro To Internet)
No ratings yet
Lecture 1 (Intro To Internet)
28 pages
Epidemiology Analysis Biostatistic Assignment
No ratings yet
Epidemiology Analysis Biostatistic Assignment
6 pages
Antenna Efficiency
No ratings yet
Antenna Efficiency
3 pages
1495 3051 1 PB
No ratings yet
1495 3051 1 PB
10 pages
Internet and Protocol-2
No ratings yet
Internet and Protocol-2
10 pages
Block Cipher Modes and Security
No ratings yet
Block Cipher Modes and Security
24 pages
MVC Pajn
No ratings yet
MVC Pajn
9 pages
Major Contributing Factors To Lower Level of Connection To Existing Sewer Network
No ratings yet
Major Contributing Factors To Lower Level of Connection To Existing Sewer Network
8 pages
Study and Analysis of Rectangular Microstrip Patch Array Antenna at 28Ghz For 5G Applications
No ratings yet
Study and Analysis of Rectangular Microstrip Patch Array Antenna at 28Ghz For 5G Applications
6 pages
Lab 7
No ratings yet
Lab 7
2 pages
Frequency
No ratings yet
Frequency
4 pages
Why - Do - Antennas - Radiate
No ratings yet
Why - Do - Antennas - Radiate
4 pages
Antenna Gain
No ratings yet
Antenna Gain
2 pages
Antenna - Theory .Com - Frequency Bands
No ratings yet
Antenna - Theory .Com - Frequency Bands
2 pages
The Short Dipole Antenna
No ratings yet
The Short Dipole Antenna
3 pages
Lecture PPT 7
No ratings yet
Lecture PPT 7
15 pages
(Ebook PDF) Principles of Biostatistics 2nd Edition Download PDF
100% (5)
(Ebook PDF) Principles of Biostatistics 2nd Edition Download PDF
56 pages
CT04 - ICE4101 - Google Forms
No ratings yet
CT04 - ICE4101 - Google Forms
9 pages
Audit Delays Impact on Client Retention
No ratings yet
Audit Delays Impact on Client Retention
19 pages
Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers
No ratings yet
Diabetes Prediction Using Ensembling of Different Machine Learning Classifiers
16 pages
Discovering Statistics Using Ibm Spss Statistics 4Th Edition (Ebook PDF) Download
No ratings yet
Discovering Statistics Using Ibm Spss Statistics 4Th Edition (Ebook PDF) Download
45 pages
Hidden Markov Model - Forward Algorithm
No ratings yet
Hidden Markov Model - Forward Algorithm
7 pages
Determinants LR (Article Review)
No ratings yet
Determinants LR (Article Review)
6 pages

Linear-Classifier Lecture 4 SLB

Uploaded by

Linear-Classifier Lecture 4 SLB

Uploaded by

Linear Classification

Introduction to Classification using

Last modified 1/1/19

• Linear classifiers are the simplest classifiers

• Predicting tumor cells as benign or malignant

• Classifying credit card transactions

• Classifying physical activities based on smartphone sensor data

• Categorizing news stories as finance,

• Decision Tree based Methods

Given a collection of 5 instances of

previously unseen instance = 11 5.1 7.0 ???????

What about this one,

Examples of class A Examples of class B

4 4 5 6 This one is really hard!

Here is the rule again.

• An alternative rule that works on the original training data is X + Y ≤

• We can “project” the previously

• One way to build a predictive model is to specify the structure of the

• Sometimes we want to know which examples are most likely to

• Class probability estimation is also something you often want

Class = Katydid (1) Class = Grasshopper (0)

Number of correct predictions f11 + f00

Number of wrong predictions f10 + f01

Actual For now responsible for

• What criteria do we care about? What matters?

You might also like