0% found this document useful (0 votes)

32 views38 pages

Intro

Uploaded by

GregMG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views38 pages

Intro

Uploaded by

GregMG

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 38

Classification:

A machine learning perspective

Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Part of a specialization

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

This course is a part of the
Machine Learning Specialization

1. Foundations

4. Clustering 5. Recommender
2. Regression 3. Classification
& Retrieval Systems

6. Capstone

3 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

What is the course about?

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

What is classification?
From features to predictions

ML
Data Classifier Intelligence
Method

Input x:
features derived Learn xày
from data
relationship Predict y:
categorical “output”,
class or label
5 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Sentiment classifier
Input x: Easily best sushi in Seattle.

Sentence Sentiment
Classifier

Output: y
Sentiment

6 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Classifier

Sentence
Classifier
from
review MODEL
Output: y
Input: x Predicted
class

7 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Example multiclass classifier
Output y has more than 2 categories

Education

Finance

Technology

Input: x Output: y
Webpage
8 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Spam filtering
Not spam

Spam

Input: x Output: y
Text of email,
9
sender, IP,… ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Image classification

Input: x Output: y
Image pixels Predicted object
10 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Personalized medical diagnosis
Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia
…

11 ©2015 Emily Fox & Carlos Guestrin Machine Learning Specialization

Reading your mind
Inputs x are
brain region Output y
intensities
“Hammer”

“House”
12 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Impact of classification

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Impact of classification

14 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course overview

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course philosophy: Always use case studies & …

Core
Visual Algorithm
concept

Advanced
Practical Implement
topics

I O N A L
OPT
16 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Overview of content

Models Algorithms Core ML

Linear Alleviating
Gradient
classifiers overfitting

Logistic Stochastic Handling

regression gradient missing data

Decision Recursive Precision-

trees greedy recall

Online
Ensembles Boosting
learning

17 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Course outline

©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Overview of modules

Models Algorithms Core ML

Alleviating
Linear classifiers Gradient
overfitting
Module 1 Modules 2 & 3
Modules 3 & 5

Handling missing
Logistic regression Stochastic gradient
data
Modules 1, 2, 3 Module 9
Module 6

Decision trees Recursive greedy Precision-recall

Modules 4 & 5 Module 4 Module 8

Ensembles Boosting Online learning

Module 7 Module 8 Module 9

19 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 1: Linear classifiers
Word Coeﬃcient
#awesome 1.0
#awful -1.5
Score(x) = 1.0 #awesome – 1.5 #awful
#awful

Score(x) < 0
…

0
Score(x) > 0
0 1 2 3 4 …
#awesome
20 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 1: Logistic regression represents probabilities
⌃
P(y=+1|x,ŵ) = 1 .

1 + e-ŵ h(x)
T

21 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 2: Learning “best” classifier
Maximize likelihood over all possible w0,w1,w2

ℓ(w0=0, w1=1, w2=-1.5) = 10-6

#awful

ℓ(w0=1, w1=1, w2=-1.5) = 10-5

… Best model with

4 gradient ascent:
3 Highest likelihood ℓ(w)
2 ŵ = (w0=1, w1=0.5, w2=-1.5)
1
ℓ(w0=1, w1=0.5, w2=-1.5) = 10-4
0
0 1 2 3 4 …
#awesome
23 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 3: Overfitting & regularization
True error
Classification
error

Training error

Model complexity

Use regularization penalty 2

to mitigate overfitting
ℓ(w)
(w) - λ||w||2
25 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 4: Decision trees
Start

excellent poor
Credit?

fair
Income?
Safe Term?
high Low
3 years 5 years

Risky Safe Term? Risky

3 years 5 years

Risky Safe

26 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 5: Overfitting in decision trees
Decision Tree
Depth 1 Depth 3 Depth 10

Logistic Regression
Degree 1 features Degree 2 features Degree 6 features

27 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

Module 5: Alleviate overfitting by learning simpler trees
Occam’s Razor: “Among competing hypotheses,
the one with fewest assumptions should be
selected”, William of Occam, 13th Century

Complex Tree Simpler Tree

Simplify

Module 6: Handling missing data
Start

Credit Term Income y

excellent poor
excellent 3 yrs high safe Credit?

fair ? low risky fair

or unknown
fair 3 yrs high safe Income?
Safe Term?
poor 5 yrs high risky high Low
3 years 5 years or unknown
excellent 3 yrs low risky or unknown
fair 5 yrs high safe Risky Safe Term? Risky

poor ? high risky 3 years 5 years

or unknown
poor 5 yrs low safe
fair ? high safe Risky Safe

Module 7: Boosting question
“Can a set of weak learners be combined to
create a stronger learner?” Kearns and Valiant (1988)

Yes! Schapire (1990)

Boosting

Amazing impact: simple approach widely used in

industry wins most Kaggle competitions
32 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 7: Boosting using AdaBoost
Income>$100K? Credit history? Savings>$100K? Market conditions?

Yes No Bad Good Yes No Bad Good

Safe Risky Risky Safe Safe Risky Risky Safe

f1(xi) = +1 f2(xi) = -1 f3(xi) = -1 f4(xi) = +1

Ensemble: Combine votes from many simple

classifiers to learn complex classifiers

Module 8: Precision-recall
Goal: increase
# guests by 30%

Need an automated,
“authentic”
Reviews marketing campaign

Great quotes Spokespeople

“Easily best sushi in Seattle.”

Accuracy not most important metric

PRECISION RECALL
Did I (mistakenly) show a Did I not show a (great)
negative sentence??? positive sentence???
34 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 9: Scaling to huge datasets & online learning

4.8B webpages 500M Tweets/day 5B views/day

Stochastic gradient: tiny modification to gradient,

a lot faster, but annoying in practice
Avg. log likelihood

Gradient
Better

Assumed background

Courses 1 & 2 in this ML Specialization
• Course 1: Foundations
- Overview of ML case studies
- Black-box view of ML tasks
- Programming & data
manipulation skills

• Course 2: Regression
- Data representation (input, output, features)
- Linear regression model
- Basic ML concepts:
• ML algorithm
• Gradient descent
• Overfitting
• Validation set and cross-validation
• Bias-variance tradeoﬀ
• Regularization

Math background
• Basic calculus
- Concept of derivatives
• Basic vectors
• Basic functions
- Exponentiation ex
- Logarithm

Programming experience
• Basic Python used
- Can pick up along the way if
knowledge of other language

Reliance on GraphLab Create
• SFrames will be used, though not required
- open source project of Dato
(creators of GraphLab Create)
- can use pandas and numpy instead
• Assignments will:
1. Use GraphLab Create to
explore high-level concepts
2. Ask you to implement
all algorithms without GraphLab Create
• Net result:
- learn how to code methods in Python
40 ©2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Computing needs
• Basic 64-bit desktop or laptop
• Access to internet
• Ability to:
- Install and run Python (and GraphLab Create)
- Store a few GB of data

Let’s get started!

What's Next For ML & You: Emily Fox & Carlos Guestrin
No ratings yet
What's Next For ML & You: Emily Fox & Carlos Guestrin
38 pages
Logistic Regression Learning Annotated
No ratings yet
Logistic Regression Learning Annotated
77 pages
1 - Intro
No ratings yet
1 - Intro
33 pages
Lecture 6 Linear Classifier 2
No ratings yet
Lecture 6 Linear Classifier 2
42 pages
(Fall 2024) Intro To ML
No ratings yet
(Fall 2024) Intro To ML
51 pages
Coursera Machine Learning Course Week 6 - Slides
No ratings yet
Coursera Machine Learning Course Week 6 - Slides
44 pages
Regression:: Emily Fox & Carlos Guestrin
No ratings yet
Regression:: Emily Fox & Carlos Guestrin
30 pages
July4 SaketAnand FriendlyIntroToML
No ratings yet
July4 SaketAnand FriendlyIntroToML
84 pages
Topic 1 - Introduction
No ratings yet
Topic 1 - Introduction
30 pages
2024 Machine Learning Intro
No ratings yet
2024 Machine Learning Intro
50 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
2021 Machine Learning Intro
No ratings yet
2021 Machine Learning Intro
43 pages
Lec 2
No ratings yet
Lec 2
43 pages
Lecture 1
No ratings yet
Lecture 1
43 pages
From Field Problems To Machine Learning
No ratings yet
From Field Problems To Machine Learning
51 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
270 pages
Introduction 1
No ratings yet
Introduction 1
142 pages
Course Collections by Coursera - Machine Learning & Artificial Intelligence
100% (2)
Course Collections by Coursera - Machine Learning & Artificial Intelligence
6 pages
ML Merged
No ratings yet
ML Merged
433 pages
Applied ML Course Overview
No ratings yet
Applied ML Course Overview
66 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
ELE-COI-521 Machine Learning Topics
No ratings yet
ELE-COI-521 Machine Learning Topics
40 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
No ratings yet
AI321: Theoretical Foundations of Machine Learning: Dr. Motaz El-Saban
44 pages
MATH 370: Intro to Machine Learning
No ratings yet
MATH 370: Intro to Machine Learning
60 pages
Machine Learning Overview
No ratings yet
Machine Learning Overview
92 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
Intro to Data Science & ML Course
No ratings yet
Intro to Data Science & ML Course
28 pages
MLUnit 1
No ratings yet
MLUnit 1
131 pages
Classification Annotated
No ratings yet
Classification Annotated
50 pages
State of The Art Research Methodology For Machine
No ratings yet
State of The Art Research Methodology For Machine
58 pages
Introduction To Machine Learning: Agenda
No ratings yet
Introduction To Machine Learning: Agenda
13 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
L2 What Is ML
No ratings yet
L2 What Is ML
38 pages
Lecture1 - Introduction To Machine Learning
No ratings yet
Lecture1 - Introduction To Machine Learning
39 pages
An Enlightenment To Machine Learning - Resp
No ratings yet
An Enlightenment To Machine Learning - Resp
22 pages
01 ML Basics
No ratings yet
01 ML Basics
61 pages
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
No ratings yet
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
59 pages
Machine Learning Specialization CloudxLab PDF
No ratings yet
Machine Learning Specialization CloudxLab PDF
12 pages
01 Introduction
No ratings yet
01 Introduction
23 pages
Unit 1
No ratings yet
Unit 1
62 pages
Machine Learning
No ratings yet
Machine Learning
25 pages
Lecture 2 - What Is ML
No ratings yet
Lecture 2 - What Is ML
17 pages
01 Introduction ML
No ratings yet
01 Introduction ML
48 pages
Norvig Google ESTF2019
No ratings yet
Norvig Google ESTF2019
71 pages
Ai Full Stack
No ratings yet
Ai Full Stack
15 pages
Machine Learning - MT 2016: Varun Kanade
No ratings yet
Machine Learning - MT 2016: Varun Kanade
50 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
10 pages
Intro To ML - 1
No ratings yet
Intro To ML - 1
29 pages
Machine Learning
No ratings yet
Machine Learning
7 pages
Air Quality Prediction Using Machine Learning
No ratings yet
Air Quality Prediction Using Machine Learning
29 pages
Lecture 1
100% (1)
Lecture 1
51 pages
Your Roadmap To Becoming A World Class AI Generalist
No ratings yet
Your Roadmap To Becoming A World Class AI Generalist
10 pages
Advanced Machine Learning (AML)
No ratings yet
Advanced Machine Learning (AML)
70 pages
ML Cahp 1
No ratings yet
ML Cahp 1
35 pages
ML Notes
No ratings yet
ML Notes
25 pages
DSF Unit 4
No ratings yet
DSF Unit 4
12 pages
Full Machine Learning For Algorithmic Trading 2nd Edition Stefan Jansen Ebook All Chapters
No ratings yet
Full Machine Learning For Algorithmic Trading 2nd Edition Stefan Jansen Ebook All Chapters
46 pages
Stanford University CS 229, Autumn 2015 Midterm Examination
No ratings yet
Stanford University CS 229, Autumn 2015 Midterm Examination
25 pages
Linear Regression Interview Prep
No ratings yet
Linear Regression Interview Prep
36 pages
Overfitting vs Underfitting in ML
No ratings yet
Overfitting vs Underfitting in ML
20 pages
Unit-4 (NLP)
No ratings yet
Unit-4 (NLP)
47 pages
Short Notes
No ratings yet
Short Notes
44 pages
SFace Loss
No ratings yet
SFace Loss
12 pages
Entropy and Information Gain For Decision Tree Algorithm
No ratings yet
Entropy and Information Gain For Decision Tree Algorithm
12 pages
Issues in Decision Tree Learning
No ratings yet
Issues in Decision Tree Learning
6 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Avoiding Over-Fitting in Trading Strategy (Part 2)
No ratings yet
Avoiding Over-Fitting in Trading Strategy (Part 2)
1 page
M6 RegressionLinearModels v2
No ratings yet
M6 RegressionLinearModels v2
97 pages
CSCI 5521 Spring 2025 Final Exam
No ratings yet
CSCI 5521 Spring 2025 Final Exam
8 pages
Machine Learning Applications For Building Structural Design and Performance Assessment
No ratings yet
Machine Learning Applications For Building Structural Design and Performance Assessment
41 pages
A Bootstrapping Soft Shrinkage Approach and
No ratings yet
A Bootstrapping Soft Shrinkage Approach and
17 pages
CERVICAL RESEARCH TILL CONCLUSION PDF 02
No ratings yet
CERVICAL RESEARCH TILL CONCLUSION PDF 02
7 pages
Age & Gender Detection Tech Overview
No ratings yet
Age & Gender Detection Tech Overview
69 pages
Stock Prediction Report
No ratings yet
Stock Prediction Report
38 pages
2020-Anki P. Et Al.-Intelligent Chatbot Adapted From Question and Answer System Using RNN-LSTM Model
No ratings yet
2020-Anki P. Et Al.-Intelligent Chatbot Adapted From Question and Answer System Using RNN-LSTM Model
12 pages
Lec 01 (ML) Introduction
No ratings yet
Lec 01 (ML) Introduction
98 pages
Phase - 2 - PPT (1) Final
No ratings yet
Phase - 2 - PPT (1) Final
26 pages
AI Transforming Agriculture
No ratings yet
AI Transforming Agriculture
45 pages
Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science
No ratings yet
Decision Trees in Machine Learning - by Prashant Gupta - Towards Data Science
6 pages
Itc FMCG Case Study
No ratings yet
Itc FMCG Case Study
14 pages
Discriminative vs. Generative Models
No ratings yet
Discriminative vs. Generative Models
21 pages
AI ML User Guide
No ratings yet
AI ML User Guide
28 pages
LBDL
No ratings yet
LBDL
185 pages
Criteria For Classifying Forecasting Me - 2020 - International Journal of Foreca PDF
No ratings yet
Criteria For Classifying Forecasting Me - 2020 - International Journal of Foreca PDF
11 pages
Symptom-Based Disease Prediction A Machine Learnin
No ratings yet
Symptom-Based Disease Prediction A Machine Learnin
10 pages