Introduction

Uploaded by

mert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views41 pages

Introduction

Uploaded by

mert

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 41

CMPE 442 INTRODUCTION

MACHINE LEARNING
MACHINE LEARNING
 Machine learning is the field of study that gives
computers the ability to learn without being
explicitly programmed.
MACHINE LEARNING
 Machine learning is the field of study that gives
computers the ability to learn without being
explicitly programmed.
 A computer program is said to learn from
experience E with respect to some task T and
some performance measure P, if its performance
on T, as measured by P, improves with
experience E.
MACHINE LEARNING
 Example: Spam filter- given examples of spam e-
mails and examples of ham e-mails, learns to flag
spam.
 Training set- examples that the system uses to learn.
 T (task)- flag spam for new e-mails
 E (experience)- training data
 P (performance)- ? needs to be defined:
 Ex: the ratio of correctly classified emails  accuracy
EVALUATING PERFORMANCE ON A TASK
 Machine learning problems don’t have a
“correct” answer.
 Consider sorting problem:
 Many sorting algorithms available: bubble sort, quick
sort, insertion sort ...
 The performance is measured in terms of how fast
they are and how much data they can handle.
 Would we compare the sorting algorithms with
respect to the correctness of the result?
EVALUATING PERFORMANCE ON A TASK
 Machine learning problems don’t have a
“correct” answer.
 Consider sorting problem:
 Many sorting algorithms available: bubble sort, quick
sort, insertion sort ...
 The performance is measured in terms of how fast
they are and how much data they can handle.
 Would we compare the sorting algorithms with
respect to the correctness of the result?
 Algorithm that isn’t guaranteed to produce a sorted list
every time is useless as a sorting algorithm.
EVALUATING PERFORMANCE ON A TASK
 No perfect solution in machine learning
 Perfect e-mail spam filter does not exist!!!
 In many cases data is “noisy”
 Examples mislabelled
 Features contain errors
o Performance evaluation of learning algorithms is
important in machine learning.
WHY USE MACHINE LEARNING?
WHY USE MACHINE LEARNING?
 Let’s write a spam filter using traditional
programming technique
1) Study spam emails and get the patterns and most
occurring words.
2) Write detection algorithm.
3) Test and repeat steps 1 and 2 until it is good
enough
WHY USE MACHINE LEARNING?

Launch!

Study the Write

Evaluate
problem rules

Analyze
errors
WHY USE MACHINE LEARNING?

Launch!
Data

Study the Train ML

Evaluate
problem algorithm

Analyze
errors
WHY USE MACHINE LEARNING?
 Consider the example of recognizing handwritten
digits.

 Each digit corresponds to a 28x28 pixel image

and so can be represented by a vector x
comprising 784 real numbers.
 Goal: build a machine that will take such a vector
x as input and that will produce the identity of
the digit 0, …, 9 as the output.
WHY USE MACHINE LEARNING?
 Better use a machine learning approach where a
large set N digits called the training set is
used to tune the parameters of an adaptive model.
 The categories of the digits in the training set are
known in advance  target vector t.
 The goal is to determine the function y(x) which takes
a new digit image x as an input and generates an
output vector y  learning, training phase
 Once the model is trained we can run it on the test
set.
 The ability to categorize correctly new examples that
differ from those used for training is known as
generalization.
WHY USE MACHINE LEARNING?
 For problems that are too complex for traditional
approach.
 For problem that have no known algorithm.
 Ex.: Speech recognition
 Helps human learn: applying ML techniques to
large amounts of data reveals patterns that were
not immediately apparent data mining.
SOME ML PROBLEMS
 Speech Recognition

 Document Classification

 Face Detection and Recognition

 ...
TYPES OF MACHINE LEARNING SYSTEMS
 Whether or not they are trained with human
supervision: supervised, unsupervised, semi-
supervised, reinforcement learning.

 Instance-based versus model-based learning.

SUPERVISED LEARNING
 The training data includes the desired solutions,
called labels.
 Spam filter  classification
SPAM FILTERING AS A CLASSIFICATION
TASK
MACHINE LEARNING FOR SPAM
FILTERING
SUPERVISED LEARNING
 The training data includes the desired solutions,
called labels.
 House price prediction  regression
SUPERVISED LEARNING
 Some most important supervised algorithms:
 K-Nearest Neighbours
 Linear Regression
 Naïve Bayes
 Logistic Regression
 Support Vector Machines
 Decision Trees and Random Forests
 Neural Networks
UNSUPERVISED LEARNING
 The training data is unlabelled.
 The system tries to learn without anyone's
guidance.
UNSUPERVISED LEARNING
 Some most important unsupervised algorithms:
 Clustering
 K-Means
 Hierarchical Cluster Analysis

 Expectation Maximization

 Visualization and Dimensionality Reduction

 Principal Component Analysis (PCA)
 Locally-Linear Embedding (LLE)

 t-distribution Stochastic Neighbour Embedding (t-SNE)

SUPERVISED/UNSUPERVISED LEARNING
INSTANCE-BASED VS. MODEL-BASED
LEARNING
 Most ML problems are about making prediction
 Given training examples, the system needs to be
able to generalize to examples it has never seen
before
 The true goal is to perform well on new instances

 Two main generalization approaches:

 Instance-based: The system learns the examples by
heart, then generalizes to new cases using a
similarity measure.
 Model-based: generalizes from a set of examples by
building a model of these examples, then use that
model to make predictions.
INSTANCE-BASED LEARNING
MODEL-BASED LEARNING
REGRESSION PROBLEM
LINEAR REGRESSION
LINEAR REGRESSION
LINEAR REGRESSION
PROJECT PHASES
 Study data
 Select a learning algorithm

 Train it on the training data

 Apply the model to make predictions on new

cases
MAIN CHALLENGES IN MACHINE
LEARNING
 Two things that can go wrong:
 Bad data
 Bad algorithm
BAD DATA
 Insufficient quantity of training data
 It takes a lot of data for most ML algorithms to work
properly.
 Non-representative training data
 It is crucial that your training data is representative of the
new cases you want to generalize to.
 Poor-quality data
 It is better to spend time cleaning up the training data:
decide about outliers and missing features.
 Irrelevant features
 Feature engineering involves:
 Feature selection: selecting the most useful features to train on
among existing features
 Feature extraction: combining existing features to produce a
more useful one
 Creating new features by gathering new data
BAD ALGORITHM
 Overfitting the training data:
 Happens when the model performs well on the
training data, but it does not generalize well.
 Underfitting the training data
 Happens when the model is too simple to learn the
underlying structure of the data
BAD ALGORITHM: EXAMPLE
 Simple regression problem: Suppose we observe a
real-valued input variable x and we wish to use
this observation to predict the value of a real-
valued target variable t.
 The data for this example is generated from the
function with random noise included in
the target values.
 Suppose we are given a training set containing N
observations of x, and the
corresponding observations t, t
BAD ALGORITHM: EXAMPLE
 N=10, the input data set x is generated by
choosing values of , for , spaced
uniformly in range
 The target data set t is obtained by computing
for corresponding x values and adding
small level of noise having Gaussian distribution
 Goal: exploit the training set in order to
make predictions of the value of the target
variable for some new value of the input
variable.
 In other words we are trying to
discover the underlying function
POLYNOMIAL CURVE FITTING


M order of polynomial
 coefficients
CURVE FITTING
TESTING AND VALIDATING
 Once you have a trained model, evaluate it and
fine-tune it.
 Split your data into two sets: training set and the
test set.
 Generalization error: error rate on the new cases,
estimated by evaluating the model on test set.
 If the training error is low (makes few mistakes
on training set) but the generalization error is
high, then the model is overfitting the training
set.
HOW ML HELPS TO SOLVE A TASK?

AI-900 Exam Notes
75% (4)
AI-900 Exam Notes
44 pages
Unit 1
No ratings yet
Unit 1
92 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
51 pages
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
No ratings yet
S1 - 25 (NSP) - ML - CS 1 - 27th July 2025
59 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
316 pages
Zarantech - Intro To ML
No ratings yet
Zarantech - Intro To ML
105 pages
ML - Unit I - Final
No ratings yet
ML - Unit I - Final
132 pages
Unit 01
No ratings yet
Unit 01
32 pages
w1 - Introduction To ML
No ratings yet
w1 - Introduction To ML
40 pages
Basic Concepts of Machine Learning For Beginners
No ratings yet
Basic Concepts of Machine Learning For Beginners
102 pages
01 Introduction Overview
No ratings yet
01 Introduction Overview
43 pages
Unit 1
No ratings yet
Unit 1
93 pages
Machine Learning Fundamentals
No ratings yet
Machine Learning Fundamentals
19 pages
Unit - 5.1 - Introduction To Machine Learning
No ratings yet
Unit - 5.1 - Introduction To Machine Learning
38 pages
ME3435E ADDTE Lect27 Machine Learning For Signal Processing 19.03.25
No ratings yet
ME3435E ADDTE Lect27 Machine Learning For Signal Processing 19.03.25
34 pages
Embedded Systems
No ratings yet
Embedded Systems
34 pages
HCIA-AI (Artificial Intelligence)
100% (2)
HCIA-AI (Artificial Intelligence)
36 pages
Chapter 5 Machine Learning
No ratings yet
Chapter 5 Machine Learning
96 pages
01-Introduction To Machine Learning
No ratings yet
01-Introduction To Machine Learning
89 pages
BE02000041 Funda of AI Unit 3 Basics of ML
No ratings yet
BE02000041 Funda of AI Unit 3 Basics of ML
86 pages
ML Module I
No ratings yet
ML Module I
71 pages
07 Overview of Machine Learning
No ratings yet
07 Overview of Machine Learning
113 pages
MACHINE LEARNING ALGORITHM - Unit-1-1
100% (1)
MACHINE LEARNING ALGORITHM - Unit-1-1
78 pages
L1 Overview
No ratings yet
L1 Overview
28 pages
Data Science & ML Curriculum
No ratings yet
Data Science & ML Curriculum
11 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
Machine Learning
No ratings yet
Machine Learning
26 pages
Introduction To ML Unit-1
No ratings yet
Introduction To ML Unit-1
90 pages
Lecture 01 - Machine Learning Basics Revision
No ratings yet
Lecture 01 - Machine Learning Basics Revision
80 pages
Introduction 1175
No ratings yet
Introduction 1175
58 pages
Unit 1
No ratings yet
Unit 1
62 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
132 pages
Machine Learning
No ratings yet
Machine Learning
57 pages
Lecture 4 Machine Learning - BCSC
No ratings yet
Lecture 4 Machine Learning - BCSC
45 pages
Python UNIT-5
100% (1)
Python UNIT-5
67 pages
Chapter 2
No ratings yet
Chapter 2
35 pages
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
63 pages
Concept Learning
No ratings yet
Concept Learning
85 pages
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
No ratings yet
UNIT I 1 ML Introduction To ML Well Posed Learning Problem
48 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Unit 1
No ratings yet
Unit 1
38 pages
Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Dr. Ahmed Elngar - ML
No ratings yet
Dr. Ahmed Elngar - ML
118 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
28 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
39 pages
Machine Learning - ch1
No ratings yet
Machine Learning - ch1
46 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Machine Learning - 1
No ratings yet
Machine Learning - 1
52 pages
Machine Learning Fundamentals Guide
No ratings yet
Machine Learning Fundamentals Guide
46 pages
The Future of Robotic Surgery: Enhancing Precision and Outcomes (WWW - Kiu.ac - Ug)
No ratings yet
The Future of Robotic Surgery: Enhancing Precision and Outcomes (WWW - Kiu.ac - Ug)
4 pages
Machine Learning for Beginners
No ratings yet
Machine Learning for Beginners
27 pages
Lec1 Intoduction
No ratings yet
Lec1 Intoduction
34 pages
Data Science & ML Course Guide
No ratings yet
Data Science & ML Course Guide
83 pages
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
No ratings yet
Machine Learning: Professional CORE (CET3006B) T. Y. B.Tech CSE
106 pages
ML 01
No ratings yet
ML 01
15 pages
Data in ML
No ratings yet
Data in ML
26 pages
Afafdfsregf
No ratings yet
Afafdfsregf
9 pages
Cloud Computing Syllabus
100% (1)
Cloud Computing Syllabus
19 pages
Biostatistics and Microbiology A Surviva
No ratings yet
Biostatistics and Microbiology A Surviva
1 page
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
270 pages
Resume
No ratings yet
Resume
3 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
CMPE472 Quiz#1
100% (1)
CMPE472 Quiz#1
52 pages
Data Mining Technique Using Weka Tool
No ratings yet
Data Mining Technique Using Weka Tool
21 pages
Machine Learning Basics
No ratings yet
Machine Learning Basics
16 pages
Ai Class Viii
No ratings yet
Ai Class Viii
5 pages
Final Magnitude Hackathon Problem Statements
No ratings yet
Final Magnitude Hackathon Problem Statements
4 pages
An Enlightenment To Machine Learning
100% (1)
An Enlightenment To Machine Learning
16 pages
Generating Meaning: Active Inference and The Scope and Limits of Passive AI
No ratings yet
Generating Meaning: Active Inference and The Scope and Limits of Passive AI
21 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
CH 2
No ratings yet
CH 2
29 pages
5th Sem Syllabus Autonomy
No ratings yet
5th Sem Syllabus Autonomy
28 pages
AI Data Executive Summary
No ratings yet
AI Data Executive Summary
16 pages
Zero-Day Network Intrusion Detection Using Machine Learning Approach
No ratings yet
Zero-Day Network Intrusion Detection Using Machine Learning Approach
9 pages
Syllabus FinTech 21 22 4Y
No ratings yet
Syllabus FinTech 21 22 4Y
14 pages
Decision Trees
No ratings yet
Decision Trees
38 pages
ML System Optimization - Lecture 10 - Tiny-Machine-Learning
No ratings yet
ML System Optimization - Lecture 10 - Tiny-Machine-Learning
40 pages
Fs Crime Compliance Management FCCM Ds
No ratings yet
Fs Crime Compliance Management FCCM Ds
6 pages
Enhancing SMEs Digital Transformation Through Machine Learning - A Framework For Adaptive Quality Prediction
No ratings yet
Enhancing SMEs Digital Transformation Through Machine Learning - A Framework For Adaptive Quality Prediction
15 pages
Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review
No ratings yet
Towards An Efficient Model For Network Intrusion Detection System (IDS) : Systematic Literature Review
30 pages
Chapter 8 V7.0
No ratings yet
Chapter 8 V7.0
129 pages
Scan Generative AI Rathenau Instituut
No ratings yet
Scan Generative AI Rathenau Instituut
60 pages
Unraveling The Crystal Ball Machine Learning Models For Crude Oil and Natural Gas Volatility Forecasting
No ratings yet
Unraveling The Crystal Ball Machine Learning Models For Crude Oil and Natural Gas Volatility Forecasting
28 pages
Questions 4
No ratings yet
Questions 4
16 pages
Deeplearning - Ai Deeplearning - Ai
No ratings yet
Deeplearning - Ai Deeplearning - Ai
36 pages
Data Science Masters Brochure 2024 C21acc94be
No ratings yet
Data Science Masters Brochure 2024 C21acc94be
23 pages
07 Boosting Notes
No ratings yet
07 Boosting Notes
10 pages
Lab3 UDPTCP
No ratings yet
Lab3 UDPTCP
4 pages
Automated Algorithms vs. Data Scientists
No ratings yet
Automated Algorithms vs. Data Scientists
5 pages
Deep Learning for Age & Gender Detection
No ratings yet
Deep Learning for Age & Gender Detection
6 pages
Computer Engineering Department TED University: CMPE 252 - C Programming, Spring 2021 Lab 2
No ratings yet
Computer Engineering Department TED University: CMPE 252 - C Programming, Spring 2021 Lab 2
4 pages
Role of Machine Learning in Human Stress A Review
No ratings yet
Role of Machine Learning in Human Stress A Review
5 pages
Lab 08
No ratings yet
Lab 08
2 pages
Lab 03
No ratings yet
Lab 03
2 pages
C Programming Lab: Recursive Functions & Line Length Calculation
No ratings yet
C Programming Lab: Recursive Functions & Line Length Calculation
3 pages
Resume Mayank Yadav
No ratings yet
Resume Mayank Yadav
2 pages