Introduction To ML

Uploaded by

View Present

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views48 pages

Introduction To ML

Uploaded by

View Present

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 48

Machine Learning

What is Machine Learning

• Machine learning - term coined around 1960.
• Composed of two words—
– machine corresponds to a computer, robot, or
other device,
– learning refers to an activity intended to acquire
or discover event patterns, which we humans are
good at.
2
Want Machines to Learn!!!?
• Computers and robots can work 24/7, need very low maintenance
and don't get tired, need breaks, call in sick, or go on strike.
• Their performance is justifiable for sophisticated problems that
involve a variety of huge datasets or complex calculations.
• Machines driven by algorithms designed by humans are able to
learn latent rules and inherent patterns and to fulfill tasks desired
by humans.
• Learning machines are better suited than humans for tasks that
are routine, repetitive, or tedious.
3
Evolution of Machine Learning
• Manually defining, maintaining, and updating rules
becomes more and more expensive over time.
• Extracting the number of possible patterns for an activity or
event (dynamic, real-time) is not practically feasible and
would exhaust all enumeration.
• It is much easier and more efficient to develop learning
rules or algorithms which command computers to learn and
extract patterns, and to figure things out themselves from
abundant data.
4
Emerging School of Thought
• Active learning or human-in-the-loop – advocates
combining the efforts of machine learners and humans.
• The idea is routine boring tasks are more suitable for
computers, and creative tasks more suitable for humans.
• According to this philosophy, machines are able to learn, by
following rules (or algorithms) designed by humans and to
do repetitive and logic tasks desired by a human.

5
Overview of Machine Learning
• Machine learning mimicking human intelligence is a
subfield of artificial intelligence.
• It’s closely related to linear algebra, probability
theory, statistics, and mathematical optimization.
• Machine learning models are built based on
statistics, probability theory and linear algebra, then
optimize the models using mathematical
optimization.
6
Overview of Machine Learning
• Machine learning definition :-

A computer program is said to learn from

experience E with respect to some class of tasks T and
performance measure P, if its performance at tasks in T,
as measured by P, improves with experience E.

7
Examples of Machine Learning
• “Is this cancer?”, “What is the market value of this
house?”, “Which of these people are good friends with
each other?”, “Will this rocket engine explode on take
off?”, “Will this person like this movie?”, “Who is
this?”, “What did you say?”, and “How do you fly this
thing?”
• Playing computer games, recognize spoken words, drive
autonomous vehicles, classify/recognize structures and so
on.
8
Overview of Machine Learning
.
Machine
Input Learning Output
Model

Numeric, Explore, construct Prediction or

textual, audio, algos that learn from Classification
visual historical data and
perform on new data. Define Loss or Cost
function to optimize
the goal of learning
9
Categories of Machine Learning
• Depending on the nature of learning data Machine
Learning tasks can be classified as –
– Supervised Learning
– Unsupervised Learning
– Reinforcement Learning

10
Supervised Learning
• The general rule of the learning goal is to map input to output.
• The learning data is labeled data – comes with description,
targets or desired outputs along with indicative signals. The
labels are usually provided by event logging systems and
human experts.
• The learned rule is then used to label new data with unknown
outputs.
• Used in daily applications, such as face and speech recognition,
products or movie recommendations and sales forecasting.
11
Supervised Learning
• We can further subdivide supervised learning
into regression and classification.
• Regression trains on and predicts a continuous-
valued response, for example predicting house
prices.
• Classification attempts to find the appropriate class
label, such as analyzing positive/negative sentiment
and prediction loan defaults. 12
Unsupervised Learning
• The learning goal is to understand the data, learn from it and
then produce the results.
• Learning data is unlabeled data – contains only indicative
signals without any description attached. We need to find
structure of the data underneath, to discover hidden
information, or to determine how to describe the data.
• Unsupervised learning can be used to detect anomalies, such
as fraud or defective equipment, or to group customers with
similar online behaviors for a marketing campaign.
13
Semi-supervised Learning
• Learning data is partially labeled.
• It makes use of unlabeled data (typically a large amount) for
training, besides a small amount of labeled.
• Applied in cases where it is expensive to acquire a fully labeled
dataset while more practical to label a small subset.
• For example, it often requires skilled experts to label
hyperspectral remote sensing images, and lots of field
experiments to locate oil at a particular location, while
acquiring unlabeled data is relatively easy.
14
Reinforcement Learning
• Learning data provides feedback so that the system
adapts to dynamic conditions in order to achieve a
certain goal.
• The system evaluates its performance based on the
feedback responses and reacts accordingly.
• The best known instances include self-driving cars
and chess master AlphaGo.
15
Machine Learning Algorithms
• Logic-based learning
– They used basic rules specified by human experts, and with these rules,
systems tried to reason using formal logic, background knowledge, and
hypotheses.
• Statistical Learning
• Artificial Neural Networks
– They imitate animal brains, and consist of interconnected neurons that are also
an imitation of biological neurons. They try to model complex relationships
between inputs and outputs and to capture patterns in data.
• Genetic Algorithms
– They mimic the biological process of evolution and try to find the optimal
solutions using methods such as mutation and crossover. 16
Data for Machine Learning
• Good thing – we have a lot of data in the world.
• Bad thing – hard to process this data.
• Challenges – diversity and noisiness of the data.
• We as humans, usually process data coming in our
ears and eyes. These inputs are transformed into
electrical or chemical signals.
• Computers too can process electrical signals.
17
Data for Machine Learning
• Data for ML is represented either as numbers,
images, or text.
• Images and text are not very convenient, so they
need to be transformed into numerical values.

18
Training, Testing, Validation Data Sets
• Training Data Set – Practice Questions
– learn something from them and hopefully are able to apply this
knowledge to other similar questions.
– ML models derive patterns from these.
• Testing Data Set – Actual Exams
– Models are applied and test their compatibility.
• Validation Data Set – Mock Test
– To assess how well we will do in actual ones and to aid revision.
– Verify how well the models will perform in a simulated setting then
we fine-tune the models accordingly in order to achieve greater19hits.
Data for Machine Learning
• The model is given example input values and
example output values. Or if we are more ambitious,
we can feed the program the actual inputs and let
the machine process the data further just like an
autonomous car doesn't need a lot of human input.

20
Practice Questions
• Question 1 -------- Ans is option A
• Question 2 -------- Ans is option B
• Question 3 -------- Ans is option A
• Question 4 -------- Ans is option B
• Question 5 -------- Ans is option A

• Even if the question is not related to potatoes and tomatoes

you may memorize the answers for each question verbatim.
21
Exam Questions
• Question 1 -------- Option ???
• Question 2 -------- Option B
• Question 3 -------- Option???
• Question 4 -------- Option ???
• Question 5 -------- Option ???

• We will score very low on the exam questions as it is rare

that the exact same questions will occur in the actual exams.
22
Overfitting
• The phenomenon of memorization can cause overfitting.
• Over extracting too much information from the training sets and making
the model just work well with them – low bias in machine learning.
• This will not help us generalize with data and derive patterns from them.
• The model as a result will perform poorly on datasets that were not seen
before – high variance in machine learning.
• This occurs when the learning rules are described based on a relatively
small number of observations, instead of the underlying relationship.
• Also when the model is made excessively complex so that it fits every
training sample, such as memorizing the answers for all questions.
23
Underfitting
• A model is underfit if it does not perform well on the training sets
and will not so on the testing sets.
• It fails to capture the underlying trend of the data.
• This may occur if we are not using enough data to train the model.
(like we will fail the exam if we did not review enough material).
• Underfit will result if a wrong model is fit to the data. (like we will score
low in any exercises or exams if we take the wrong approach and learn it the wrong way)
• This is high bias in machine learning, although its variance is low as
performance in training and test sets are pretty consistent, in a bad
way. 24
Avoiding Overfitting & Underfitting
• High bias results in underfitting (incorrect assumptions)
• Variance measures how sensitive the model prediction is to
variations in the datasets. High variance causes Overfitting.
• Hence, try to always make both bias and variance as low as
possible.
• In practice, there is an explicit trade-off between
themselves, where decreasing one increases the other. This
is called bias–variance tradeoff.
25
Bias-Variance Tradeoff Example
We were asked to build a model to predict the probability of a candidate being
the next president based on the phone poll data. The poll was conducted by zip
codes. We randomly choose samples from one zip code, and from these, we
estimate that there's a 61% chance the candidate will win. However, it turns out
he loses the election. Where did our model go wrong? The first thing we think of
is the small size of samples from only one zip code. It is the source of high bias,
also because people in a geographic area tend to share similar demographics.
However, it results in a low variance of estimates. So, can we fix it simply by
using samples from a large number zip codes? Yes, but this might cause an
increased variance of estimates at the same time. We need to find the optimal
sample size, the best number of zip codes to achieve the lowest overall bias and
variance. 26
Avoiding Overfitting with Cross-validation
• Cross-validation data set – mock tests.
• The validation procedure helps evaluate how the models will
generalize to independent or unseen datasets in a simulated
setting.
• The original data is partitioned into three subsets, usually
60% for the training set, 20% for the validation set, and the
rest 20% for the testing set.
• This setting suffices if we have enough training samples after
partition. 27
Cross-validation

28
Cross-validation
• Testing results from all rounds are averaged to generate a more accurate
estimate of model prediction performance.
• Cross-validation helps reduce variability and therefore limit problems like
overfitting.

29
Cross-validation – Exhaustive scheme
• Leave-one-out-cross-validation (LOOCV) – Leave out a fixed
number of observations in each round as testing (or
validation) samples, the remaining observations as training
samples. This process is repeated until all possible different
subsets of samples are used for testing once.
• A dataset of size n, LOOCV requires n rounds of cross-
validation.
• This can be slow when n gets large.
30
Cross-validation – Non-Exhaustive scheme
• This does not try out
all possible partitions.
Most widely used type
of this scheme is k-
fold cross-validation.
• Common values
for k are 3, 5, and 10.
• Average the k sets of
test results for the
purpose of evaluation.
31
k-fold cross-validation

32
Cross-validation – Holdout Method
• Randomly split the
data into training
and testing set
numerous times.
• Problem – some
samples may never
end up in the
testing set, while
some may be
selected multiple
times in the testing
set. 33
Nested Cross-validation
• It is a combination of cross-
validations. It consists of the
following two phases:
– The inner cross-
validation, is conducted
to find the best fit,
usually implemented as
a k-fold cross-validation.
– The outer cross-
validation, is used for
performance evaluation
& statistical analysis.
34
Analogy for Cross-validation
A data scientist plans to take his car to work, and his goal is to arrive before 9 am
every day. He needs to decide the departure time and the route to take. He tries
out different combinations of these two parameters on some Mondays, Tuesdays,
and Wednesdays and records the arrival time for each trial. He then figures out
the best schedule and applies it every day. However, it doesn’t work quite well as
expected. It turns out the scheduling model is overfit with data points gathered in
the first three days and may work well on Thursdays and Fridays. A better solution
would be to test the best combination of parameters derived from Mondays to
Wednesdays on Thursdays and Fridays and similarly repeat this process based on
different sets of learning days and testing days of the week. This analogized cross-
validation ensures the selected schedule work for the whole week.
35
Avoiding Overfitting with Regularization
• Overfitting also occurs due to unnecessary complexity of the
model.
• Linear model (2 parameters) – span a 2-D space.
• Quadratic polynomial (3 parameters) – span a 3-D space.
• High order polynomial function (n parameters) – spans a much
larger space.
– Such models are easily obtained models.
– But they generalize worse than linear models.
– Hence are more prompt to overfitting. 36
Linear Function Vs. Polynomial Function
• So if we need easily-
obtainable models then
complexity has to be
handled.
• Regularization reduces the
complexity by imposing
penalties on high orders of
polynomials.
• But a less accurate and less
strict rule is learned by the
model during training phase.
37
Regularization Example
• Equip robotic guard dog the ability to identify strangers and friends.

• The rules are too complicated and unlikely to generalize well to new visitors.
• Regularization should be optimal because :
– Too small regularization does not have any impact.
– Too large regularization results in underfitting (model becomes complex, falls short of
38
data).
Feature Selection & Dimensionality Reduction
• High dimensional data
– computationally expensive
– prone to overfitting due to high
complexity.
– impossible to visualize.
• Not all the features are useful, and
may add randomness to results.
• For better model construction good
feature selection is important.

39
Preprocessing, exploration and feature engineering

• A machine learning system isn't able to recognize gibberish, so we need to help it

by cleaning the input data.
• Understand the data – First scan the data and/or second visualize the data.
• A grid of numbers is the most convenient form to process.
• Feature engineering is the process of creating or improving features.
• They are often created based on common sense, domain knowledge, or prior
experience.
• Missing values – ignore them or try imputing (arithmetic mean, median or
mode).
• Encoding (Label encoding, On-hot encoding), Scaling, Polynomial features, Power
transformations, Binning.
40
Why feature scaling?
• Most of the ML models work on the Euclidean Distance.
• Data won’t be on the same scale. Hence the distance between such data points
can’t be calculated properly.
• As a result ML models will have problems in learning this type of data.
• Hence feature scaling will transform the data to the same scale.

41
Why feature scaling?
• If x & y are not on same scale then the one of the distances will dominate the
other.
• Eg – Salary is x, Age is y
• Suppose x1 = 79000, x2 = 48000, y1 = 48, y2 = 27
• The square of the distances is –
– x1 – x2 = 31000 square = 961000000
– y1 – y2 = 21 square = 441
• Here y will almost not exist for the ML model as
y value is neglible compared to x value.
• Hence y will be ignored by the ML model and
the result will be wrong. 42
Feature scaling
• Feature Scaling is recommended even in case ML
model is not based on Euclidean Distance.
• This will help the algorithm converge must faster. Eg
– Decision Trees

43
How to load data file(s)?

• Input data sets can be in various formats (.XLS, .TXT, .CSV, JSON ).

44
Load data file from csv & excel
#Import Library Pandas
import pandas as pd
#Reading the dataset in a dataframe using Pandas print
df = pd.read_csv("E:/train.csv") # Load csv file
df=pd.read_excel("E:/EMP.xlsx", "Data") # Load Data sheet of excel file
df.head(3) #Print first three observations

45
How to convert a variable to different datatype

Converting a variable data type to other is important and common

procedure we perform after loading data.
string_outcome = str(numeric_input) #Converts numeric_input to string_outcome
integer_outcome = int(string_input) #Converts string_input to integer_outcome
float_outcome = float(string_input) #Converts string_input to integer_outcome

46
How to convert character date to Date

from datetime import datetime

char_date = 'Apr 1 2015 1:20 PM' #creating example character date
date_obj = datetime.strptime(char_date, '%b %d %Y %I:%M%p')
print (date_obj)

47
How to transpose a Data set?

To transpose Table A into Table B on variable Product. This task can be

accomplished by using dataframe.pivot.
# Load Data sheet of excel file EMP
df = pd.read_excel("E:/transpose.xlsx", "Sheet1")
print (df )
result = df.pivot(index= 'ID', columns='Product', values='Sales')
print(result)

Unit I MACHINE LEARNING
No ratings yet
Unit I MACHINE LEARNING
87 pages
Machine Learning and Applications (5L)
No ratings yet
Machine Learning and Applications (5L)
185 pages
Vintage Games 2.0
100% (10)
Vintage Games 2.0
375 pages
ITA6016 - Machine Learning Introduction
No ratings yet
ITA6016 - Machine Learning Introduction
13 pages
Machine Learning Course Overview
No ratings yet
Machine Learning Course Overview
51 pages
ML1-Introduction To Machine Learning
No ratings yet
ML1-Introduction To Machine Learning
46 pages
Graphpad Prism 8.3
No ratings yet
Graphpad Prism 8.3
3 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
Unit 1
No ratings yet
Unit 1
72 pages
1 Leaning Introduction
No ratings yet
1 Leaning Introduction
29 pages
Chapter 1 Introduction To Machine Learning
No ratings yet
Chapter 1 Introduction To Machine Learning
29 pages
University Institute of Engineering Department of Computer Science and Engg
No ratings yet
University Institute of Engineering Department of Computer Science and Engg
27 pages
Armitage Use, Backtrack 5
No ratings yet
Armitage Use, Backtrack 5
5 pages
ML Chap1
No ratings yet
ML Chap1
26 pages
Machine Learning-Lecture 01
No ratings yet
Machine Learning-Lecture 01
28 pages
Lecture 1
No ratings yet
Lecture 1
65 pages
Introduction To ML
100% (1)
Introduction To ML
39 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
10 pages
Tutorial Sheet1 (M.L.)
No ratings yet
Tutorial Sheet1 (M.L.)
49 pages
Unit-1 Introduction To Machine Learning
No ratings yet
Unit-1 Introduction To Machine Learning
24 pages
Presentation On ML
No ratings yet
Presentation On ML
469 pages
Machine Learning
No ratings yet
Machine Learning
24 pages
Network & System Admin Basics
100% (1)
Network & System Admin Basics
33 pages
Mlfa Autumn 22 Lec 01
No ratings yet
Mlfa Autumn 22 Lec 01
43 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
56 pages
Module 1 ML
No ratings yet
Module 1 ML
51 pages
Supervised & Deep Learning Guide
No ratings yet
Supervised & Deep Learning Guide
83 pages
Lect1 Introduction
No ratings yet
Lect1 Introduction
38 pages
01 LecIntro
No ratings yet
01 LecIntro
23 pages
Machine Learning - Question Bank
No ratings yet
Machine Learning - Question Bank
45 pages
Module 1
No ratings yet
Module 1
175 pages
Arithmetic and Weighted Mean
No ratings yet
Arithmetic and Weighted Mean
5 pages
0 - Module 0 Fundamental Introduction (Huawei VRP) PDF
No ratings yet
0 - Module 0 Fundamental Introduction (Huawei VRP) PDF
4 pages
Data Validation vs. Verification Guide
No ratings yet
Data Validation vs. Verification Guide
16 pages
Car Rental Management System
No ratings yet
Car Rental Management System
10 pages
Machine Learning
No ratings yet
Machine Learning
74 pages
Functional Testing Techniques
No ratings yet
Functional Testing Techniques
42 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
4 pages
Introduction to Machine Learning
No ratings yet
Introduction to Machine Learning
132 pages
Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
No ratings yet
Buffer Cache Algorithms: Session No:5 Operating System Design @KL University, 2020
21 pages
EDCI572 Project
No ratings yet
EDCI572 Project
28 pages
Main Projrct
No ratings yet
Main Projrct
61 pages
Using Unicode Character Symbols in Excel
No ratings yet
Using Unicode Character Symbols in Excel
28 pages
Firoz Topic 0
No ratings yet
Firoz Topic 0
24 pages
1 - Machine Learning Overview
No ratings yet
1 - Machine Learning Overview
53 pages
ML Unit 1
No ratings yet
ML Unit 1
9 pages
Unit Iii
No ratings yet
Unit Iii
39 pages
ID Reader Installation Manualote
No ratings yet
ID Reader Installation Manualote
37 pages
Unit 1
No ratings yet
Unit 1
62 pages
Lec 001
No ratings yet
Lec 001
17 pages
Unit 1
No ratings yet
Unit 1
6 pages
Infineon TC2xx - AURIX - Documentation PP v01 - 00 EN
No ratings yet
Infineon TC2xx - AURIX - Documentation PP v01 - 00 EN
8 pages
Unit1 2
No ratings yet
Unit1 2
101 pages
Srinivasan Padmanabhan Resume
No ratings yet
Srinivasan Padmanabhan Resume
6 pages
Ch7 Introduction To Machine Learning
No ratings yet
Ch7 Introduction To Machine Learning
29 pages
Lecture 1 - Introduction To Machine Learning-HO - Ch0
No ratings yet
Lecture 1 - Introduction To Machine Learning-HO - Ch0
44 pages
SC200: Advanced Sensor Controller
No ratings yet
SC200: Advanced Sensor Controller
4 pages
Mastering CCNARouting Fundamentals 654 FCFC 9 Da 692 A 4 D
No ratings yet
Mastering CCNARouting Fundamentals 654 FCFC 9 Da 692 A 4 D
12 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
19 pages
Machine Learning
No ratings yet
Machine Learning
46 pages
A New PWM Controller With One Cycle Response
No ratings yet
A New PWM Controller With One Cycle Response
7 pages
CMMS3 DDL Guide for Ford Suppliers
No ratings yet
CMMS3 DDL Guide for Ford Suppliers
2 pages
Ai - Foundations of Machine Learning I
No ratings yet
Ai - Foundations of Machine Learning I
40 pages
Machine Learning.
No ratings yet
Machine Learning.
50 pages
Pag 31
No ratings yet
Pag 31
2 pages
UNIT I-Part 1
No ratings yet
UNIT I-Part 1
52 pages
Pronto Xi Help 750.2 - Item Creation Request Function
No ratings yet
Pronto Xi Help 750.2 - Item Creation Request Function
3 pages
Digital Audio Workstation Meaning
No ratings yet
Digital Audio Workstation Meaning
10 pages
Lec 7 - 8 - Machine Learning Introduction
No ratings yet
Lec 7 - 8 - Machine Learning Introduction
55 pages
Unit 1 ML
No ratings yet
Unit 1 ML
41 pages
Real Log Book
No ratings yet
Real Log Book
24 pages
Guentner Manual GMMnext V1.1.5 en
No ratings yet
Guentner Manual GMMnext V1.1.5 en
179 pages
Ingest 6.5.2 Release Notes
No ratings yet
Ingest 6.5.2 Release Notes
42 pages
BE02000041 Funda of AI Unit 3 Basics of ML
No ratings yet
BE02000041 Funda of AI Unit 3 Basics of ML
86 pages
RUA Form 2022 - New
No ratings yet
RUA Form 2022 - New
5 pages
Machine Learning Concise Notes
No ratings yet
Machine Learning Concise Notes
7 pages
Lecture 1.2 Introduction To Machine Learning
No ratings yet
Lecture 1.2 Introduction To Machine Learning
31 pages
Iso File Naming Macro
No ratings yet
Iso File Naming Macro
6 pages
Unit 1
No ratings yet
Unit 1
93 pages
AMINA Group Case Study V1
No ratings yet
AMINA Group Case Study V1
2 pages
ML Module I
No ratings yet
ML Module I
71 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
25 pages
User Manual Part 2 3260293
No ratings yet
User Manual Part 2 3260293
1 page
01 Introduction
No ratings yet
01 Introduction
28 pages
Intro To Machine Learning
No ratings yet
Intro To Machine Learning
31 pages
Unit I Machine Learning
No ratings yet
Unit I Machine Learning
78 pages
Chapter 1
No ratings yet
Chapter 1
27 pages
Machine Learning
No ratings yet
Machine Learning
39 pages
w1 - Introduction To ML
No ratings yet
w1 - Introduction To ML
40 pages