0% found this document useful (0 votes)

69 views48 pages

Machine Learning

This document provides an introduction to machine learning and regression presented by András Horváth at Pazmany University in 2017-2018. It discusses key concepts in machine learning including definitions of intelligence, supervised vs unsupervised learning, and major achievements in the field such as IBM Deep Blue beating Kasparov at chess in 1997 and IBM Watson winning at Jeopardy in 2011. Example machine learning applications and basic steps for developing machine learning models are also outlined. Links to accompanying presentation materials and code are provided.

Uploaded by

G Srinivasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

69 views48 pages

Machine Learning

Uploaded by

G Srinivasan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 48

Introduction to machine learning, Pazmany University 2017-2018

Introduction to machine learning and

regression
András Horváth

This presentation is available

at:
http://users.itk.ppke.hu/~horan
/big_data
The code is available at:
https://databricks-prod-clou
dfront.cloud.databricks.com/
public/4027ec902e239c93eaaa8
714f173bcfc/2217267761988785
/1705666210822011/2819141955
166097/latest.html
Introduction to machine learning, Pazmany University 2017-2018

Machine learning, machine lintelligence

What is intelligence?

2
Introduction to machine learning, Pazmany University 2017-2018

Machine learning, machine lintelligence

What is intelligence?

The ability to acquire and apply knowledge and skills.

3
Introduction to machine learning, Pazmany University 2017-2018

Machine learning, machine lintelligence

What is intelligence?

The ability to acquire and apply knowledge and skills.

4
Introduction to machine learning, Pazmany University 2017-2018

Machine learning, machine lintelligence

What is intelligence?

The ability to acquire and apply knowledge and skills

Intelligence is the ability to adapt to change

Providing computers the ability to learn without being explicitly

programmed:

Involves: programming, Computational statistics, mathematical

optimization, image processing, natural language processing
etc...

5
Introduction to machine learning, Pazmany University 2017-2018

Machine learning, machine lintelligence

What is intelligence?

The ability to acquire and apply knowledge and skills

Intelligence is the ability to adapt to change

Providing computers the ability to learn without being explicitly

programmed:

Involves: programming, Computational statistics, mathematical

optimization, image processing, natural language processing
etc...

6
Introduction to machine learning, Pazmany University 2017-2018

Conquests of machine learning

1952 Arthur Samuel (IBM): First machine learning program playing
checkers

Arthur Samuel coined the term „machine learning”

7
Introduction to machine learning, Pazmany University 2017-2018

Conquests of machine learning

1952 Arthur Samuel (IBM): First machine learning program playing
checkers

1997 IBM Deep Blue Beats Kasparov:

First match (1996 Nov): Kasparov–Deep Blue (4–2)

Second Match (1997 May): Deep Blue–Kasparov (3½–2½)

8
Introduction to machine learning, Pazmany University 2017-2018

Conquests of machine learning

1952 Arthur Samuel (IBM): First machine learning program playing
checkers

1997 IBM Deep Blue Beats Kasparov:

2011 IBM Watson: Beating human champions in Jeopardy

It's a 4-letter term for a summit; the

first 3 letters mean a type of
simian : Apex

4-letter word for a vantage point or

a belief : View

Music fans wax rhapsodic about

this Hungarian's "Transcendental
Etudes" : Franz Liszt

While Maltese borrows many words

from Italian, it developed from a
dialect of this semitic language :
Arabic
9
Introduction to machine learning, Pazmany University 2017-2018

Conquests of machine learning

1952 Arthur Samuel (IBM): First machine learning program playing
checkers

1997 IBM Deep Blue Beats Kasparov:

2011 IBM Watson: Beating human champions in Jeopardy

2014 Deep face algorithm Facebook

Reached 97.35% accuracy

Human performance is around 97%

10
Introduction to machine learning, Pazmany University 2017-2018

Conquests of machine learning

1952 Arthur Samuel (IBM): First machine learning program playing
checkers

1997 IBM Deep Blue Beats Kasparov:

2011 IBM Watson: Beating human champions in Jeopardy

2014 Deep face algorithm Facebook

2016 Alpha go: deep learning

Fan Hui (5-0)

Lee Sedol (4-1)

99.8% win rate against other Go programs

11
Introduction to machine learning, Pazmany University 2017-2018

Types of machine learning

Supervised learning and unsupervised learning

12
Introduction to machine learning, Pazmany University 2017-2018

Types of machine learning

Supervised learning and unsupervised learning

Supervised:

We know the expected output we know the perfect, desired output on

the training set

Classification and regression

Unsupervised:

All we have is data and no labels. We have to identify rules in the

structure of the data

Clustering and association 13

Introduction to machine learning, Pazmany University 2017-2018

Types of machine learning

Supervised learning and unsupervised learning

Supervised:

Classification and regression Classification: A classification problem is when the output variable is
a category, such as “red” or “blue” or “disease” and “no disease”.

Regression: A regression problem is when the output variable is a

real value, such as “dollars” or “weight”.

Unsupervised:
Clustering: A clustering problem is where you want to discover the
Clustering and association inherent groupings in the data, such as grouping customers by
purchasing behavior.

Association: An association rule learning problem is where you want

to discover rules that describe large portions of your data, such as
people that buy X also tend to buy Y.

Semi-supervised: Some data is labeled but most of it is unlabeled and a mixture of

supervised and unsupervised techniques can be used.
14
Introduction to machine learning, Pazmany University 2017-2018

Supervised Classification
The most important conquests of deep learning comes from supervised calssifcation.

There are three major improvements in this:

- New methods and technologies

- Powerful hardware for training (GPGPUs)

- Vast amount of data is available

15
Introduction to machine learning, Pazmany University 2017-2018

Supervised Regression
Our aim is to predict a target value (Y) based on our data
(X)

We have to find a model that explains the rules how Y

can be derived from X m(X)=Y

The prefect fit is usually not possible, because our world

is not perfect:

We have a model which has flaws

There is noise in our observation

We have to fit a model which minimizes the error on the

dataset:

L1 error

L2 error 16
Introduction to machine learning, Pazmany University 2017-2018

Correlation and cause effect

relationship
Large correlation between the data does not necessary mean that one is caused by the other

One has to gather the large amount of data to ensure the coincidence by luck

One has to understand the data

17
Introduction to machine learning, Pazmany University 2017-2018

Correlation and cause effect

relationship
Large correlation between the data does not necessary mean that one is caused by the other

One has to gather the large amount of data to ensure the coincidence by luck

One has to understand the data

18
Introduction to machine learning, Pazmany University 2017-2018

Steps of machine learning

Supervised learning and unsupervised learning:

1, Gather Data

2, Understand your data

3, Data preparation

4, Choosing a model

5, Training and evaluation

6, Parameter tuning

7, Go back to point 1

19
Introduction to machine learning, Pazmany University 2017-2018

Gathreing data
Lets have an example dataset:

Boston house prices dataset:

A typical regression problem, good for trying out

different machine learning models

There are many available solutions on this dataset

Standard dataset for model evaluation contained by

Python libraries

Easy to load the data:

20
Introduction to machine learning, Pazmany University 2017-2018

Understanding the data

The dataset contains 506 rows (different data points) and 14 columns

13 features describing the properties (X):

per capita crime rate by town, proportion of residential land zoned for lots over 25,000 sq.ft., proportion of
non-retail business acres per town, Charles River dummy variable (= 1 if tract bounds river; 0 otherwise),
nitric oxides concentration (parts per 10 million), average number of rooms per dwelling, proportion of
owner-occupied units built prior to 1940, weighted distances to five Boston employment centers, index of
accessibility to radial highways, full-value property-tax rate per $10,000, pupil-teacher ratio by town,
1000(Bk - 0.63)^2 where Bk is the proportion of blacks by town, % lower status of the population

14th column: the target value (Y)

Our aim is to predict Y from X 21

Introduction to machine learning, Pazmany University 2017-2018

Data preparation
Write out some values:

22
Introduction to machine learning, Pazmany University 2017-2018

Data preparation
Plot some values as histograms:

23
Introduction to machine learning, Pazmany University 2017-2018

Data preparation
Plot some values related to the average price:

24
Introduction to machine learning, Pazmany University 2017-2018

Data preparation
Plot correlation between the different features:

25
Introduction to machine learning, Pazmany University 2017-2018

Evaluation of our method: the bias variance

problem
We have to simultanously minimize two different errors

The bias is error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to
miss the relevant relations between features and target outputs (underfitting)

The variance is error from sensitivity to small fluctuations in the training set. High variance can cause an
algorithm to model the random noise in the training data, rather than the intended outputs (overfitting).

If we fit our data perfectly on the training set, it might not be general. Our aim is to have a general model

26
Introduction to machine learning, Pazmany University 2017-2018

Evaluation of our method: the bias variance

problem
We have to simultanously minimize two different errors

The bias is error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to
miss the relevant relations between features and target outputs (underfitting)

If we fit our data perfectly on the training set, it might not be general. Our aim is to have a general model

27
Introduction to machine learning, Pazmany University 2017-2018

Evaluation of our method: the bias variance

problem
We have to simultanously minimize two different errors

The bias is error from erroneous assumptions in the learning algorithm. High bias can cause an algorithm to
miss the relevant relations between features and target outputs (underfitting)

If we fit our data perfectly on the training set, it might not be general. Our aim is to have a general model

Let's reserve some of our data to see how general our model is:

Once we have a good model we should re-run the training and evaluation on differently divided dataset
and the accuracy should be consistent. This is called cross-validation

28
Introduction to machine learning, Pazmany University 2017-2018

Scaling feautres
The features in our data can have orders of different magnitudes

Some algorithms can handle this fact, some can not

Even those which can learn these differences work faster and better with scaled data

Let's change all features to ensure that all of them has zero mean and variance of 1

29
Introduction to machine learning, Pazmany University 2017-2018

Detecting outliers
It really helps to have a look at the data:

One could see that prices are limited at 50K$

These elements will not fit into our model

They are outliers and it is not good to test the

model on them in real life

The values are also thresholded in the test data in this case, so it is good if the model could learn this fact

30
Introduction to machine learning, Pazmany University 2017-2018

Linear regression

We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b

We want to minimize the error between:

According to w and b

There is an analytical formula to find the best w

and b

We have to find the local extremum of the error

function

31
Introduction to machine learning, Pazmany University 2017-2018

Linear regression

We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b

Linear regression in SK learn

32
Introduction to machine learning, Pazmany University 2017-2018

Linear regression

We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b

Linear regression in SK learn

33
Introduction to machine learning, Pazmany University 2017-2018

Linear regression - Lasso

We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b

One can easily overfit the data:

There is noise on our input data (some variables might be more reliable than the other)

How could we select the important variables.

Lasso (Least Absolute Shrinkage and Selection Operator) penalizes large values in the
regressor (the same output accuracy where the values in w are more uniform is better)

The parameter of the algorithm is the penalty constant

34
Introduction to machine learning, Pazmany University 2017-2018

Linear regression – Ridge,

Elastic Net
We assume that there is a correlation in our data (x and y) which can be described as:
Y=wX+b

One can easily overfit the data:

There is noise on our input data (some variables might be more reliable than the other)

Ridge and ElasticNet uses more complex penalization on the parameters depending on the
data (Thikonov regularization)

35
Introduction to machine learning, Pazmany University 2017-2018

Non-linear regression

How could we fit a more complex mode?

We could fit any model we found according to the previous formula:
Y=m(x)

Where we can derivate the model according to all the parameters and we will have an equation
for every parameter

We could fit any model if it is known, but unfortunately in practice the problem is
that we do not know the model

How could we create a general model, which could approximate any possible
models

36
Introduction to machine learning, Pazmany University 2017-2018

Non-linear regression

We can approximate any function with partially linear functions

We can divide the problem into sub-domains

depending on the input values and use separated (or
combined) linear regressors to approximate the
original function.

37
Introduction to machine learning, Pazmany University 2017-2018

Ensemble regression

We can approximate any function with partially linear functions

The approximation will be the weighted summation of linear models

The weights are determined by the values of the data

http://arogozhnikov.github.io/2016/06/24/gradient_boosting_explained.html

38
Introduction to machine learning, Pazmany University 2017-2018

Ensemble regression

We can approximate any function with partially linear functions

39
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

Linear regression:

C1(wx+b)+C2(wx+b)+C3(wx+b)+C4(wx+b)+C5(wx+b)+C5(wx+b)

What does a neuron do?

40
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

Linear regression:

C1(wx+b)+C2(wx+b)+C3(wx+b)+C4(wx+b)+C5(wx+b)+C5(wx+b)

What does a neuron do?

41
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

Linear regression:

C1(wx+b)+C2(wx+b)+C3(wx+b)+C4(wx+b)+C5(wx+b)+C5(wx+b)

A feed forward neural network – layers of neurons:

f((w3f((w2f((w1x)+b1))+b2))+b3)

42
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

How can we define this as mathematical operations:

43
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

A feed-forward network is great

How could we teach such a network:

- Backpropagation algorithm

44
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

A feed-forward network is great

How could we teach such a network:

- Backpropagation algorithm

- Stochastic gradient descent

45
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

A feed-forward network is great

How could we teach such a network:

- Backpropagation algorithm

- Stochastic gradient descent

Tensorflow is here to help

46
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

Regularization:

Using batches

Adding Dropout

47
Introduction to machine learning, Pazmany University 2017-2018

Neural networks

Intro To ML PDF
No ratings yet
Intro To ML PDF
66 pages
Introduction To ML Linear Regression
No ratings yet
Introduction To ML Linear Regression
33 pages
Machine Learning for Beginners
No ratings yet
Machine Learning for Beginners
32 pages
MLLecture 1
No ratings yet
MLLecture 1
10 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
163 pages
1694266379-Unit1 Machine Learning Introduction CU 2.0
No ratings yet
1694266379-Unit1 Machine Learning Introduction CU 2.0
58 pages
Aprendizaje de Máquina: Joaquín F Sánchez
No ratings yet
Aprendizaje de Máquina: Joaquín F Sánchez
30 pages
Introduction To Machine Learning: David Kauchak CS 451 - Fall 2013
No ratings yet
Introduction To Machine Learning: David Kauchak CS 451 - Fall 2013
34 pages
Introduction To Machine Learning: Supervised Learning, Deep Learning, and More
No ratings yet
Introduction To Machine Learning: Supervised Learning, Deep Learning, and More
10 pages
Cidu2011 Banerjee Intro To ML 01
No ratings yet
Cidu2011 Banerjee Intro To ML 01
120 pages
07 Overview of Machine Learning
No ratings yet
07 Overview of Machine Learning
113 pages
ML Intro Theory
No ratings yet
ML Intro Theory
10 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
12 pages
Chapter 5 Introduction To ML-1
100% (1)
Chapter 5 Introduction To ML-1
32 pages
Summer of Science Report On - Intro To Machine Learning
No ratings yet
Summer of Science Report On - Intro To Machine Learning
36 pages
ML-UNIT - I - Part A
No ratings yet
ML-UNIT - I - Part A
88 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
76 pages
CE880 Lecture5 Slides
No ratings yet
CE880 Lecture5 Slides
32 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
3 pages
Smola PDF
No ratings yet
Smola PDF
271 pages
Section 2 - Introduction To Machine Learning-Bje Edits - Ipynb - Colab
No ratings yet
Section 2 - Introduction To Machine Learning-Bje Edits - Ipynb - Colab
7 pages
1 Introduction
No ratings yet
1 Introduction
24 pages
COS324 Course Notes
No ratings yet
COS324 Course Notes
256 pages
1 Lecture 1: Introduction To Machine Learning
No ratings yet
1 Lecture 1: Introduction To Machine Learning
12 pages
Machine Learning Chapter 1
No ratings yet
Machine Learning Chapter 1
25 pages
ML ch1
No ratings yet
ML ch1
20 pages
Machine Learning Guide: Concepts & Types
No ratings yet
Machine Learning Guide: Concepts & Types
13 pages
Machine Learning
No ratings yet
Machine Learning
13 pages
ML SecA
No ratings yet
ML SecA
34 pages
Intro To Machine Learning
100% (1)
Intro To Machine Learning
250 pages
Intro To ML
No ratings yet
Intro To ML
26 pages
1 - ML - Intro
No ratings yet
1 - ML - Intro
17 pages
Machine Learning Basics: 1. General Introduction
No ratings yet
Machine Learning Basics: 1. General Introduction
46 pages
Intro to Machine Learning
100% (1)
Intro to Machine Learning
170 pages
Introducing Machine Learning
No ratings yet
Introducing Machine Learning
17 pages
01 Overview of Machine Learning
No ratings yet
01 Overview of Machine Learning
100 pages
SEng5305-chap-1-Introduction To ML
No ratings yet
SEng5305-chap-1-Introduction To ML
85 pages
Introduction To Machine Learning
No ratings yet
Introduction To Machine Learning
45 pages
01 02 Introduction Regression Analysis and GR
No ratings yet
01 02 Introduction Regression Analysis and GR
11 pages
I2ml3e Chap1
No ratings yet
I2ml3e Chap1
20 pages
Lesson 1
No ratings yet
Lesson 1
20 pages
Introduction to Machine Learning Basics
No ratings yet
Introduction to Machine Learning Basics
44 pages
ML Final
No ratings yet
ML Final
95 pages
Practicalintroductiontomachinelearning1561472049990 PDF
No ratings yet
Practicalintroductiontomachinelearning1561472049990 PDF
110 pages
ML Unit-1
No ratings yet
ML Unit-1
130 pages
Unit 1
No ratings yet
Unit 1
88 pages
Lecture 1
No ratings yet
Lecture 1
47 pages
Machine Learning With R and Python
No ratings yet
Machine Learning With R and Python
290 pages
03-Introduction To Machine Learning - DNN
No ratings yet
03-Introduction To Machine Learning - DNN
35 pages
Machine Learning Lecture-01
No ratings yet
Machine Learning Lecture-01
37 pages
01 ML Overview Notes
No ratings yet
01 ML Overview Notes
22 pages
Machine Learning for Students
No ratings yet
Machine Learning for Students
22 pages
ML Microsoft Course Overview: Machine Learning in Context
100% (1)
ML Microsoft Course Overview: Machine Learning in Context
53 pages
Lecture 1.1. Introduction
No ratings yet
Lecture 1.1. Introduction
48 pages
1 - ML Introduction1
No ratings yet
1 - ML Introduction1
23 pages
1 s2.0 S2214509522001784 Main
No ratings yet
1 s2.0 S2214509522001784 Main
17 pages
Decision Tree Learning Challenges
No ratings yet
Decision Tree Learning Challenges
23 pages
PA DL Consolidated
No ratings yet
PA DL Consolidated
94 pages
Introduction To Data Science - 23CSH-283
100% (1)
Introduction To Data Science - 23CSH-283
48 pages
1-MATERIAL DL Syllabus V2
No ratings yet
1-MATERIAL DL Syllabus V2
2 pages
Cs224n Text Generation
No ratings yet
Cs224n Text Generation
73 pages
14
No ratings yet
14
72 pages
Module 1
No ratings yet
Module 1
181 pages
Development and Submission of Near Infrared Analytical Procedures
No ratings yet
Development and Submission of Near Infrared Analytical Procedures
28 pages
Machine Learning Applications For Building Structural Design and Performance Assessment
No ratings yet
Machine Learning Applications For Building Structural Design and Performance Assessment
41 pages
Fish Classification in Abra
No ratings yet
Fish Classification in Abra
54 pages
Unit Ii (57 92)
No ratings yet
Unit Ii (57 92)
36 pages
Tutorial 7 Machine Learning Algorithms
No ratings yet
Tutorial 7 Machine Learning Algorithms
30 pages
Practical Machine Learning With R - Tutorials and Case - Carsten Lange - 2024 - CRC Press LLC - 9781003367147 - Anna's Archive
No ratings yet
Practical Machine Learning With R - Tutorials and Case - Carsten Lange - 2024 - CRC Press LLC - 9781003367147 - Anna's Archive
369 pages
MC4301 - ML Unit 2 (Model Evaluation and Feature Engineering)
No ratings yet
MC4301 - ML Unit 2 (Model Evaluation and Feature Engineering)
40 pages
tn2018-00291 Food Doneness Sensing
No ratings yet
tn2018-00291 Food Doneness Sensing
31 pages
ML/DS Interview Cheat Sheets
No ratings yet
ML/DS Interview Cheat Sheets
16 pages
ML Unit 2 CSE
No ratings yet
ML Unit 2 CSE
160 pages
Decision Trees
No ratings yet
Decision Trees
24 pages
Introduction To The AI Project Cycle
No ratings yet
Introduction To The AI Project Cycle
10 pages
Regression Problems in Python PDF
No ratings yet
Regression Problems in Python PDF
34 pages
Mathematics For Machine Learning
No ratings yet
Mathematics For Machine Learning
134 pages
Data Science Basics for Beginners
No ratings yet
Data Science Basics for Beginners
3 pages
A1 DataMining
No ratings yet
A1 DataMining
13 pages
Book's Solutions
No ratings yet
Book's Solutions
20 pages
Machine Learning Interview Guide
100% (1)
Machine Learning Interview Guide
41 pages
Tanzania Water Pump Analysis
No ratings yet
Tanzania Water Pump Analysis
5 pages
Covid-19 Short-Term Forecasting in Bangladesh Using Supervised Machine Learning
No ratings yet
Covid-19 Short-Term Forecasting in Bangladesh Using Supervised Machine Learning
8 pages
CS 464 Introduction To Machine Learning: Feature Selection
No ratings yet
CS 464 Introduction To Machine Learning: Feature Selection
36 pages
CSE Women's Midterm: DLT Focus
No ratings yet
CSE Women's Midterm: DLT Focus
2 pages