Chapter 3. Machine Learning - Full

Uploaded by

schlaggen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views18 pages

Chapter 3. Machine Learning - Full

Uploaded by

schlaggen

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

MACHINE LEARNING

DR. PHẠM MINH HOÀN – [email protected]

OBJECTIVES OF CHAPTER 3
• Understanding different types of data sources and how to access and manipulate
them.
• Data analysis is all about extracting meaningful insights from your data.
• Data exploration is about getting familiar with your data and identifying patterns
and trends.
• Data
visualization is about creating visual representations of your data to
communicate insights effectively.
• Most data analysis tasks involve using specialized libraries that provide functions
and tools for working with data.
CONTENTS
3.1. Machine Learning models
3.2. Regression
3.3. Classification
3.4. Clustering
MACHINE LEARNING MODELS
• Machine Learning is making the computer learn from studying data
and statistics.
• Machine Learning is a step into the direction of artificial intelligence
(AI).
• Machine Learning is a program that analyses data and learns to
predict the outcome.
REGRESSION
• The term regression is used when you try to find the
relationship between variables.
• In Machine Learning, and in statistical modeling, that
relationship is used to predict the outcome of future events.
REGRESSION
• Ex:
import matplotlib.pyplot as plt
from scipy import stats
x = [5,7,8,7,2,17,2,9,4,11,12,9,6]
y = [99,86,87,88,111,86,103,87,94,78,77,85,86]
slope, intercept, r, p, std_err = stats.linregress(x, y)
def myfunc(x):
return slope * x + intercept
mymodel = list(map(myfunc, x))
plt.scatter(x, y)
plt.plot(x, mymodel)
plt.show()
CLUSTERING
• K-means is an unsupervised learning method for clustering data
points.
• Thealgorithm iteratively divides data points into K clusters by
minimizing the variance in each cluster.
• The best value for K using the elbow method, then use K-means
clustering to group the data points into clusters.
CLUSTERING
• Ex: Visualizing some data points.
# Import the modules
import matplotlib.pyplot as plt
# Create arrays
x = [4, 5, 10, 4, 3, 11, 14 , 6, 10, 12]
y = [21, 19, 24, 17, 16, 25, 24, 22, 21, 21]
plt.scatter(x, y)
plt.show()
CLUSTERING
• Ex:Utilize the elbow method to visualize the intertia for different
values of K.
# Import the modules
from sklearn.cluster import KMeans
# Turn the data into a set of points
data = list(zip(x, y))
inertias = []
for i in range(1,11):
kmeans = KMeans(n_clusters=i)
kmeans.fit(data)
inertias.append(kmeans.inertia_)
CLUSTERING
• Ex:Utilize the elbow method to visualize the intertia for different
values of K.
To find the best value for K, run K-means across data for a range of possible
values.
Have 10 data points, so the maximum number of clusters is 10 (The K value
does not exceed the size of the data). So for each value K in range(1,11), train a
K-means model and plot the intertia at that number of clusters:
CLUSTERING
• Ex:Utilize the elbow method to visualize the intertia for different
values of K.
plt.plot(range(1,11), inertias, marker='o')
plt.title('Elbow method')
plt.xlabel('Number of clusters')
plt.ylabel('Inertia')
plt.show()
CLUSTERING
• Ex: The elbow method shows that 2 is a good value for K (where the
interia becomes more linear), so we retrain and visualize the result.
kmeans = KMeans(n_clusters=2)
kmeans.fit(data)

plt.scatter(x, y, c=kmeans.labels_)
plt.show()
EXAMPLE WITH REGRESSION AND CLUSTERING
• Ex: To read data from the Sales.csv file, do:
• Linear regression analysis shows the relationship between the
number of orders and the total sales amount.
• Clustering based on total sales amount and total number of orders
with k-means.
CLASSIFICATION
• The classification technique or model attempts to get some
conclusion from observed values.
• DecisionTreeClassifier is a class capable of performing
multi-class classification on a dataset.
• A Decision Tree is a Flow Chart, and can help you make
decisions based on previous experience.
CLASSIFICATION
• Example: decide should go to a comedy show or not.
Age Experience Rank Nationality Go
36 10 9 UK NO
42 12 4 USA NO
23 4 6 N NO
52 4 4 USA NO
43 21 8 USA YES
44 14 5 UK NO
66 3 7 N YES
35 14 9 UK YES
52 13 7 N YES
35 5 9 N YES
24 3 5 USA NO
18 3 7 UK YES
45 9 9 UK YES
CLASSIFICATION
• Explain:
• Rank <= 6.5 means that every comedian with a rank of 6.5 or lower will
follow the True arrow (to the left), and the rest will follow the False arrow (to
the right).
• gini = 0.497 refers to the quality of the split, and is always a number between
0.0 and 0.5, where 0.0 would mean all of the samples got the same result, and
0.5 would mean that the split is done exactly in the middle.
• samples = 13 means that there are 13 comedians left at this point in the
decision, which is all of them since this is the first step.
• value = [6, 7] means that of these 13 comedians, 6 will get a "NO", and 7 will
get a "GO".
CLASSIFICATION
• Explain:
• There are many ways to split the samples, we use the GINI method in this
tutorial.
• The Gini method uses this formula:
Gini = 1 - (x/n)2 - (y/n)2
• Where x is the number of positive answers("GO"), n is the number of
samples, and y is the number of negative answers ("NO"), which gives us this
calculation:
1 - (7 / 13)2 - (6 / 13)2 = 0.497
SUMMARY

Machinelearning GateNotes
No ratings yet
Machinelearning GateNotes
105 pages
Otonari Asobi Volume 7
No ratings yet
Otonari Asobi Volume 7
255 pages
Concave vs Convex Mirror Quiz
100% (4)
Concave vs Convex Mirror Quiz
5 pages
Machine Learning and Deep Learning Supervised Learning 1682688720
No ratings yet
Machine Learning and Deep Learning Supervised Learning 1682688720
121 pages
Size 365 Days of Mathematics For Class X A5
No ratings yet
Size 365 Days of Mathematics For Class X A5
202 pages
Fundamentals of Data Science Unit 4
100% (1)
Fundamentals of Data Science Unit 4
31 pages
ML Lectures Summary 2
No ratings yet
ML Lectures Summary 2
52 pages
Machine Learning Lab
No ratings yet
Machine Learning Lab
20 pages
Mooc Part 2
No ratings yet
Mooc Part 2
8 pages
Classification
No ratings yet
Classification
95 pages
Lect 10 - Unsupervised Learning
No ratings yet
Lect 10 - Unsupervised Learning
50 pages
Science 6 Energy & Machines Test
No ratings yet
Science 6 Energy & Machines Test
2 pages
Classification in Data Mining
No ratings yet
Classification in Data Mining
60 pages
ML SummaryFINAL
No ratings yet
ML SummaryFINAL
48 pages
PYP Student Planner Guide
No ratings yet
PYP Student Planner Guide
31 pages
Unsupervised
No ratings yet
Unsupervised
14 pages
Yunsu Han KNN K Means
No ratings yet
Yunsu Han KNN K Means
8 pages
Machine Learning Overview Guide
No ratings yet
Machine Learning Overview Guide
68 pages
Unit 3 Big Data
No ratings yet
Unit 3 Big Data
50 pages
Machine Learning
No ratings yet
Machine Learning
37 pages
Statistical Machine Learning: Yiqiao YIN Department of Statistics Columbia University
No ratings yet
Statistical Machine Learning: Yiqiao YIN Department of Statistics Columbia University
204 pages
Machine Learning With Matlab
100% (1)
Machine Learning With Matlab
36 pages
Schedule D SAFETY, HEALTH AND ENVIRONMENTAL REQUIREMENTS
100% (1)
Schedule D SAFETY, HEALTH AND ENVIRONMENTAL REQUIREMENTS
26 pages
ML Notes
No ratings yet
ML Notes
12 pages
Unit IV
No ratings yet
Unit IV
96 pages
Ds Notes Mca
No ratings yet
Ds Notes Mca
30 pages
How To Perform Clustering Algorithms in Machine Learning
No ratings yet
How To Perform Clustering Algorithms in Machine Learning
9 pages
Revision
No ratings yet
Revision
12 pages
ML Summary
No ratings yet
ML Summary
23 pages
Linear Algebra Cheat Sheet
No ratings yet
Linear Algebra Cheat Sheet
5 pages
Module 3 - Classification
No ratings yet
Module 3 - Classification
9 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
DiscreteMaths LectureNotes
No ratings yet
DiscreteMaths LectureNotes
4 pages
Unit 2 R Programming
No ratings yet
Unit 2 R Programming
15 pages
Unit 6
No ratings yet
Unit 6
22 pages
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
No ratings yet
ICT202B AI ML and Emerging Technologies UNIT 3 (Classification and Regression) 2
23 pages
Python K-Means Clustering Guide
No ratings yet
Python K-Means Clustering Guide
6 pages
Model Definition11
No ratings yet
Model Definition11
6 pages
Model Definition
No ratings yet
Model Definition
6 pages
Classification
No ratings yet
Classification
50 pages
Clustering in Machine Learning
No ratings yet
Clustering in Machine Learning
4 pages
Classifying in Machine Learning
No ratings yet
Classifying in Machine Learning
26 pages
DM Chapter 4
No ratings yet
DM Chapter 4
47 pages
Top 90+ Data Science Interview Questions and Answers (2024)
No ratings yet
Top 90+ Data Science Interview Questions and Answers (2024)
38 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
5 pages
Algorithms New
No ratings yet
Algorithms New
8 pages
Untitled Document 15
No ratings yet
Untitled Document 15
7 pages
Machine Learning Algorithms 1728923216
No ratings yet
Machine Learning Algorithms 1728923216
12 pages
Lab Report6 - B21CI014
No ratings yet
Lab Report6 - B21CI014
8 pages
Lec09 Clustering
No ratings yet
Lec09 Clustering
27 pages
Chapter 2. Data Analysis and Processing - Full
No ratings yet
Chapter 2. Data Analysis and Processing - Full
49 pages
Spam Not Spam
No ratings yet
Spam Not Spam
7 pages
Data Minning Unit 2-1
No ratings yet
Data Minning Unit 2-1
10 pages
Aiml Prof
No ratings yet
Aiml Prof
8 pages
Full Chapter of Social Psychology 10th Edition by Saul Kassin Ebook and TestBank Bundle EPUB DOCX PDF Download Now
No ratings yet
Full Chapter of Social Psychology 10th Edition by Saul Kassin Ebook and TestBank Bundle EPUB DOCX PDF Download Now
405 pages
g (y) = βo + β (Age) - (a)
No ratings yet
g (y) = βo + β (Age) - (a)
6 pages
Single Line Graph
No ratings yet
Single Line Graph
14 pages
Bayesian and Clustering Algorithms in Python
No ratings yet
Bayesian and Clustering Algorithms in Python
18 pages
Ex No: Date: K-Means Clustering Using Python: Scatter
No ratings yet
Ex No: Date: K-Means Clustering Using Python: Scatter
10 pages
splat mover 斯坦福数字孪生解决方案
No ratings yet
splat mover 斯坦福数字孪生解决方案
23 pages
K Means
No ratings yet
K Means
9 pages
8 Vertical Stresses Below Applied Loads
No ratings yet
8 Vertical Stresses Below Applied Loads
13 pages
Machine Learning Algorithms Guide
No ratings yet
Machine Learning Algorithms Guide
10 pages
Interview Preparing - ML Draft
No ratings yet
Interview Preparing - ML Draft
12 pages
Using Machine Learning To Locate Support and Resistance Lines For Stocks
No ratings yet
Using Machine Learning To Locate Support and Resistance Lines For Stocks
14 pages
PGP-Data Science - Course Module With Internship Module
No ratings yet
PGP-Data Science - Course Module With Internship Module
17 pages
4699-Chu-201-Xme-201 Itp Weld R
No ratings yet
4699-Chu-201-Xme-201 Itp Weld R
5 pages
Syllabus Arch 353 Sec Sem.2024-2025
No ratings yet
Syllabus Arch 353 Sec Sem.2024-2025
4 pages
Intro to Exploratory Data Analysis
No ratings yet
Intro to Exploratory Data Analysis
17 pages
Lesson 2 WORD 2010
No ratings yet
Lesson 2 WORD 2010
14 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Integrating Detailed Hydrocarbon Analysis Data With Simulated Distillation To Improve The Characterisation of Crude Oils by Gas Chromatography
No ratings yet
Integrating Detailed Hydrocarbon Analysis Data With Simulated Distillation To Improve The Characterisation of Crude Oils by Gas Chromatography
2 pages
Road Drainage System
No ratings yet
Road Drainage System
4 pages
Project 11 Hotel Management
No ratings yet
Project 11 Hotel Management
3 pages
Numerical Calculation of Manoeuvring Coefficients For Modelling The Effect of Submarine Motion Near The Surface
No ratings yet
Numerical Calculation of Manoeuvring Coefficients For Modelling The Effect of Submarine Motion Near The Surface
124 pages
Siemens Scada
No ratings yet
Siemens Scada
12 pages
Ancient Psychomusicology Studies
No ratings yet
Ancient Psychomusicology Studies
557 pages
Mpisane Bonga Basil 2015
No ratings yet
Mpisane Bonga Basil 2015
86 pages
Lesson 1 WINDOWS AND INTERNET
No ratings yet
Lesson 1 WINDOWS AND INTERNET
18 pages
ATTENDANCE 2nd Quarter (AutoRecovered)
No ratings yet
ATTENDANCE 2nd Quarter (AutoRecovered)
3 pages
CMN 211 - Rubric For Final Essay Demonstration of Research Skills /5
No ratings yet
CMN 211 - Rubric For Final Essay Demonstration of Research Skills /5
1 page
Project 03 Sales Management
No ratings yet
Project 03 Sales Management
3 pages
Project 07 Inventory
No ratings yet
Project 07 Inventory
3 pages
RGUKT CET Final Notification 20.08.2021
No ratings yet
RGUKT CET Final Notification 20.08.2021
14 pages
2007 02 17 GENV Cofimvaba Landfill Site Phase 2
No ratings yet
2007 02 17 GENV Cofimvaba Landfill Site Phase 2
42 pages
02 The Relief
No ratings yet
02 The Relief
2 pages
Use Daily Affirmations To Crank Up Your Self Esteem
No ratings yet
Use Daily Affirmations To Crank Up Your Self Esteem
2 pages
Ngan Thanh Tran Tran Thanh Ngan-11214243 415062 1736623253
No ratings yet
Ngan Thanh Tran Tran Thanh Ngan-11214243 415062 1736623253
2 pages
Chap - 24
No ratings yet
Chap - 24
61 pages
Resume Workshop: A Presentation For The BCA Department
No ratings yet
Resume Workshop: A Presentation For The BCA Department
39 pages
Ethnomath in Javanese Drums
No ratings yet
Ethnomath in Javanese Drums
12 pages
BA64 Group6 Geo-Economy
No ratings yet
BA64 Group6 Geo-Economy
30 pages
Richard E. Taylor
No ratings yet
Richard E. Taylor
4 pages
Calculating Module Voltages
No ratings yet
Calculating Module Voltages
2 pages
Student Stock Market Awareness
No ratings yet
Student Stock Market Awareness
6 pages

Chapter 3. Machine Learning - Full

Uploaded by

Chapter 3. Machine Learning - Full

Uploaded by

MACHINE LEARNING

DR. PHẠM MINH HOÀN – [email protected]

You might also like