0% found this document useful (0 votes)

229 views6 pages

02 - Decision Tree Classification On Iris Dataset

The document discusses building a decision tree classification model to predict iris flower species (Iris-setosa, Iris-versicolor, Iris-virginica) based on sepal and petal attributes. It loads the iris dataset, splits it into training and test sets, trains a decision tree classifier, evaluates its accuracy at 97.8%, and visually plots the decision tree to show how it makes predictions based on attribute thresholds. It also demonstrates predicting the species for new data points using the trained decision tree model.

Uploaded by

John Wick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

229 views6 pages

02 - Decision Tree Classification On Iris Dataset

Uploaded by

John Wick

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Practical - 2

AIM :- Decision Tree Classification on iris

Dataset

Import Libraries

In [1]:

1 import numpy as np
2 import pandas as pd
3 from sklearn.tree import DecisionTreeClassifier

Loading iris.csv Dataset in Pandas Dataframe

In [2]:

1 data = pd.read_csv("Iris.csv")
2 data.head(3)

Out[2]:

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

Getting Information about data

In [3]:

1 data.info()

RangeIndex: 150 entries, 0 to 149

Data columns (total 6 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Id 150 non-null int64

1 SepalLengthCm 150 non-null float64

2 SepalWidthCm 150 non-null float64

3 PetalLengthCm 150 non-null float64

4 PetalWidthCm 150 non-null float64

5 Species 150 non-null object

dtypes: float64(4), int64(1), object(1)

memory usage: 7.2+ KB


X is data and Y is target data i.e species

In [4]:

1 X = data[['SepalLengthCm','SepalWidthCm','PetalLengthCm','PetalWidthCm']].values
2 X[:5]

Out[4]:

array([[5.1, 3.5, 1.4, 0.2],

[4.9, 3. , 1.4, 0.2],

[4.7, 3.2, 1.3, 0.2],

[4.6, 3.1, 1.5, 0.2],

[5. , 3.6, 1.4, 0.2]])

In [5]:

1 Y = data['Species']
2 Y[:5]

Out[5]:

0 Iris-setosa

1 Iris-setosa

2 Iris-setosa

3 Iris-setosa

4 Iris-setosa

Name: Species, dtype: object

Training Model

In [6]:

1 from sklearn.model_selection import train_test_split

2
3 X_trainset, X_testset, Y_trainset, Y_testset = train_test_splittrain_X, test_X, train_
4 X, Y, test_size=0.3, random_state=0)

In [7]:

1 SpeciesTree = DecisionTreeClassifier(criterion = 'entropy', max_depth = 4)

2 SpeciesTree

Out[7]:

DecisionTreeClassifier(criterion='entropy', max_depth=4)

In [8]:

1 SpeciesTree.fit(X_trainset, Y_trainset)

Out[8]:

DecisionTreeClassifier(criterion='entropy', max_depth=4)

Prediction
In [9]:

1 predTree = SpeciesTree.predict(X_testset)
2 predTree [0:5]

Out[9]:

array(['Iris-virginica', 'Iris-versicolor', 'Iris-setosa',

'Iris-virginica', 'Iris-setosa'], dtype=object)

In [10]:

1 Y_testset[0:5]

Out[10]:

114 Iris-virginica

62 Iris-versicolor

33 Iris-setosa

107 Iris-virginica

7 Iris-setosa

Name: Species, dtype: object

In [11]:

1 from sklearn import metrics

2 import matplotlib.pyplot as plt
3 print("DecisionTrees's Accuracy: ",metrics.accuracy_score(Y_testset, predTree))

DecisionTrees's Accuracy: 0.9777777777777777

Visualizing the Decision Tree

In [12]:

1 import matplotlib.pyplot as plt

2 from sklearn.tree import DecisionTreeClassifier
3 from sklearn import tree
4
5 fn = data.columns[1:5]
6 cn = data["Species"].unique().tolist()
7 SpeciesTree.fit(X, Y)
8 fig, axes = plt.subplots(nrows=1, ncols=1, figsize=(10, 10), dpi=300)
9
10 tree.plot_tree(SpeciesTree, feature_names=fn, class_names=cn, filled=True)

Out[12]:

[Text(0.5, 0.9, 'PetalLengthCm <= 2.45\nentropy = 1.585\nsamples = 150\nvalu

e = [50, 50, 50]\nclass = Iris-setosa'),

Text(0.4230769230769231, 0.7, 'entropy = 0.0\nsamples = 50\nvalue = [50, 0,

0]\nclass = Iris-setosa'),

Text(0.5769230769230769, 0.7, 'PetalWidthCm <= 1.75\nentropy = 1.0\nsamples

= 100\nvalue = [0, 50, 50]\nclass = Iris-versicolor'),

Text(0.3076923076923077, 0.5, 'PetalLengthCm <= 4.95\nentropy = 0.445\nsamp

les = 54\nvalue = [0, 49, 5]\nclass = Iris-versicolor'),

Text(0.15384615384615385, 0.3, 'PetalWidthCm <= 1.65\nentropy = 0.146\nsamp

les = 48\nvalue = [0, 47, 1]\nclass = Iris-versicolor'),

Text(0.07692307692307693, 0.1, 'entropy = 0.0\nsamples = 47\nvalue = [0, 4

7, 0]\nclass = Iris-versicolor'),

Text(0.23076923076923078, 0.1, 'entropy = 0.0\nsamples = 1\nvalue = [0, 0,

1]\nclass = Iris-virginica'),

Text(0.46153846153846156, 0.3, 'PetalWidthCm <= 1.55\nentropy = 0.918\nsamp

les = 6\nvalue = [0, 2, 4]\nclass = Iris-virginica'),

Text(0.38461538461538464, 0.1, 'entropy = 0.0\nsamples = 3\nvalue = [0, 0,

3]\nclass = Iris-virginica'),

Text(0.5384615384615384, 0.1, 'entropy = 0.918\nsamples = 3\nvalue = [0, 2,

1]\nclass = Iris-versicolor'),

Text(0.8461538461538461, 0.5, 'PetalLengthCm <= 4.85\nentropy = 0.151\nsamp

les = 46\nvalue = [0, 1, 45]\nclass = Iris-virginica'),

Text(0.7692307692307693, 0.3, 'SepalLengthCm <= 5.95\nentropy = 0.918\nsamp

les = 3\nvalue = [0, 1, 2]\nclass = Iris-virginica'),

Text(0.6923076923076923, 0.1, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1,

0]\nclass = Iris-versicolor'),

Text(0.8461538461538461, 0.1, 'entropy = 0.0\nsamples = 2\nvalue = [0, 0,

2]\nclass = Iris-virginica'),

Text(0.9230769230769231, 0.3, 'entropy = 0.0\nsamples = 43\nvalue = [0, 0,

43]\nclass = Iris-virginica')]
Predicting Species for Set of Values

Prediction-1
In [13]:

1 X_new = [[6.3,3.0,1.3,0.2]]
2 predTree = SpeciesTree.predict(X_new)
3 predTree

Out[13]:

array(['Iris-setosa'], dtype=object)

Prediction-2

In [14]:

1 X_new = [[5.4,2.8,2.9,1.5]]
2 predTree = SpeciesTree.predict(X_new)
3 predTree

Out[14]:

array(['Iris-versicolor'], dtype=object)

Prediction-3

In [15]:

1 X_new = [[5.4,2.8,2.9,0.5]]
2 predTree = SpeciesTree.predict(X_new)
3 predTree

Out[15]:

array(['Iris-versicolor'], dtype=object)

Statistical Machine Learning Assignment
No ratings yet
Statistical Machine Learning Assignment
5 pages
Diagnostic Table For Yanmar 4TNV98 ZNMS Tier 3 Engine
100% (1)
Diagnostic Table For Yanmar 4TNV98 ZNMS Tier 3 Engine
3 pages
Bootstrap Powerpoint
100% (1)
Bootstrap Powerpoint
20 pages
Python Seaborn Notes
No ratings yet
Python Seaborn Notes
28 pages
Quiz Feedback1 - Coursera
100% (1)
Quiz Feedback1 - Coursera
7 pages
IRIS BPNN - Ipynb - Colaboratory
100% (1)
IRIS BPNN - Ipynb - Colaboratory
4 pages
Regression Analysis
100% (2)
Regression Analysis
9 pages
Weighbridge Integration With Sap
No ratings yet
Weighbridge Integration With Sap
10 pages
Feature Selection in Python ML
No ratings yet
Feature Selection in Python ML
7 pages
03 - K Means Clustering On Iris Datasets
No ratings yet
03 - K Means Clustering On Iris Datasets
4 pages
Classification With Decision Trees: Instructor: Qiang Yang
100% (1)
Classification With Decision Trees: Instructor: Qiang Yang
62 pages
Pivot Table
No ratings yet
Pivot Table
19 pages
Jawaban MTCNA
No ratings yet
Jawaban MTCNA
13 pages
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
No ratings yet
!!!!!!!!!AC SINGLE PHASE INDUCTION MOTOR SPEED CONTROL U2008b PDF
6 pages
Machine Learning Theory
100% (1)
Machine Learning Theory
12 pages
Bagging and Random Forest Presentation1
100% (3)
Bagging and Random Forest Presentation1
23 pages
Pue - Kar.nic - in PUE PDF Files Colleges NN
No ratings yet
Pue - Kar.nic - in PUE PDF Files Colleges NN
18 pages
Supervised Learning - Regression - Annotated
No ratings yet
Supervised Learning - Regression - Annotated
97 pages
Pra 8
No ratings yet
Pra 8
4 pages
Logistic Regression
100% (1)
Logistic Regression
29 pages
Data Science & Statistics Cheat Sheet
100% (1)
Data Science & Statistics Cheat Sheet
13 pages
PCA Using Python
No ratings yet
PCA Using Python
18 pages
Itu-T G.841
No ratings yet
Itu-T G.841
98 pages
Facebook Privacy Perception: Sunil Pillai
No ratings yet
Facebook Privacy Perception: Sunil Pillai
29 pages
Day 5 Supervised Technique-Decision Tree For Classification PDF
100% (1)
Day 5 Supervised Technique-Decision Tree For Classification PDF
58 pages
Linear Regression with Scikit-Learn
No ratings yet
Linear Regression with Scikit-Learn
8 pages
Week 1 Quiz
100% (1)
Week 1 Quiz
28 pages
Data Pre-Processing (Pandas)
No ratings yet
Data Pre-Processing (Pandas)
19 pages
Assignment # 01 Bscs - 7 Semester: Machine Learning
100% (1)
Assignment # 01 Bscs - 7 Semester: Machine Learning
5 pages
Ensemble Classifiers
100% (1)
Ensemble Classifiers
37 pages
04 - Burglary Alarm Example Using Bayesian Network
100% (1)
04 - Burglary Alarm Example Using Bayesian Network
5 pages
Assignment Updated 101
100% (1)
Assignment Updated 101
24 pages
Practical-5 - Jupyter Notebook
100% (1)
Practical-5 - Jupyter Notebook
8 pages
Machine Learning: Bilal Khan
100% (2)
Machine Learning: Bilal Khan
20 pages
Bank Customer Churn Analysis - Jupyter Notebook
No ratings yet
Bank Customer Churn Analysis - Jupyter Notebook
11 pages
ML UNIT-2 Notes
No ratings yet
ML UNIT-2 Notes
15 pages
Philips Pet716 Service Manual
No ratings yet
Philips Pet716 Service Manual
31 pages
Predicting Salary with Experience
100% (1)
Predicting Salary with Experience
7 pages
Weak-Measurement Elements of Reality: Lev Vaidman
No ratings yet
Weak-Measurement Elements of Reality: Lev Vaidman
11 pages
Cluster
100% (1)
Cluster
72 pages
Supervised Learning 1 PDF
100% (1)
Supervised Learning 1 PDF
162 pages
Clustering K-Means
100% (2)
Clustering K-Means
28 pages
Career Plans For Next 2 Years
No ratings yet
Career Plans For Next 2 Years
11 pages
Analisis Swot Kurikulum Prodi Pgmi Menyongsong Pembangunan Uin Sun An Kalijaga Yogyakarta 2038 Yang Bervisi Integrasi-Interkonektif
No ratings yet
Analisis Swot Kurikulum Prodi Pgmi Menyongsong Pembangunan Uin Sun An Kalijaga Yogyakarta 2038 Yang Bervisi Integrasi-Interkonektif
16 pages
Project 5 PDF
100% (1)
Project 5 PDF
48 pages
Python Data Analysis & Visualization
No ratings yet
Python Data Analysis & Visualization
34 pages
iDS-7200HQHI-M2/S SERIES Turbo Acusense DVR: Key Feature
No ratings yet
iDS-7200HQHI-M2/S SERIES Turbo Acusense DVR: Key Feature
4 pages
L2 - Machine Learning Process
No ratings yet
L2 - Machine Learning Process
17 pages
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
100% (1)
C2M2 - Assignment: 1 Risk Models Using Tree-Based Models
38 pages
Data Mining Project: Clustering & Model Analysis
100% (1)
Data Mining Project: Clustering & Model Analysis
40 pages
Classification and Regression Trees
100% (1)
Classification and Regression Trees
60 pages
Lecture Week 2 KNN and Model Evaluation PDF
100% (1)
Lecture Week 2 KNN and Model Evaluation PDF
53 pages
Chapter 17 - Logistic Regression
No ratings yet
Chapter 17 - Logistic Regression
32 pages
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
100% (1)
Linear Regression With LM Function, Diagnostic Plots, Interaction Term, Non-Linear Transformation of The Predictors, Qualitative Predictors
15 pages
Matplotlib Basics for Beginners
No ratings yet
Matplotlib Basics for Beginners
16 pages
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
100% (2)
Machine Learning: Lecture 13: Model Validation Techniques, Overfitting, Underfitting
26 pages
Data Acquisition in MATLAB
No ratings yet
Data Acquisition in MATLAB
27 pages
PROBLEM SENSING FOR TEACHERS AND MTs
No ratings yet
PROBLEM SENSING FOR TEACHERS AND MTs
91 pages
Iris Flower Classification Project
No ratings yet
Iris Flower Classification Project
9 pages
Linear Regression Techniques Explained
100% (1)
Linear Regression Techniques Explained
44 pages
13 PracticalMachineLearning
100% (1)
13 PracticalMachineLearning
84 pages
Outliers, Hypothesis and Natural Language Processing
100% (1)
Outliers, Hypothesis and Natural Language Processing
7 pages
Machine Learning Lab Manual
No ratings yet
Machine Learning Lab Manual
56 pages
Bagging and Boosting Regression Algorithms
100% (1)
Bagging and Boosting Regression Algorithms
84 pages
Tree-Based Machine Learning Methods
100% (1)
Tree-Based Machine Learning Methods
138 pages
Stats & ML Model Comparisons
100% (1)
Stats & ML Model Comparisons
72 pages
Normalisasi Database
No ratings yet
Normalisasi Database
25 pages
HEC-RAS User's Manual Version 4.1
No ratings yet
HEC-RAS User's Manual Version 4.1
790 pages
ML MU Unit 2
100% (3)
ML MU Unit 2
84 pages
Statistics Probability
No ratings yet
Statistics Probability
66 pages
Data Science Masters Program - Curriculum-Updated 2019
No ratings yet
Data Science Masters Program - Curriculum-Updated 2019
52 pages
Machine Learning Algorithms
No ratings yet
Machine Learning Algorithms
9 pages
Abb E-Clipse Bypass Configurations (BCR, BDR, VCR, or VDR) For Ach 550 User Manual
No ratings yet
Abb E-Clipse Bypass Configurations (BCR, BDR, VCR, or VDR) For Ach 550 User Manual
100 pages
Unit-5 Decision Trees and Ensemble Learning
100% (1)
Unit-5 Decision Trees and Ensemble Learning
162 pages
Seminar Report Machine Learning
No ratings yet
Seminar Report Machine Learning
20 pages
05 - I and F Pattern Classification Using Perceptron
No ratings yet
05 - I and F Pattern Classification Using Perceptron
3 pages
SC 900T00A ENU PowerPoint - 01
No ratings yet
SC 900T00A ENU PowerPoint - 01
20 pages
Diabetes Prediction Report
No ratings yet
Diabetes Prediction Report
16 pages
4 Underlying Principles of Parallel
No ratings yet
4 Underlying Principles of Parallel
25 pages
Anaconda Training PDF
100% (1)
Anaconda Training PDF
2 pages
CIT 207 MODULE v2
No ratings yet
CIT 207 MODULE v2
57 pages
Investigating and Ranking The Rate of Penetration (ROP) Features For Petroleum Drilling Monitoring and Optimization
No ratings yet
Investigating and Ranking The Rate of Penetration (ROP) Features For Petroleum Drilling Monitoring and Optimization
7 pages
Nda Lab2
No ratings yet
Nda Lab2
2 pages
Unit 1 DBMS
No ratings yet
Unit 1 DBMS
107 pages
CV Syllabus
No ratings yet
CV Syllabus
3 pages
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
No ratings yet
Answers To Problems For Data Mining and Predictive Analytics (2nd Edition) by Larose
12 pages
CV Riswanda Zikrawi
No ratings yet
CV Riswanda Zikrawi
1 page
FRST
No ratings yet
FRST
19 pages
Diagnose IIS Performance Problems Using Windows Performance Monitor
No ratings yet
Diagnose IIS Performance Problems Using Windows Performance Monitor
2 pages
Predictive Analytics
No ratings yet
Predictive Analytics
46 pages
Final Semester Exam Paper
No ratings yet
Final Semester Exam Paper
4 pages
EV Charger Specification
No ratings yet
EV Charger Specification
9 pages
BIA-Aligned Recovery Matrix
No ratings yet
BIA-Aligned Recovery Matrix
1 page

02 - Decision Tree Classification On Iris Dataset

Uploaded by

02 - Decision Tree Classification On Iris Dataset

Uploaded by

Practical - 2

AIM :- Decision Tree Classification on iris

Loading iris.csv Dataset in Pandas Dataframe

Id SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species

0 1 5.1 3.5 1.4 0.2 Iris-setosa

1 2 4.9 3.0 1.4 0.2 Iris-setosa

2 3 4.7 3.2 1.3 0.2 Iris-setosa

Getting Information about data

RangeIndex: 150 entries, 0 to 149

Data columns (total 6 columns):

# Column Non-Null Count Dtype

--- ------ -------------- -----

0 Id 150 non-null int64

1 SepalLengthCm 150 non-null float64

2 SepalWidthCm 150 non-null float64

3 PetalLengthCm 150 non-null float64

4 PetalWidthCm 150 non-null float64

5 Species 150 non-null object

dtypes: float64(4), int64(1), object(1)

memory usage: 7.2+ KB

array([[5.1, 3.5, 1.4, 0.2],

[4.9, 3. , 1.4, 0.2],

[4.7, 3.2, 1.3, 0.2],

[4.6, 3.1, 1.5, 0.2],

[5. , 3.6, 1.4, 0.2]])

Name: Species, dtype: object

1 from sklearn.model_selection import train_test_split

1 SpeciesTree = DecisionTreeClassifier(criterion = 'entropy', max_depth = 4)

array(['Iris-virginica', 'Iris-versicolor', 'Iris-setosa',

'Iris-virginica', 'Iris-setosa'], dtype=object)

Name: Species, dtype: object

1 from sklearn import metrics

DecisionTrees's Accuracy: 0.9777777777777777

Visualizing the Decision Tree

1 import matplotlib.pyplot as plt

[Text(0.5, 0.9, 'PetalLengthCm <= 2.45\nentropy = 1.585\nsamples = 150\nvalu

Text(0.4230769230769231, 0.7, 'entropy = 0.0\nsamples = 50\nvalue = [50, 0,

Text(0.5769230769230769, 0.7, 'PetalWidthCm <= 1.75\nentropy = 1.0\nsamples

Text(0.3076923076923077, 0.5, 'PetalLengthCm <= 4.95\nentropy = 0.445\nsamp

Text(0.15384615384615385, 0.3, 'PetalWidthCm <= 1.65\nentropy = 0.146\nsamp

Text(0.07692307692307693, 0.1, 'entropy = 0.0\nsamples = 47\nvalue = [0, 4

Text(0.23076923076923078, 0.1, 'entropy = 0.0\nsamples = 1\nvalue = [0, 0,

Text(0.46153846153846156, 0.3, 'PetalWidthCm <= 1.55\nentropy = 0.918\nsamp

Text(0.38461538461538464, 0.1, 'entropy = 0.0\nsamples = 3\nvalue = [0, 0,

Text(0.5384615384615384, 0.1, 'entropy = 0.918\nsamples = 3\nvalue = [0, 2,

Text(0.8461538461538461, 0.5, 'PetalLengthCm <= 4.85\nentropy = 0.151\nsamp

Text(0.7692307692307693, 0.3, 'SepalLengthCm <= 5.95\nentropy = 0.918\nsamp

Text(0.6923076923076923, 0.1, 'entropy = 0.0\nsamples = 1\nvalue = [0, 1,

Text(0.8461538461538461, 0.1, 'entropy = 0.0\nsamples = 2\nvalue = [0, 0,

Text(0.9230769230769231, 0.3, 'entropy = 0.0\nsamples = 43\nvalue = [0, 0,

You might also like