0% found this document useful (0 votes)

487 views6 pages

Decision Tree Basics for Data Analysts

Decision trees are a flow-like tree structure that uses conditional logic to perform predictive analysis. They have internal nodes that represent test conditions, branches that represent test outcomes, and leaf nodes that represent class labels. Decision trees can be used for both classification and regression problems, and are often called CART models. They work by splitting nodes into sub-nodes based on evaluating features' entropy and information gain to determine the optimal attribute to use for splitting at each step.

Uploaded by

Pallavi Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

487 views6 pages

Decision Tree Basics for Data Analysts

Uploaded by

Pallavi Patil

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Decision Tree Notes:

● Decision tree as the name suggests it is a flow like a tree structure that

works on the principle of conditions.

● It is efficient and has strong algorithms used for predictive analysis.

● It has mainly attributed that include internal nodes, branches and a

terminal node.

● Every internal node holds a “test” on an attribute, b

ranches hold the

conclusion of the test and every leaf node means the class label.

● It is used for both classifications a

s well as r egression. It is often

termed as “CART” that means classification and regression tree.

● Tree algorithms are always preferred due to stability and reliability.

● A decision tree makes decisions by splitting nodes into sub-nodes.

Example: various examples

Last Part:

● Node splitting, or simply splitting, is the process of dividing a node into multiple

sub-nodes to create relatively pure nodes.

● There are multiple ways of doing this, which can be broadly divided into two

categories based on the type of target variable

Categorical Target Variable

● Entropy(Gini Impurity)

● Information Gain

● Chi-Square

The algorithm can be summarized as :

1. At each stage (node), pick out the best feature as the test condition.
2. Now split the node into the possible outcomes (internal nodes).

3. Repeat the above steps till all the test conditions have been exhausted into leaf
nodes.

When you start to implement the algorithm, the first question is: ‘How to pick the
starting test condition?’

Choose best features or attributes for split

The answer to this question lies in the values of ‘Entropy’ and ‘Information Gain’.
Let us see what are they and how do they impact our decision tree creation.

Entropy: Entropy in Decision Tree stands for homogeneity. If the data is
completely homogenous, the entropy is 0, else if the data is divided (50-50%)
entropy is 1.

Information Gain: Information Gain is the decrease/increase in Entropy value

when the node is split.

● An attribute should have the highest information gain to be selected for

splitting.
● Based on the computed values of Entropy and Information Gain, we
choose the best attribute at any particular step.

Stream Processing - Hands-On With Apache Flink (Giannis Polyzos) (Z-Library)
No ratings yet
Stream Processing - Hands-On With Apache Flink (Giannis Polyzos) (Z-Library)
234 pages
Structured Programming Exam
No ratings yet
Structured Programming Exam
2 pages
Importance of Water Cycle
100% (1)
Importance of Water Cycle
8 pages
Gardner - Property & Theft' Notes
No ratings yet
Gardner - Property & Theft' Notes
4 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
16 pages
Unit of Analysis
No ratings yet
Unit of Analysis
56 pages
Unit-4-Unit-4-Bda EDIT
No ratings yet
Unit-4-Unit-4-Bda EDIT
16 pages
Decision Tree Algorithms Guide
No ratings yet
Decision Tree Algorithms Guide
49 pages
Unit 4 Sample Problems and Solutions
No ratings yet
Unit 4 Sample Problems and Solutions
34 pages
Data Mining - Density Based Clustering
No ratings yet
Data Mining - Density Based Clustering
8 pages
Chartered Data Scientists Curriculum 2020 PDF
No ratings yet
Chartered Data Scientists Curriculum 2020 PDF
4 pages
21csc205p Dbms Unit IV
No ratings yet
21csc205p Dbms Unit IV
66 pages
K-Means Clustering Guide
100% (1)
K-Means Clustering Guide
12 pages
MSC IT Syllabus
93% (15)
MSC IT Syllabus
69 pages
Data MIning & Data Warehousing-TCS-31
No ratings yet
Data MIning & Data Warehousing-TCS-31
2 pages
Unit 1 Bda Complete Notes
No ratings yet
Unit 1 Bda Complete Notes
15 pages
Smart Traffic Management System Using IOT and Machine Learning Approach
No ratings yet
Smart Traffic Management System Using IOT and Machine Learning Approach
6 pages
Unit 4
No ratings yet
Unit 4
4 pages
Decision Tree Algorithm in Machine Learning
No ratings yet
Decision Tree Algorithm in Machine Learning
17 pages
3.7.YARN - Failures in Classic MapReduce
No ratings yet
3.7.YARN - Failures in Classic MapReduce
5 pages
Data Mining-Outlier Analysis
No ratings yet
Data Mining-Outlier Analysis
6 pages
Flajolet-Martin Algorithm for Distinct Count
No ratings yet
Flajolet-Martin Algorithm for Distinct Count
23 pages
BigData Hadoop Notes
No ratings yet
BigData Hadoop Notes
101 pages
CC Module 5
No ratings yet
CC Module 5
26 pages
Intro to Exploratory Data Analysis
No ratings yet
Intro to Exploratory Data Analysis
17 pages
RTRP Lab Project
No ratings yet
RTRP Lab Project
13 pages
UNIT 3 (Chapter 2) Pandas
No ratings yet
UNIT 3 (Chapter 2) Pandas
43 pages
Product Metrics
100% (1)
Product Metrics
15 pages
Hadoop Ecosystem Components Guide
No ratings yet
Hadoop Ecosystem Components Guide
19 pages
Unit V Big Data Analytics
No ratings yet
Unit V Big Data Analytics
47 pages
Future Academy Machine Learning Brochure
No ratings yet
Future Academy Machine Learning Brochure
14 pages
Classification: Decision Tree Induction: Lecture #9
No ratings yet
Classification: Decision Tree Induction: Lecture #9
121 pages
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
No ratings yet
UNIT-04: Introduction To Data Mining: Data Mining Techniques KDD Process Association Rules.
40 pages
Unit IV Testing Pune University SRES COE
No ratings yet
Unit IV Testing Pune University SRES COE
94 pages
Single Layer Perceptron
No ratings yet
Single Layer Perceptron
6 pages
Decision Tree Induction
No ratings yet
Decision Tree Induction
52 pages
Unit 1
No ratings yet
Unit 1
139 pages
7 Types of Neural Network Activation Functions
No ratings yet
7 Types of Neural Network Activation Functions
16 pages
Features of MapReduce
No ratings yet
Features of MapReduce
4 pages
SE Unit 4 - Part 2
No ratings yet
SE Unit 4 - Part 2
9 pages
Big Data Aktu Unit 3
No ratings yet
Big Data Aktu Unit 3
90 pages
Mean Stack Technologies-Module II - Angular JS, Mongodb
No ratings yet
Mean Stack Technologies-Module II - Angular JS, Mongodb
6 pages
Fruits & Vegetable Classification and Calories Measurement System
No ratings yet
Fruits & Vegetable Classification and Calories Measurement System
2 pages
Unit 2 - Bda Notes
No ratings yet
Unit 2 - Bda Notes
37 pages
Neural Networks for CS Students
100% (1)
Neural Networks for CS Students
22 pages
DBMS Unit 2
No ratings yet
DBMS Unit 2
48 pages
Unit-3 (Mongo DB)
No ratings yet
Unit-3 (Mongo DB)
47 pages
Data Mining for CSE Students
No ratings yet
Data Mining for CSE Students
11 pages
Unit - Iv: Machine Learning (ML) For Iot
No ratings yet
Unit - Iv: Machine Learning (ML) For Iot
17 pages
Stepwise Project Planning Outline of Step Wise Project Planning
No ratings yet
Stepwise Project Planning Outline of Step Wise Project Planning
12 pages
Machine Learning Model Evaluation
No ratings yet
Machine Learning Model Evaluation
11 pages
3 Normalization
No ratings yet
3 Normalization
16 pages
Data Modelling and Visualization
No ratings yet
Data Modelling and Visualization
31 pages
APP - Unit 3
No ratings yet
APP - Unit 3
112 pages
Adversarial Search in AI Games
No ratings yet
Adversarial Search in AI Games
30 pages
ML Notes Updated
No ratings yet
ML Notes Updated
60 pages
Decision Tree
No ratings yet
Decision Tree
35 pages
Unit IV Decision Trees
No ratings yet
Unit IV Decision Trees
37 pages
Supervised Decision TreeRandom Forest
No ratings yet
Supervised Decision TreeRandom Forest
39 pages
Decision Tree in Machine Learning
No ratings yet
Decision Tree in Machine Learning
11 pages
Trees
No ratings yet
Trees
78 pages
Decision Tree Basics for Data Scientists
No ratings yet
Decision Tree Basics for Data Scientists
61 pages
State Common Entrance Test Cell, Maharashtra State, Mumbai
No ratings yet
State Common Entrance Test Cell, Maharashtra State, Mumbai
1 page
Experimental Skills: That Really Validate Something That The Theory Says Right
No ratings yet
Experimental Skills: That Really Validate Something That The Theory Says Right
31 pages
Experiment No. 2 Web Page Using HTML5 Title: Objective
No ratings yet
Experiment No. 2 Web Page Using HTML5 Title: Objective
34 pages
"Math Workshop for Science & Engineering"
No ratings yet
"Math Workshop for Science & Engineering"
5 pages
Object Oriented Analysis and Design
No ratings yet
Object Oriented Analysis and Design
83 pages
Kaveri B Kari: Sanjeevan Engineering & Technology Institute
No ratings yet
Kaveri B Kari: Sanjeevan Engineering & Technology Institute
1 page
TITLE-Introduction To R and Installation of R: Experiment No.1
No ratings yet
TITLE-Introduction To R and Installation of R: Experiment No.1
4 pages
"Smart Bus Pass System Using QR Code": Computer Science and Engineering
80% (5)
"Smart Bus Pass System Using QR Code": Computer Science and Engineering
46 pages
Experimental Skills
No ratings yet
Experimental Skills
15 pages
GIS Mid Sem Solution Updated
100% (1)
GIS Mid Sem Solution Updated
8 pages
1 PDF
No ratings yet
1 PDF
249 pages
76 PDF
No ratings yet
76 PDF
2 pages
Android Based Leave Management System: Special Issue - 5 International Conference - "ACCE - 2019"
No ratings yet
Android Based Leave Management System: Special Issue - 5 International Conference - "ACCE - 2019"
6 pages
Ificial Intelligence & Machine Learning: FROM: 09-06-2020 at 6.00 - 7.30 PM
No ratings yet
Ificial Intelligence & Machine Learning: FROM: 09-06-2020 at 6.00 - 7.30 PM
11 pages
From Vertices To Fragments: Rasterization: Frame Buffer
No ratings yet
From Vertices To Fragments: Rasterization: Frame Buffer
22 pages
Student Support and Progression
No ratings yet
Student Support and Progression
1 page
CSE Online Lecture Time Table
No ratings yet
CSE Online Lecture Time Table
1 page
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
No ratings yet
Dr. Babasaheb Ambedkar Technological University, Lonere, Raigad
2 pages
Shivaji University Kolhapur
100% (1)
Shivaji University Kolhapur
8 pages
Dr. Babasaheb Ambedkar Technologiocal University, Lonere
No ratings yet
Dr. Babasaheb Ambedkar Technologiocal University, Lonere
6 pages
An Overall Survey of Extractive Based Automatic Text Summarization Methods
No ratings yet
An Overall Survey of Extractive Based Automatic Text Summarization Methods
6 pages
Computer Graphics Software Standards
No ratings yet
Computer Graphics Software Standards
7 pages
Department of Computer Science: Tutorial Material Computer Graphics
No ratings yet
Department of Computer Science: Tutorial Material Computer Graphics
89 pages
An Overview of Computer Graphics Industry Standards.
No ratings yet
An Overview of Computer Graphics Industry Standards.
12 pages
Eureka: Real Estate App with GPS
No ratings yet
Eureka: Real Estate App with GPS
12 pages
The Graphics Pipeline and Opengl I:: Transformations!
No ratings yet
The Graphics Pipeline and Opengl I:: Transformations!
77 pages
Introduction To Research: Multi-Faculty
No ratings yet
Introduction To Research: Multi-Faculty
1 page
Revolver - April 2017
No ratings yet
Revolver - April 2017
8 pages
TNPSC Developer All Poets
No ratings yet
TNPSC Developer All Poets
15 pages
Nigerian Business and Economy Studies
100% (1)
Nigerian Business and Economy Studies
21 pages
Classification of Art by Critics and Artist
No ratings yet
Classification of Art by Critics and Artist
4 pages
Items Description of Module: Subject Name Paper Name Module Title Module Id Pre-Requisites Objectives Keywords
No ratings yet
Items Description of Module: Subject Name Paper Name Module Title Module Id Pre-Requisites Objectives Keywords
8 pages
Nguyen and Peschard
No ratings yet
Nguyen and Peschard
29 pages
Awesome Review
No ratings yet
Awesome Review
11 pages
MCAT Critical Analysis and Reasoning Skills Review New For MCAT 2015
No ratings yet
MCAT Critical Analysis and Reasoning Skills Review New For MCAT 2015
319 pages
EFU Life Insurance
No ratings yet
EFU Life Insurance
14 pages
Transformer Training for Engineers
100% (2)
Transformer Training for Engineers
8 pages
FIBA Referee Training Guide
No ratings yet
FIBA Referee Training Guide
156 pages
Basic Japanese Free Learning Guide Lesson 1.5
No ratings yet
Basic Japanese Free Learning Guide Lesson 1.5
3 pages
2308 Ngọc
No ratings yet
2308 Ngọc
14 pages
Mathematics 5-Q3-W2
No ratings yet
Mathematics 5-Q3-W2
19 pages
Volume 50 Easter 2011
No ratings yet
Volume 50 Easter 2011
24 pages
Piazzolla
No ratings yet
Piazzolla
3 pages
Demand Forecasting Guide
No ratings yet
Demand Forecasting Guide
12 pages
Dissertation Outline Mixed Methods
100% (2)
Dissertation Outline Mixed Methods
8 pages
Class 2 Bridge Course
No ratings yet
Class 2 Bridge Course
6 pages
Solar Lottery Philip K Dick Instant Download
No ratings yet
Solar Lottery Philip K Dick Instant Download
41 pages
Human Resource Management Thesis Philippines
100% (2)
Human Resource Management Thesis Philippines
6 pages
Contemporary Social Issues of Tamil Nadu
No ratings yet
Contemporary Social Issues of Tamil Nadu
8 pages
Entrep - Branding
No ratings yet
Entrep - Branding
13 pages
Sesión 1 - Fascículo
No ratings yet
Sesión 1 - Fascículo
12 pages
Test 51
No ratings yet
Test 51
8 pages
(Reading Certificate) Egemen Türedi 16 Oct 2025
No ratings yet
(Reading Certificate) Egemen Türedi 16 Oct 2025
2 pages
Catalogo de Decapodos
No ratings yet
Catalogo de Decapodos
394 pages
Lionel Bekier, in The Matter of Jonathan Bekier, Infant v. Bettina Srour Bekier, in The Matter of Jonathan Bekier, Infant, Defendant, 248 F.3d 1051, 11th Cir. (2001)
No ratings yet
Lionel Bekier, in The Matter of Jonathan Bekier, Infant v. Bettina Srour Bekier, in The Matter of Jonathan Bekier, Infant, Defendant, 248 F.3d 1051, 11th Cir. (2001)
7 pages

Decision Tree Basics for Data Analysts

Uploaded by

Decision Tree Basics for Data Analysts

Uploaded by

Decision Tree Notes:

works on the principle of conditions.

● It is efficient and has strong algorithms used for​ predictive analysis​.

● It has mainly attributed that include​ ​internal nodes, branches and a

● Every​ internal node​ holds a ​“test” ​on an attribute, b

● It is used for both​ classifications a

termed as​ “​CART”​ that means classification and regression tree.

● Tree algorithms are always preferred due to stability and reliability.

● A decision tree makes decisions by splitting nodes into sub-nodes.

Example: various examples

sub-nodes to create relatively pure nodes.

categories based on the type of target variable

Categorical Target Variable

The algorithm can be summarized as :

Choose best features or attributes for split

Information Gain: Information Gain is the ​decrease/increase in Entropy value

● An attribute should have the highest information gain to be selected for

You might also like

● It is efficient and has strong algorithms used for predictive analysis.

● It has mainly attributed that include internal nodes, branches and a

● Every internal node holds a “test” on an attribute, b

● It is used for both classifications a

termed as “CART” that means classification and regression tree.

Information Gain: Information Gain is the decrease/increase in Entropy value