Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
7 views2 pages

Understanding Decision Trees

Decision Trees are supervised machine learning algorithms used for classification and regression, structured as a tree with root, internal, and leaf nodes. They are easy to interpret and can handle both numerical and categorical data, but are prone to overfitting and biased with imbalanced datasets. Popular algorithms include ID3, C4.5, and CART, and they have various applications such as customer churn prediction and medical diagnosis.

Uploaded by

Sanchay Saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views2 pages

Understanding Decision Trees

Decision Trees are supervised machine learning algorithms used for classification and regression, structured as a tree with root, internal, and leaf nodes. They are easy to interpret and can handle both numerical and categorical data, but are prone to overfitting and biased with imbalanced datasets. Popular algorithms include ID3, C4.5, and CART, and they have various applications such as customer churn prediction and medical diagnosis.

Uploaded by

Sanchay Saxena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Understanding Decision Trees in

Machine Learning
1. Introduction
Decision Trees are a type of supervised machine learning algorithm used for classification
and regression tasks. They model decisions and their possible consequences in the form of a
tree structure.

2. Structure of a Decision Tree


A decision tree is composed of the following:

- Root Node: Represents the entire dataset and the first decision to be made.

- Internal Nodes: Represent features based on which the data is split.

- Leaf Nodes: Represent the final output or decision (class or value).

- Branches: Show the outcome of a decision and connect nodes.

3. Types of Decision Trees


- Classification Trees: Output is a class label (e.g., "Yes" or "No").

- Regression Trees: Output is a continuous value (e.g., house price).

4. Key Terminologies
- Gini Impurity: Measures the impurity of a node. Used in classification.

- Entropy & Information Gain: Measure the effectiveness of a feature in classifying data.

- Variance Reduction: Used in regression trees.

- Pruning: Technique to reduce overfitting by removing less significant branches.

5. Advantages of Decision Trees


- Easy to interpret and visualize.

- Handles both numerical and categorical data.

- Requires little data preprocessing.


6. Disadvantages
- Prone to overfitting.

- Small changes in data can lead to different trees.

- Biased with imbalanced datasets.

7. Popular Algorithms
- ID3 (Iterative Dichotomiser 3)

- C4.5 / C5.0

- CART (Classification and Regression Trees)

8. Use Cases
- Customer churn prediction

- Credit scoring

- Medical diagnosis

- Marketing segmentation

9. Decision Tree in Python (Example Code)

from sklearn.tree import DecisionTreeClassifier


from sklearn.datasets import load_iris
from sklearn import tree

iris = load_iris()
clf = DecisionTreeClassifier()
clf = clf.fit(iris.data, iris.target)

tree.plot_tree(clf)

10. Conclusion
Decision Trees are a powerful and interpretable tool in machine learning. Despite their
simplicity, they are the basis of more advanced ensemble methods like Random Forests and
Gradient Boosted Trees.

You might also like