UNIT-V: TREES
Basic Tree Concepts:
One of the disadvantages of using an array or linked list to store data is
the time necessary to search for an item. Since both the arrays and
Linked Lists are linear structures the time required to search a linear
list is proportional to the size of the data set. For example, if the size of
the data set is n, then the number of comparisons needed to find (or not
find) an item may be as bad as some multiple of n. So imagine doing the
search on a linked list (or array) with n = 10 6 nodes. Even on a machine
that can do million comparisons per second, searching for m items will
take roughly m seconds.
This is not acceptable in todays world where speed at which we complete
operations is extremely important. Time is money. Therefore it seems that
better (more efficient) data structures are needed to store and search
data.
Hence, we can extend the concept of linked data structure (linked list,
stack, queue) to a structure that may have multiple relations among its
nodes. Such a structure is called a tree.
A tree is a collection of nodes connected by directed (or undirected)
edges. A tree is a nonlinear data structure, compared to arrays, linked
lists, stacks and queues which are linear data structures. A tree can be
empty with no nodes or a tree is a structure consisting of one node called
the root and zero or one or more subtrees.
A tree has following general properties:
One node is distinguished as a root;
Every node (exclude a root) is connected by a directed edge from
exactly one other
node; A direction is: parent -> children
A is a parent of B, C, D,
B is called a child of A.
on the other hand, B is a parent
of E, F, K
In this figure, the root has 3
subtrees.
Each node can have arbitrary number of children. Nodes with no children
are called leaves, or external nodes. In the above picture, C, E, F, L, G
are leaves. Nodes, which are not leaves, are called internal nodes.
Internal nodes have at least one child.
Nodes with the same parent are called siblings. In the picture, B, C, D
are called Siblings. The depth of a node is the number of edges from
the root to the node. The depth of K is 2. The height of a node is the
number of edges from the node to the deepest leaf. The height of B is 2.
The height of a tree is a height of a root.
What is a Tree?
Non-linear data structure
Hierarchical arrangement of data
Has components named after natural trees
root
branches
leaves
Drawn with root at the top
Components of a Tree
Node : stores a data element
Parent : single node that directly precedes a node
all nodes have 1 parent except root (has 0)
Child:
one or more nodes that directly follow a node
Ancestor
: any node which precedes a node
itself, its parent, or an ancestor of its parent
Descendent: any node which follows a node
itself, its child, or a descendent of its child
Terminology
Root no parent
Leaf no child
Interior non-leaf
Height distance from root to leaf
Root node
Interior nodes
Leaf nodes
Height
More Tree Terminology
Leaf (external) node: node with no
children
Internal node: non-leaf node
Siblings: nodes which share same parent
Subtree: a node and all its descendents
Ignoring the nodes parent, this is itself a tree
Ordered tree: tree with defined order of
children
enables ordered traversal
Binary tree: ordered tree with up to two
children per node
Level and Depth
node (13)
degree of a node
leaf (terminal)
A 1
3
nonterminal
parent
B 2 1 C 2
2
3 D
children
2
sibling
degree of a tree (3)
2 E 3 0 F 3 0 G 31 H 3 0 I 3 0 J
ancestor
level of a node
height of a tree (4) 0 K 4 0 L 4
0 M
4
Level
1
2
Nodes
A node may contain a value or a condition or represents a separate data structure or a
tree of its own. Each node in a tree has zero or more child nodes, which are below it in
the tree (by convention, trees grow down, not up as they do in nature). A node that has
a child is called the child's parent node (or ancestor node, or superior). A node has at
most one parent. The height of a node is the length of the longest downward path to a
leaf from that node. The height of the root is the height of the tree. The depth of a node
is the length of the path to its root (i.e., its root path).
Root nodes
The topmost node in a tree is called the root node. Being the topmost node, the root
node will not have parents. It is the node at which operations on the tree commonly
begin (although some algorithms begin with the leaf nodes and work up ending at the
root). All other nodes can be reached from it by following edges or links. (In the formal
definition, each such path is also unique). In diagrams, it is typically drawn at the top. In
some trees, such as heaps, the root node has special properties. Every node in a tree
can be seen as the root node of the subtree rooted at that node.
Leaf nodes
Nodes at the bottommost level of the tree are called leaf nodes. Since they are at the
bottommost level, they do not have any children.
Internal nodes
An internal node or inner node is any node of a tree that has child nodes and is thus not
a leaf node.
Subtrees
A subtree is a portion of a tree data structure that can be viewed as a complete tree in
itself. Any node in a tree T, together with all the nodes below it, comprise a subtree of T.
The subtree corresponding to the root node is the entire tree; the subtree corresponding
to any other node is called a proper subtree (in analogy to the term proper subset).
Binary Trees
A special class of trees: max degree for each node
is 2
Recursive definition: A binary tree is a finite set of
nodes that is either empty or consists of a root and
two disjoint binary trees called the left subtree and
the right subtree.
Any tree can be transformed into binary tree.
by left child-right sibling representation
Binary Trees
A binary tree is a tree in which no node can have
more than two subtrees.
A node can have zero, one, or two subtrees. These
subtrees are known as the left subtree and right
subtree. The below figure shows a binary tree with its
subtree. Each subtree is itself a binary tree.
A
B
C
Left subtree
E
D
Right subtree
A binary tree in which each node has exactly zero or two
children is called a full binary tree. In a full tree, there are
no nodes with exactly one child.
A complete binary tree is a tree, which is completely filled,
with the possible exception of the bottom level, which is filled
from left to right. A complete binary tree of the height h has
between 2h and 2(h+1)-1 nodes. Here are some examples:
Full Binary Tree
Complete Binary Tree
Binary Tree Example
A
B
C
L
M
Binary Tree Traversals
A binary tree traversal requires that each node of
the tree will be processed once and only once in a
predetermined sequence.
The two general approaches to the traversal
sequence are: depth-first and breadth-first.
Depth-first traversal: The processing proceeds along
a path from the root through one child to the
most distant descendent of that first child before
processing a second child.
Breadth-first traversal: The processing proceeds
horizontally from the root to
A all of its children,
then to its childrens children, and so forth until
all nodes have been Bprocessed. E
C
Depth-first Traversals: In a depth-tree traversal, a binary tree consists of a root, a
left subtree, and a right subtree, we cam define six different depth-first traversals
sequences, out of these, the first three are given in a standard name and rest of
the three are unnamed and they can be derived easily. The standard sequences
of these trees are shown in below figure.
2
Left
subtree
3
Right
subtree
Preorder traversal
(NLR)
1
Left
subtree
3
Right
subtree
Inorder traversal (LNR)
Binary Tree Traversals
1
Left
subtree
2
Right
subtree
Postorder traversal
(LRN)
Preorder Traversal (NLR)
In the preorder traversal, the root node is
processed first, followed by the left
subtree, and then the right subtree.
Algorithm preOrder (val root <node pointer>
if (root is not null)
1. process (root)
2. preOrder (root->leftSubtree)
3. preOrder (root->rightSubtree)
endif
return
end preOrder
Binary tree traversals:A
E
B
C
First we process the root A. After the root, we process the left
subtree. To process the left subtree, we first process its root
the root, B, then its left subtree and right subtree in order.
When Bs left and right subtrees have been processed in
order, we are then ready to process As right subtree, E. To
process the subtree E, we first process the root and then the
left subtree and right subtree. Because there is no left
subtree, we continue immediately with the right subtree,
which completes the tree.
Inorder Traversal (LNR)
The inorder traversal processes the left subtree
first, then the root node, and finally the right
subtree. The root is processed in between the
subtrees.
Algorithm inOrder (val root <node pointer>
if (root is not null)
1. inOrder (root->leftSubtree)
2. process (root)
3. inOrder (root->rightSubtree)
endif
return
end inOrder
The left subtree must be processed first, we trace from the root to the
leftmost leaf node before processing any nodes. After processing the left
subtree, C, we process its parent node, B. We are now ready to process the
right subtree, D. Processing D completes the processing of the roots left
subtree, and we are now ready to process the root, A, followed by its right
subtree. Because the right subtree, E, has no left child, we can process its
root immediately followed by its right subtree, F.
Postorder Traversal (LRN)
It processes the root node after (post) the left
and right subtrees have been processed. It
start by locating the leftmost leaf and
processing it. It then processes its right
sibling, including its subtrees (if any). Finally, it
process the root node.
Algorithm postOrder (val root <node pointer>
if (root is not null)
1. postOrder (root->leftSubtree)
2. postOrder (root->rightSubtree)
3. process (root)
endif
return
end postOrder
Breadth-First Traversals
The processing proceeds horizontally from
the root to all of its children, then to its
childrens children, and so forth until all
nodes have been processed. In other
words, in the breadth-first traversals, each
level is completely processed before the
next level is started. To traverse a tree in
A
breadth-first order, we use
a queue.
E
A
B
F
C
D
BE
CDF
Algorithm breadthFirst
Pointer = root
Loop (pointer not null)
Process (pointer)
(val root <node pointer>
If (pointer->left not null)
enqueue (pointer->left)
End if
If (pointer-> right not null)
enqueue (right pointer)
If (not emptyQueue)
dequeue (pointer)
Else
pointer =null
end if
end loop
return
end breadthFirst
Expression Trees
Expression trees are the application of binary trees. An
expression is a sequence of tokens that follow
prescribed rules. A token may be either an operand or
an operator. The standard algorithmic operators are: +,
-, *, /.
The properties of expression trees are:
1. Each leaf is an operand
2. The root and internal nodes are operators
3. Subtrees are sub expressions, with its expression trees.
Expression trees operations:
The binary expression trees represents three traversal
operations: infix, postfix and prefix. The inorder
traversal produces the infix expression, the post order
traversal produces the postfix expression and the
preorder traversal produces the prefix expression.
Infix Traversal
To print the infix expression tree, we must add an opening parenthesis at the beginning of
each expression and a closing parenthesis at the end of each expression. Because the root
of the tree and each of its subtree represent a sub expression, we print the opening
parenthesis when we start a tree or subtree and closing parenthesis when we have
processed all of its children. The pseudocode algorithm for the infix expression traversal
tree is:
Algorithm infix (val tree <tree pointer>)
if (tree not empty)
if (tree-> token is an operand)
print (tree-token)
else
print (open parenthesis)
Infix (tree->left)m
print (tree->token)
infix (tree->right)
print (close parenthesis)
end if
end if
return
end infix
Postfix
The postfix expression is an expression
traversal
postorder traversal of any binary tree.
uses the basic
This expression
does not require parenthesis. The pseudocode algorithm
is:
algorithm postfix (val tree <tree pointer>)
if (tree not empty)
postfix (tree->left)
postfix (tree->right)
print (tree->token)
end if
return
end postfix
Prefix
It uses the standard preorder tree traversal.
traversal
expression does not require parenthesis.
pseudocode algorithm is:
algorithm prefix (val tree <tree pointer>)
if (tree not empty)
print (tree->token)
prefix (tree->left)
prefix (tree->right)
end if
return
end prefix
This
The
General
A general tree is a tree in which each node can have an unlimited out
degree. Each node may have as many children as is necessary to
Trees
satisfy its requirements. Although general trees have little use in
computer science, they are commonly found in user applications.
Changing General Tree to Binary Tree:
The binary format can be adopted by changing the meaning of the left
and right pointers. In a general tree, we can use two relationships:
parent to child and sibling to sibling. Using these two relationship,
we can represent any general tree as a binary tree.
Insertions into General Tree:
To insert a node into a general tree, the user must supply the parent of
the node. Given the parent, three different rules may be used: 1)
FIFO insertion, 2) LIFO insertion, and 3) key-sequenced insertion.