Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit bb93abd

Browse files
committed
feat($tree): add <tree> chapter
add sections of <avl-tree>, <red-black-tree> and <heap> accompanied by overview
1 parent 5fb35e7 commit bb93abd

7 files changed

Lines changed: 220 additions & 0 deletions

File tree

images/binary_heap.png

11.6 KB
Loading

images/build_max_heap.png

76.8 KB
Loading

images/build_max_heap_1.png

1.87 KB
Loading

tree/avl-tree.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
# Adelson-Velsky and Landis (AVL) Tree
2+
3+
[AVL Tree][avl-tree] is a type of [balanced binary search tree](../searching/binary-search.md) which is early invented data structure in binary search solutions. It requires a stable property that the absolute difference between the _height_ of the _left_ subtree of _root_ and the _height_ of the _right_ subtree of _root_ is no larger than 1.
4+
5+
## Height Guarantee
6+
7+
_Formal Claim_:
8+
9+
For every [AVL Tree][avl-tree] with &nscr; nodes, its _height_ &hscr; &les; 2 &sdot; log(n).
10+
11+
_**Proof**_:
12+
13+
define _N<sub>h</sub>_ is the minimum number of nodes in an AVL Tree of _height_ h, then
14+
15+
_N<sub>h</sub>_ = 1 + _N<sub>h-1</sub>_ + _N<sub>h-2</sub>_, _N<sub>h</sub>_ > 2 _N<sub>h-2</sub>_
16+
17+
Obviously, n = &Theta;(2<sup>h/2</sup>) (given that _N<sub>h</sub>_ = n, _N<sub>0</sub>_ = 1, _N<sub>1</sub>_ = 2), h < 2 &sdot; log(n)
18+
19+
_A more subtle claim_:
20+
21+
a [AVL Tree][avl-tree] has most height of 1.44 &sdot; log(n + 2).
22+
23+
_**Proof**_:
24+
25+
_N<sub>h</sub>_ has a lower bound &Omega;(k<sup>h</sup>), then
26+
27+
substituting _c &sdot; k<sup>h</sup>_ with both sides of the recurrence relation above yields:
28+
29+
c &sdot; k<sup>h</sup> &les; 1 + c &sdot; k<sup>h-1</sup> + c &sdot; k<sup>h-2</sup> = &Tau;(n)
30+
31+
in which to find out a _c, h<sub>0</sub>_ such that for every _h_ > _h<sub>0</sub>_, the inequality above holds.
32+
33+
Divided both sides by c &sdot; k<sup>h-2</sup>:
34+
35+
k<sup>2</sup> &les; k<sup>2-h</sup>/c + k + 1
36+
37+
wherein the term k<sup>2-h</sup> becomes relatively small when h grows very large, then the inequality will be satisfied if value of k is smaller than the solution to the equation: k<sup>2</sup> = k + 1, which is the [golden ratio](https://en.wikipedia.org/wiki/Golden_ratio) &phi; = 1.618... &ap; 1.62
38+
39+
Hence, n = &Omega;(&phi;<sup>h</sup>), conversely, h = &Omicron;(log<sub>&phi;</sub>n) &ap; 1.44 &sdot; log (n + 2)
40+
41+
## Performance Comparisons
42+
43+
comparing with [Red-Black Tree](red-black-tree.md), [AVL Tree][avl-tree] has a better lookup performance given its relative shorter _height_ of the tree: 1.44 &sdot; log(n + 2) < 2 &sdot; log(n + 1);
44+
45+
While the Red-Black Tree requires less costs during tree adjustments when INSERTION or DELETION operations happen than the AVL Tree does;
46+
47+
In conclusion, the AVL Tree is performant if used in scenarios where the number of element lookup dominates the number of element updates.
48+
49+
[avl-tree]: #avl-tree

tree/heap.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Heap
2+
3+
[Heap](#heap) is a widely adopted data structure in various computing applications such as priority queues, heap sort<sup>[[1]](https://en.wikipedia.org/wiki/Heap_(data_structure))</sup> and so on.
4+
5+
Inherited from general [tree](overview.md) structure, a heap is either a [max-heap or a min-heap](#max-heap-and-min-heap). Normally, the heap is referred as binary heap, theoretical tree-like, which could be stored in either a static or dynamic structure such as a static array, tree nodes.
6+
7+
It is mostly favored to use a static array structure given that INSERTION operations of new entries always happen in a row from left-side of the tree to the right-side of the tree, which means the heap structure is nearly a [complete binary tree](overview.md) fully filled except the leaf level. Figuratively,
8+
9+
<figure style="text-align:center">
10+
<img src="../images/binary_heap.png" />
11+
<figcaption>Figure 1. Binary Heap in a Static Array Implementation</figcaption>
12+
</figure>
13+
14+
New elements will be promptly appended to the end of the array and entry removal only happens at the _root_. And these operations will incur the [heap maintenance steps](#heap-property-maintenance) to ensure its status as a [max-heap or min-heap](#max-heap-and-min-heap)
15+
16+
_Note: There are certain operations to take such as PARENT(i) = i/2, which is the index of node i's parent; LEFT(i) = 2i, which is the index of node i's left child; RIGHT(i) = 2i + 1, which is the index of node i's right child_.
17+
18+
## Max Heap and Min Heap
19+
20+
In a max heap, the key of a node is larger than or equal to the keys of its children. The largest element is stored in root. Specifically, given an array _A_ for entry storage:
21+
22+
_A[PARENT(i)]_ &ges; _A[i]_
23+
24+
Similarly, a min heap will have keys of nodes are smaller than their children. The smallest entry is stored in root. Then,
25+
26+
_A[PARENT(i)]_ &les; _A[i]_
27+
28+
In a [Heap Sort](../sorting/heap-sort.md) algorithm, the max heap is chosen while the min heap is generally common in building a [priority queue](http://pages.cs.wisc.edu/~vernon/cs367/notes/11.PRIORITY-Q.html).
29+
30+
_Note: only max heap data structure is used in further discussions_.
31+
32+
## Heap Property Maintenance
33+
34+
A MAX-HEAPIFY is a critical process for heap property maintenance. Given an array A and an index i, assuming LEFT(i) and RIGHT(i) are both max heap. Then, MAX-HEAPIFY is called upon if A[i] smaller than its children in order to adjust the position of A[i] in the total heap to maintain the overall max heap property.
35+
36+
It is worth noted that MAX-HEAPIFY operation should only be performed where a single heap property violation happens. In a top-down fashion,
37+
38+
<pre>
39+
<code>
40+
MAX_HEAPIFY(A, i)
41+
l = left(i)
42+
r = right(i)
43+
if l &les; heap-size(A) and A[l] > A[i]
44+
largest = l
45+
else
46+
largest = i
47+
if r &les; heap-size(A) and A[r] > A[largest]
48+
largest = r
49+
if largest &ne; i
50+
swap A[i] and A[largest]
51+
MAX_HEAPIFY(A, largest)
52+
</code>
53+
</pre>
54+
55+
_Note: recursive calls happen because after swapping the current index i entry with left or right child, the corresponding right or left sub-tree could have a new heap property violation_.
56+
57+
The operations before each call of MAX_HEAPIFY take constant time, and the total number of calls on MAX_HEAPIFY is bounded by &Omicron;(_h_), wherein _h_ is the height of the heap. Therefore, the time complexity of MAX_HEAPIFY operation is &Omicron;(log(n)) for a n-entry heap.
58+
59+
## Build a Heap
60+
61+
Given an unordered inputs _A_ stored in a static array structure, build a max heap from it involves iterative calls to MAX_HEAPIFY operation. Specifically,
62+
63+
```
64+
BUILD_MAX_HEAP(A)
65+
for i = length(A)/2 to 1
66+
MAX_HEAPIFY(A, i)
67+
```
68+
69+
wherein all leaves of the heap are between the index length(A)/2 + 1 to length(A). The overall process is a bottom-up fashion and generate the max heap regardless of the number of heap property violations.
70+
71+
### Algorithm Analysis
72+
73+
At the bottom level of the heap, there are 2<sup>h</sup> nodes with each cost none for the heapify operation; at the level above the bottom, there are 2<sup>h-1</sup> nodes with each cost most 1 swapping for the heapify operation, and so on. Figuratively,
74+
75+
<figure style="text-align:center">
76+
<img src="../images/build_max_heap.png" />
77+
<figcaption>Figure 2. Build Max Heap Total Work</figcaption>
78+
</figure>
79+
80+
Then, at the level j, there are 2<sup>h-j</sup> nodes with each cost most j swappings for the heapify operation. Counting them up,
81+
82+
<figure style="text-align:center">
83+
<img src="../images/build_max_heap_1.png" />
84+
</figure>
85+
86+
By [infinite geometric series](https://en.wikipedia.org/wiki/Geometric_series#Proof_of_convergence), the sum of j/2<sup>j</sup> converges to 2; thus,
87+
88+
&Tau;(n) &les; 2<sup>h+1</sup> = n + 1 = &Omicron;(n)
89+
90+
Obviously, the operation must access each of the inputs during heap building and a more tighter bound will be &Theta;(n).
91+
92+
## Additional References
93+
94+
1. Data Structures: Heaps. https://www.youtube.com/watch?v=t0Cq6tVNRBA
95+
96+
2. Lecture Notes: Heapsort analysis. http://www.cs.umd.edu/~meesh/351/mount/lectures/lect14-heapsort-analysis-part.pdf

tree/overview.md

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
# Tree
2+
3+
When talking about [Tree](#tree) data structure, we normally omit the term _**free**_ that a _free tree_ is an acyclic, connected, undirected [graph](../graph-algorithms/overview.md). And a possibly disconnected acyclic undirected graph is _forest_.
4+
5+
Theorem. 1. A _G = (V, E)_ is an undirected graph, then
6+
7+
1. _G_ is a _free tree_.
8+
2. any two vertices in _G_ are connected through a single path.
9+
3. _G_ is connected, but removing an edge will result in a disconnected graph.
10+
4. _G_ is connected, acyclic and |_E_| = |_V_| - 1.
11+
5. if adding any single edge to |_E_| will cause the graph to form a cycle.
12+
13+
## Rooted Tree and Ordered Tree
14+
15+
A [Rooted Tree](#rooted-tree-and-ordered-tree) is a _free tree_ that has a virtual topmost node distinguishable from others which called the _root_ of the tree.
16+
17+
The children of a tree node is called the _descendant_ of that node and every children call their parent node the _ancestor_. Child nodes under the same node call each other _siblings_ and node with no children is called _leaf_ (external node), a non-leaf node is _internal node_.
18+
19+
The length of a path from _root_ of a tree to a node x is the _**depth**_ of x in the tree (the _**depth**_ of _root_ is 0). The _**height**_ of a node in a tree is the value of longest path leading from that node to a leaf in the tree; it generalizes to a formula for computing:
20+
21+
&hscr;(x) = max( &hscr;(_left subtree of x_), &hscr;(_right subtree of x_)) + 1
22+
23+
An [ordered tree](#rooted-tree-and-ordered-tree) is a rooted tree in which children of each node are ordered. e.g. a node in a tree has _k_ children, then it has 1<sup>th</sup>, 2<sup>nd</sup>, .. k<sup>th</sup> child.
24+
25+
_Note: most of the tree data structure we use is a **rooted tree** for analysis simplicity._
26+
27+
## Binary and _k_-ary Tree
28+
29+
A [Binary tree](#binary-and-k-ary-tree) is defined on a set of nodes that either contains no nodes or is composed with the _root_ node, its connected left subtree and a corresponding right subtree.
30+
31+
In a binary tree structure, filling all missing children of nodes with empty nodes of no descendants will result in a structure called _**full binary tree**_.
32+
33+
For trees that have more than 2 children per node, every node (includeing the _root_) in such a tree has no more than _k_ number of children. It is termed as a _k_-ary Tree.
34+
35+
A _**complete k-ary tree**_ is a tree in which all leaves have the same _depth_ and all non-leaf nodes have _k_ direct children.
36+
37+
## Table of Contents
38+
39+
* [AVL Tree](avl-tree.md) - a trivial implementation of a balanced binary tree.
40+
* [Red-Black Tree](red-black-tree.md) - a prevalent balanced binary tree implementation.
41+
* [Heap](heap.md) - a special tree structure facilitates various applications such as [priority queue](https://en.wikipedia.org/wiki/Priority_queue) and [heap sort](../sorting/heap-sort.md).

tree/red-black-tree.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Red Black Tree
2+
3+
[Red Black Tree][red-black-tree] is a type of [balanced binary search tree][binary-search-tree]. It adds a bit to stand for the _color_: RED or BLACK, of each node.
4+
5+
_Formal Def_:
6+
7+
A [Red Black Tree][red-black-tree] is a type of [binary search tree][binary-search-tree] that follows the properties below:
8+
9+
1. Each node is either RED or BLACK
10+
2. The _root_ is BLACK
11+
3. All leaf nodes (NIL) are BLACK
12+
4. If a node is RED, both of its children are BLACK
13+
5. Every path from a node to its leaves contains the same number of BLACK nodes (_Def: the number of black nodes from a chosen node &xscr; to a NIL leaf (node &xscr; is not included) is called the **black-height**_ (bh(&xscr;)) of such node, the number of black nodes from **root** to any NIL leaves is the **black-height** of the total red-black tree)
14+
15+
## Height Guarantee
16+
17+
_Formal Claim_:
18+
19+
For every [red-black tree][red-black-tree] with &nscr; nodes, its height &hscr; &les; 2 &sdot; log<sub>2</sub> (n + 1).
20+
21+
_**Proof**_:
22+
23+
First, there are at least 2<sup>bh(x)</sup> - 1 number of internal nodes rooted from node x.
24+
25+
By simple induction, the base case: if height of tree x is 0, then there is 2<sup>0</sup> - 1 = 0; for tree with height more than 0 with both child nodes, there are at least 2<sup>bh(x) - 1</sup> - 1 internal nodes for each child node. Therefore, a rooted tree from x has at least 2 * (2<sup>bh(x) - 1</sup> - 1) + 1 = 2<sup>bh(x)</sup> - 1.
26+
27+
Presumably, for a given tree with height &hscr;, any paths leading from root to leaves have at least half of the nodes are black, in other words, the bh(_root_) &ges; &hscr;/2; then
28+
29+
_n_ &ges; 2<sup>&hscr;/2</sup> - 1, &hscr; &les; 2 log(n+1)
30+
31+
guarantee holds, proof ends.
32+
33+
[red-black-tree]: #red-black-tree
34+
[binary-search-tree]: ../searching/binary-search.md

0 commit comments

Comments
 (0)