BINARY SEARCH TREES
Finding the successor of a node x
Two cases
Case 1: If node x has a right child, then x’s successor is the left-most descendant of
this right child
Case 2:If node x doesn’t have a right child, then x’s successor is x’s most recent
(lowest) ancestor whose left child is also an ancestor of x
E
Q
Y
L
X
P
Keep going up
until you meet
a left link
If z is a leaf (i.e., has no children), all we have to do is insert a NULL pointer in the node
that USED to be z’s parent
If z has only one child, we just “splice out” or bypass node z (z’s parent will now refer to
z’s [only] child, rather than z)
If z has two children, then we find its successor y, and replace z’s data with y’s data. The
remainder of z’s original right subtree becomes y’s new right subtree, and z’s original left
subtree becomes y’s new left subtree
INSERTION SORT
• Good for sorting small lists of keys
Best case: the array is already sorted O(n)
Worst case: the array is in reverse order O( n2)
MERGESORT
Run Time: O(N lg N)
Run Time: Θ(N)
HUFFMAN ENCODING
“THIS IS AN EXAMPLE OF A HUFFMAN TREE
{T, H, I, S, space, A, N, E, X, M, P, L, O, F, U, R}
…
The original “alphabet” for this message consists of 16 characters (2 4 =
16 4 bits)
7 + (4 × 2) + 3 + (2 × 6) + (1 × 6) = 36 × 4 bits = 144 bits
HUFFMAN DECODING
We add padding bits to fill the void at the end
If we need padding bits at all, then we need somewhere
between 1 and 7 of them, if we're using 8-bit bytes (1 – 3 if
using 4-bit “bytes”)
We may not need any padding bits at all (if our encoded
message happens to be some multiple of 8 bits long – that fits
perfectly)
COUNTING SORT
If k >> n, this isn’t a good sort
Suppose we were sorting ten 32-bit unsigned integers, and we don’t
know anything about their range. With ten values, n = 10
Absent of any knowledge about the range of the values, we must
assume k = 232 4 billion+
The A and B arrays [1..n] would have 10 values
The C array [1..k] would have 4 billion+ values
The O(k) loops would dominate by 8 orders of magnitude!
RADIX SORT
QUICKSORT
first
x=
A[p]
i=p
for j = p + 1 to r
exchange A[i] with A[p]
return i
1. Best / Average case (random data): O(n lg n)
2. Worst case (already sorted data): O(n2)
1 2 3 4 5 6 7 8 9 10 (already sorted data)
This would be partitioned into an empty lower “half” and nearly
everything
else in the “high” half
Each partitioning would make n – 1 comparisons, but would not
divide the list
in half!
n passes of n – 1 comparisons each → n2 behavior!
3. Divide-and-conquer approach (recursive)
4. Can use lots of stack space (recursive calls)
5. Not good when there are a significant number of duplicates
HEAPSORT (Combines best of Merge Sort (O(n lg n)) & Insertion Sort (In-
place sort))
1. A nearly-full binary tree
The tree is full, except for the right part of the lowest level
2. Not a Binary Search Tree (BST)
We don’t go down through the tree comparing keys to find
information
The BST property does NOT hold on a heap
3. We get the subscripts for storing the heap in a one-based array
4. The root is stored in A[1]. Heaps are not typically stored as zero-based arrays
i
5. Subscript: Left child = 2i; Right child = 2i+1; Parent = ⌊ ⌋ ; therefore, no pointer is
2
needed
6. Max-Heap property:
Every node i other than the root: A[i].PARENT ≥ A[i]
The value at a node is, at least as large as its child(ren)
Due to recursion, the largest value is always at the root, or A[1]
Runs in O(lg n) time
7. Height of a Heap
Height of a node = Number of edges on the longest downward path from that node to
the leaf
Height of the entire heap = Height of the root
Height of the heap = Θ (lg n)
= Basic operation on heap (proportional to at most tree height)
Find the largest of A[i] and its children (if
either child doesn’t exist, don’t consider it)
largest will be either i, l, or r
If largest is i (not one of i’s children), then
we’re done. Otherwise (if largest is one of
i’s children), swap A[i] with that child
(A[largest]), and then (recursively) MAX-
HEAPIFY from that child down
8. A max-priority queue supports the following operations:
INSERT(S, x) – inserts the element x into the set S
Θ(1)
Θ(1)
O(lg
n)
MAXIMUM(S) – returns the element of S with the largest key (the highest-priority
item in the queue)
Θ(1)
EXTRACT-MAX(S) removes and returns the element of S with the largest key
All else Θ(1)
Only Θ(lg n)
INCREASE-KEY(S, x, k) – increases the value of element x’s key to the new value k,
which is assumed to be at least as large as x’s current key value (i.e., raise x’s priority
to k, maintaining the priority queue’s structure)
Runs O(lg n) but heap height Θ(lg n)
APPLICATION OF TREE
HUFFMAN ENCODING
Motivation: compression
The contents of a node are only meaningful at a leaf; internal nodes are of little
consequence
The path to each leaf is of great importance
Path used to encode (and compress) data
Allows recoding a fixed "alphabet" with variable-length bit strings
BINARY SEARCH TREE
For data lookup
The data stored in every node is important; the paths are of no consequence
An in-order traversal will output the values in increasing (or decreasing) order
Can support the (7) dynamic set operations:
• SEARCH: O(h) operations, where h is the height of the tree
• INSERT: requires a SEARCH (O(h)), plus a couple of pointer changes to attach the
new node, so it’s O(h)
• DELETE: requires Search (O(h)), and may require Successor (O(h)), and a constant
number of pointer changes, so it, too, is O(h)
• PREDECESSOR and SUCCESSOR: require SEARCH (O(h)) and may require MIN /
MAX (O(h)), so they’re O(h)
• MINIMUM and MAXIMUM: require walking from the root to the left/right-most leaf
(which depends on the height of the tree), so they’re O(h)
The height of a tree containing a given number of nodes (n) depends on how well-
balanced the tree is. If the tree is complete, lg(n); if it degenerates into a linked list, n.
Completely full trees will have 2h – 1 nodes; 2h-1 leaves; 2h-1 – 1 internal.
AVL TREES (h ≤ (~1.4404 lg N))
1. Tree is not maintained with perfect balance; however, it does stay “nearly balanced” or
"balanced enough" (-1 BF +1).
2. BF = height of the node’s LEFT subtree - height of the node’s RIGHT subtree
3. Properties of AVL Trees
• Time to insert: O(h)
• Time to search: O(h)
• Time to find successor / predecessor: O(h)
• Time to delete: O(h)
• Time to find min / max: O(h)
RED-BLACK TREES
1. It’s still a BST, and we keep it “mostly balanced”.
2. Uses a single bit for “color” (every node is either “red” or “black”) and some very
special rules.
3. The leaves are always empty – our data resides in the internal nodes.
Furthermore, it simplifies some of the coding (and saves space) if all leaves are the same
empty node, which Cormen calls T.NIL. The root’s parent is also T.NIL.
4. The “Ground Rules”:
i. Every node is colored either “Red” or “Black”
ii. The root is (always) black
iii. Every leaf, T.NIL, is always black
iv. If a node is red, then both of its children are black (hence, there can be no two
consecutive red nodes on any path from root to leaf)
v. For every node, the number of black nodes between that node and the leaves is the
same
5. The height (h) at a node is the number of edges (links) between the node and the
leaves along the longest path
6. The black height (bh) at a node X is the number of black nodes (including T.NIL)
along the path from X to the leaves, not counting X.
7. SEARCH, SUCCESSOR, PREDECESSOR, MIN, and MAX work just like any other
BST, and are all O(h) O(lg n)
8. INSERT and DELETE – not so easy
• O(lg n) time to run RBT-INSERT up to the call of RBT-INSERT-FIXUP
• Within RBT-INSERT-FIXUP :
• Each iteration takes O(1) time
• Each iteration is either the last one or it moves z up 2 levels
• O(lg n) levels. O(lg n) time
• Also note that there are at most 2 rotations overall
• Thus, insertion into a red-black tree takes O(lg n) time
RBT Insertion – The Three cases:
Case 1: z is red, z.p is red, & z’s uncle y is red
• Re-color z, z’s parent, and z’s uncle (no rotation)
• Move z up two levels (z z’s parent’s parent), and try again as long as z’s parent is red
Case 2: z and z.p are both red, z’s uncle y is black, z is a right child
• Do a left rotation around z, making z a left child. Takes us immediately to…
Case 3: z and z.p are both red, z’s uncle y is black, z is a left child
• Recolor z’s parent and grandparent
• Do a right rotation around z’s grandparent
GRAPHS
Notation: A graph G has sets of both vertices and edges. The edges connect the vertices.
Any vertex can have any number (zero or more) of edges connecting it to other vertices
Vertices are sometimes referred to as nodes
In an undirected graph, the edges are bidirectional.
In a directed graph, some of the edges are one-way (directional), and some are two-way
(bidirectional). Arrows indicate the direction(s) of the edges. A bidirectional edge is
counted as two edges in a directed graph.
The degree of a node is the number of edges connected to it.
In Euler’s graph, Vertex W is of degree 5; nodes N, E, and S are of degree 3 (Note: All
four nodes have odd degrees)
Euler proved that for there to be a path that crosses every bridge exactly once, there
must be no more than two vertices of odd degree
If there’s one vertex of odd degree, it must be either the starting OR ending point
of the path
If there are two odd degrees, then they must be the starting AND ending points
If there are more than two nodes of odd degree, then such a path does not exist
In a directed graph, we also refer to the in-degree (the number of edges coming into
the node) and the out-degree (outbound edges) of a node
The degree of a node in a directed graph is the sum of the two
The number of edges is denoted by |E|, and the number of vertices is |V|.
The search time might be Θ(E+V), which means Θ(|E|+ |V|).
The vertex set of a graph G is G.V, and the edge set of a graph G is G.E
Given graph G = (V, E):
There are two common ways to represent a graph for algorithms:
Adjacency lists – a listing of the edges from each vertex, to each vertex
Better for sparse graphs, where there are not a large number of edges (it’s
more compact)
Adjacency matrix – a |V| x |V| matrix that has a 1 where there’s an edge between a
pair of vertices, a 0 if not
Preferred for graphs where there are many edges (dense graphs, where |E| → |
V2|)
Adjacency List
An array Adj of |V| lists, one per vertex
Directed and undirected graphs: Vertex u’s list, Adj[u], has all vertices v such that there’s
an edge (u, v) E
If edges have weights, we can put the weights in the lists: Weight: w : E → R
Disadvantage: there has to be a search algorithm through the list of a vertex to see if there
is an edge to another vertex.
Advantage: Less space to store than a matrix
Directed graph: There is a total of |E| items across all of the adjacency lists
Undirected graph: There are 2|E|
Adjacency Matrix
Undirected graph:
Symmetric about its diagonal
the adjacency matrix equals its transpose
(A = AT )
A fully-connected or complete graph is one in which every vertex has an edge to every other
vertex.
Breadth-First Search (BFS)
• A shortest-path algorithm that works on unweighted graphs (graphs in which all edges are
assumed to have a weight of 1)
u. =
s.
NIL =
NIL
v. = u
BFS for Undirected Graph
BFS for Directed Graph
By adding the red arrow instruction in the algorithm as shown above:
Depth-First Search (DFS)
Goes “deeper” before it goes “wider”
Doesn’t start from a particular vertex (but still hits all vertices and edges)
Each vertex v has two timestamps – one, v.d, is when we discover the vertex; the other,
v.f, is when we are finished with the vertex
To accomplish this, each vertex is one of three colors:
• White – we haven’t discovered that vertex yet (time < v.d)
• Gray – vertex has been discovered, but we're not finished exploring beyond it (v.d <
time < v.f)
• Black – we're finished with that vertex (and everything beyond it) (time > v.f)
Run Time: Θ(V + E)
Note: In the boxed graph, initially D = Grey, D does not have any adjacent white vertex, so
color it black, increment of time 4 5, then pop back up the recursion (same concept for F).
Each edge (u, v) can be classified by the color of v when the edge is first explored:
• If v is white: (u, v) is a tree edge
• If v is gray: (u, v) is a back edge
• If v is black: (u, v) is either a forward or a cross edge
If u.d < v.d then (u, v) is a forward edge
If u.d > v.d then (u, v) is a cross edge
Topological Sort
Directed, Acyclic Graph (must be acyclic to do a linear ordering)
Linear ordering (a listing) of its vertices V such that for every edge (u, v), u appears
(somewhere) before v in the ordered list
Partial ordering – there’s not one single sequence in which all of the events have to
occur, but some events have to precede some others
Not unique
Greedy Algorithm
Locally-optimized decisions
i.e., they make the most advantageous choice at any given point, without regard for
any possible past or future knowledge about the problem
As such, the most appealing choice at the moment may turn out, in the long run, to be a
poor choice
Greedy algorithms never back up and “change their mind” based on what they later learn
The Huffman tree-building algorithm is an example of a greedy algorithm
It always picks the two remaining items with the lowest weights to select next for
merging into a new subtree
In the case of Huffman, the greedy approach does happen to build the optimal
encoding tree (as Huffman proved)
Example: Count out 36 cents in change
Total change = 36 cents (QUARTER, DIME, PENNY); we’re done with 3 coins.
SPANNING TREES
Given a connected, undirected graph G = (V, E), and a weight w(u, v) for each edge (u, v)
E
We want to find a subset of E, which we’ll call T (i.e., T E), such that
is minimized
T is acyclic, undirected, and it connects all of the vertices, it must form a (single) tree,
which “spans” the graph, making it a spanning tree (directly from the DFS algorithm)
Tree T minimizes the total weights is called the minimum spanning tree, precisely called
the “minimum-weight spanning tree”.
Greedy algorithms are not always guaranteed to find the optimal solutions, but for
minimum spanning tree problems, they do always find an optimal solution
Note: The” MST may not be unique, but there is a single minimum weight, which is 37. This
graph has no spanning tree with a weight < 37.
Minimum Spanning Tree (MST) Properties
It has |V| – 1 edges
It has no cycles
It might not be unique for a given graph
i.e., more than one tree with the minimum weight may exist
We can use the loop invariant (A is a subset of some MST) to show that this generic
algorithm works
Initialization: The empty set trivially satisfies the loop invariant
Maintenance: Since we add only safe edges, A remains a subset of some MST
Termination: All edges added to A are in an MST, so when we stop, A is a spanning
tree that is also an MST
Kruskal’s Algorithm
In Kruskal’s algorithm, the set A is a forest (multiple trees). The safe edge we add always
connects two distinct trees.
Kruskal’s algorithm depends on the existence of few set-based functions:
MAKE-SET(x) creates a set containing the single item x
FIND-SET(x) looks through the sets it is maintaining and determines which set x
belongs to
UNION(u, v) merges two sets (one containing u and the other containing v) into one
set (u)
Negative-Weight Edges
If we follow edge (d, c), that takes the distance from s to c to 8 via p = s, c, d, c,
with weights of 5 + 6 – 3 = 8, which is not a new minimum, so we don’t consider it.
This cycle has a net weight of 6 – 3 = +3, so following the cycle only lengthens the
path.
If we follow (s, e) and then (e, f), we have a total distance from s to f of 5. But if we
follow (f, e), then the distance from s to e drops to –1. Following (e, f) and then (f, e)
again will drop the distance further to –4. We’re looking for the shortest-distance
path, so following the cycle, which has a net weight of 3 – 6 = – 3, does minimize the
total. But we get stuck in it, taking the total path weight to –. Thus the shortest-
distance path from s to e, f, and g is –.
Despite the fact that {h, i, j} contains a negative-weight cycle (net weight: –3), the
fact that nodes h, i, and j are all unreachable from s means that the distance from s to
each of {h, i, j} is +.
Cycles
Clearly, reachable, negative-weight cycles can’t be part of a shortest-length path
Similarly, positive-weight cycles would not be taken, as following the cycle would only
raise the path’s length above the minimum
Zero-weight cycles can be removed without changing the net path length, so they
effectively don’t raise or lower the total
Therefore, we restrict our shortest-path discussion to paths with no cycles
In the BELLMAN-FORD algorithm, edges can be relaxed many times
DIJKSTRA’s algorithm relaxes each edge exactly once
BELLMAN-FORD Algorithm
Disadvantages: This algorithm fails for negative-weight cycles.
Run Time: O(E × (V-1)) = O(VE)
Relax the edges for V – 1
times. For this graph 5 – 1
= 4 times.
Reduce v.d if v.d > u.d +
Check every edge (u, v) to see if v.d > u.d + w(u, v)
Should be v.d < u.d + w(u, v)
If after 4 times, v.d still reduces, Bellman-Ford
algorithm fails, and has negative edge cycles.
Should be d[v] < d[u] +
w(u,v)
DIJKSTRA’s Algorithm
Edge weights must be 0
Weighted version of BFS
Instead of a FIFO queue, uses a priority queue
Keys are shortest-path weights (v.d)
Two sets of vertices:
S: vertices whose final shortest-path weights have been determined, and
Q: a priority queue containing {V – S} (all of the vertices in the graph except those in
S)
The running time depends on implementation of priority queue
If binary heap, each operation takes O(lg V) time, and there are E of them, so O(E lg
V)
If a Fibonacci heap: O(V lg V + E)
All Pairs Shortest Paths
All-source shortest paths (i.e., the shortest path from every vertex to every other vertex)
We could use the single-source algorithms |V| times
If there are no negative weights, we can use Dijkstra’s algorithm
If there are negative weights, we can’t use Dijkstra’s algorithm at all
If we use BELLMAN-FORD, the run-time is O(V 2E). If the graph is dense, then |E|
approaches |V 2|, so the run time becomes O(V 4)
Floyd-Warshall Algorithm
B-Tree
B-Trees raise the branching factor, lowering the height.
When we have t keys, we have t + 1 children
To search the tree:
• Values < k1 follow c1
• Values between k1 and k2 follow c2
• Values between k2 and k3 follow c3
• Values > k3 follow c4
Most of the nodes have
more than one key. Number of children per node = the number of keys in
the node + 1
Eg: In a node consists of
2 keys: D & H
Insert the new key into an existing leaf node, rather than appending a new leaf node (as
we do with BST/ AVL/ RBT)
If that node is currently full, The node has to be split into two nodes, both of which
are guaranteed to have some new empty space, into which the new key will be
guaranteed to fit
B-TREE-SPLIT-CHILD splits the node y along its median, so half the keys can go in
one node, the other half in the other node
The median key is denoted y.keyt
The two nodes become the children of the old node y’s parent, and y’s original median
key y.keyt is moved up (“promoted”) into y’s parent
There is a possibility that y’s parent is full, so the tree-split routine has to be run
again, possibly all the way back up the tree to (and including) the root
TIME COMPLEXITY
1. INSERTION SORT
Best Case: O(N)
Worst Case: O(N2)
2. MERGE: Θ(N)
Divide: Just compute q as the average of p and r ⇒ D(n) = O(1)
3. MERGE SORT: O(N lg N)
Conquer: Recursively solve 2 subproblems, each of size n/2 ⇒
Combine: MERGE on an n-element subarray takes O(n) time ⇒ C(n)
2T(n/2)
= O(n)
4. HUFFMAN TREE: O(N log N)
5. COUNTING SORT: O(2N + 2k) ≈ O(N + k) ≈ O(N)
6. RADIX SORT: O(d * N) ≈ O(N)
7. QUICKSORT
Average Case: O (N lg N)
Worst Case (Already Sorted): O(N2)
8. HEAPSORT: O(n lg n)
Height of Heap: Θ (lg n)
MAX-HEAPIFY: Θ (lg n)
HEAP-MAXIMUM: Θ(1)
HEAP-EXTRACT-MAX: All else is Θ(1), except MAX-HEAPIFY
HEAP-INCREASE-KEY: O(lg n), may move new value up heap to root;
heap height is Θ(lg n)
MAX-HEAP-INSERT: O(lg n)
9. AVL Trees
Height of tree (h): Tree is complete = lg(n), Tree degenerated into a
linked list = n
insert: O(h)
search: O(h)
find successor / predecessor: O(h)
delete: O(h)
find min / max: O(h)
10. RED-BLACK TREE
INSERT and DELETE: O(lg n)
11. BREADTH FIRST SEARCH (BFS): O(V+E)
O(V) because every vertex enqueued at most once
O(E) because every vertex dequeued at most once and we examine
edge (u, v) only when u is dequeued
Therefore, every edge examined at most once if directed, at most
twice if undirected
Either way, that’s still O(E)
12. DEPTH FIRST SEARCH (DFS): Θ(V+E)
Θ, not just O, since it is guaranteed to examine every vertex and
every edge
13. TOPOLOGICAL SORT ON A DAG (Directed Acyclic Graph): O(V
+ E)
14. KRUSKAL Algorithm: O(E lg V)
15. INITIALIZE-SINGLE-SOURCE
Exists in the BELLMAN FORD and DIJKSTRA’s) Algorithms: (V)
16. BELLMAN FORD Algorithms
Single-source algorithms: O(V * E)
Single-source algorithms |V| times: O(V2E)
If the graph is dense, then |E| approaches |V 2|, so the run time
becomes O(V 4)
17. DIJKSTRA’s Algorithm
Binary heap: O(E lg V) [Each operation takes O(lg V) time, and there
are E of them]
Fibonacci heap: O(V lg V + E)
18. Floyd-Warshall Algorithm: (n3)
19. B-Tree
Search, insert, delete: O(log N)