WWW.KVRSOFTWARES.BLOGSPOT.
COM
Scanned by CamScanner
WWW.KVRSOFTWARES.BLOGSPOT.COM
Scanned by CamScanner
WWW.KVRSOFTWARES.BLOGSPOT.COM
Scanned by CamScanner
WWW.KVRSOFTWARES.BLOGSPOT.COM
Scanned by CamScanner
WWW.KVRSOFTWARES.BLOGSPOT.COM
Scanned by CamScanner
Chapter 12
Digital Search Structures
Digital Search Trees
Binary Tries and Patricia
Multiway Tries
C-C Tsai P.1
Digital Search Tree
A digital search tree is a binary tree in which
each node contains one element.
Assume fixed number of bits.
Not empty =>
Root contains one dictionary pair (any pair).
All remaining pairs whose key begins with a 0
are in the left subtree.
All remaining pairs whose key begins with a 1
are in the right subtree.
Left and right subtrees are digital subtrees on
remaining bits.
C-C Tsai P.2
1
Example of Digital Search Tree
Start with an empty digital search tree and insert a
pair whose key is 0110.
0110
Now, insert a pair whose key is 0010.
0110
0010
Now, insert a pair whose key is 1001.
0110
0010 1001
C-C Tsai P.3
Example
Now, insert a pair whose key is 1011.
0110
0110
0010 1001
0010 1001
Now, insert a pair whose key is 0000. 1011
0110
0010 1001
C-C Tsai 0000 1011 P.4
2
Search/Insert/Delete
Complexity of each operation is O(#bits in a key).
#key comparisons = O(height).
Expensive when keys are very long.
0110
0010 1001
0000 1011
C-C Tsai P.5
Applications of Digital Search Trees
Analog of radix sort to searching.
Keys are binary bit strings.
Fixed length – 0110, 0010, 1010, 1011.
Variable length – 01, 00, 101, 1011.
Application – IP routing, packet
classification, firewalls.
IPv4 – 32 bit IP address.
IPv6 – 128 bit IP address.
C-C Tsai
3
Binary Trie
Information Retrieval.
At most one key comparison per operation,
search/insert/delete.
A Binary trie (pronounced try) is a binary tree
that has two kinds of nodes: branch nodes
and element nodes. For fixed length keys,
Branch nodes: Left and right child pointers. No
data field(s).
Element nodes: No child pointers. Data field to
hold dictionary pair.
C-C Tsai P.7
Example of Binary Trie
0 1
0 0 1
1100
0 1 0
0001 0011
0 1
1000 1001
At most one key comparison for a search.
C-C Tsai P.8
4
Variable Key Length
Left and right child fields.
Left and right pair fields.
Left pair is pair whose key terminates at root of
left subtree or the single pair that might otherwise
be in the left subtree.
Right pair is pair whose key terminates at root of
right subtree or the single pair that might
otherwise be in the right subtree.
Field is null otherwise.
C-C Tsai P.9
Example of Variable Key Length
0 null
0 1
00 01100 10 11111
0 0
0000 001 1000 101
1
00100 001100
At most one key comparison for a search.
C-C Tsai P.10
5
Fixed Length Insert
Insert 0111. 0 1
0 0 1
1
1100
0 1 0111 0
0001 0011
0 1
1000 1001
Zero compares.
C-C Tsai P.11
Fixed Length Insert
Now, Insert 1101 0 1
0 0 1
1
1100
0 1 0111 0
0001 0011
0 1
1000 1001
C-C Tsai P.12
6
Fixed Length Insert
1100
Insert 1101 0 1
0 0 1
1
0 1 0111 0 0
0001 0011
0 1 0
1000 1001
C-C Tsai P.13
Fixed Length Insert
Inserted 1101. 0 1
0 0 1
1
0 1 0111 0 0
0001 0011
0 1 0 1
1000 1001 1100 1101
One compare.
C-C Tsai P.14
7
Fixed Length Delete
Now, Delete 0111. 0 1
0 0 1
1
0 1 0111 0 0
0001 0011
0 1 0 1
1000 1001 1100 1101
C-C Tsai P.15
Fixed Length Delete
0 1
0 0 1
0 1 0 0
0001 0011
0 1 0 1
1000 1001 1100 1101
Delete 0111. One compare.
C-C Tsai P.16
8
Fixed Length Delete
Now, Delete 1100. 0 1
0 0 1
0 1 0 0
0001 0011
0 1 0 1
1000 1001 1100 1101
C-C Tsai P.17
Fixed Length Delete
0 1
0 0 1
0 1 0 0
0001 0011
0 1 1
1000 1001 1101
Delete 1100.
C-C Tsai P.18
9
Fixed Length Delete
1101
0 1
0 0 1
0 1 0 0
0001 0011
0 1
1000 1001
Delete 1100.
C-C Tsai P.19
Fixed Length Delete
1101
0 1
0 0 1
0 1 0
0001 0011
0 1
1000 1001
Delete 1100.
C-C Tsai P.20
10
Fixed Length Delete
0 1
0 0 1
1101
0 1 0
0001 0011
0 1
1000 1001
Delete 1100. One compare.
C-C Tsai P.21
Fixed Length Join(S,m,B)
Insert m into B to get B’.
S empty => B’ is answer; done.
S is element node => insert S element
into B’; done;
B’ is element node => insert B’ element
into S; done;
If you get to this step, the roots of S and
B’ are branch nodes.
C-C Tsai P.22
11
Fixed Length Join(S,m,B)
S has empty right subtree.
S B’ J(S,B’)
a b c J(a,b) c
J(X,Y) Djoin X and Y, all keys in X < all in Y.
S has nonempty right subtree.
Left subtree of B’ must be empty, because all keys
in B’ > all keys in S.
S B’ J(S,B’)
a b c a J(b,c)
C-C Tsai
Complexity = O(height). P.23
Compressed Binary Tries
No branch node whose degree is 1.
Add a bit# field to each branch node.
bit# tells you which bit of the key to use
to decide whether to move to the left or
right subtrie.
C-C Tsai P.24
12
Example: Binary Trie
1
0 1
2
0 0 1
3
0 1 0 0
0001 4 4
0011
0 1 0 1
1000 1001 1100 1101
bit# field shown in black outside branch node.
C-C Tsai P.25
Example: Compressed Binary Trie
0 1
1
3 2
0 1
0 1
0001 0011
4 4
0 1 0 1
1000 1001 1100 1101
bit# field shown in black outside branch node.
#branch nodes = n – 1.
C-C Tsai P.26
13
Insert
0 1
1
3 2
0 1
0 1
0001 0011
4 4
0 1 0 1
1000 1001 1100 1101
Now, Insert 0010.
C-C Tsai P.27
Example: After Inserting 0010
0 1
1
3 2
0 1
0 1
0001 4
0 1 4 4
0010 0011 0 1 0 1
1000 1001 1100 1101
Now, Insert 0100.
C-C Tsai P.28
14
Example: Insert 0100
1
0 1
2 2
0 1 0 1
3
0100 4 4
0 1
0 1 0 1
0001 4 1000 1001 1100 1101
0 1
0010 0011
C-C Tsai P.29
Delete
0 1
1
2 2
0 1 0 1
3
0100 4 4
0 1
0 1 0 1
0001 4 1000 1001 1100 1101
0 1
0010 0011 Now, Delete 0010.
C-C Tsai P.30
15
Example: After Deleting 0010
0 1
1
2 2
0 1 0 1
3
0100 4 4
0 1
0 1 0 1
0001 0011 1000 1001 1100 1101
C-C Tsai
Now, Delete 1001. P.31
Example: After Deleting 1001
0 1
1
2 2
0 1 0 1
3
0100
1000 4
0 1
0 1
0001 0011
1100 1101
C-C Tsai P.32
16
Patricia
Practical Algorithm To Retrieve Information
Coded In Alphanumeric.
All nodes in Patricia structure are of the same
data type (binary tries use branch and element
nodes).
Pointers to only one kind of node.
Simpler storage management.
Uses a header node that has zero bitNumber.
Remaining nodes define a trie structure that is the
left subtree of the header node. (right subtree is
not used)
Trie structure is the same as that for the
compressed binary trie.
C-C Tsai
Node Structure
bit# LC Pair RC
bit# = bit used for branching
LC = left child pointer
Pair = dictionary pair
RC = right child pointer
C-C Tsai P.34
17
Compressed Binary Trie To Patricia
0 1
1
3 2
0 1
0 1
0001 0011
4 4
0 1 0 1
1000 1001 1100 1101
Move each element into an ancestor or header node.
C-C Tsai P.35
Example:
Compressed Binary Trie To Patricia
0 0001
0 1101
3 1
0011 2
0 1001
1 0 1
4 4
1000 1100
1
0 1
C-C Tsai
0 P.36
18
Insert
Insert 0000101 0
0000101
0
Insert 0000000 0000101
5
0000000
C-C Tsai P.37
Insert
0
Now, Insert 0000010 0000101
5
0000000
0
Inserted 0000010 0000101
5
0000000
6
0000010
C-C Tsai P.38
19
Insert
0
Insert 0001000 0000101 0
0000101
5
0000000 4
0001000
6
0000010 5
0000000
6
0000010
C-C Tsai P.39
Insert
0
Insert 0000100 0000101
4
0001000
5
0000000
6
0000010
C-C Tsai P.40
20
Insert
0
Insert 0001010 0000101
4
0001000
5
0000000
6 7
0000010 0000100
C-C Tsai P.41
Insert
0
Inserted 0001010 0000101
4
0001000
5 6
0000000 0001010
6 7
0000010 0000100
C-C Tsai P.42
21
Delete
Let p be the node that contains the
dictionary pair that is to be deleted.
Case 1: p has one self pointer.
Case 2: p has no self pointer.
C-C Tsai P.43
p Has One Self Pointer
p = header => trie is now empty.
Set trie pointer to null.
p != header => remove node p and update
pointer to p.
p p
0001000 0000000
C-C Tsai P.44
22
p Has No Self Pointer
Let q be the node that has a back pointer to p.
Node q was determined during the search for
the pair with the delete key k.
p
0001000
Blue pointer could
be red or black.
q
y
C-C Tsai P.45
p Has No Self Pointer
Use the key y in node q to find the unique node
r that has a back pointer to node q.
p
0001000
q
y
r
z
C-C Tsai P.46
23
p Has No Self Pointer
Copy the pair whose key is y to node p.
p
0001000
y
q
y
r
z
C-C Tsai P.47
p Has No Self Pointer
Change back pointer to q in node r to point to
node p.
p
0001000
y
q
y
r
z
C-C Tsai P.48
24
p Has No Self Pointer
Change forward pointer to q from parent(q) to
child of q.
p
0001000
y
q
y
r
z Node q now has been
removed from trie.
C-C Tsai P.49
Multiway Tries
Key = Social Security Number.
441-12-1135
9 decimal digits.
10-way trie (order 10 trie).
0 1 2 3 4 5 6 7 8 9
Height <= 10.
C-C Tsai
25
Social Security Trie
10-way trie
Height <= 10.
Search: <= 9 branches on digits plus 1 compare.
100-way trie
441-12-1135
Height <= 6.
Search: <= 5 branches on digits plus 1 compare.
C-C Tsai P.51
Social Security AVL & Red-Black
Red-black tree
Height <= 2log2109 ~ 60.
Search: <= 60 compares of 9 digit numbers.
AVL tree
Height <= 1.44log2109 ~ 40.
Search: <= 40 compares of 9 digit numbers.
Best binary tree.
Height = log2109 ~ 30.
C-C Tsai P.52
26
Compressed Social Security Trie
char# = character/digit used for branching.
Equivalent to bit# field of compressed binary trie.
#ptr = # of nonnull pointers in the node.
0 1 2 3 4 5 6 7 8 9
char# #ptr
C-C Tsai P.53
Insert
Insert 012345678. 012345678
Insert 015234567. 3 2 5
012345678 015234567
3: The 3rd digit is used for branching
Null pointer fields not shown.
C-C Tsai P.54
27
Insert
Insert 015231671. 3 2 5
012345678 015234567
C-C Tsai P.55
Insert
Insert 079864231. 3 2 5
6 1 4
012345678
015231671 015234567
C-C Tsai P.56
28
Insert
Insert 012345618. 2 1 7
3 2 5
079864231
6 1 4
012345678
015231671 015234567
C-C Tsai P.57
Insert
Insert 011917352. 2 1 7
3 2 5
079864231
1 4
8 1 7 6
012345678
012345618 015231671 015234567
C-C Tsai P.58
29
Insert
2 1 7
31 2 5
079864231
011917352
1 4
8 1 7 6
012345678
012345618 015231671 015234567
C-C Tsai P.59
Delete
Delete 011917352. 2 1 7
31 2 5
079864231
011917352
1 4
8 1 7 6
012345678
012345618 015231671 015234567
C-C Tsai P.60
30
Delete
Delete 012345678. 2 1 7
3 2 5
079864231
1 4
8 1 7 6
012345678
012345618 015231671 015234567
C-C Tsai P.61
Delete
Delete 015231671. 2 1 7
3 2 5
079864231
1 4
6
012345618
015231671 015234567
C-C Tsai P.62
31
Delete
2 1 7
3 2 5
079864231
012345618 015234567
C-C Tsai P.63
Variable Length Keys
Problem arises only when one key is a (proper) prefix of
another.
Insert 0123 3 2 5
1 4 6
012345678
015231671 015234567
C-C Tsai P.64
32
Variable Length Keys
Add a special end of key character (#) to each key to
eliminate this problem.
Insert 0123 3 2 5
1 4 6
012345678
015231671 015234567
C-C Tsai P.65
Variable Length Keys
Insert 0123 3 2 5
4 # 1 4 6
5
012345678 0123 015231671 015234567
End of key character (#) not shown.
C-C Tsai P.66
33
Tries With Edge Information
Add a new field (element) to each branch
node.
New field points to any one of the element
nodes in the subtree.
Use this pointer on way down to figure out
skipped-over characters.
C-C Tsai P.67
Example
3 2 5
5 4 # 1 4
6
012345678 0123 015231671 015234567
element field shown in blue.
C-C Tsai P.68
34
Trie Characteristics
Expected height of an order m trie is ~logmn.
Limit height to h (say 6). Level h branch nodes point
to buckets that employ some other search structure
for all keys in subtrie.
Switch from trie scheme to simple array when
number of pairs in subtrie becomes <= s (say s=6).
Expected # of branch nodes for an order m trie when n is
large and m and s are small is n/(s ln m).
Sample digits from right to left (instead of from left
to right) or using a pseudorandom number
generator so as to reduce trie height.
C-C Tsai P.69
Multibit Tries
Variant of binary trie in which the number of bits
(stride) used for branching may vary from node to
node.
Proposed for Internet router applications.
Variable length prefixes.
Longest prefix match.
Limit height by choosing node strides.
Root stride = 32 => height = 1.
Strides of 16, 8, and 8 for levels 1, 2, and 3 => only 3
levels.
C-C Tsai P.70
35
Multibit Trie Example
S =1 0 null
S =2 0 1 S =1
000 001 010 011 10 11
0 1
00 01 10 11
C-C Tsai P.71
Multibit Tries
Node whose stride is s uses s bits for branching.
Node has 2s children and 2s element/prefix fields.
Prefixes that end at a node are stored in that node.
Short prefixes are expanded to length represented by
node.
When root stride is 3, prefixes whose length is < 3 are
expanded to length 3.
P = 00* expands to P0 = 000* and P1 = 001*.
If Q = 000* already exists P0 is eliminated because Q
represents a longer match for any destination.
C-C Tsai P.72
36