Collision Resolution
What are collisions
• When two different keys map to a same value.
• Then both the data points are to be stored in the same location
• Which is not possible.
• Such cases in hashing are called collisions
What happens because of it??
• As the same address can not be used, new location has to be used.
• This impacts the purpose of hashing
• This slows down the operations like insertions and lookups
Collision
Resolution
Open
Chaining
Addressing
Linear Quadratic Double
Linked List
Probing Probing Hashing
Open Addressing
• Once a collision takes place, open addressing or closed hashing
computes new positions using a probe sequence and the next record
is stored in that position.
• When a key is mapped to a particular memory.
• If the location is free
• The data value can be stored in it.
• If the location already has some data
• Other slots are examined systematically in forward direction to find free slots.
• However, if slots are not found, then we have overflow
Linear Probing
• The simplest approach to resolve a collision.
• The hash value is calculated as:
• Where,
• is the probe number, that varies from o to n-1
• n is the size of the array.
• Working:
• Initially value is calculated as
• This will result in original value, if there is no empty slot at that location
• Then,
• This will proceed till i becomes n-1
• Once i reaches and there were no empty slots
• This implies overflow
Consider a hash table of size 10. Using linear probing, insert the
keys 72, 27, 36, 24, 63, 81, 92, and 101 into the table.
• 72 • 63
• (72 mod 10+0)mod 10 • (63 mod 10+0)mod 10
•2 •3
81
• 27 • 81 72
• (27 mod 10+0)mod 10 • (81 mod 10+0) mod 10 63
•7 •1
24
• 36 • 92
• (36 mod 10+0)mod 10 • (92 mod 10 +0)mod 10 92
•6 • (92 mod 10 +1)mod 10 36
• 24 • (92 mod 10 +2)mod 10
• (24 mod 10+0)mod 10 • (92 mod 10 +3)mod 10 27
•4 • 5
• 101 will be stored at index 8
Linear Probing
Primary Clustering
Pros and Cons • As seen if a the slot to be occupied by a key is
• Easy to implement taken, then the next vacant slot is allotted
• Computationally cheap • Subsequent collisions with keys that hash to
• We may need additional the same location or nearby locations will also
markers for deletion and be placed using linear probing, potentially
searching filling up consecutive slots.
• To avoid mistaking deleted • This results in a densely populated slots at a
slot for an empty slot particular region called clusters.
Quadratic Probing
• The hash value is calculated as:
• Where,
• is the probe number, that varies from o to n-1
• Where, and are constants and
• n is the size of the array.
Using quadratic probing, insert the keys 72, 27, 36, 24, 63,
81, 92 into the table. Consider c1=1 and c2=3
• 72 • 63
• (72 mod 10+1*0+3*0)mod 10 • (63 mod 10+1*0+3*0)mod 10
•2 •3
81
• 27 • 81 72
• (27 mod 10+1*0+3*0)mod 10 • (81 mod 10+1*0+3*0) mod 10 63
•7 •1
24
• 36 • 101
• (36 mod 10+1*0+3*0)mod 10 • (101 %10 +1*0+3*0)%10 92
•6 • (101 %10 +1*1+3*1)%10 36
• 24 •5
• (24 mod 10+1*0+3*0)mod 10
27
•4
Quadratic Probing
• This resolves the primary clustering
• Finding an empty slot may need more computations as compared to
Linear probing
• This introduces secondary clustering.
• When there is collision between two keys,
• Then the same probe sequence will be followed.
Double hashing
• In the chance of collisions, we use second, independent hash function
to determine the next slot.
• Where is primary hash function
• i is the probe number, initially set to 0.
• Where is secondary hash function
Using double hashing, insert the keys 72, 27, 36, 24, 63, 81,
92 into the table. Consider l=8
• 72 • 81
• (72%10+0*(72%8))%10 •1
•2
81
• 92
• 27 • (92%10+0*(92%8))%10
72
• (27%10+0*(27%8))%10 • (92%10+1*(92%8))%10 63
•7 • (92%10+2*(92%8))%10
• 0 24
• 36
•6 92
• 24 36
•4
27
• 63
•3
Rehashing
• When the table is nearly full
• The chances of collision increase
• In such cases, the size of hash table is increased (twice or thrice)
• This technique moves the data to new hash table
81
Move them to a new hash table of size 20
72
63
24
82
36
27
46
Chaining
• Each element of the array is a linked list
• If two keys result in same index,
• Then both of them are added in the linked list at the index.
• The new element is added at the beginning