0% found this document useful (0 votes)

45 views16 pages

Hashing

Hashing is a search technique that allows for constant time complexity in searching elements, unlike linear and binary searches which depend on the number of elements. It utilizes a hash table to store data indexed by a hash key generated from a hash function, and handles collisions through methods like separate chaining and open addressing. Various hash functions, such as division, mid-square, folding, and multiplication methods, are used to map keys to indices in the hash table.

Uploaded by

hp1509032014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

45 views16 pages

Hashing

Uploaded by

hp1509032014

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 16

Hashing

In all search techniques like linear search, binary search, the time required to
search an element is depends on the total number of elements in that array/list. In
these search techniques, as the number of elements is increased the time required
to search an element also increased linearly.

Hashing is another approach in which time required to search an element doesn't

depend on the number of element. Using hashing data structure, an element is
searched with constant time complexity. Hashing is an effective way to reduce
the number of comparisons to search an element in a data structure.

Hashing is the process of indexing and retrieving element (data) in a data

structure to provide faster way of finding the element using the hash key.

Here, hash key is a value which provides the index value where the actual data is
likely to store in the data structure.

In this data structure, we use a concept called Hash table to store data. All the data
values are inserted into the hash table based on the hash key value.

Hash key value is used to map the data with index in the hash table. And the hash
key is generated for every data using a hash function. That means every entry in
the hash table is based on the key value generated using a hash function.

Hash Table is defined as follows...

Hash table is just an array which maps a key (data) into the data structure
with the help of hash function such that insertion, deletion and search
operations can be performed with constant time complexity (i.e. O(1)).

Hash tables are used to perform the operations like insertion, deletion and search
very quickly in a data structure. Using hash table concept insertion, deletion and
search operations are accomplished in constant time. Generally, every hash table
make use of a function, which we'll call the hash function to map the data into the
hash table.

Page 1
A hash function is defined as follows...

Hash function is a function which takes a piece of data (i.e. key) as input and
outputs an integer (i.e. hash value) which maps the data to a particular index
in the hash table.

Basic concept of hashing and hash table is shown in the following figure...

Page 2
What is Collision?
Since a hash function gets us a small number for a key which is a big integer or
string, there is possibility that two keys result in same value. The situation where a
newly inserted key maps to an already occupied slot in hash table is called
collision and must be handled using some collision handling technique.

24,19,32,44
Hash table=6 0 to5
0 24
1 19
2 32
3
4
5

k mod 6
24
H(24)=24%6=0
19%6=1
32%6=2
44%6=2

Page 3
How to handle Collisions?

There are mainly two methods to handle collision:

1) Separate Chaining
2) Open Addressing

1. Separate Chaining:
The idea is to make each cell of hash table point to a linked list of records that have
same hash function value.

Page 4
Let us consider a simple hash function as “key mod 7” and sequence of keys as 50,
700, 76, 85, 92, 73, 101.

0 700
1 85 92
2
3 73 101
4
5
6 76

Page 5
Advantages:
1) Simple to implement.
2) Hash table never fills up, we can always add more elements to chain.
3) Less sensitive to the hash function or load factors.
4) It is mostly used when it is unknown how many and how frequently keys may
be inserted or deleted.

Disadvantages:
1) Cache performance of chaining is not good as keys are stored using linked list.
Open addressing provides better cache performance as everything is stored in same
table.
2) Wastage of Space (Some Parts of hash table are never used)
3) If the chain becomes long, then search time can become O(n) in worst case.
4) Uses extra space for links.

2. Open Addressing

Like separate chaining, open addressing is a method for handling collisions. In

Open Addressing, all elements are stored in the hash table itself. So at any point,
size of table must be greater than or equal to total number of keys (Note that we
can increase table size by copying old data if needed).

Insert(k): Keep probing until an empty slot is found. Once an empty slot is found,
insert k.

Search(k): Keep probing until slot’s key doesn’t become equal to k or an empty
slot is reached.

Delete(k): If we simply delete a key, then search may fail. So slots of deleted keys
are marked specially as “deleted”.
Insert can insert an item in a deleted slot, but search doesn’t stop at a deleted slot.

Open Addressing is done following ways:

a) Linear Probing: In linear probing, we linearly probe for next slot. For example,
typical gap between two probes is 1 as taken in below example also.
let hash(x) be the slot index computed using hash function and S be the table size

Page 6
If slot hash(x) % S is full, then we try (hash(x) + 1) % S

If (hash(x) + 1) % S is also full, then we try (hash(x) + 2) % S

If (hash(x) + 2) % S is also full, then we try (hash(x) + 3) % S

..................................................

Let us consider a simple hash function as “key mod 7” and sequence of keys as 50,
700, 76, 85, 92, 73, 101.

H(k)=k mod 7 85%7=1

(H(k)+i) mod

Clustering: The main problem with linear probing is clustering, many consecutive
elements form groups and it starts taking time to find a free slot or to search an
element.

Page 7
H(k)=K mod 10
0 19
H’(k,i)=(h(k)+i) mod 10 (3+1) mod 10 1
2 72
9+1 mod 10=0 3 43
4 23
(2+4)mod 5 135
Keys: 43,135,72,23,99,19,82 6 82
7
43%10=3 8
9 99
135%10=5

72%10=2

23%10=3 O(n)

99%10=9

19%10=9

b) Quadratic Probing We look for i2‘th slot in i’th iteration.

let hash(x) be the slot index computed using hash function.

If slot hash(x) % S is full, then we try (hash(x) + 1*1) % S

If (hash(x) + 11) % S is also full, then we try (hash(x) + 22) % S

If (hash(x) + 22) % S is also full, then we try (hash(x) + 33) % S

H(k)=k mod 10

H’(k,i)=(h(k)+i^2) mod 10

Keys:42,16,91,33,18,27,36,62

c) Double Hashing We use another hash function hash2(x) and look for
i*hash2(x) slot in i’th rotation.

let hash(x) be the slot index computed using hash function.

Page 8
If slot hash(x) % S is full, then we try (hash(x) + 1*hash2(x)) % S

If (hash(x) + 1hash2(x)) % S is also full, then we try (hash(x) + 2hash2(x)) % S

If (hash(x) + 2hash2(x)) % S is also full, then we try (hash(x) + 3hash2(x)) % S

Comparison of above three:

Linear probing has the best cache performance, but suffers from clustering. One
more advantage of Linear probing is easy to compute.

 Quadratic probing lies between the two in terms of cache performance and
clustering.

 Double hashing has poor cache performance but no clustering. Double

hashing requires more computation time as two hash functions need to be
computed.

Open Addressing vs. Separate Chaining

Advantages of Chaining:
1) Chaining is Simpler to implement.
2) In chaining, Hash table never fills up, we can always add more elements to
chain. In open addressing, table may become full.
3) Chaining is Less sensitive to the hash function or load factors.
4) Chaining is mostly used when it is unknown how many and how frequently keys
may be inserted or deleted.
5) Open addressing requires extra care for to avoid clustering and load factor.

Advantages of Open Addressing

1) Cache performance of chaining is not good as keys are stored using linked list.
Open addressing provides better cache performance as everything is stored in same
table.
2) Wastage of Space (Some Parts of hash table in chaining are never used). In
Open addressing, a slot can be used even if an input doesn’t map to it.
3) Chaining uses extra space for links.

Page 9
Rehashing:

Rehashing is a technique in which the table is resized i.e.size of the table is

doubled by creating a new table.

It is preferable if the total size of table is a prime number.

When table is completely full

When insertion fail due to overflow.

Example:

37,90,55,22,17,49,87

Table size=10

H(key)=k mod table size

Page 10
Types of Hash functions

Types of Hash functions

1. Division Method.

2. Mid Square Method.

3. Folding Method.

4. Multiplication Method.

1. Division Method:

This is the most simple and easiest method to generate a hash value. The hash
function divides the value k by M and then uses the remainder obtained.

Formula:

h(K) = k mod M

Here,
k is the key value, and
M is the size of the hash table.

It is best suited that M is a prime number as that can make sure the keys are more
uniformly distributed. The hash function is dependent upon the remainder of a
division.

Example:

k = 12345
M = 95

H(12345)=12345%95

Page 11
h(12345) = 12345 mod 95
= 90

H(1276)=1276%11=0

k = 1276
M = 11 0 1276
1

10
h(1276) = 1276 mod 11
=0

54,72,89,37 if the table size is 10 then 0

M=10 0
1
H(54)=54%10=4 2 72
3
H(72)=72%10=2 4 54
5
6
7 37
8
9 89

Page 12
H(89)=89%10=9

H(37)=37%10=7

Advantages:

1. This method is quite good for any value of M.

2. The division method is very fast since it requires only a single division
operation.

Disadvantages:

1. This method leads to poor performance since consecutive keys map to

consecutive hash values in the hash table.

2. Sometimes extra care should be taken to choose the value of M.

2. Mid Square Method:

The mid-square method is a very good hashing method. It involves two steps to
compute the hash value-

1. Square the value of the key k i.e. k2

2. Extract the middle r digits as the hash value.

Formula:

h(K) = h(k x k)

Here,
k is the key value.

The value of r can be decided based on the size of the table.

Example:

k = 60

H(60)=60*60=3600

H(60)=60

Page 13
k x k = 60 x 60
= 3600
h(60) = 60

The hash value obtained is 60

3. Digit Folding Method:

This method involves two steps:

1. Divide the key-value k into a number of parts i.e. k1, k2, k3,….,kn, where
each part has the same number of digits except for the last part that can have
lesser digits than the other parts.

2. Add the individual parts. The hash value is obtained by ignoring the last
carry if any.

Formula:

k = k1, k2, k3, k4, ….., kn

s = k1+ k2 + k3 + k4 +….+ kn
h(K)= s

Here,
s is obtained by adding the parts of the key k

Example:

k = 12345
k1 = 12, k2 = 34, k3 = 5
s = k1 + k2 + k3
= 12 + 34 + 5
= 51
h(K) = 51

Page 14
4. Multiplication Method

This method involves the following steps:

1. Choose a constant value A such that 0 < A < 1.

2. Multiply the key value with A.

3. Extract the fractional part of kA.

4. Multiply the result of the above step by the size of the hash table i.e. M.

5. The resulting hash value is obtained by taking the floor of the result obtained
in step 4.

Formula:

h(K) = floor (M (kA mod 1))

Here,
M is the size of the hash table.
k is the key value.
A is a constant value.

Example:

k = 12345
Donald Knuth suggested to use A = 0.61803398987
M = 100

Example:

Let key=107,assume M=50

A=0.61803398987

H(k)= floor[ 50 (107*0.61803398987)]

= floor[ 66.12]) ]
h(k)=0.12

=50*0.12

Page 15
=6

107 will be placed at index 6 in hash table

Extraction

In this method some digits are extracted from the key to form the address location
in hash table

For example:

Suppose first,third and fourth digit from left is selected for hash key.

497824

478->at 478 location in the hash table of size 1000 the key can be stored

3111

H(3111)=783

Page 16

Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
32 pages
BCS304-DSA Notes M-5
100% (1)
BCS304-DSA Notes M-5
22 pages
6 - Hashing
No ratings yet
6 - Hashing
52 pages
Hashing PPT For Student
No ratings yet
Hashing PPT For Student
53 pages
Vtucode - in Module 5 DS 2022 Scheme
No ratings yet
Vtucode - in Module 5 DS 2022 Scheme
24 pages
Self Unit I
No ratings yet
Self Unit I
57 pages
Hashing
No ratings yet
Hashing
33 pages
Unit2 Hashing DSA
No ratings yet
Unit2 Hashing DSA
55 pages
University Institute of Engineering CSE-2 Year: Advanced Data Structures and Algorithms
No ratings yet
University Institute of Engineering CSE-2 Year: Advanced Data Structures and Algorithms
26 pages
Infosec Reference Draft Rants&Writeups Bitcoin Hack - MD at Master
No ratings yet
Infosec Reference Draft Rants&Writeups Bitcoin Hack - MD at Master
8 pages
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
No ratings yet
VND - Openxmlformats Officedocument - Wordprocessingml.document&rendition 1
9 pages
Hashing
No ratings yet
Hashing
48 pages
DSA - Unit 1
No ratings yet
DSA - Unit 1
43 pages
SORTING PROGRAMS - Counting + Bucket + Heap
No ratings yet
SORTING PROGRAMS - Counting + Bucket + Heap
27 pages
Dsa 5
100% (1)
Dsa 5
22 pages
What Is Hashing
No ratings yet
What Is Hashing
11 pages
Unit 1 Hashing
No ratings yet
Unit 1 Hashing
61 pages
Hashing Presentation
No ratings yet
Hashing Presentation
12 pages
Hashing Presentation
No ratings yet
Hashing Presentation
12 pages
HASHING
No ratings yet
HASHING
63 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
27 pages
Hashing
No ratings yet
Hashing
37 pages
Hashing New
No ratings yet
Hashing New
48 pages
HAshing (Satish Sir)
No ratings yet
HAshing (Satish Sir)
52 pages
Hashing
No ratings yet
Hashing
44 pages
Unit 5
No ratings yet
Unit 5
50 pages
HAshing (ISE Department)
No ratings yet
HAshing (ISE Department)
31 pages
Lecture 27 - Hashing
No ratings yet
Lecture 27 - Hashing
48 pages
2,2 Hashing
No ratings yet
2,2 Hashing
30 pages
Hashing
No ratings yet
Hashing
30 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
23 pages
Cse373 10 Hashing
No ratings yet
Cse373 10 Hashing
36 pages
Module 5
No ratings yet
Module 5
33 pages
DS Lecture - 6 (Hashing)
No ratings yet
DS Lecture - 6 (Hashing)
26 pages
Hashing
No ratings yet
Hashing
20 pages
Hashing
No ratings yet
Hashing
23 pages
Study Material On Hashing
No ratings yet
Study Material On Hashing
4 pages
Hash Tables: A Programmer's Guide
No ratings yet
Hash Tables: A Programmer's Guide
26 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
47 pages
Hashing Algorithms
No ratings yet
Hashing Algorithms
22 pages
Collision
No ratings yet
Collision
24 pages
Lect Hashing
No ratings yet
Lect Hashing
36 pages
Hashing and Graphs
No ratings yet
Hashing and Graphs
28 pages
SHA - 256 (Secured Hashing Algorithm)
No ratings yet
SHA - 256 (Secured Hashing Algorithm)
8 pages
Hashing
No ratings yet
Hashing
4 pages
Modifed Hash
No ratings yet
Modifed Hash
42 pages
3 Hashing
No ratings yet
3 Hashing
20 pages
UNIT V - Hashing
No ratings yet
UNIT V - Hashing
20 pages
Unit-6c DBMS - Hashing
No ratings yet
Unit-6c DBMS - Hashing
21 pages
Hashing Techniques
No ratings yet
Hashing Techniques
13 pages
Hashing
No ratings yet
Hashing
34 pages
05 Hashing
No ratings yet
05 Hashing
47 pages
Hash Tables
No ratings yet
Hash Tables
21 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
20 pages
Hashing in Data Structures
No ratings yet
Hashing in Data Structures
27 pages
Hashing
No ratings yet
Hashing
37 pages
Hash Table: Didih Rizki Chandranegara
No ratings yet
Hash Table: Didih Rizki Chandranegara
33 pages
Hashing
No ratings yet
Hashing
30 pages
Hashing
No ratings yet
Hashing
56 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
56 pages
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
No ratings yet
Analysis of Algorithms CS 477/677: Hashing Instructor: George Bebis
53 pages
CH 12
No ratings yet
CH 12
31 pages
Chapter 8 - Hashing
No ratings yet
Chapter 8 - Hashing
78 pages
哲蚌寺藏文古籍目录（上冊）
No ratings yet
哲蚌寺藏文古籍目录（上冊）
1,379 pages
Cryptographic Hash Functions Guide
No ratings yet
Cryptographic Hash Functions Guide
63 pages
Seminar Report Format
No ratings yet
Seminar Report Format
14 pages
Values, Hash Codes, Hash Sums, Checksums or Simply Hashes.: From Wikipedia, The Free Encyclopedia
100% (1)
Values, Hash Codes, Hash Sums, Checksums or Simply Hashes.: From Wikipedia, The Free Encyclopedia
11 pages
PersonC FullReport
No ratings yet
PersonC FullReport
3 pages
398 f11 hw3
No ratings yet
398 f11 hw3
4 pages
Unit 10
No ratings yet
Unit 10
10 pages
Hashing Techniques Explained
No ratings yet
Hashing Techniques Explained
14 pages
Exp 5 - Dsa Lab File
No ratings yet
Exp 5 - Dsa Lab File
10 pages
Cryptographic Hash Algorithms Performance Finding Using .Net Simulation
No ratings yet
Cryptographic Hash Algorithms Performance Finding Using .Net Simulation
5 pages
Unit 1 Dsa Hashing 2022 Compressed 1
No ratings yet
Unit 1 Dsa Hashing 2022 Compressed 1
115 pages
Dynamic Memory Allocation in C
No ratings yet
Dynamic Memory Allocation in C
3 pages
TCP2101 Algorithm Design and Analysis Lab02 - HashTables
No ratings yet
TCP2101 Algorithm Design and Analysis Lab02 - HashTables
4 pages
Cryptographic Hash & HMAC Guide
No ratings yet
Cryptographic Hash & HMAC Guide
39 pages
Hash Tables: A Guide for CS Students
No ratings yet
Hash Tables: A Guide for CS Students
48 pages
MD5 Hash
No ratings yet
MD5 Hash
34 pages
Assignment-3 - Abdul Haleem
No ratings yet
Assignment-3 - Abdul Haleem
4 pages
Bcrypt
No ratings yet
Bcrypt
7 pages
Hashing
No ratings yet
Hashing
3 pages
Pubkey-Crypto 14
No ratings yet
Pubkey-Crypto 14
19 pages
DSA Practical Telephone Book
No ratings yet
DSA Practical Telephone Book
3 pages
Cryptography for IT Professionals
No ratings yet
Cryptography for IT Professionals
11 pages
Hashing Techniques for Students
No ratings yet
Hashing Techniques for Students
3 pages
Plantilla-IOC 14 FEB 2024
No ratings yet
Plantilla-IOC 14 FEB 2024
5 pages
(Update) Post Test Week 6 - Attempt Revieww
No ratings yet
(Update) Post Test Week 6 - Attempt Revieww
5 pages
Question Bank
No ratings yet
Question Bank
1 page
Question Bank For Unit II
No ratings yet
Question Bank For Unit II
1 page
Min-Hashing and Set Similarity Techniques
100% (1)
Min-Hashing and Set Similarity Techniques
2 pages
BSSCFTP
No ratings yet
BSSCFTP
72 pages

Hashing

Uploaded by

Hashing

Uploaded by

Hashing

Hashing is another approach in which time required to search an element doesn't

Hashing is the process of indexing and retrieving element (data) in a data

Hash Table is defined as follows...

There are mainly two methods to handle collision:

Like separate chaining, open addressing is a method for handling collisions. In

Open Addressing is done following ways:

If (hash(x) + 1) % S is also full, then we try (hash(x) + 2) % S

If (hash(x) + 2) % S is also full, then we try (hash(x) + 3) % S

H(k)=k mod 7 85%7=1

b) Quadratic Probing We look for i2‘th slot in i’th iteration.

let hash(x) be the slot index computed using hash function.

If slot hash(x) % S is full, then we try (hash(x) + 1*1) % S

If (hash(x) + 1*1) % S is also full, then we try (hash(x) + 2*2) % S

If (hash(x) + 2*2) % S is also full, then we try (hash(x) + 3*3) % S

let hash(x) be the slot index computed using hash function.

If (hash(x) + 1*hash2(x)) % S is also full, then we try (hash(x) + 2*hash2(x)) % S

If (hash(x) + 2*hash2(x)) % S is also full, then we try (hash(x) + 3*hash2(x)) % S

Comparison of above three:

 Double hashing has poor cache performance but no clustering. Double

Open Addressing vs. Separate Chaining

Advantages of Open Addressing

Rehashing is a technique in which the table is resized i.e.size of the table is

It is preferable if the total size of table is a prime number.

When table is completely full

When insertion fail due to overflow.

H(key)=k mod table size

Types of Hash functions

2. Mid Square Method.

54,72,89,37 if the table size is 10 then 0

1. This method is quite good for any value of M.

1. This method leads to poor performance since consecutive keys map to

2. Sometimes extra care should be taken to choose the value of M.

2. Mid Square Method:

1. Square the value of the key k i.e. k2

2. Extract the middle r digits as the hash value.

The value of r can be decided based on the size of the table.

The hash value obtained is 60

3. Digit Folding Method:

This method involves two steps:

k = k1, k2, k3, k4, ….., kn

This method involves the following steps:

1. Choose a constant value A such that 0 < A < 1.

2. Multiply the key value with A.

3. Extract the fractional part of kA.

h(K) = floor (M (kA mod 1))

Let key=107,assume M=50

H(k)= floor[ 50 (107*0.61803398987)]

107 will be placed at index 6 in hash table

You might also like

If (hash(x) + 11) % S is also full, then we try (hash(x) + 22) % S

If (hash(x) + 22) % S is also full, then we try (hash(x) + 33) % S

If (hash(x) + 1hash2(x)) % S is also full, then we try (hash(x) + 2hash2(x)) % S

If (hash(x) + 2hash2(x)) % S is also full, then we try (hash(x) + 3hash2(x)) % S