Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views4 pages

Hashing in DBMS

Hashing in DBMS is a technique for directly locating data on disk without using index structures, allowing for faster searches using hashed keys. There are two main types of hashing: static, where bucket addresses remain constant, and dynamic, which allows for dynamic addition and removal of data buckets. Key concepts include data buckets, hash functions, and collision resolution methods like rehashing and chaining.

Uploaded by

Ivy Waithaka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views4 pages

Hashing in DBMS

Hashing in DBMS is a technique for directly locating data on disk without using index structures, allowing for faster searches using hashed keys. There are two main types of hashing: static, where bucket addresses remain constant, and dynamic, which allows for dynamic addition and removal of data buckets. Key concepts include data buckets, hash functions, and collision resolution methods like rehashing and chaining.

Uploaded by

Ivy Waithaka
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 4

Hashing in DBMS

In DBMS, hashing is a technique to directly search the location of desired data on the disk
without using index structure. Hashing method is used to index and retrieve items in a database
as it is faster to search that specific item using the shorter hashed key instead of using its original
value. Data is stored in the form of data blocks whose address is generated by applying a hash
function in the memory location where these records are stored known as a data block or data
bucket.

Why Hashing?
Here, are the situations in the DBMS where you need to apply the Hashing method:

● For a huge database structure, it’s tough to search all the index values through all its level
and then you need to reach the destination data block to get the desired data.
● Hashing method is used to index and retrieve items in a database as it is faster to search
that specific item using the shorter hashed key instead of using its original value.
● Hashing is an ideal method to calculate the direct location of a data record on the disk
without using index structure.
● It is also a helpful technique for implementing dictionaries.

Important Terminologies in Hashing


Here, are important terminologies which are used in Hashing:

● Data bucket – Data buckets are memory locations where the records are stored. It is also
known as Unit Of Storage.
● Key: A DBMS key is an attribute or set of an attribute which helps you to identify a
row(tuple) in a relation(table). This allows you to find the relationship between two
tables.
● Hash function: A hash function, is a mapping function which maps all the set of search
keys to the address where actual records are placed.
● Linear Probing – Linear probing is a fixed interval between probes. In this method, the
next available data block is used to enter the new record, instead of overwriting on the
older record.
● Quadratic probing– It helps you to determine the new bucket address. It helps you to
add Interval between probes by adding the consecutive output of quadratic polynomial to
starting value given by the original computation.
● Hash index – It is an address of the data block. A hash function could be a simple
mathematical function to even a complex mathematical function.
● Double Hashing –Double hashing is a computer programming method used in hash
tables to resolve the issues of has a collision.
● Bucket Overflow: The condition of bucket-overflow is called collision. This is a fatal
stage for any static has to function.
Types of Hashing Techniques
There are mainly two types of SQL hashing methods/techniques:

1. Static Hashing
2. Dynamic Hashing

Static Hashing
In the static hashing, the resultant data bucket address will always remain the same.

Therefore, if you generate an address for say Student_ID = 10 using hashing function mod(3),
the resultant bucket address will always be 1. So, you will not see any change in the bucket
address.

Therefore, in this static hashing method, the number of data buckets in memory always remains
constant.
Static Hash Functions

● Inserting a record: When a new record requires to be inserted into the table, you can generate
an address for the new record using its hash key. When the address is generated, the record is
automatically stored in that location.
● Searching: When you need to retrieve the record, the same hash function should be helpful to
retrieve the address of the bucket where data should be stored.
● Delete a record: Using the hash function, you can first fetch the record which is you wants to
delete. Then you can remove the records for that address in memory.

Static hashing is further divided into

1. Open hashing
2. Close hashing.

Open Hashing
In Open hashing method, Instead of overwriting older one the next available data block is used to
enter the new record, This method is also known as linear probing.
For example, A2 is a new record which you wants to insert. The hash function generates address
as 222. But it is already occupied by some other value. That’s why the system looks for the next
data bucket 501 and assigns A2 to it.
How Open Hash Works

Close Hashing
In the close hashing method, when buckets are full, a new bucket is allocated for the same hash
and result are linked after the previous one.

Dynamic Hashing
Dynamic hashing offers a mechanism in which data buckets are added and removed dynamically
and on demand. In this hashing, the hash function helps you to create a large number of values.

Difference between Ordered Indexing and Hashing


Below are the key differences between Indexing and Hashing
Parameters Order Indexing Hashing

Addresses in the memory are sorted


Storing of Addresses are always generated using
according to a key value called the primary
address a hash function on the key value.
key

Performance of hashing will be best


It can decrease when the data increases in
when there is a constant addition and
the hash file. As it stores the data in a sorted
deletion of data. However, when the
Performance form when there is any
database is huge, then hash file
(insert/delete/update) operation performed
organization and its maintenance will
which decreases its performance.
be costlier.

This is an ideal method when you


Preferred for range retrieval of data- which
want to retrieve a particular record
means whenever there is retrieval data for a
Use for based on the search key. However, it
particular range, this method is an ideal
will only perform well when the hash
option.
function is on the search key.
There will be many unused data blocks
In static and dynamic hashing
Memory because of the delete/update operation.
methods, memory is always managed.
managemen These data blocks can’t be released for re-
Bucket overflow is also handled
t use. That’s why regular maintenance of the
perfectly to extend static hashing.
memory is required.

Collision.
Hash collision is a state when the resultant hashes from two or more data in the data set, wrongly
map the same place in the hash table.

Hashing Collision.
There are two technique which you can use to avoid a hash collision:

1. Rehashing: This method, invokes a secondary hash function, which is applied continuously until
an empty slot is found, where a record should be placed.
2. Chaining: Chaining method builds a Linked list of items whose key hashes to the same value.
This method requires an extra link field to each table position.

Summary:
● In DBMS, hashing is a technique to directly search the location of desired data on the disk
without using index structure.
● Hashing method is used to index and retrieve items in a database as it is faster to search that
specific item using the shorter hashed key instead of using its original value.
● Data bucket, Key , Hash function, Linear Probing, Quadratic probing , Hash index, Double
Hashing, Bucket Overflow are important terminologies used in hashing
● Two types of hashing methods are 1) static hashing 2) dynamic hashing
● In the static hashing, the resultant data bucket address will always remain the same.
● Dynamic hashing offers a mechanism in which data buckets are added and removed dynamically
and on demand.
● In order Indexing addresses in the memory are sorted according to a critical value while in
hashing addresses are always generated using a hash function on the key value.
● Hash collision is a state when the resultant hashes from two or more data in the data set,
wrongly map the same place in the hash table.
● Rehashing and chaining are two methods which help you to avoid hashing collision.

You might also like