HASH TABLES
What are they
A data structure that stores key-value pairs and provides efficient
insertion, deletion and lookup operations
Its main function is to immediately find an item in a sorted or unsorted list
without the need to compare with other items in the data set, often used
to implement a dictionary data structure
A hashing function is used to calculate the position of an item in a hash
table, it is applied to an item to determine a hash value (its position in the
table)
Eg a common exams is adding up ASCII values and calculating the
modulus (with the size of the hash table)of that to determine its position
-a hash table needs to be at least large enough to store all data types but
is normally significantly larger to minimise the chance of two items having
the same hash value known as a collision
A GOOD HASHING FUNCTION SHOULD: be calculated quickly, result in as
few collisions as possible, use as little memory as possible
TO RESOLVE COLLISIONS…..
--find an empty position in next available space known as open addressing
--to find the item later the hashing function delivers the starting position
from which a linear search can be applied until the item is found-linear
probing
The above process results in clustering several positions around a
common collision value
A major disadvantage of this is that it prevents other items from being
placed in their correct places in case it is already taken up
--process of finding an alternative position is known as rehashing
--another alternative is to use a two dimensional hashing table so more
than one item can be placed in the same position known as chaining
TYPICAL USES (when item in a large data set needs to be found quickly )
eg file systems linking file names
ADDING
1)apply the hashing function to calculate the position where the value
should be placed
2)if this location is empty insert the item and stop
3)BUT if the position is not empty check the first position in the overflow
table if empty insert the item and stop
3)if not keep incrementing through the overflow table until a free space is
found or it is full
DELETEING
1)generate the hash value using the function
2)if the calculated position stores the item delete it and stop
3)if the calculated position does not contain the item to be deleted check
the first position in the overflow table if found delte
4)if not keep incrementing until the item is discovered or end is reached
Keep in mind deleting it here means the address is marked as available for
later use and will be overwritten later
RETRIEVING
1)generate a hash value and check the address in the hash table
2)if that item is the one to be found return it
3)if not search the overflow table until it is found or end is reached, if not
found output not found
It is simpler as we can immediately land on the position where the item
should be HOWEVER if collisions occur a linear search will need to be
performed