Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views42 pages

Data Structures

Uploaded by

atharva3010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views42 pages

Data Structures

Uploaded by

atharva3010
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Linked lists

• A linked list is a data structure in which the objects are arranged in a


linear order.
• Linked list vs. array
• Array: the linear order is determined by the array indices
• Linked list: the order is determined by a pointer in each object
• A doubly linked list 𝐿 is an object with an attribute key and two other
pointer attributes: 𝑛𝑒𝑥𝑡 and 𝑝𝑟𝑒𝑣
• The object may also contain satellite data.
Linked list
• Given an element 𝑥 in the list, 𝑥. 𝑛𝑒𝑥𝑡 points to its successor in the
linked list, and 𝑥. 𝑝𝑟𝑒𝑣 points to its predecessor.
• If 𝑥. 𝑝𝑟𝑒𝑣 = NIL, then the element 𝑥 has no predecessor and is
therefore the first element of the list
• If 𝑥. 𝑛𝑒𝑥𝑡 = NIL, then the element 𝑥 has no successor and is therefore
the last element of the list.
• If a list is singly linked then we drop the 𝑝𝑟𝑒𝑣 pointer.
• If a list is sorted, the linear order of the list corresponds to the linear
order of the keys stored in the elements of the list.
List-search
List-insert

• Given an element 𝑥 whose key attribute has already been set, the LIST-INSERT procedure
“splices” 𝑥 onto the front of the linked list.
• Our attribute notation can cascade, so that 𝐿. ℎ𝑒𝑎𝑑. 𝑝𝑟𝑒𝑣 denotes the 𝑝𝑟𝑒𝑣 attribute of the
object that 𝐿. ℎ𝑒𝑎𝑑 points to.
List-delete
• The procedure list delete removes an element 𝑥 from a linked list 𝐿.
• It must be given a pointer to 𝑥.
• it then splices 𝑥 out of the list by updating the pointers.
• If we wish to delete an element with a
given key, we must first call LIST-SEARCH.
Sentinels
• The code for LIST-DELETE would be simpler if we could ignore the
boundary conditions and head and tail of the list
• A sentinel is a dummy object that allows us to simplify boundary
conditions.
• For the list 𝐿 we provide 𝐿. 𝑛𝑖𝑙 that represents NIL but have all the
attributes of the other objects in the list.
• Wherever we have a reference to NIL in list code, we replace it by a
reference to the sentinel 𝐿. 𝑛𝑖𝑙
Hash tables
• Insert, search, and delete are the dictionary operations.
• Hash table is an effective data structure for implementing dictionaries.
• Although searching for an element in a hash table can take as long as
searching for an element in a linked list – Θ(𝑛) time in the worst case
– in practice, hashing performs extremely well.
• Under reasonable assumptions, the average time to search for an
element in a hash table is 𝑂 𝑛 .
Direct-address tables
Hash tables
• A hash function ℎ computes the slot from the key 𝑘
• A hash function maps the universe 𝑈 of keys in to slots of a hash table
𝑇[0 . . 𝑚 − 1]:
ℎ: 𝑈 → 0, 1, … , 𝑚 − 1
where the size 𝑚 of the hash table is typically much less than 𝑈 .
• We say that an element with key 𝑘 hashes to slot ℎ 𝑘
• We also say that ℎ 𝑘 is the hash value of key 𝑘.
Resolving collision
• If two keys have the same hash, we call this situation a collision.
• Collision resolution by chaining
• In chaining, we place all the elements that hash to the same slot into the same
linked list. Slot 𝑗 contains a pointer to the head of the list of all stored elements
that hash to 𝑗. If there are not such elements, slot 𝑗 contains NIL.
Analysis of hashing with chaining
• In a hash table in which collisions are resolved by chaining, an
unsuccessful search takes average-case time Θ 1 + 𝛼 , under the
assumption of simple uniform hashing.

• In a hash table in which collisions are resolved by chaining, a


successful search takes average-case time Θ 1 + 𝛼 , under the
assumption of simple uniform hashing.
Analysis of hashing with chaining
• Given a hash table 𝑇 with 𝑚 slots that stores 𝑛 elements, we define the
load factor 𝛼 for 𝑇 as 𝑛/𝑚, that is the average number of elements
stored in a chain.
• Our analysis will be in terms of 𝛼, which can be less than, equal to, or
greater that 1.
• Simple Uniform Hashing:
• It is assumed that any given element is equally likely to hash into any of the 𝑚
slots, independently of where any other element has hashed to.
• For 𝑗 = 0, 1, … , 𝑚 − 1, let us denote the length of the list 𝑇[𝑗] by 𝑛𝑗 , so that
𝑛 = 𝑛0 + 𝑛1 + ⋯ + 𝑛𝑚−1 . The expected value of 𝑛𝑗 is 𝐸 𝑛𝑗 = 𝛼 = 𝑛/𝑚.
Hash functions
• The division method
• ℎ 𝑘 = 𝑘 mod 𝑚
• Avoid taking 𝑚 to be a power to 2
• A prime number not too close to a power of 2 is often a good choice.

• The multiplication method:


• First multiply 𝑘 by a constant 𝐴 in the range 0 < 𝐴 < 1
• Extract the fractional part of the number 𝑘𝐴
• Multiply this value by 𝑚 and take the floor of the result
ℎ 𝑘 = ⌊𝑚 𝑘𝐴 mod 1 ⌋
Hashing with chaining
• In a hash table in which collisions are resolved by chaining, an
unsuccessful search takes average-case time Θ 1 + 𝛼 , under the
assumption of simple uniform hashing.
• Under the assumption of simple uniform hashing, any key 𝑘 not
already stored in the table is equally likely to hash to any of the 𝑚
slots. The expected time to search unsuccessfully for a key 𝑘 is the
expected time to search to the end of list 𝑇 ℎ 𝑘 , which has expected
length 𝐸 𝑛ℎ 𝑘 = 𝛼. Thus, the expected number of elements
examined in an unsuccessful search is 𝛼, and the total time required
(including the time for computing ℎ 𝑘 ) is Θ 1 + 𝛼 .
Hashing with chaining
• In a hash table in which collisions are resolved by chaining, a
successful search takes average-case time Θ(1 + 𝛼), under the
assumption of simple uniform hashing.
• Let 𝑥𝑖 denote the 𝑖th element added to 𝑥’s list after 𝑥 was added to the list, for
𝑖 = 1,2, … , 𝑛, and let 𝑘𝑖 = 𝑥𝑖 . 𝑘𝑒𝑦. For keys 𝑘𝑖 and 𝑘𝑗 define indicator random
1
variable 𝑋𝑖𝑗 = 𝐼 ℎ 𝑘𝑖 = ℎ 𝑘𝑗 = . So 𝐸 𝑋𝑖𝑗 = 1/𝑚.
𝑚
• The expected number of elements examined in a successful search is:
Open addressing
• In open addressing, all elements occupy the hash table itself. That is,
each table entry contains either an element of the dynamic set or NIL.
• When searching for an element, we systematically examine table slots
until we find the desired element or have ascertained that the element
is not in the table.
• No lists/elements are stored outside the table, unlike in chaining.
• In open addressing, the hash table can “fill up” so that no further
insertions can be made.
• The load factor 𝛼 = 𝑛/𝑚 can never exceed 1.
Open Addressing
• Advantages
• It avoids pointers altogether.
• We compute the sequence of slots to be examined.
• The extra memory freed by not storing pointers provides the hash table with a
larger number of slots for the same amount of memory, potentially yielding
fewer collisions and faster retrieval.
• To perform insertion using open addressing, we successively examine, or
probe, the hash table until we find an empty slot in which to put the key.
• The sequence of positions probed depends upon the key being inserted.
Probing
• To perform insertion using open addressing, we successively examine,
or probe the hash table until we find an empty slot in which to put the
key.
• The sequence of positions probed depends on the key being inserted.
• To determine which slots to probe, we extend the hash function to
include the probe number (starting from 0) as a second input. Thus the
hash function becomes
ℎ: 𝑈 × 0, 1, … , 𝑚 − 1 → 0, 1, … , 𝑚 − 1
• The probe sequence: ℎ 𝑘, 0 , ℎ 𝑘, 1 , … , ℎ 𝑘, 𝑚 − 1
Linear Probing
• Given an ordinary hash function ℎ ′ : 𝑈 → 0, 1, … , 𝑚 − 1 , which we
refer to as an auxiliary hash function, the method of linear probing
uses that hash function ℎ 𝑘, 𝑖 = ℎ ′ 𝑘 + 𝑖 mod 𝑚 for 𝑖 =
0, 1, … , 𝑚 − 1.
• Given key 𝑘, we first probe 𝑇[ℎ ′ 𝑘 ], i.e., the slot given my the
auxiliary hash function.
• We next probe slot 𝑇 ℎ ′ 𝑘 + 1 , and so on up to slot 𝑇 𝑚 − 1 . Then
wrap around to slots 𝑇 0 , 𝑇 1 , … until fillay probe the slot
𝑇 ℎ′ 𝑘 − 1 .
Linear Probing
• Linear probing is easy to implement, but it suffers from a problem
known as primary clustering.
• Long runs of occupied slots build up, increasing the average search
time.
• Clusters arise because an empty slot preceded by 𝑖 full slots gets filled
next with probability (𝑖 + 1)/𝑚.
• Long runs of occupied slots tend to get longer and the average search
time increases.
Quadratic Probing
• Quadratic probing uses a hash function of the form
ℎ 𝑘, 𝑖 = ℎ ′ 𝑘 + 𝑐1 𝑖 + 𝑐2 𝑖 2 mod 𝑚
where ℎ′ is an auxiliary hash function, 𝑐1 and 𝑐2 are positive auxiliary
constants, and 𝑖 = 0, 1, … , 𝑚 − 1.
• Secondary clustering.
Double hashing
• Double hashing offers one of the best methods available for open
addressing because the permutations produced have many of the
characteristics or randomly chosen permutations. Double hashing
uses a hash function of the form
ℎ 𝑘, 𝑖 = ℎ1 𝑘 + 𝑖ℎ2 𝑘 mod 𝑚
where both ℎ1 and ℎ2 are auxiliary hash functions.
Perfect hashing

You might also like