Unit 2 Notes
Unit 2 Notes
Syllabus: Sorting and Searching Brute force approach: General method -Sorting
(bubble, selection, insertion) –Searching (Sequential/Linear) Divide and Conquer
approach: General method - Sorting (merge, quick) – Searching (Binary Search).
Sorting Approach
Systematic arrangement of data is called SORTING. For example – in telephone
directory the person surname are arranged in increasing or decreasing order. The
sorting is a technique by which the elements are arranged in some particular
order. Usually the sorting order is of two types-
Ascending order: It is the sorting order in which the elements are arranged from
low value to high value. In other words elements are in increasing order. For
example: 10, 50, 40, 20, 30 can be arranged in awarding order after applying
some sorting technique as 10, 20. 30, 40, 50
Descending Order: It is the sorting order in which the elements are arranged from
high value to low value. In other words elements are in decreasing order. It is
reverse of the ascending order. For example: 10, 50, 40, 20, 30 can be arranged
in descending order after applying some sorting technique as 50,40,30,20,10
While sorting the elements, we always consider a specific order and expect our data
to be arranged in that order.
1. The sorting is useful in database applications for arranging the data in desired
order.
2. In the dictionary like applications the data is arranged in sorted order.
3. For searching the element from list of elements, the sorting is required.
4. For checking the uniqueness of the element the sorting is required.
5. For finding the closest pair from the list of elements the sorting is required.
Internal sorting – In the internal sorting data resides on main memory of the
computer. It is used to sort small amount of data. This type of sorting is faster in
comparison to external sorting and required low memory for sorting. Various
internal sorting techniques are - bubble sort, insertion sort, selection sort.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 1 | 24
sorted using this technique. The external merge sort is a technique in which the
data is loaded in intermediate files. Each intermediate file is sorted
independently and then combined or merged to get the sorted data.
For example: Consider that there are 10,000 records that has to be sorted- Clearly
we need to apply external sorting method. Suppose main memory has a capacity to
store 500 records in blocks, with
each block size of 100 records.
The given strings "mango", "apple", guava" all are of length 5. Hence they are
arranged in the order in which they appear in the input list. Various sorting
techniques such as merge sort, radix sort are the example of stable sorting
algorithm
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 2 | 24
Offline Algorithm: The offline algorithms are a kind of algorithm in which the
entire list of elements is available at the time of processing. For example -
selection sort is a sorting technique in which entire list is scanned for finding
minimum element each time. Thus entire list is available at the time of processing
in case of selection sort
Sorting Techniques
Sorting is an important activity and every time we insert or delete the data at the
time of sorting, the remaining data retain in queue to sort. Various algorithms that
are developed for sorting are as follows -
1. Insertion Sort 2. Bubble sort 3. Selection Sort 4. Merge Sort
5. Quick Sort 6. Radix Sort 7. Shell Sort 8. Bucket Sort
9. Heap Sort
Bubble Sorting
This is the simplest kind of sorting method. In bubble sort procedure we perform
several iterations in groups which are called passes. In this procedure first each
element is filled in array. Now in each pass we compare a[0] element of array with
next element i.e. a[1]. If a[0] is greater than a[1] we interchange the value and if it is
not greater then value remain same and we move to next step and compare a[1]
with a[2]. Similarly if a[1] is greater than a[2] we interchange the value otherwise
not and move to next step. Like this when we reach at last position, largest element
comes at last position in array and 1st pass is over. Now list is sorted upto some
extend. Similarly passes is repeated from a[0] to a[n] till all element get sort.
Example: Consider 5 unsorted elements are 45, -40, 190, 99, 11.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 3 | 24
Compare a[2] and a[3] Compare a[3] and a[4]
i.e. compare 190 and 99 i.e. compare 190 and 11
if a[2] is greater than a[3] if a[3] is greater than a[4]
Interchange them Interchange them
Therefore a[2] = 99 & a[3] = 190 Therefore a[3] = 190 & a[4] = 11
After first pass the array will hold the elements which are sorted to some extent.
Pass2:
Pass 3:
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 4 | 24
Next compare a[2] and a[3] Compare a[3] and a[4]
No interchange No interchange
This is end of pass 3. This process will be thus continued till pass 4. As
element is sorted so no interchange will happen in pass 4, so diagram
is not shown here. Finally at the [end of last pass the array will hold all
the sorted elements like this, Since the comparison positions look like
bubbles, therefore it is called bubble sort.
Algorithm:
Selection Sorting
Scan the array to find its smallest element and swap it with the first element.
Then, starting with the second element scan the entire list to find the smallest
element and swap it with the second element. Then starting from the third element
the entire list is scanned in order to find the next smallest element. Continuing in
this fashion we can sort the entire list.
Generally, on pass i (0 <= i <= n-2), the smallest element is searched among last n-i
elements and is swapped with A[ i ]. The list gets sorted after n-l passes.
Example: Consider
the elements 70,
30, 20, 50, 60, 10,
40 We can store
these elements in
array A as :
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 5 | 24
1st Pass:
2nd Pass:
3rd Pass:
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 6 | 24
4th Pass:
5th Pass:
6th Pass:
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 7 | 24
Swap A[ i ] with smallest element. The array then becomes,
The above algorithm can be analysed mathematically. We will apply a general plan
for non recursive mathematical analysis.
1. The input size is n i.e. total number of elements in the list.
2. In the algorithm the basic operation is key comparison. if A[i] < A[min]
3. This basic operation depends only on array size n. Hence we can find sum as
Thus time complexity of selection sort is (n2) for all input, but total number of key
swaps is only (n)
Insertion Sorting
In this method the elements are inserted at their appropriate place. Hence is the
name insertion sort. Consider the following example to understand insertion sort.
For Example: Consider a list
of elements as 30, 70, 20,
50, 40, 10, 60
Step 1
Step 2
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 8 | 24
Step 3
Step 4
Step 5
Step 6
Similarly now compare 10 with the
element of sorted zone i.e. 20. 30,
40, 50, 70 and insert it on desired
position with respect to it
Step 7
Step 8
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 9 | 24
The worst case occurs for insertion sort when the list of elements is in descending
order and we have to sort it in ascending order. In such a case the total number of
key comparisons will be
= (n-1) + (n-2) + ......... + 2 + 1
= n (n-l)/2
= O(n2)
Hence worst case time complexity is O(n2)
Linear Searching
In linear search, we access each element of an array one by one sequentially and
see whether it is desired element or not. A search will be unsuccessful if all the
elements are accessed and the desired element is not found. In the worst case, the
number of average case we may have to scan half of the size of the array (n / 2).
Therefore, linear search can be defined as the technique which traverses the array
sequentially to locate the given item.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 10 | 24
Efficiency of sequential searching - The time taken or the number of
comparisons made in searching a record in a search table determines the
efficiency of the technique. if the desired record is present in the first position of
the search table, then only one comparison is made. If the desired record is the
last one, then n comparisons have to be made. If the record is present somewhere
in the search table, on an average, the number of comparisons will be (n+1)/2. The
worst-case efficiency of this technique is 0(n) stands for the order of execution
Here a is a linear array with n elements. and item is a given item of information.
This algorithm finds the location be of item in c. or sets loc = 0 if the search is
unsuccessful.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 11 | 24
Now, the element to be searched is
found. So algorithm will return the
index of the element matched.
Binary Search
Binary search is an extremely efficient algorithm. This search technique
searches the given item in minimum possible comparisons. To do the binary
search, first we had to sort the array elements. The logic behind this technique
is given below :
Repeat the same steps until an element is found or exhausts in the search area.
In this algorithm every time we are reducing the search area. 50 number of
comparisons keep on decreasing. In worst case the number of comparisons is
atmost log(N + 1). So it is an efficient algorithm when compared to linear search
but the array has to be sorted before doing binary Search.
Here a is sorted array with lower bound LB and upper bound UB and Item is a
given Item Information. The variables beg, end and mid denoted, respectively, the
beginning and middle location of a segment of element of a. This algorithm finds
the location loc of item in or sets loc = NULL.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 12 | 24
1. Iterative method
2. Recursive method
The recursive method of binary search follows the divide and conquer
approach.
Let the elements of array are given and the element to search is, K = 56
We have to use the below formula to calculate the mid of the array -
mid = (beg + end)/2
So, in the given array - beg = 0, end = 8, mid = (0 + 8)/2 = 4. So, 4 is the mid of
the array.
Now, the element to search is found. So algorithm will return the index of the
element matched.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 13 | 24
There are two fundamental of Divide & Conquer Strategy:
Fig. 2
1. Relational Formula
2. Stopping Condition
1. Relational Formula:
It is the formula that we generate
from the given technique. After
generation of Formula, we apply
D&C Strategy, i.e. we break the
problem recursively & solve the
broken subproblems.
2. Stopping Condition:
When we break the problem
using Divide & Conquer Strategy,
then we need to know that for
how much time, we need to apply
divide & Conquer. So the
condition where the need to stop
our recursion steps of D&C is called as Stopping Condition.
The following are some standard algorithms application that follow Divide and
Conquer algorithm.
Time Complexity: The time complexity of the divide and conquer algorithm to
find the maximum and minimum element in an array is O(n). This is because each
time we divide the array in half, so we will have a total of log(n) divisions. In each
division, we compare two elements to find the maximum and minimum element,
which takes constant time. Therefore, the total time complexity is O(n*log(n)).
Space Complexity: The space complexity of the divide and conquer algorithm to
find the maximum and minimum element in an array is O(log(n)). This is because
we are using recursion to divide the array into smaller parts, and each recursive
call takes up space on the call stack. The maximum depth of the recursion tree is
log(n), which is the number of times we can divide the array in half. Therefore, the
space complexity is O(log(n)).
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 14 | 24
Divide and Conquer (D & C) vs Dynamic Programming (DP)
Both paradigms (D & C and DP) divide the given problem into subproblems and
solve subproblems. How do choose one of them for a given problem? Divide and
Conquer should be used when the same subproblems are not evaluated many
times. Otherwise Dynamic Programming or Memoization should be used. For
example, Quicksort is a Divide and Conquer algorithm, we never evaluate the
same subproblems again. On the other hand, for calculating the nth Fibonacci
number, Dynamic Programming should be preferred.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 15 | 24
Fig. 3
The sorted 5 blocks (i.e. 500 records) are stored in intermediate file. This process
will i be repeated 20 times to get all the records sorted in chunks. In the second
step, we start merging a pair of intermediate files in the main memory to get output
file.
Merge Sorting
The merge sort is a sorting algorithm that uses the divide and conquer strategy. In
this method division is dynamically carried out. Merge sort on an input array with n
elements consists of three steps:
1. Divide: Partition array into two sublists s1 and s2 with n/2 elements each
2. Conquer: Then sort sub list s1 and sublist s2.
3. Combine: Merge 51 and 52 into a unique sorted group.
Merge sort keeps on dividing the list into equal halves until it can no more be
divided. By definition, if it is only one element in the list, it is sorted. Then, merge
sort combines the smaller sorted lists keeping the new list sorted too.
Step 2 − divide the list recursively into two halves until it can no more be divided.
Step 3 − merge the smaller lists into new list in sorted order.
We know that merge sort first divides the whole array iteratively into equal halves
unless the atomic values are achieved. We see here that an array of 8 items is
divided into two arrays of size 4.
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 16 | 24
This does not change the sequence of appearance of items in the original. Now we
divide these two arrays into halves.
We further divide these arrays and we achieve atomic value which can no more be
divided.
Now, we combine them in exactly the same manner as they were broken down. We
first compare the element for each list and then combine them into another list in a
sorted manner. We see that 14 and 33 are in sorted positions. We compare 27 and
10 and in the target list of 2 values we put 10 first, followed by 27. We change the
order of 19 and 35 whereas 42 and 44 are placed sequentially.
In the next iteration of the combining phase, we compare lists of two data values,
and merge them into a list of found data values placing all in a sorted order.
After the final merging, the list should look like this –
Analysis: In merge sort algorithm two recursive calls are made. Each recursive call
focuses on n/ 2 elements of the list. After two recursive calls one call is made to
combine two sublists i.e. to merge all the elements. We can write it as –
T(n) = O(nlog2n)
The average and worst case time complexity of merge sort is O(nlog2n)
Example 2 : Consider the elements as 70, 20, 30, 40, 10, 50, 60
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 17 | 24
Quick Sorting
Quick sort is a sorting algorithm that uses the divide and conquer strategy. In this
method division is dynamically carried out. The three steps of quick sort are as
follows:
1. Divide: Split the array into two sub arrays that each element in the left sub
array is less than or equal the middle element and each element in the right sub
array is greater than the middle element. The splitting of the array into two sub
arrays is based on pivot element. All the elements that are less than pivot
should be in left sub array and all the elements that are more than pivot should
be in right sub array.
2. Conquer: Recursively sort the two sub arrays. A
3. Combine: Combine all the sorted elements in a group to form a list of sorted
elements.
In merge sort the division of array is based on the positions of array elements, but
quick sort this division is based on actual value of the element. Consider an array
a[i] where i is ranging from 0 to n - 1 then we can formulize the division of array
elements as
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 18 | 24
Example - let us understand this algorithm with the help of some example.
50 30 10 90 80 20 40 70
Step 1:
We will now split the array in two parts. The left sublist will contain the elements
less than Pivot (i.e. 50) and right sublist contains elements greater than pivot.
Step 2:
We will increment i. If A[i] <= Pivot, we will continue to increment it until the
element pointed by i is greater than A[Low].
Step 3:
Step 4:
Step 5:
As A[j] > Pivot (i.e. 70 > 50). We will decrement j. We will continue to decrement j
until the element pointed by j is less than A [Low].
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 19 | 24
Step 6:
Now we cannot decrement j because 40 < 50. Hence we will swap A[i] and A[j] i.e. 90
and 40.
Step 7:
As A[i] is less than A[Low] and A[j] is greater than A[Low] we will continue
incrementing i and decrementing j. until the false conditions are obtained.
Step 8:
Step 9:
As A[i] < A[Low] and A[j] > A[Low], we will continue incrementing i and decrementing
j.
Step 10:
As A[j] < A[Low] and j has crossed i that is j < i, we will swap A[low] and A[j].
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 20 | 24
Step 11:
We will now start sorting left sublist, assuming the first element of left sublist as
pivot element. Thus now new pivot = 20.
Step 12:
Now we will set i and j pointer and then we will start comparing A[i] with A[Low] or
A[Pivot]. Similarly comparison with A[j] and A[Pivot].
Step 13:
As A[i] > A[Pivot]. hence stop incrementing i. Now as A[i] > A[Pivot]. hence decrement
j.
Step 14:
Now j can not be decremented because 10 < 20. Hence we will swap A[i] and A[j].
Step 15:
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 21 | 24
Step 16:
Step 17:
As A[j] < A[Low] we cannot decrement j now. We will now swap A[Low] and A[i] as 1
has crossed i and i > j
Step 18:
As there is only one element in left sublist hence we will sort right sublist.
Step 19:
As left sublist is sorted completely we will sort right sublist, assuming first element
of right sublist as pivot.
Step 20:
As A[i] > A[Pivot]. hence we will stop incrementing i. Similarly A[j] < A[Pivot]. Hence
we stop decrementing j. Swap A[i] and A[j].
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 22 | 24
Step 21:
Step 22:
Step 23:
Step 24:
The left sublist now contain 70 and right sublist contain only 90. We can not
further subdivide the list
Hence list is
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 23 | 24
6.1 Continue the process till left i and right j cross each other.
Step 7 − When left i and right j cross each other, right j is replaced with pivot
point and right j become new Pivot.
7.1. Pivot point is position is fixed. Now array is divided in two parts.
Step 8 – Repeat step 1 to 7 till both array part is sorted and exit
The partition function is called to arrange the elements such that all the elements
that are less than pivot are at the left side of pivot and all the elements that are
greater than pivot are all at the right of pivot. In other words pivot is occupying its
proper position and the partitioned list is obtained in an ordered manner.
Analysis: When pivot is chosen such that the array gets divided at the mid then it
gives the best case time complexity. The best case time complexity of quick sort is
O(nlog2n). The worst case for quick sort occurs when the pivot is minimum or
maximum of all the elements in the list. This can be graphically represented as -
This ultimately results in 0(n2) time complexity. When array elements are randomly
distributed then it results in average case time complexity, and it is O(nlog2n).
D r. D h e r e s h S o n i , A s s t . P r o f. V I T B h o p a l U n i v e r s i t y P a g e 24 | 24