Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
18 views28 pages

DS Unit-II Notes

The document provides an overview of arrays as linear data structures, detailing their definition, features, advantages, and disadvantages. It explains the operations associated with arrays, including insertion, deletion, searching, and updating, along with examples in Python. Additionally, it discusses the concept of Abstract Data Types (ADT) and the storage representation of arrays, including sparse matrices and their search methods.

Uploaded by

balajishridharla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views28 pages

DS Unit-II Notes

The document provides an overview of arrays as linear data structures, detailing their definition, features, advantages, and disadvantages. It explains the operations associated with arrays, including insertion, deletion, searching, and updating, along with examples in Python. Additionally, it discusses the concept of Abstract Data Types (ADT) and the storage representation of arrays, including sparse matrices and their search methods.

Uploaded by

balajishridharla
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

SRTCT’S

SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,


KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Unit-II : - Linear Data Structures, searching and sorting

Overview of Array (Data Structures)

Definition of Array:

An array is a linear data structure that stores a collection of elements of the same
data type in contiguous memory locations.
Each element in the array is accessed using an index (starting from 0).

Features of Arrays:

Feature Description
Size of the array is defined at the time of declaration and cannot
Fixed Size
be changed later.
All elements must be of the same type (e.g., all integers, all
Same Data Type
floats).
Elements can be accessed directly using their index (e.g., arr[2]
Indexed Access
gets the 3rd element).
Contiguous
All elements are stored next to each other in memory.
Storage

Diagram:

1 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Advantages:

1. Fast Access: O(1) time to access any element using index.


2. Ease of Use: Simple and easy to implement.
3. Efficient in Iteration: Arrays work well with loops.

Disadvantages:

1. Fixed Size: Cannot grow or shrink during execution.


2. Insertion/Deletion is Costly: Requires shifting of elements.
3. Wastage of Memory: If array is partially filled.

Types of Arrays:

Type Description
1D Array A single row or list of elements.
2D Array Table or matrix format (rows × columns).
Multidimensional Array Arrays of arrays (more than 2 dimensions).

Example in Python:

arr = [10, 20, 30, 40, 50]


print(arr[2]) # Output: 30

Applications of Arrays:

1. Used in matrices and tables.


2. Search and sort algorithms.
3. Implementation of other data structures like stacks, queues, etc.
4. Storing real-time sensor data, student marks, etc.

What is an Abstract Data Type (ADT)?

An Abstract Data Type (ADT) is a logical model or conceptual representation of


a data structure that defines what operations can be performed on the data, without
specifying how these operations will be implemented.

2 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Array as an ADT

An array ADT defines a collection of elements (of the same data type) and supports
a set of operations such as:

Operation Description
Insert(i, value) Insert value at index i
Delete(i) Delete value at index i
Get(i) Get value at index i
Update(i, value) Update value at index i
Length() Return the number of elements

These are logical operations — the implementation (in C, Python, Java, etc.) may
vary.

Real-World Analogy

Think of an array as a row of mailboxes, each with a unique index number.

1. You can put something in a specific box (Insert).


2. You can remove it (Delete).
3. You can check what’s inside (Get).
4. You can replace the item (Update).

Array ADT Operations – Example in Python

class ArrayADT:
def __init__(self, size):
self.data = [None] * size # Fixed-size array
self.length = 0

def insert(self, index, value):


if 0 <= index < len(self.data):
self.data[index] = value
self.length += 1

def delete(self, index):


if 0 <= index < len(self.data):

3 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
self.data[index] = None
self.length -= 1

def get(self, index):


if 0 <= index < len(self.data):
return self.data[index]

def update(self, index, value):


if 0 <= index < len(self.data):
self.data[index] = value

def display(self):
return self.data
# Usage
arr = ArrayADT(5)
arr.insert(0, 10)
arr.insert(1, 20)
arr.insert(2, 30)print(arr.display()) # [10, 20, 30, None, None]
arr.update(1, 25)print(arr.get(1)) # 25
arr.delete(2)print(arr.display()) # [10, 25, None, None, None]

Why Array is an ADT?

Because it:

 Defines behavior through operations.


 Hides internal implementation (e.g., using lists, pointers, or memory blocks)
 Focuses on what the data structure does, not how.

Operations on Array :

An array is a linear data structure that allows random access to elements using
indexes. We can perform several basic operations on arrays like inserting, deleting,
traversing, searching, and updating.

Basic Operations on Array

Operation Description

4 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Operation Description
Traversal Visiting all elements one by one
Insertion Adding a new element at a given position
Deletion Removing an element from a specific position
Searching Finding the location of a specific element
Updation Changing the value of an existing element

1. Traversal (Visiting each element)

arr = [10, 20, 30, 40]for i in arr:


print(i)

Output:

10
20
30
40

2. Insertion (Add new element)

Case 1: At specific index

arr = [10, 20, 30, 40]


arr.insert(2, 25) # insert 25 at index 2print(arr)

Output: [10, 20, 25, 30, 40]

Case 2: At end:

arr.append(50)print(arr)

Output: [10, 20, 25, 30, 40, 50]

3. Deletion (Remove an element)

Case 1: By index

arr = [10, 20, 30, 40]


arr.pop(2) # remove element at index 2print(arr)

Output: [10, 20, 40]

5 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Case 2: By value

arr.remove(20) # remove the value 20print(arr)

Output: [10, 40]

4. Searching (Find an element)

arr = [10, 20, 30, 40]


target = 30
found = Falsefor i in range(len(arr)):
if arr[i] == target:
print(f"Element found at index {i}")
found = True
breakif not found:
print("Element not found")

Output: Element found at index 2

5. Updating (Modify value at specific index)

arr = [10, 20, 30, 40]


arr[1] = 25 # change value at index 1 to 25print(arr)

Output: [10, 25, 30, 40]

Summary Table:

Operation Python Example Time Complexity


Traverse for i in arr: O(n)
Insert arr.insert(i, val) O(n)
Delete arr.pop(i) / remove O(n)
Search Linear: O(n), Binary: O(log n)
Update arr[i] = val O(1)

6 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Storage Representation in Array (with Example)

What is Storage Representation of Array?

Storage representation refers to how array elements are stored in memory. Arrays
use contiguous memory locations, meaning all elements are stored next to each
other in RAM.

Each element is accessed using an index, and the address of any element can be
calculated using a formula.

Memory Layout of Array

Let's say we have an integer array:

int arr[5] = {10, 20, 30, 40, 50};

Assume each integer takes 4 bytes, and the base address (starting address) of the array
is 1000.

Index Element Address


0 10 1000
1 20 1004
2 30 1008
3 40 1012
4 50 1016

Address Calculation Formula

For an array arr, the address of the element at index i is calculated as:

LOC(arr[i]) = Base Address + i × Size of Data Type

Example:

Base address = 1000


i=3
Size of int = 4 bytes
Address of arr[3] = 1000 + 3 × 4 = 1012

7 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Why is this Useful?

 Fast access: Arrays allow constant-time access (O(1)) using this formula.
 Used in low-level programming to manage memory directly.

Storage Representation in 2D Array

2D array stores data in row-major or column-major order.

Example:

int arr[2][3] = {
{1, 2, 3},
{4, 5, 6}
};

Row-Major Order (default in C, Python):


Stored as: 1, 2, 3, 4, 5, 6

Memory layout (Assume base = 1000, int = 4 bytes):

Element Address
arr[0][0] = 1 1000
arr[0][1] = 2 1004
arr[0][2] = 3 1008
arr[1][0] = 4 1012
arr[1][1] = 5 1016
arr[1][2] = 6 1020

Types of Array (with Examples)

Arrays are classified based on their dimensions — the number of indexes required to
access elements.

1. One-Dimensional Array (1D Array)

A 1D array is a list of elements that can be accessed using a single index.

Example:

arr = [10, 20, 30, 40]print(arr[2]) # Output: 30

8 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Memory Layout:

Index: 0 1 2 3Value: 10 20 30 40

2. Two-Dimensional Array (2D Array)

A 2D array is like a table or matrix with rows and columns. You need two indexes
to access elements: arr[row][column].

Example:

arr = [[1, 2, 3],


[4, 5, 6]]
print(arr[1][2]) # Output: 6

Memory Layout (Row-major order):

Row 0: [1 2 3]Row 1: [4 5 6]

3. Multi-Dimensional Array (3D and higher)

A multidimensional array is an array of arrays — more than 2 dimensions.

Example (3D Array):

arr = [
[ [1, 2], [3, 4] ],
[ [5, 6], [7, 8] ]
]
print(arr[1][0][1]) # Output: 6

Here, arr[1][0][1] means: Second block → First row → Second element

4. Dynamic Array

A dynamic array is one that can grow or shrink in size during runtime. In Python,
lists act as dynamic arrays.

9 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Example:

arr = [10, 20]


arr.append(30) # Add element at runtimeprint(arr) # Output: [10, 20, 30]

Comparison Table

Type Dimensions Indexing Example Access


1D Array 1 arr[i] arr[2] → 3rd element
2D Array 2 arr[i][j] arr[1][2]
Multi-Dimensional 3 or more arr[i][j][k]... arr[1][0][1]
Dynamic Array Varies arr.append(val) arr.append(40)

Applications of Different Arrays

Type Used In
1D Lists, stacks, queues, temperature data
2D Matrices, game boards, image pixels
3D+ Scientific simulations, 3D graphics
Dynamic Real-time data storage, dynamic inputs

Sparse Matrix Representation using 2D Searching: Sequential (Linear) Search

What is a Sparse Matrix?

A sparse matrix is a matrix in which most of the elements are zero.


Instead of storing all elements (including zeros), we only store non-zero elements to
save memory.

Common Representations of Sparse Matrix:

Triplet (Coordinate List) Format – stores only non-zero values and their row &
column indices:

(row, col, value)

Example Matrix:

10 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Matrix:[ 0 0 3 ]
[0 0 0]
[4 0 0]

Sparse Representation (Triplet Format):

sparse = [ (0, 2, 3), (2, 0, 4) ]

Here,

(0, 2, 3) → value 3 at row 0, col 2

(2, 0, 4) → value 4 at row 2, col 0

Searching in Sparse Matrix using Sequential (Linear) Search

We want to find whether a target value exists in the sparse matrix by checking each
non-zero entry one by one.

Python Example:

# Sparse representation: (row, col, value)


sparse = [ (0, 2, 3), (2, 0, 4) ]
target = 4
found = False
for row, col, value in sparse:
if value == target:
print(f"Found {target} at position ({row}, {col})")
found = True
break
if not found:
print("Value not found")

Output: Found 4 at position (2, 0)

Step-by-Step Explanation:

1. You define the sparse matrix as a list of triplets.


2. Set the target value you want to search.

11 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
3. Use a loop to linearly scan each triplet.
4. If the value matches the target, print its position and stop.
5. If no match is found, print “not found”.

Time Complexity:

Best Case: O(1) → if the value is found early

Worst Case: O(n) → if value is last or not present


(n = number of non-zero elements

Advantages of Using Sparse + Linear Search:

1. Saves space (only non-zero values stored).


2. Simple to implement.
3. Efficient when the number of non-zero values is small.

Summary Table:

Concept Details
Matrix Type Sparse Matrix (mostly 0s)
Storage Method Triplet (row, col, value)
Search Method Sequential / Linear Search
Time Complexity O(n) where n = non-zero terms

Binary Search in Sparse Matrix?

Binary Search requires:

1. Data to be sorted.
2. Works in O(log n) time (faster than linear search).

So, before binary searching in a sparse matrix, we must sort the triplet list (either by
value or position).

Example 1: Search by Value (Sorted by Value)

12 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Sparse Matrix (original):

[0 0 3]
[0 0 0]
[4 0 9]

Sparse Representation (sorted by value):

sparse = [ (0, 2, 3), (2, 0, 4), (2, 2, 9) ]

Binary Search Code to Find a Value

Python Example:

def binary_search_sparse(sparse, target):


low = 0
high = len(sparse) - 1

while low <= high:


mid = (low + high) // 2
row, col, val = sparse[mid]

if val == target:
return (row, col)
elif val < target:
low = mid + 1
else:
high = mid - 1

return None
# Example sparse matrix (sorted by value)
sparse = [ (0, 2, 3), (2, 0, 4), (2, 2, 9) ]
target = 4

result = binary_search_sparse(sparse, target)if result:


print(f"Found {target} at position {result}")else:
print("Value not found")

Output:

Found 4 at position (2, 0)

Step-by-Step Execution:

13 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
1. Sparse array is sorted by value.
2. Binary search starts with mid-point.
3. If value == target, return position.
4. If value < target, search right half.
5. Else, search left half.

Time Complexity

1. Binary Search: O(log n)


where n = number of non-zero elements
2. Better than Linear Search when data is sorted

Summary Table

Feature Details
Input Format Triplet list: (row, col, value)
Requirement Must be sorted (by value)
Search Method Binary Search
Time Complexity O(log n)
Example Found Output (2, 0) for value 4

Notes:

1. Binary Search is not possible on 2D array directly unless: The data is sorted row-
wise or value-wise.
2. Works best when data is already sorted.

What is Fibonacci Search?

Fibonacci Search is a search algorithm (like binary search) used on sorted arrays.
It uses Fibonacci numbers to divide the search range.

Note: The data must be sorted for Fibonacci Search.

Simple Example

14 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Original Matrix:

[0 0 5]
[0 0 0]
[3 0 9]

Sparse Representation (sorted by value):

sparse = [
(2, 0, 3),
(0, 2, 5),
(2, 2, 9)
]

Now we want to search for value 5 using Fibonacci Search.

Python Code (Simple)

def fibonacci_search_sparse(sparse, target):


n = len(sparse)
fib2 = 0 # (m-2)'th Fibonacci
fib1 = 1 # (m-1)'th Fibonacci
fibM = fib1 + fib2 # m'th Fibonacci

# Find the smallest Fibonacci number ≥ n


while fibM < n:
fib2 = fib1
fib1 = fibM
fibM = fib1 + fib2

offset = -1

while fibM > 1:


i = min(offset + fib2, n - 1)
row, col, value = sparse[i]

if value == target:
return (row, col)
elif value < target:
fibM = fib1
fib1 = fib2
fib2 = fibM - fib1
offset = i

15 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
else:
fibM = fib2
fib1 = fib1 - fib2
fib2 = fibM - fib1

# Check last possible element


if fib1 and offset + 1 < n and sparse[offset + 1][2] == target:
row, col, _ = sparse[offset + 1]
return (row, col)

return None
# Example Usage
sparse = [(2, 0, 3), (0, 2, 5), (2, 2, 9)] # Sorted by value
target = 5

result = fibonacci_search_sparse(sparse, target)if result:


print(f"Found {target} at position {result}")else:
print("Not found")

Output:

Found 5 at position (0, 2)

Step-by-Step Summary:

Step Action
1 Sparse matrix stored as a list of (row, col, value)
2 The list is sorted by value: 3, 5, 9
3 Fibonacci search is applied to search for target 5
4 It returns the position (0, 2) where 5 is stored

Why Use Fibonacci Search?

1. Works faster than linear search


2. Requires only addition and subtraction (no division)
3. Best for sorted sparse matrices

Summary Table:

Feature Description

16 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Feature Description
Matrix Type Sparse Matrix
Representation Triplet Format (row, col, val)
Sorted By Value
Search Used Fibonacci Search
Time O(log n)
Output Returns position (row, col)

What is Index Sequential Search?

Index Sequential Search is a combination of:

 Indexing (like binary search on blocks)


 Sequential Search within a block

Steps:

1. Divide data into blocks.


2. Create an index table that stores the first element of each block.
3. Search index table to find the right block.
4. Sequentially search inside the identified block.

It is faster than linear search, but simpler than binary/fibonacci search.

Simple Example

Original Matrix:

[0 0 5]
[0 0 0]
[3 0 9]

Sparse Representation:

sparse = [
(2, 0, 3),
(0, 2, 5),
(2, 2, 9)
]

Assume values are sorted by value:


Sorted sparse: [(2, 0, 3), (0, 2, 5), (2, 2, 9)]

17 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Step 1: Divide into Blocks

Divide the sparse list into blocks of 2:

 Block 0: (2, 0, 3), (0, 2, 5)


 Block 1: (2, 2, 9)

Step 2: Create Index Table

index_table = [
(0, 3), # Block 0 starts at index 0, min value = 3
(2, 9) # Block 1 starts at index 2, min value = 9
]

Step 3: Index Sequential Search Code

def index_sequential_search(sparse, index_table, target):


block_start = -1

# Step 1: Search in index table


for i in range(len(index_table)):
start_idx, min_val = index_table[i]
if min_val > target:
break
block_start = start_idx

# If no block found
if block_start == -1:
return None

# Step 2: Sequential search in block


block_end = index_table[index_table.index((block_start,
sparse[block_start][2])) + 1][0] if block_start != index_table[-1][0] else len(sparse)

for i in range(block_start, block_end):


row, col, value = sparse[i]
if value == target:
return (row, col)

return None

18 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Using the Code:

sparse = [(2, 0, 3), (0, 2, 5), (2, 2, 9)]


index_table = [(0, 3), (2, 9)] # (start_index, min_value)
target = 5

result = index_sequential_search(sparse, index_table, target)if result:


print(f"Found {target} at position {result}")else:
print("Not found")

Output: Found 5 at position (0, 2)

Summary of Steps:

Step Description
1 Sort sparse list by value
2 Divide into blocks
3 Create index table (first value of each block)
4 Search index table to find target's block
5 Do linear search inside that block

Advantages of Index Sequential Search:

1. Faster than linear search


2. Easy to implement
3. Suitable for medium-size sorted data

Summary Table

Feature Description
Matrix Type Sparse Matrix
Representation Triplet Format (row, col, value)
Search Method Index Sequential Search
Sorting Required Yes (sorted by value)
Time Complexity O(√n) approx.
Example Output (0, 2) for value 5

19 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

What is Internal and External Sorting?

1. Internal Sorting

Definition:
Sorting is called internal when all data that needs to be sorted fits into the main
memory (RAM).

Example of Internal Sorting

You have an array of 10 integers:

arr = [5, 2, 9, 1, 6]

You can apply Bubble Sort, Insertion Sort, Quick Sort, etc., directly in memory.

Common Internal Sorting Algorithms:

Algorithm Characteristics
Bubble Sort Simple, stable, slow (O(n²))
Insertion Sort Fast for nearly sorted data
Quick Sort Fastest average case (O(n log n))
Merge Sort Stable and efficient

Internal Sort Characteristics:

4. Works with small to medium datasets


5. Fast as it uses RAM
6. Easy to implements

2. External Sorting

20 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Definition:
Sorting is called external when the dataset is too large to fit into memory, and
sorting is done using secondary storage (disk).

Example of External Sorting

Suppose you have a file of 1 billion records (more than RAM capacity).
You break the file into chunks, sort each chunk in memory, and then merge all
chunks.

Common algorithm: External Merge Sort

Steps in External Merge Sort:

1. Divide: Break large file into smaller chunks that fit in memory.
2. Sort: Sort each chunk using internal sort.
3. Merge: Use k-way merging to merge all sorted chunks.

Real-life Example:

Sorting a huge log file on a web server, which is 20 GB in size but the
RAM is only 4 GB.

Comparison Table

Feature Internal Sorting External Sorting


Data Size Fits in memory (RAM) Too large for memory
Storage Used Main memory (RAM) Secondary memory (Disk)
Speed Faster Slower due to disk access
Algorithms Bubble, Quick, Merge, External Merge Sort, Polyphase
Used Insertion merge
Sorting arrays, lists in
Use Case Large files, databases, big data
programs

Summary :

21 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Term Meaning
Internal Sort Sorts data that fits in RAM
External Sort Sorts huge data using disk + RAM
Key Algorithm Quick, Merge, Insertion (internal), External Merge (external)
Real-world Usage Internal: Arrays in Python/Java. External: Logs, DB records

Bubble Sort Explained with Example

What is Bubble Sort?

Bubble Sort is a simple comparison-based sorting algorithm.


It works by repeatedly swapping adjacent elements if they are in the wrong order.

Each pass "bubbles" the largest unsorted element to the end.

Working Principle:

1. Compare adjacent elements.


2. Swap them if they are in the wrong order.
3. Repeat for all elements until the array is sorted.

Step-by-Step Example

Let’s sort the list:

arr = [5, 2, 4, 1]

Pass 1:

 Compare 5 & 2 → Swap → [2, 5, 4, 1]


 Compare 5 & 4 → Swap → [2, 4, 5, 1]
 Compare 5 & 1 → Swap → [2, 4, 1, 5]

Largest element (5) is now at the end.

22 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Pass 2:

 Compare 2 & 4 → OK
 Compare 4 & 1 → Swap → [2, 1, 4, 5]

Pass 3:

 Compare 2 & 1 → Swap → [1, 2, 4, 5]

Sorted in 3 passes!

Final Output: [1, 2, 4, 5]

Visualization:

Pass Array State Notes


1 [2, 4, 1, 5] 5 bubbled to last position
2 [2, 1, 4, 5] 4 reached correct position
3 [1, 2, 4, 5] Final sorted order

Time Complexity:

Case Time Complexity


Best Case O(n) (when sorted) ✔
Average O(n²)
Worst Case O(n²)

Space Complexity: O(1) (in-place)

Features of Bubble Sort:

23 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Feature Details
Stable ✅ Yes (preserves order)
Adaptive ✅ Yes (optimized version)
Easy to implement ✅ Beginner-friendly

Python Code:

def bubble_sort(arr):
n = len(arr)
for i in range(n):
swapped = False
for j in range(0, n - 1 - i):
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j]
swapped = True
if not swapped:
break # Optimization: stop if already sorted
return arr
print(bubble_sort([5, 2, 4, 1]))

Output: [1, 2, 4, 5]

Insertion Sort

Like sorting playing cards in your hand.


You insert each new element into its correct position among the sorted ones.

Example:

Sort [5, 2, 4, 1]

Step-by-Step:

 Start from 2nd element.


 Compare 2 with 5 → 2 < 5 → Insert before
[2, 5, 4, 1]
 Compare 4 with 5 → 4 < 5 → Insert before
Compare 4 with 2 → 4 > 2 → Done
[2, 4, 5, 1]
 Compare 1 with 5 → Insert before
Compare 1 with 4 → Insert before

24 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Compare 1 with 2 → Insert before
[1, 2, 4, 5]

Final Output: [1, 2, 4, 5]

Time Complexity:

Best: O(n) (nearly sorted)

Worst: O(n²)

Space: O(1)

Stable: ✅ Yes

Selection Sort :

Find the minimum element and place it at the beginning. Repeat for remaining array.

Example:

Sort [5, 2, 4, 1]

Step-by-Step:

 Find min in [5,2,4,1] → 1 → Swap with 5


[1, 2, 4, 5]
 Find min in [2,4,5] → 2 → Already correct
 Find min in [4,5] → 4 → Already correct

Final Output: [1, 2, 4, 5]

Time Complexity:

 Best/Worst: O(n²)
 Space: O(1)
 Stable: No

25 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING

Quick Sort

Pick a pivot, place elements < pivot to the left, > pivot to the right.
Recursively sort left and right parts.

Example:

Sort [5, 2, 4, 1]
Let pivot = 5

Step-by-Step:

1. Partition:

 [2, 4, 1] < 5
 [5]
 No elements > 5
→ [2, 4, 1] + [5]

2. Now sort [2, 4, 1], pivot = 2

 [1] < 2, [4] > 2


→ [1, 2, 4]

3. Combine: [1, 2, 4, 5]

Final Output: [1, 2, 4, 5]

Time Complexity:

1. Best/Average: O(n log n)


2. Worst: O(n²) (bad pivot)
3. Space: O(log n)
4. Stable: ❌ No

Merge Sort:

26 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Divide the array into halves, sort each half, then merge them.

Example:

Sort [5, 2, 4, 1]

Step-by-Step:

 Split → [5, 2] and [4, 1]


 Split again → [5], [2] and [4], [1]
 Merge sorted pairs:
[2, 5] and [1, 4]
 Merge final: [1, 2, 4, 5]

Final Output: [1, 2, 4, 5]

Time Complexity:

1. Best/Worst/Average: O(n log n)


2. Space: O(n)
3. Stable: ✅ Yes

Summary Comparison Table

Time Time
Algorithm Space Stable Use When...
(Best) (Worst)
Insertion Sort O(n) O(n²) O(1) ✅ Yes List is nearly sorted or small
Selection Sort O(n²) O(n²) O(1) ❌ No Simple but not efficient
Fast, best general-purpose
Quick Sort O(n log n) O(n²) O(log n) ❌ No
sort
Merge Sort O(n log n) O(n log O(n) ✅ Yes Requires stability &

27 Prepared by: Ms. Ashwini Bhosale


SRTCT’S
SUMAN RAMESH TULSIANI TECHNICAL CAMPUS – FACULTY OF ENGINEERING,
KHAMSHET
An ISO 9001:2015 Certified Institute
DEPARTMENT OF COMPUTER ENGINEERING
Time Time
Algorithm Space Stable Use When...
(Best) (Worst)
n) consistent speed

28 Prepared by: Ms. Ashwini Bhosale

You might also like