Module 1
Basic Concepts of Data Structures
Badharudheen P
Assistant Professor,
Dept. of CSE, MESCE, Kuttippuram
System Life Cycle
The system life cycle is a series of stages that are
worked through during the development of a new
information system.
Systems: Large Scale Computer Programs
The systems development life cycle concept applies
to a range of hardware and software configurations
A system can be composed of hardware only,
software only, or a combination of both.
Phases of System Life Cycle
Requirements
Analysis
Design
Refinement and coding
Verification
Requirement Phase
All projects begin with a set of specifications that
defines the purpose of that program.
Requirements describe the information that the
programmers are given (input) and the results
(output) that must be produced.
Method:
Meet with end-users, or clients, describe exactly what
the output should look like.
Analysis Phase
The problem is break down into manageable pieces.
There are two approaches for analysis:-
Bottom up and Top down.
Bottom up approach is an older, unstructured
strategy.
Top down approach is a structured approach divide
the program into manageable segments.
Several alternate solutions to the problem are
developed and compared during this phase
Design Phase
The designer approaches the system from the
perspectives of both data objects and the operations.
The first perspective leads to the creation of abstract
data types while the second requires the specification
of algorithms.
Ex: Designing a scheduling system for university
Data objects: Students, courses, professors etc
Operations: insert, remove search etc
Design Phase
The abstract data types and algorithm specifications
are language independent.
We must specify the information required for each
data object and ignore coding details.
Example: Student object should include name, phone
number, social security number etc.
Refinement and Coding Phase
We choose representations for data objects and write
algorithms and then programs for each operation on
them.
Data objects representation can determine the
efficiency of the algorithm related to it.
The choice of choosing the programming language
depends on:
The nature of the problem
The programming languages available in your computer
The limitations of your computer.
Verification Phase
This phase consists of
Developing correctness proofs for the program
using mathematical models
Testing the program with a variety of input data
Testing can be done only after coding.
Testing requires set of test data.
Test data should be chosen carefully so that it includes
all possible scenarios
Removing errors
The task of removing the defects or bugs are called
debugging.
Algorithms
An algorithm is a finite set of instructions to
accomplish a particular task.
All algorithms must satisfy the following criteria:
Input: Data given to the system.
Output: At least one result should be produced.
Definiteness: Each instruction should be clear
Finiteness: For all test cases, the algorithm should
terminate after a finite number of steps.
We can use a natural language like English or
graphical representation called flow chart
Performance Analysis
An algorithm is said to be efficient and fast, if it
takes less time to execute & consume less memory.
Performance is analyzed based on 2 criteria
Space Complexity
Time Complexity
Space Complexity
Analysis of space complexity of a program is the
amount of memory it needs to run to completion.
The space needed by a program consists of:
Data space: Space needed to store all constants, variable
values, dynamically allocated space, etc.
Instruction space: Space needed to store the executable
version of the program and it is fixed.
Environment stack space: This space is needed to store
the information to resume the suspended (partially
completed) functions.
Space Complexity - Example
int sum(int A[], int n)
{
int sum=0, i;
for(i=0;i<n;i++)
{ Here Space needed for variable n = 1 byte
Sum=sum+A[i]; sum = 1 byte
return sum; i = 1 byte
} Array A[i] = n byte
}
Total Space complexity = [n+3] byte
Time Complexity
The time complexity of an algorithm or a program is
the amount of time it needs to run to completion.
T(P) = C + TP
Here C is compile time, Tp is Runtime
Time Complexity
The method is to identify the key operations and
count such operations performed.
A key operation is an operation that takes maximum
time among all possible operations in the algorithm.
The time complexity can now be expressed as
function of number of key operations performed.
When we analyze an algorithm it depends on the
input data.
Time Complexity
For calculating the time complexity, we use a method
called Frequency Count: ie, counting the number of
steps
Comments – 0 step
Assignment statement – 1 Step
Conditional statement – 1 Step
Loop condition for ‘n’ numbers – n+1 Step
Body of the loop – n step
Return statement – 1 Step
Time Complexity - Example
Time Complexity - Example
Time Complexity - Example
Time Complexity
There are three cases:
Best case:
The amount of time on best possible input data.
Input is the one for which the algorithm runs fast.
Provides lower bound on running time.
Average case:
On typical (or average) input data.
Provides a prediction on running time.
Worst case:
On the worst possible input data.
Provides upper bound on running time.
Analysis Function
Running time is expressed as a function of the input
size ‘n’ as f(n)
The various functions are compared with respect to
the running time.
The analysis is independent of the machine time and
the programming style.
Frequency Count
Otherwise known as step count.
Frequency count method can be used to analyze a
program.
It is the number of times the statement is executed in
the program.
Here we assume that every statement takes the same
constant amount of time for its execution.
Hence the time complexity can be determined by
summing the frequency counts of all the statements of
that program.
Frequency Count
For example:
(a) the statement x++ is not contained within any
loop. Then its frequency count is one.
(b) same element will be executed ‘n’ times
(c) it is executed by n2. From this frequency count we
can analyze program.
Frequency Count
Here f(n) = 2n+3, so time complexity: O(n)
Frequency Count
To convert a function to the O(some thing):
Take the term with highest degree in the function
Discard the coefficient.
For example: f(n) = 2n2 + 3n + 5
Its time complexity is: O(n2)
Asymptotic Notation
To solve a problem, we may have many solutions
(algorithms).
We have to find which algorithm is good in term of
time and memory.
Asymptotic notations are the notations used to
represent the time complexity of an algorithm in
terms of a function.
The functions can be in terms of
1 < log n < √n < n < n log n < n2 < n3 <
…..< nn
Asymptotic Notation
Notations used are:
O: Big-oh Notation
Upper bound of a function
Ω: Big-omega Notation
Lower bound of a function
Ө: Big-theta Notation
Average bound of a function
O: Big-oh Notation
It is used to represent the worst case complexity.
Definition: The function f(n) = O(g(n)) iff, Ǝ
positive constants c and n0 such that
f(n) ≤ c * g(n) ∀ n ≥ n0
time
n
O: Big-oh Notation
Example: f(n) = 2n + 3, g(n) = n
=> 2n + 3 ≤ c * g(n)
2n + 3 ≤ 5 * n c=5, for all n≥1
=> f(n) = O(g(n))
Note: You can take any value for ‘c’, but make sure that right
hand side should be greater than the left hand side for some
starting value of ‘n’.
Or you can write it as 2n + 3 ≤ 2n + 3n
=> 2n + 3 ≤ 5n for all n ≥ 1
Or you can write it as 2n + 3 ≤ 2n2 + 3n2
=> 2n + 3 ≤ 5n2 for all n ≥ 1
O: Big-oh Notation
Example: f(n) = 3n + 2, g(n) = n
=> 3n + 2 ≤ c * g(n)
3n + 2 ≤ 4 * n c=4 for all n ≥ 2
=> f(n) = O(g(n))
Hence f(n) = O(g(n))
Note: If ‘n’ is bounding the function f(n), definitely n2, n3, etc will
also bound. So always go for least upper bound.
So we can say that f(n) = O(n), f(n) = O(n2)
But f(n) ≠ O(log n)
Ω: Big-omega Notation
The function f(n) = Ω(g(n)) iff, Ǝ positive
constants c and n0 such that
f(n) ≥ c * g(n) ∀ n ≥ n0
time
n
Ω: Big-omega Notation
Example: f(n) = 2n + 3, g(n) = n
=> 2n + 3 ≥ c * g(n)
2n + 3 ≥ 1 * n c=1, for all n≥1
=> f(n) = Ω(g(n))
Hence f(n) = Ω(g(n))
Note: You can take any value for ‘c’, but make sure that right
hand side should be less than the left hand side for some starting
value of ‘n’.
Or you can write it as 2n + 3 ≥ log n for all n ≥ 1
Ө: Big-theta Notation
The function f(n) = Ө(g(n)) iff, Ǝ positive
constants c1, c2 and n0 such that
c1*g(n) ≤ f(n) ≤ c2*g(n) ∀ n ≥ n0
time
n
Ө: Big-theta Notation
Example: f(n) = 2n + 3, g(n) = n
Already we found that
1 * n ≤ 2n + 3 ≤ 5 * n for all n≥1
Hence f(n) = Ө(g(n))
This is the average bound of a function f(n)
=> f(n) = Ө(n)
f(n) ≠ Ө(log n)
f(n) ≠ Ө(n2)
Time Complexity of Linear Search
For linear search, we need to count the number of
comparisons performed, but each comparison may or
may not search the desired item.
Time Complexity of Binary Search
In Binary Search, each comparison eliminates about
half of the items from the list. Consider a list with n
items, then about n/2 items will be eliminated after
first comparison. After second comparison, n/4 items
of the list will be eliminated. If this process is
repeated for several times, then there will be just one
item left in the list
Performance Comparisons