DS Unit-1
DS Unit-1
What is an algorithm?
• An algorithm is a step by step procedure to solve a problem.
• In normal language, the algorithm is defined as a sequence of statements which are used
to perform a task. In computer science, an algorithm can be defined as follows...
An algorithm is a sequence of unambiguous instructions used for
solving a problem, which can be implemented (as a program) on a
computer.
• Algorithms are used to convert our problem solution into step by step statements.
• These statements can be converted into computer programming instructions which form
a program.
• This program is executed by a computer to produce a solution.
• Here, the program takes required data as input, processes data according to the program
instructions and finally produces a result as shown in the following picture.
Specifications of Algorithms
Every algorithm must satisfy the following specifications...
1. Input - Every algorithm must take zero or more number of input values from external.
2. Output - Every algorithm must produce an output as result.
3. Definiteness - Every statement/instruction in an algorithm must be clear and
unambiguous (only one interpretation).
4. Finiteness - For all different cases, the algorithm must produce result within a finite
number of steps.
5. Effectiveness - Every instruction must be basic enough to be carried out and it also must
be feasible.
Example for an Algorithm
Let us consider the following problem for finding the largest value in a given list of values.
➢ Problem Statement : Find the largest number in the given list of numbers?
➢ Input : A list of positive integer numbers. (List must contain at least one number).
➢ Output : The largest number in the given list of positive integer numbers.
Consider the given list of numbers as 'L' (input), and the largest number as 'max' (Output).
Algorithm
int findMax(L)
{
int max = 0,i;
for(i=0; i < listSize; i++)
{
if(L[i] > max)
max = L[i];
}
return max;
}
2
Two Types of Recursion
1. DirectRecursion
A function calls itself directly.
The function which is called by itself is known as Direct Recursi
ve function (or Recursive function)
2. IndirectRecursion
A function calls another function, and that function calls the original function back.
The function which calls a function and that function calls its c
alled function is known Indirect Recursive function (or Recursive
function)
Definition:
Performance Analysis of an algorithm is the process of evaluating how efficiently an algorithm
uses computer resources, mainly:
• Time (how fast it runs)
• Space (how much memory it uses)
Why is it Important?
When we have multiple algorithms for the same problem, we use performance analysis to:
• Compare them,
• And select the one that is best suited for our needs.
Just like choosing the best way to travel from one city to another (flight, train, bike), we choose
the best algorithm based on performance.
3
• Is it easy to understand or implement?
We mainly focus on:
1. Time Complexity:
o How much time the algorithm takes to complete its task.
o It depends on the size of input (e.g., 10 items vs 1000 items).
2. Space Complexity:
o How much memory the algorithm needs while running.
o It includes memory for variables, data structures, etc.
Formal Definitions:
• Performance of an algorithm is a way of making judgments about the quality of algorith
ms.
• It is the process of predicting the amount of time and space required by an algorithm to
solve a problem.
4
Generally, when a program is under execution it uses the computer memory for THREE reasons.
They are as follows...
1. Instruction Space: It is the amount of memory used to store compiled version of instruct
ions.
2. Environmental Stack: It is the amount of memory used to store information of partially e
xecuted functions at the time of function call.
3. Data Space: It is the amount of memory used to store all the variables and constants.
Note - When we want to perform analysis of an algorithm based on its Space complexity, we c
onsider only Data Space and ignore Instruction Space as well as Environmental Stack.
That means we calculate only the memory required to store Variables, Constants, Structures,
etc.,
To calculate the space complexity, we must know the memory required to store different datat
ype values (according to the compiler). For example, the C Programming Language compiler req
uires the following...
1. 2 bytes to store Integer value.
2. 4 bytes to store Floating Point value.
3. 1 byte to store Character value.
4. 6 (OR) 8 bytes to store double value.
Consider the following piece of code...
Example 1
1. int square(int a)
2. {
3. return a*a;
4. }
In the above piece of code, it requires 2 bytes of memory to store variable 'a' and another 2 byt
es of memory is used for return value.
That means, totally it requires 4 bytes of memory to complete its execution. And this 4 bytes of
memory is fixed for any input value of 'a'. This space complexity is said to be Constant Space Co
mplexity.
If any algorithm requires a fixed amount of space for all input values then that space complexi
ty is said to be Constant Space Complexity.
Consider the following piece of code...
Example 2
5. int sum(int A[ ], int n)
6. {
7. int sum = 0, i;
8. for(i = 0; i < n; i++)
9. sum = sum + A[i];
10. return sum;
5
11. }
In the above piece of code it requires
'n*2' bytes of memory to store array variable 'a[ ]'
2 bytes of memory for integer parameter 'n'
4 bytes of memory for local integer variables 'sum' and 'i' (2 bytes each)
2 bytes of memory for return value.
That means, totally it requires '2n+8' bytes of memory to complete its execution. Here, the total
amount of memory required depends on the value of 'n'. As 'n' value increases the space requir
ed also increases proportionately. This type of space complexity is said to be Linear Space Comp
lexity.
If the amount of space required by an algorithm is increased with the increase of input value,
then that space complexity is said to be Linear Space Complexity.
To calculate the time complexity of an algorithm, we need to define a model machine. Let us as
sume a machine with following configuration...
1. It is a Single processor machine
2. It is a 32 bit Operating System machine
3. It performs sequential execution
4. It requires 1 unit of time for Arithmetic and Logical operations
5. It requires 1 unit of time for Assignment and Return value
6. It requires 1 unit of time for Read and Write operations
6
Now, we calculate the time complexity of following example code by using the above-defined
model machine...
Consider the following piece of code...
Example 1
12. int sum(int a, int b)
13. {
14. return a+b;
15. }
In the above sample code, it requires 1 unit of time to calculate a+b and 1 unit of time to return
the value. That means, totally it takes 2 units of time to complete its execution. And it does not
change based on the input values of a and b. That means for all input values, it requires the sam
e amount of time i.e. 2 units.
If any program requires a fixed amount of time for all input values then its time complexity is
said to be Constant Time Complexity.
Consider the following piece of code...
Example 2
16. int sum(int A[], int n)
17. {
18. int sum = 0, i;
19. for(i = 0; i < n; i++)
20. sum = sum + A[i];
21. return sum;
22. }
For the above code, time complexity can be calculated as follows...
In above calculation
Cost is the amount of computer time required for a single operation in each line.
Repeatation is the amount of computer time required by each operation for all its repeatation
s.
Total is the amount of computer time required by each operation to execute.
So above code requires '4n+4' Units of computer time to complete the task. Here the exact tim
7
e is not fixed. And it changes based on the n value. If we increase the n value then the time req
uired also increases linearly.
Totally it takes '4n+4' units of time to complete its execution and it is Linear Time Complexity.
If the amount of time required by an algorithm is increased with the increase of input val
ue then that time complexity is said to be Linear Time Complexity.
Asymptotic Notations
Asymptotic notations are used to represent the complexities of algorithms for asymptotic analy
sis. These notations are mathematical tools to represent the complexities. There are three nota
tions that are commonly used.
Big Oh Notation
Big-Oh (O) notation gives an upper bound for a function f(n) to within a constant factor.
Big-Omega (Ω) notation gives a lower bound for a function f(n) to within a
constant factor.
8
We write f(n) = Ω(g(n)), If there are positive constantsn0 and c such that, to
the right of n0 the f(n) always lies on or above c*g(n).
Ω(g(n)) = { f(n) : There exist positive constant c and n0 such that 0 ≤ c g(n) ≤
f(n), for all n ≥ n0}
Whenever we want to work with a large amount of data, then organizing that data is very
important. If that data is not organized effectively, it is very difficult to perform any task on that
data. If it is organized effectively then any operation can be performed easily on that data.
A data structure can be defined as follows...
Data structure is a method of organizing a large amount of data more efficiently so that
any operation on that data becomes easy
9
Linear Data Structures
Example
1. Arrays
2. List (Linked List)
3. Stack
4. Queue
If a data structure organizes the data in random order, then that data
structure is called as Non-Linear Data Structure.
Example
1. Tree
2. Graph
3. Dictionaries
4. Heaps
5. Tries, Etc.,
A data type tells the computer what kind of data you are working with in a
program.
Examples:
An Abstract Data Type (ADT) is a special kind of data type where you can
use it and perform operations, but you don’t need to know how it works
inside.
It hides the internal working details (that’s why it’s called abstract), but lets you
use it easily.
Examples of ADT:
10
• Stack
• Queue
• List
These are all made using basic data types, but you use them with specific
operations.
Summary:
11
• ADTs make programming easier by hiding complexity.
What is an Array?
Whenever we want to work with large number of data values, we need to use that much number
of different variables. As the number of variables are increasing, complexity of the program also
increases and programmers get confused with the variable names. There may be situations in
which we need to work with large number of similar data values. To make this work more easy,
C programming language provides a concept called "Array".
An array is a variable which can store multiple values of same data type at a time.
"Collection of similar data items stored in continuous memory locations with single
name".
int a, b, c;
Here, the compiler allocates 2 bytes of memory with name 'a', another 2 bytes
of memory with name 'b' and more 2 bytes with name 'c'. These three memory
locations are may be in sequence or may not be in sequence. Here these
individual variables store only one value at a time.
int a[3];
12
Here, the compiler allocates total 6 bytes of continuous memory locations with
single name 'a'. But allows to store three different integer values (each in 2
bytes of memory) at a time. And memory is organized as follows...
That means all these three memory locations are named as 'a'. But "how can we
refer individual elements?" is the big question. Answer for this question is,
compiler not only allocates memory, but also assigns a numerical value to each
individual element of an array. This numerical value is called as "Index". Index
values for the above example are as follows...
arrayName[indexValue]
For the above example, the individual elements can be referred as follows...
13
If I want to assign a value to any of these memory locations (array elements),
we can assign as follows...
a[1] = 100;
Types of Arrays
In c programming language, arrays are classified into two types. They are as
follows...
We use the following general syntax for declaring a single dimensional array...
Example Code
14
The above declaration of single dimensional array reserves 60 continuous
memory locations of 2 bytes each with the name rollNumbers and tells the
compiler to allow only integer values into those memory locations.
We use the following general syntax for declaring and initializing a single
dimensional array with size and initial values.
Example Code
We can also use the following general syntax to intialize a single dimensional
array without specifying size and with initial values...
The array must be initialized if it is created without specifying any size. In this
case, the size of the array is decided based on the number of values initialized.
Example Code
In the above example declaration, size of the array 'marks' is 6 and the size of
the array 'studentName' is 16. This is because in case of character array,
compiler stores one exttra character called \0 (NULL) at the end.
15
of memory allocation. The index value of single dimensional array starts with
zero (0) for first element and incremented by one for each element. The index
value in an array is also called as subscript or indices.
arrayName [ indexValue ]
Example Code
marks [2] = 99 ;
In the above statement, the third element of 'marks' array is assinged with
value '99'.
Most popular and commonly used multi dimensional array is two dimensional
array. The 2-D arrays are used to store data in the form of table. We also use
2-D arrays to create mathematical matrices.
We use the following general syntax for declaring a two dimensional array...
Example Code
We use the following general syntax for declaring and initializing a two
dimensional array with specific number of rows and coloumns with initial values.
16
Example Code
Example Code
{1, 2, 3},
{4, 5, 6}
};
We use the following general syntax to access the individual elements of a two-
dimensional array...
Example Code
matrix_A [0][1] = 10 ;
In the above statement, the element with row index 0 and column index 1
of matrix_A array is assinged with value 10.
17
18