INTRODUCTION
TO
DATA STRUCTURES
BASIC TERMINOLOGY
Data: Collection of raw facts. Data may be a single value or it may be a set
of values.
Information: Meaningful or Processed data is called Information.
Record is a collection of related data item.
File is a collection of logically related records.
Entity
is a person, place, thing, event or concept about which information is
recorded.
has certain attributes or properties which may be assigned values.
Attributes gives the characteristics of the entity.
Entity set: Entities with similar attributes forms an Entity Set, ex: Bank
accounts: All people who have an account at a bank.
Range is a set of all possible values that could be assigned to a particular
attribute.
2
DATA STRUCTURES
Logical or mathematical model of a particular organization of data is called
a Data Structure.
DS is a collection of data in an organized way.
Data structure means how the data is organised in memory.
Data structures are the building blocks of the program.
The selection of a particular data structure stresses on following:
The data structure must be rich enough in structure to reflect the
relationship existing between the data.
The structure should be so simple that data can be processed effectively
whenever required.
DATA STRUCTURE = ORGANIZED DATA + ALLOWED
OPERATIONS
PROGRAM = ALGORITHM + DATA STRUCTURE
3
CLASSIFICATION OF DATA
STRUCTURES
Data structures are normally divided into two broad
categories:
Primitive data structures
Basic data structures that are directly operated upon by
machine instruction.
Available in most programming languages as built-in
types.
E.g. int, float, char, pointer
Non-primitive data structures
These data structures are a set of homogenous and
heterogeneous data elements stored together.
4
TYPES OF DATA
STRUCTURE
5
NON-PRIMITIVE DATA STRUCTURES
These are further classified as:
Linear data structure
A data structure is said to be linear if its elements
forms any sequence e.g. Arrays, Linked Lists,
Stacks, Queues.
Non-linear data structure
Represents data containing hierarchical relationship
between elements e.g. trees, graphs
6
7
8
DATA STRUCTURE
OPERATIONS
The choice of data structure depends on the
frequency with which specific operations are
performed.
Operations that can be performed are:
Traversing
Searching
Insertion
Deletion
Sorting
Merging
9
DATA STRUCTURE
OPERATIONS
Traversing
Accessing each record exactly once so that certain items in the
record may be processed.
Searching
Finding the location of the record with a given key value, or finding
the location of all records satisfying one or more conditions.
Insertion
Adding a new record to the structure.
Deletion
Removing a record from a structure.
Sorting
Arranging the records in some logical order
Merging
Combining the records in two different sorted files into a single
sorted file.
10
DATA TYPES
Each variable in C has its associated data type.
Each data type requires different amount of memory.
Some commonly known basic data types are:
int
Used to store an integer
Requires 2 bytes of memory
char
Stores a single character
Requires one byte of memory
float
Used to store decimal numbers with single precision
double
Used to store decimal numbers with double precision
11
12
13
14
ABSTRACT DATA TYPES
A useful tool for specifying the logical properties of a data
type is the ADT
It is the way we look at a data structure, focusing on what it
does and ignoring how it does its job.
Abstract means considered apart from the detailed
specification or implementation
Data Type is a collection of values and a set of operations
on those values
ABSTRACT DATA TYPES
(CONTD..)
ADT can be a structure considered without regard to its
implementation
OR
It can be thought of as a description of the data in
the structure with a list of operations that can be
performed on the data within the structure
e.g. stack.h file has been provided for performing the
operations over stack
SPECIFYING AN ADT
An ADT consists of two parts:
a value definition and
An operator definition
The value definition defines the collection of values for the
ADT and consists of two parts:
1. a definition clause
2. a condition clause
Each operator is defined as abstract function with three parts:
1. a header
2. optional preconditions
3. postconditions
SPECIFYING AN ADT
(EXAMPLE)
To illustrate the concept of an ADT and our specification
method, consider the ADT RATIONAL, corresponds to
Rational Number.
/* value definition */
abstract typedef <integer, integer> RATIONAL;
Condition RATIONAL[1] <> 0;
/* operator definition */
abstract RATIONAL mult (a, b) /* written a*b */
RATIONAL a,b;
Postcondition mult[0] == a[0]*b[0];
mult[1] == a[1]*b[1];
Similarily, other operators definition come here
REAL LIFE APPLICATIONS
1. 2-D Array for record keeping
2. Stack of plates in a function/party
3. Queue for bus at bus stand
4. Queue for buying tickets at ticket counter
5. Tree for representing the family relations
6. Graph for traveling in a number of cities
19
ARRAYS
An array is a set of finite, ordered collection of homogeneous data items.
Finite : countable.
Ordered : stored in a sequential manner (Stored in consecutive
memory locations)
Homogeneous: Same data types.
It means an array can contain one type of data only, either all
integer, all float-point number or all character.
The elements of array are referenced respectively by an index set
consisting of n consecutive numbers.
The number n of elements is called the length or size of the array.
Length=UB-LB+1
Where,
UB – largest index, called Upper Bound
LB – smallest index, called Lower bound
Length=UB when LB=1
20
The elements of array A may be denoted by:
Subscript notation
A1, A2, A3, ……., An
Parenthesis notation
A(1), A(2), …… , A(N)
Bracket notation
A[1], A[2], A[3], …… ,A[N]
The number K in A[K] is called subscript or
index.
A[K] is called subscripted variable.
21
Representation of Array
Example
53
5
3
22
TYPES OF ARRAY
One-Dimensional: An array that can be represented by
only one-dimension such as row or column and that holds
finite number of same type data items. Ex: int A[1].
Two-Dimensional: An array that has two dimensions
such as row or column, eg. int A[2][4]
Multi-Dimensional: A 3D array is a collection of 2D
arrays.
An array with more than one dimension, eg. a 3D array,
an array of arrays of arrays , int A[2][4][5].
It is specified by using three subscripts: Block size(total
number of 2D arrays), row size and column size.
23
ONE DIMENSIONAL
ARRAYS
A one dimensional array is one in which only one
subscript specification is needed to specify a particular
element of the array.
One dimensional array can be declared as :
Data_type var_name [expression];
Where data_type is the type of elements to be stored in
the array.
Var_name specifies the name of array,
Expression or subscript specifies the number of values to
be stored in the array.
24
Example : int num[5]
The array will store five integer values, its
name is num
num[0]=22, num[1] = 1, num[2]=30,
num[3]=9, num[4]=3
num
0 1 2 3 4
22 1 30 9 3
25
The size of array can be determine as :
Size = UB – LB +1
For array num size will be :
UB = 4
LB = 0
Size = 4 – 0 +1 = 5
Size of array in bytes (i.e. amount of
memory array occupies)
26
Size in bytes = size of array x size of base
type
E.g. if we have array
int num [5];
Then size in bytes = 5 x 2 = 10 bytes
Similarly, if we have array
float num[5];
Then size in bytes = 5 x 4 = 20 bytes
27
REPRESENTATION OF ARRAY IN MEMORY
Let LA be a linear array in memory.
LOC(LA[K])=address of the element LA[K] of array
LA
Computer keeps track of address of first
element of LA only, called Base address
Base(LA)
To calculate the address of any element of K,
formula is:
LOC(LA[K]) = Base(LA) + w(K-lower bound)
w is the no. of words per memory cell for LA
28
EXAMPLE
29
OPERATIONS ON ARRAYS
Traversing
Accessingor processing (visiting) each
element of array exactly once
Insertion
To insert an element into array
Deletion
To delete element from array
Searching
To search any element from the given list
Sorting
To sort the given list of elements
30
ALGORITHM: TRAVERSING
IN LINEAR ARRAY
LA is a linear array with lower bound LB and
upper bound UB. This algorithm traverses LA
applying an operation PROCESS to each element
of LA.
Alternate algorithm
31
INSERTION INTO LINEAR
ARRAY
(Inserting into a Linear Array) INSERT (LA, N, k, ITEM)
Here LA is a Linear array with N elements and K is a positive integer. This
algorithm inserts an element ITEM into the Kth position in LA.
Steps are:
1.[Initialize counter] Set J=N
2. Repeat steps 3 and 4 while J>=k
3. [Move Jth element downward] Set LA[J+1]=LA[J]
4. [Decrease counter] Set J=J-1
[End of step 2 loop]
5.[Insert element] Set LA[k]=ITEM
6.[Reset N] Set N=N+1
7.Exit
32
DELETION INTO LINEAR
ARRAY
33
TWO-DIMENSIONAL
ARRAYS
A two dimensional m×n array A is a collection of m·n
data elements.
Each element is specified by a pair of integers (such as J,
K), called subscripts such that
1 ≤ J ≤ m and 1 ≤ K ≤ n
It is denoted by
AJ,K or A[J,K]
• Two-dimensional arrays are called matrix
arrays.
34
TWO-DIMENSIONAL ARRAY (TABLE)
Col 0 Col 1 Col 2 Col 3 Col 4
Row 0 a[0][0] a[0][1] a[0][2] a[0][3] a[0][4]
Row 1 a[1][0] a[1][1] a[1][2] a[1][3] a[1][4]
Row 2 a[2][0] a[2][1] a[2][2] a[2][3] a[2][4]
►a[3][5]; // a table with 3 rows,5 columns
►The first dimension denotes the row-index, which
starts with 0.
►The second dimension denotes the column-index,
which also starts with 0.
35
TWO-DIMENSIONAL
ARRAY
A has 3 rows and 4 columns
Columns
1 2 3 4
36
REPRESENTATION OF 2-D
ARRAY IN MEMORY
37
ADDRESS CALCULATION IN
2D ARRAY
Following formula can be applied to locate a particular address:
Let a 2D array A of m*n size, to compute the LOC (A[J,K]) using
the formula:
Row major order
LOC(A[J,K]) = Base(A) + w(N(J-1)+(K-1))
where Base(A) = address of 1st element of A, w = words per
memory cell and N = total no. of columns in array
Column major order
LOC(A[J,K]) = Base(A) + w(M(K-1)+(J-1))
where Base(A) = address of 1st element of A, w = words per
memory cell and M = total no. of rows in array
38
EXAMPLE
Consider the 25 x 4 matrix array SCORE.
Suppose Base(SCORE) = 200 & w= 4.
Calculate the address of SCORE[12,3]
i.e. the 12th row and 3rd column using
row major order and column major
order.
39
ANSWER
M = 25,
N=4
J = 12
K=3
w=4
Base (SCORE) =200
Data Structures 40
USING ROW-MAJOR
ORDER
LOC (SCORE[12,3]) = Base (SCORE) + w [N(J-1) + (K-
1)]
= 200 + 4 [4(12-1) + (3-1)]
= 200 + 4 [4(11) + 2)]
= 200 + 4 [44 +2]
= 200 + 4 [46]
= 200 + 184
= 384
41
USING COLUMN MAJOR
ORDER
LOC (SCORE[12,3]) = Base (SCORE) + w [M(K-1) +
(J-1)]
= 200 + 4 [25(3-1) + (12-1)]
= 200 + 4 [25(2) + 11)]
= 200 + 4 [50 +11]
= 200 + 4 [61]
= 200 + 244
= 444
42
MULTIDIMENSIONAL
ARRAY
When array can be extended to any number
of dimensions. E.g. 3D array may be defined
as :
int C[2][4][3]
Multidimensional array also called arrays of
arrays
Suppose C is a 3D (2 x 4 x 3) array. Then C
contains 2 x 4 x 3 = 24 elements. These
elements appear in 3 layer called pages,
column and row.
43
MULTI-DIMENSIONAL
ARRAYS
The length Li of dimension i of C is the
number of elements in the index set and
Li can be calculated as
Li = upper bound – lower bound + 1
For a given subscript Ki, the effective
index Ei of Li is the number of indices
preceding Ki in the index set and Ei can
be calculated from
Ei = Ki – lower bound
Data Structures 44
45
Then the address LOC(C[K1, K2, …….,KN]) of
an arbitrary element of C can be obtained
from the formula
Column-Major
Base(C) + w [(…..((EN LN-1 + EN-1)LN-2+EN-2)+….. + E2)L1
+ E1)
Row-Major
Base(C) + w [(…..((E1L2 + E2)L3+E3)L4+….. + EN-1)LN +
EN)
Where C is stored in column major or row
major order. Base (C) denotes the address of
the first element of C and w denotes the
number of words per memory location.
46
EXAMPLE
47
ADVANTAGES OF ARRAY
It is used to represent multiple data items of same type by
using single name.
It can be used to implement other data structures like linked
lists, stacks, queues, tree, graphs etc.
Two-dimensional arrays are used to represent matrices.
Many databases include one-dimensional arrays whose
elements are records.
48
DISADVANTAGES OF ARRAY
We must know in advance that how many elements
are to be stored in array.
Array is static structure. It means that the array is of fixed
size. The memory which is allocated to the array cannot be
increased or decreased.
Array is fixed size; if we allocate more memory than
requirement then the memory space will be wasted.
The elements of array are stored in consecutive memory
locations. So insertion and deletion are very difficult and
time consuming.
49
STRUCTURE AND
UNION
RECORDS
A Record is a collection of related data items, each of
which is called a field.
Record is a collection of non homogenous data i.e. the
data items in a record may have different types. Whereas,
Array is a collection of homogenous data.
SID Name Add Class Branch Field
101 Amit Delhi B.Tech. IT Record
102 Sumit Panipat B.Tech. CSE
…. …. …. …. ….
…. …. …. …. ….
…. …. …. …. ….
…. …. …. …. ….
107 Anil Karnal MCA -
STRUCTURE
The structure is collection of data items that
may be of different data types.
The data items of a structure are called fields.
Each field has an identifiers (i.e. field name)
and a data type.
The general format of structure
Struct Structure_name
{
Datatype member_1;
Datatype member_2;
………………………………;
Datatype member_n;
};
Where struct is a keyword also called tag and
Structure_name is the name given to structure
data type.
EXAMPLE
Struct Student
{
int roll;
char name[20];
int age;
};
DECLARING VARIABLES
OF THE STUCTURE
Once a structure is define, you can
declare variables of that type
Syntax
Struct Structure_name var1, var2,var3……
varn;
Example
Struct Student S1, S2;
Declares the variables S1, S2 of type student.
ACCESSING A STRUCTURE
An individual member of the
structure is processed by the
structure variable name followed
by a dot operator and the member
name.
Syntax
Structure_name.member_name;
Example
S1.roll;
S1.name;
S1.age;
ARRAY OF STRUCTURES
An array of structures can be
declared just like an ordinary array.
However, the structure has to be
defined before an array of its type
is declared.
Struct Student
{
Int roll;
Char name[20];
Int age;
};
Struct Student S[5];
Individual elements of a structure in an
array of structures are accessed by referring
to structure variable name, followed by
subscript, followed by a dot operator and
ending with structure member.
Syntax
Structure_Name[Subscript].member
Example
S[1].roll=101;
S[1].age=15;
UNION
Union, Like structures, contain
members whose individual data
types may differ from one another.
Example
Union Sample
{
int x;
float y;
char z;
};
Union Sample S;
DIFFERENCE BETWEEN
STRUCTURE AND UNION
Union share the same storage area
within the computer’s memory, where
as each member of structures is
assigned its own unique storage area.