Data Structure
Data Structure
1
Data Structure
Data structure plays crucial role in design and maintenance of an application program.
When we study data structure and algorithm, we are talking and concern about space
(RAM) and processing (CPU) efficiency. Both processing power and memory are finite.
So we need to keep things improved so that it takes the least amount of space and
processing time. Our RAM is basically cells of memory. Data structure is a key
component of computer science and is largely used in the areas of Artificial Intelligence,
operating systems, etc to provides the right way to organize information in the digital
space. Observe the problems in-depth and you can help this world giving the solution
which no one has ever given before. Data structure and algorithms help in understanding
the nature of the problem at a deeper level and thereby a better understanding which
allow us to write efficient computer programs.
The choice of a good data structure makes it possible to perform a variety of critical
operations effectively. An efficient data structure also uses minimum memory space and
execution time to process the structure. A data structure is not only used for organising
the data. It is also used for processing, retrieving, and storing data. There are different
2
basic and advanced types of data structures that are used in almost every program or
software system that has been developed. So we must have good knowledge of data
structures.
3
Merging: this is basically the process of combining of record in two different sorted
files into a single sorted file. Lists of two sorted data items can be combined to form a
single list of sorted data items.
Traversing: It means to access each data item exactly once so that it can be
processed. For example, to print the names of all the students in a class.
Data structures provide an easy way of organising, retrieving, managing, and storing
data.
Here is a list of the needs for data.
It gives different level of organization data
It tells how data can be stored and accessed in its element level
Provide operation on group of data, such as adding an item, looking up highest
priority item.
Provide a means to manage huge amount of data efficiently
Provide fast searching and sorting of data
Save storage memory space.
Easy access to the large database
Importance of Data Structure
Data structure helps in efficient storage of data in the storage device.
Data structure usage provides convenience while retrieving the data from storage
device.
Data structure provides effective and efficient processing of small as well as large
amount of data.
Usage of proper data structure, can help programmer save lots of time or
processing time while operations such as storage, retrieval or processing of data.
Manipulation of large amount of data can be carried out easily with the use of
good data structure approach.
4
Robustness: Generally, all computer programmers wish to produce software that
generates correct output for every possible input provided to it, as well as execute
efficiently on all hardware platforms. This kind of robust software must be able to
manage both valid and invalid inputs. (Think of robustness like building a reliable ferry. You
want the software or computer program to work correctly no matter what inputs (like numbers,
words, etc.) you give it. It should handle both good and bad inputs without breaking or crashing.)
Adaptability: Developing software projects such as word processors, Web browsers and
Internet search engine involves large software systems that work or execute correctly and
efficiently for many years. Moreover, software evolves due to ever changing market
conditions or due to emerging technologies. (Imagine you have a toolbox with different tools
like hammers, screwdrivers, and pliers. Adaptability means you can adjust how you use these tools
or add new tools to fit new tasks or projects. Like software, a toolbox that’s adaptable helps you stay
useful for a long time by adjusting to new needs or technologies.)
5
These are further divided into linear and non-linear data structure based on the structure
and arrangement of data.
Linear Data Structure
A data structure that maintains a linear relationship among its elements is called a linear
data structure. Here, the data is arranged in a linear fashion. But in the memory, the
arrangement may not be sequential.
Ex: Arrays, linked lists, stacks, queues.
Array
Array, in general, refers to an orderly arrangement of data elements. Array is a type of
data structure that stores data elements in adjacent locations. Array is considered as linear
data structure that stores elements of same data types. Hence, it is also called as a linear
homogenous data structure.
When we declare an array, we can assign initial values to each of its elements by
enclosing the values in braces { }.
int Num [5] = { 26, 7, 67, 50, 66 };
This declaration will create an array as shown below:
0 1 2 3 4
Num 26 7 67 50 66
Array
The number of values inside braces { } should be equal to the number of elements that we
declare for the array inside the square brackets [ ]. In the example of array Paul, we have
declared 5 elements and in the list of initial values within braces { } we have specified 5
6
values, one for each element. After this declaration, array Paul will have five integers, as
we have provided 5 initialization values.
Arrays can be classified as one-dimensional array, two-dimensional array or
multidimensional array.
One-dimensional Array: It has only one row of elements. It is stored in ascending storage
location.
Two-dimensional Array: It consists of multiple rows and columns of data elements. It is
also called as a matrix.
Multidimensional Array: Multidimensional arrays can be defined as array of arrays.
Multidimensional arrays are not bounded to two indices or two dimensions. They can
include as many indices as required.
Limitations:
Arrays are of fixed size.
Data elements are stored in contiguous memory locations which may not be
always available.
Insertion and deletion of elements can be problematic because of shifting of
elements from their positions.
However, these limitations can be solved by using linked lists.
Applications:
Storing list of data elements belonging to same data type
Auxiliary storage for other data structures
Storage of binary tree elements of fixed count
Storage of matrices
Linked List
A linked list is a data structure in which each data element contains a pointer or link to
the next element in the list. Through linked list, insertion and deletion of the data element
is possible at all places of a linear list. Also in linked list, it is not necessary to have the
data elements stored in consecutive locations. It allocates space for each data item in its
own block of memory.
Thus, a linked list is considered as a chain of data elements or records called nodes. Each
node in the list contains information field and a pointer field. The information field
contains the actual data and the pointer field contains address of the subsequent nodes in
the list.
7
A Linked List
Figure above represents a linked list with 4 nodes. Each node has two parts. The left part
in the node represents the information part which contains an entire record of data items
and the right part represents the pointer to the next node. The pointer of the last node
contains a null pointer.
Stacks
A stack is a linear data structure in which insertion and deletion of elements are done at
only one end, which is known as the top of the stack. Stack is called a last-in, first-out
(LIFO) structure because the last element which is added to the stack is the first element
which is deleted from the stack.
Stack is characterized by only two fundamental operations, the push and the pop. These
are the terminologies associated to stack:
1. Push: this is the term used to insert an element into stack. The push operation adds
to the top of the list, hiding any items already on the stack, or initializing the stack
if it is empty.
2. Pop: is the term used to delete element from stack. The pop operation removes an
item from the top of the list, and returning this value to the caller. A pop either
reveals previously concealed items or results in an empty list.
8
A Stack
In the computer’s memory, stacks can be implemented using arrays or linked lists. Figure
above is a schematic diagram of a stack. Here, element FF is the top of the stack and
element AA is the bottom of the stack. Elements are added to the stack from the top.
Since it follows LIFO pattern, EE cannot be deleted before FF is deleted, and similarly
DD cannot be deleted before EE is deleted and so on.
Applications:
Temporary storage structure for recursive operations
Auxiliary storage structure for nested operations, function calls,
deferred/postponed functions
9
Manage function calls
Evaluation of arithmetic expressions in various programming languages
Conversion of infix expressions into postfix expressions
Checking syntax of expressions in a programming environment
Matching of parenthesis
String reversal
In all the problems solutions based on backtracking.
Used in depth first search in graph and tree traversal.
Operating System functions
UNDO and REDO functions in an editor.
Queues
A queue is a first-in, first-out (FIFO) data structure in which the element that is inserted
first is the first one to be taken out. The elements in a queue are added at one end called
the rear and removed from the other end called the front. Like stacks, queues can be
implemented by using either arrays or linked lists.
Example of students waiting on a queue to enter a bus, the first person on the queue is the
first person to enter the bus. An example of a queue in computer occurs in a time-sharing
system, in which program with the same priority form a queue while waiting to be
executed.
A good example of the queue is any queue of consumers for a resource where the
consumer that came first is served first. The difference between stacks and queues is in
removing. In a stack we remove the item the most recently added; in a queue, we
remove the item the least recently added.
Figure below shows a queue with 4 elements, where 55 is the front element and 65 is the
rear element. Elements can be added from the rear and deleted from the front.
10
A Queue
Applications:
It is used in breadth search operation in graphs.
Job scheduler operations of OS like a print buffer queue, keyboard buffer queue to
store the keys pressed by users
Job scheduling, CPU scheduling, Disk Scheduling
Priority queues are used in file downloading operations in a browser
Data transfer between peripheral devices and CPU.
Interrupts generated by the user applications for CPU
Calls handled by the customers in BPO
Operation
The basic operations on a deque are enqueue and dequeue on either end.
Enqueue to add (an item of data awaiting processing) to a queue of such items.
Dequeue or Deque means to remove from a queue.
Trees
A tree is a non-linear data structure in which data is organized in branches. The data
elements in tree are arranged in a sorted order. It imposes a hierarchical structure on the
data elements.
The figure below represents a tree which consists of 8 nodes. The root of the tree is the
node 60 at the top. Node 29 and 44 are the successors of the node 60. The nodes 6, 4, 12
and 67 are the terminal nodes as they do not have any successors.
11
A Tree
A tree is a non-linear data structure widely used to represent data containing a
hierarchical relationship with a set of linked nodes. It is named a tree structure because
the classic representation resembles a tree; even though the chart is generally upside
down compared to an actual tree, with the root at the top and the leaves at the bottom.
A tree can be defined locally as a collection of nodes (starting at a root node) where each
node is a data structure consisting of a value, together with a list of nodes (the children)
with the constraint that no node is duplicated.
Tree terminologies
As in a general graph, the point at which lines come together are called nodes. Much of
the remaining terminologies of a tree were borrowed from two sources; the natural tree
and the family tree. From the natural trees, the topmost of the tree is called the root and
the bottom most node are called leaves, the lines connecting the nodes are called
branches. From family tree, a node is said to be a parent of those immediate below it
which are said to be children. Children of the same parent are said to be siblings (twins),
its descendants are its children, childrens children. A node of a tree together with all its
descendant is itself a tree which is said to be sub tree of original tree.
12
subtree.
Parent node A is a parent to B,C,D,E,F,G.
Parent node is an immediate predecessor B is parent of D,H,I,J & E is a
of a node. parent to K,L, H is a parent to
N,O
Leaf Node which does not have any child is C,D,F,G,I,J,N,O,P,Q are leaf
called as leaf nodes
Siblings B,C,D,E,F,G are siblings
Nodes with the same parent are called H,I,J are siblings
Siblings. N,O are siblings
K,L are siblings
Height The height of root node A is
Height is also known as depth of the
ONE. Node B, C, D, E, F, G are
tree. The height of root node A is one.
at level TWO. Nodes
Height of a tree is equal to one more
H,I,J,K,L,M are at level THREE.
than the largest level number of tree. The
Nodes N,O,P,Q are at level
height of the above tree is 4.
FOUR.
Root Root is a special node in a tree. The
entire tree originates from it. It does not Node A
have a parent.
Child node All immediate successors of a node are
D & E are children of B
its children.
Edge Edge is a connection between one node
to another. It is a line between two nodes Line between A & B is edge
or a node and a leaf.
Level The level of root node A is zero.
Node B, C, D, E, F, G are at
Level of any node is defined as the
level 1. Nodes H,I,J,K,L,M are
distance of that node from the root.
at level 2. Nodes N,O,P,Q are at
level 3.
Degree The degree of node A is 6, the
The number of children of a node is degree of node B is 3. The
called its degree. degrees of node E & H are 2.
The degree of node K & L are 1.
Path / Path is a number of successive edges A – B – H – N is path from
Traversing from source node to destination node. node A to E
13
Advantage: Provides quick search, insert, and delete operations
Graphs
A graph is also a non-linear data structure. In a tree data structure, all data elements are
stored in definite hierarchical structure. In other words, each node has only one parent
node. While in graphs, each data element is called a vertex and is connected to many
other vertexes through connections called edges.
Thus, a graph is considered as a mathematical structure, which is composed of a set of
vertexes and a set of edges. Figure below shows a graph with six nodes A, B, C, D, E, F
and seven edges [A, B], [A, C], [A, D], [B, C], [C, F], [D, F] and [D, E].
Graph
14
wide variety of real-world systems, such as social networks, transportation networks,
and computer networks.
ALGORITHM
Algorithm is a step-by-step procedure, which defines a set of instructions to be executed
in a certain order to get the desired output. Algorithms are generally created independent
of underlying languages, i.e. an algorithm can be implemented in more than one
programming language.
An algorithm is a series of steps or methodology to solve a problem. Or a step-by-step
process to get the solution for a well-defined problem.
Properties of an algorithm:
It is written in simple English.
15
Each step of an algorithm is unique and should be self-explanatory i.e. it should be
unambiguous and precise.
An algorithm must have at least one input.
An algorithm must have at least one output.
An algorithm has finite number of steps i.e. should have an end point and finite
number of steps.
From the data structure point of view, following are some important categories of
algorithms −
Search − Algorithm to search an item in a data structure.
Sort − Algorithm to sort items in a certain order.
Insert − Algorithm to insert item in a data structure.
Update − Algorithm to update an existing item in a data structure.
Delete − Algorithm to delete an existing item from a data structure.
Characteristics of an Algorithm
Not all procedures can be called an algorithm. An algorithm should have the following
characteristics −
I. Clear and Unambiguous: Algorithm should be clear and unambiguous. Each of
its steps should be clear in all aspects and must lead to only one meaning.
II. Well-Defined Inputs: If an algorithm says to take inputs, it should be well-
defined inputs.
III. Well-Defined Outputs: The algorithm must clearly define what output will be
yielded and it should be well-defined as well.
IV. Finite-ness: The algorithm must be finite, i.e. it should not end up in an infinite
loops or similar.
V. Feasible: The algorithm must be simple, generic and practical, such that it can be
executed upon will the available resources. It must not contain some future
technology, or anything.
VI. Language Independent: The Algorithm designed must be language independent,
i.e. it must be just plain instructions that can be implemented in any language, and
yet the output will be same, as expected.
16
Different approach to design an algorithm
1. Top-Down Approach: A top-down approach starts with identifying major components
of system or program decomposing them into their lower level components & iterating
until desired level of module complexity is achieved . In this we start with topmost
module & incrementally add modules that is calls.
2. Bottom-Up Approach: A bottom-up approach starts with designing most basic or
primitive component & proceeds to higher level components. Starting from very bottom,
operations that provide layer of abstraction are implemented.
17
We design an algorithm to get a solution of a given problem. A problem can be solved in
more than one ways.
Hence, many solution algorithms can be derived for a given problem. The next step is to
analyse those proposed solution algorithms and implement the best suitable solution.
Each step of an algorithm is unique and should be self-explanatory i.e. it should be
unambiguous and precise.
An algorithm must have at least one input.
An algorithm must have at least one output.
An algorithm has finite number of steps i.e. should have an end point and finite
number of steps.
18