Data Structures
WHAT THE COURSE IS ABOUT
Data structures is concerned with the
representation and manipulation of data.
All programs manipulate data.
So, all programs represent data in some way.
Data manipulation requires an algorithm.
We shall study ways to represent data and
algorithms to manipulate these representations.
The study of data structures is fundamental to
Computer Science & Engineering.
PREREQUISITE
C
Enumerated data type, void data type
typedef statement
Control statements
Use of memory by a program
Specification of pointers
Memory management functions
Problems with pointers
Various aspects of user defined functions
WHAT IS DATA STRUCTURE
“A Conceptual and concrete way to
organize data for efficient storage
and manipulation”
WHAT IS DATA STRUCTURE
A data structure is a logical and
mathematical model of a particular
organization of data.
The choice of particular data structure
depends upon following consideration:
1.It must be able to represent the inherent
relationship of data in the real world.
2. It must be simple enough so that it can
process efficiently as and when necessary
OVERVIEW OF DATA
STRUCTURE
Basic Terms related to data organization
Data type
Meaning of data structure
Factor that influence the choice of data
structure
Different data structure
Various operation performed on data
structure
BASIC TERMS RELATED TO
DATA ORGANIZATION
Data:
Values or set of values.
Eg. Observation of experiment, marks obtained by student.
Data item:
A data item refers to a single unit of values.
Eg. Roll no. name etc.
Entity:
That has certain attribute or properties which may be
assigned values.
Eg. Student is an entity
BASIC TERMS RELATED TO
DATA ORGANIZATION
Entity set:
Collection of similar entity.
Eg. Student of a class
Record:
Collection of related data items.
Eg. Rollno, Dob, gender, class of a particular student.
File:
Collection of related record.
Eg. A file containing records of all students in a class
BASIC TERMS RELATED TO
DATA ORGANIZATION
Key:
A key is a data item in a record that takes
unique values and can be used to
distinguish a record from other records.
Information:
Meaningful data, coveys some meaning and
hence can be used for decision making
DATA TYPE
A data type is a collection of values and a
set of operation that act on those values
Classification of data type:
1. Primitive data type
2. Abstract data type
3. Polymorphic data type
PRIMITIVE DATA TYPE
That is predefined. It is also known as
built in data type.
Eg. C have built in data type int, long int,
float, double, char.
ABSTRACT DATA TYPE
In computing, an abstract data type (ADT) is a
specification of a set of data and the set of
operations that can be performed on the data. Such
a data type is abstract in the sense that it is
independent of various concrete implementations.
The main contribution of the abstract data type
theory is that it
◦ (1) formalizes a definition of type (which was only
intuitively hinted on procedural programming)
◦ (2) on the basis of the information hiding principle and
◦ (3) in a way that such formalization can be explicitly
represented in programming language notations and
semantics. This important advance in computer science
theory (motivated by software engineering challenges in
procedural programming) led to the emergence of
languages and methodological principles of object-
oriented programming.
POLYMORPHIC DATA TYPE
Very often in programs, a generic operation must be
performed on data of different types. For example, in our
bubble sort algorithm for the payroll records, when
elements were found out of order in the id[] array, we
needed to swap the integer elements in that array as well
as the float elements in the hrs[] and rate[] arrays. If we
decided to implement this swapping operation as a
function, we would need to write two functions: one to
swap integers, and another to swap floating point values;
even though the algorithm for swapping is the same in
both cases.
The C language provides a mechanism which allows us
to write a single swapping function which can be used on
any data type. This mechanism is called a polymorphic
data type, i.e. a data type which can be transformed to
any distinct data type as required..
THE STUDY OF DATA
STRUCTURE INCLUDE:
Logical description of data structure
Implementation of data structure
Quantative analysis of data structure, this
include amount of memory, processing
time
TYPES OF DATA STRUCTURES
1. Linear data structure
2. Non linear data structure
LINEAR DATA STRUCTURE
A data structure whose elements form a
sequence, and every element in the
structures has a unique predecessor and
unique successor.
Eg. Array, linked list, stack and queues.
NON LINEAR DATA STRUCTURE
A data structure whose elements do not
form a sequence, and there is no
predecessor and unique successor.
Eg. Trees, graphs
ARRAYS
Collection of homogenous data elements.
Arrays can be:
1. One dimensional
2. Two dimensional
3. Multi dimensional
LINKED LIST
Linked list
◦ Linear collection of self-referential class objects,
called nodes
◦ Connected by pointer links
◦ Accessed via a pointer to the first node of the list
◦ Subsequent nodes are accessed via the link-pointer
member of the current node
◦ Link pointer in the last node is set to null to mark the
list’s end
Use a linked list instead of an array when
◦ You have an unpredictable number of data elements
◦ Your list needs to be sorted quickly
The Linked List data structure
array [0] [1] [2]
Array A B C
node
linked
Linked list A B C
Linked lists are unbounded
(maximum number of items limited only by memory)
LINKED LIST
Types of linked lists:
◦ Singly linked list
Begins with a pointer to the first node
Terminates with a null pointer
Only traversed in one direction
◦ Circular, singly linked
Pointer in the last node points back to the first node
◦ Doubly linked list
Two “start pointers” – first element and last element
Each node has a forward pointer and a backward pointer
Allows traversals both forwards and backwards
◦ Circular, doubly linked list
Forward pointer of the last node points to the first node and backward
pointer of the first node points to the last node
LINKED LISTS (VARIATIONS)
Basic elements:
◦ Head node
head
◦ Node A
data pointer
Simplest form: Linear-Singly-linked
A B C
Head
LINKED LISTS (VARIATIONS)
Circular-linked Lists
A B C
Head
◦ The last node points to the first node of the list
Strengths
◦ Able to traverse the list starting from any point
◦ Allow quick access to first and last records through a single
pointer
Weakness
◦ A bit complicated during insertion, needs careful setting of pointer
for empty or one-node list
LINKED LISTS (VARIATIONS)
Doubly-linked Lists
A B C
Head
◦ Each inner node points to BOTH successor and the
predecessor
Strengths
◦ Able to traverse the list in any direction
◦ Can insert or delete a node very quickly given only that
node’s address
Weakness
◦ Requires extra memory and handling for additional pointers
LINKED LISTS (VARIATIONS)
Putting together…
Circular-doubly-linked lists!
A B C
Head
STACK
Stack
◦ New nodes can be added and removed only at the top
◦ Similar to a pile of dishes
◦ Last-in, first-out (LIFO)
◦ Bottom of stack indicated by a link member to NULL
◦ Constrained version of a linked list
push
◦ Adds a new node to the top of the stack
pop
◦ Removes a node from the top
◦ Stores the popped value
◦ Returns true if pop was successful
DATA STRUCTURES --
STACKS
A stack is a list in which insertion and
deletion take place at the same end
◦ This end is called top
◦ The other end is called bottom
QUEUES
Queue
◦ Similar to a supermarket checkout line
◦ First-in, first-out (FIFO)
◦ Nodes are removed only from the head
◦ Nodes are inserted only at the tail
Insert and remove operations
◦ Enqueue (insert) and dequeue (remove)
The Queue Operations
A queue is like a
line of people
waiting for a bank
teller. The queue has
a front and a rear.
$ $
Front
Rear
TREES
Tree nodes contain two or more links
◦ All other data structures we have discussed
only contain one
Binary trees
◦ All nodes contain two links
None, one, or both of which may be NULL
◦ The root node is the first node in a tree.
◦ Each link in the root node refers to a child
◦ A node with no children is called a leaf node
TREE TERMINOLOGY
level
There is a unique
path from the root
0
to each node.
Root is a level 0,
1
child
is at level(parent) +
1.
2
Depth/height of a
tree 3
is the length of the
longest path.
4
BINARY TREE
Each node has two successors
◦ one called the left child
◦ one called the right child
◦ left child and/or right child may be empty
A binary tree is either
- empty or
- consists of a root and two
binary trees, one called
the left subtree and one
called the right subtree
GRAPHS
G = (V, E)
a vertex may have:
0 or more predecessors
0 or more successors
abstract containers
sequence/linear (1 to 1)
first ith last
hierarchical
graph (many to many)
(1 to many)
set
HEAPS
Heap is a binary tree that satisfy the following
property:
Shape property
Order property
Shape property states that heap is complete or
nearly complete binary tree.
Order property states that:
1. Either the element at any node is smallest of
all of its children, called min heap
2. Element at any node is largest of all of its
children, called max heap.
HASH TABLES
There are many application that require a
dynamic structure that support only insert,
search and delete operations. These
operations are commonly known as
dictionary operations. A hash table is an
effective data structure for implementing
dictionaries.
COMMON OPERATIONS ON
DATA STRUCTURE
1. Traversal:-Accessing each element exactly
once in order to process it.
2. Searching:-Finding the location of a given
element.
3. Insertion:-Adding the new element to the
structure.
4. Deletion:-Removing a existing element from
the structure.
5. Sorting:-Arranging the elements in logical
order.
6. Merging:-Combining the elements of two
similar sorted structures into a single structure.
ENUMERATED DATA TYPE
Variable of enumerated data type enhance the
readability of the program.
enum boolean {false, true};
Here boolean is called tag name for the user
defined data type. Then we can declare variable
of this type as follows:
enum boolean flag;
Then we can assign value false or true to variable
flag, and also we can compare the value of flag
with these values.
void DATA TYPE
This is also known as empty data type, is useful in many situation;
1. Void functionname(int x, int y)
{
}
Functionname() does not return any value.
2. Int functionname(void)
{
}
Functionname() does not take any rgument.
3. Void main()
{
void *ptr;
int x=5;
ptr=&x;
printf(“value pointed to pointer is now %d”,*(int*)ptr);
}
Void pointer cannot be directly dereferenced without type casting. This is
because the compiler cannot determie the size of the value the
pointer points to.
REDEFINING DATA TYPES
Using typedef statement we can make it possible
to declare variables of user defined data types as
with built in data types by redefining the user
defined data type and giving our own name.
typedef enum{false, true} boolean;
The word boolean becomes name of the new
defined data type.
boolean flag;
CONTROL STATEMENTS
1. Decision making statements
if statement
if-else statement
switch statement
2. Looping Statements
for statement
while statement
do while statement
3. Jumping statements
brake statement
continue statement
goto statement
MEMORY USE IN C
High Initialized and un initialized local
STACK variable
Memory
Memory allocated with malloc().
HEAP Calloc(), and realloc() functions
Un initialized static variable
BSS
CONST Read only variable
Initialized & un initialized global
DATA
variables & initialized static
variables
Low
TEXT Program code
Memory
POINTER
Pointer is a variable which contains
reference of another variable
Address of x value of x
px x
Pointer
px=&x
DECLARATION OF POINTER
“*” is used to declare and dereference the
pointers.
data type *ptvar;
int *ip;
/*declare ip to be pointer to an integer*/
*ip=5;
/* assign 5 to the integer pointer to which ip
points*/
address 5
ip
POINTER OPERATOR
Two operators:
&-----address of
*------at address in
X=8 let x be at 100 (x at 100) 8
Ip=&x ip contains 100 (ip at 200)100
a=*ip contains 8 (a at 250) 8
ASSIGNMENT IN POINTER
Given
Int x;
Double y;
Int *a,*b;
double *c;
a=&x; /* a now points to x*/
b=a; /*b now points to the same variable as a
points */
c=&y; /* c points to y */
POINTER TO A POINTER
Variable that hold an address of a another
variable that in turn holds an address of
another variable, this type of variable is
know as pointer to pointer.
Pointer to pointer will be declared as
**ptr;
DYNAMIC MEMORY
MANAGEMENT
Memory management functions
Functio Description
n name
malloc Allocate memory from heap
calloc Allocate memory from heap and initializes
the allocated memory to zeros
realloc Readjusts the existing block and copies the
contents to new location
free Deallocates block allocated by malloc,
calloc and realloc fuctions
DYNAMIC MEMORY
ALLOCATION
Dynamic memory allocation
◦ Obtain and release memory during execution
malloc
◦ Takes number of bytes to allocate
Use sizeof to determine the size of an object
◦ Returns pointer of type void *
A void * pointer may be assigned to any pointer
If no memory available, returns NULL
◦ Example
newPtr = malloc( sizeof( struct node ) );
free
◦ Deallocates memory allocated by malloc
◦ Takes a pointer as an argument
◦ free ( newPtr );
calloc()
The calloc() function dynamically allocates
memory and automatically initializes the
memory to zeroes. Example
newPtr = calloc( 5,sizeof( struct
node ) );
realloc()
The realloc() function changes the size of
previously dynamically allocated memory
with malloc(),calloc() or realloc function()
functions.
The prototype of realloc() function is
Void *realloc(void *block,size_t size);
It takes two arguments, first argumant is
pointer to the original object and second
argument is new size of the object.
free()
The free() function deallocates a memory
block previously allocated with malloc(),
calloc(), or realloc() functions. Prototype
for free function is
void free(void *block);
It takes one argument that specify the
pointer to the allocated block.
DEBUGGING POINTERS
The pointer can be the source of mysterious
and catastrophic program bugs.
Common bugs related to related to
pointer and memory management is
1. Dangling pointer
2. Null pointer assignment
3. Memory leak
4. Allocation failure
STRUCTURES
A structure is a collection of data elements,
called fields which may be of different
type.
Individual elements of a structure variable
are accessed using dot operator (.), if a
pointer is used to point to a structure
variable , then arrow operator (->) is used.
STRUCTURE
Example of complex data structures and the
corresponding self referential structure to
represent these data structure.
head
1200 1201 1202 X
Next pointer field
Information field
STRUCTURE
typedef struct nodetype{
int info;
struct nodetype *next;
} node;
node * head;