Data Structures
Data Structures
Data Structure:
1. Data may be organized in many different ways; the logical or mathematical model of a
particular organization of data is called a Data Structure.
3. A Data Structure is a specialized format for organizing and storing data. General data
structure types include the array, the file, the record, the table, the tree, and so on. Any
data structure is designed to organize data to suit a specific purpose so that it can be
accessed and worked with in appropriate ways. In computer programming, a data
structure may be selected or designed to store data for the purpose of working on it with
various algorithms.
The choice of a particular data model depends on two considerations. First, it must be rich
enough in structure to mirror the actual relationships of the data in the real world. On
the other hand, the structure should be simple enough that one can effectively process the data
when necessary.
Primitive Data Structure is a basic data structure which can be directly operated by the
machine instructions.
eg: int, float, double, character, pointer, Boolean.
A Data Structure which contains the linear arrangement of elements in the memory. This is
known as Linear Data Structure.
e.g. Array, Stack, Queue, Linked List
A list which displays the relationship of adjacency between elements is said to be linear.
The data appearing in our data structures are processed by means of certain operations. In
fact, the particular data structure that one chooses for a given situation depends largely on the
frequency with which specific operations are performed.
(1) Traversing: Accessing each record exactly once so that certain items in the record may
be processed. (This accessing and processing is sometimes called visiting"
the record.)
(2) Searching: Finding the location of the record with a given key value, or finding the locations of
all records that satisfy one or more conditions.
(3) Inserting: Adding a new record to the structure.
(4) Deleting: Removing a record from the structure.
Sometimes two or more of the operations may be used in a given situation; e.g., we may want to
delete the record with a given key, which may mean we first need to search for the location of the
record. The following two operations, which are used in special situations, are also be considered:
(1) Sorting: Arranging the records in some logical order (e.g., alphabetically according to some
NAME key, or in numerical order according to some NUMBER key, such as social
security number or account number)
(2) Merging: Combining the records in two different sorted files into a single sorted file
Other operations, e.g., copying and concatenation, are also used.
EXAMPLE :
An organization contains a membership file in which each record contains the following data for a
given member:
(a) Suppose the organization wants to announce a meeting through a mailing. Then one
would traverse the file to obtain Name and Address for each member.
(b) Suppose one wants to find the names of all members living in a certain area. Again one
would traverse the file to obtain the data.
(c) Suppose one wants to obtain Address for a given Name. Then one would search the file for
the record containing Name.
(d) Suppose a new person joins the organization. Then one would insert his or her record into
the file.
(e) Suppose a member dies. Then one would delete his or her record from the file.
(f) Suppose a member has moved and has a new address and telephone number. Given the
name of the member, one would first need to search for the record in the file. Then one would
perform the "update"- i.e., change items in the record with the new data.
(g) Suppose one wants to find the number of members 65 or older. Again one would traverse
the file, counting such members.
Concept of Array
Different operations performed using Arrays
This lesson discusses a very common linear structure called all array. Since arrays are
usually easy to traverse, search and sort, they are frequently used to store relatively permanent
collections of data.
LINEAR ARRAY:
A linear array is a list of a finite number n of homogeneous data elements (i.e., data elements of the
same type) such that:
(a) The elements of the array are referenced respectively by an index set consisting of n Consecutive
numbers.
(b) The elements of the array are stored respectively in successive memory locations. The
number n of elements is called the length or size of the array. If not explicitly stated, we will
assume the index set consists of the integers 1, 2, . . . , n. In general, the length or the number of
data elements of the array can be obtained from the index set by the formula
Length = UB - LB + 1
Where UB is the largest index, called the upper bound, and LB is the smallest index, called the lower
bound, of the array. Note that length = VB when LB = 1.
A1 A2, A3, . . . , An
We will usually use the subscript notation or the bracket notation. Regardless of the notation,
the number K in A[K] is called a subscript or an index and A[K] is called a; subscripted
variable.
Note that subscripts allow any element of A to be referenced by its relative position in A.
EXAMPLE
DATA[1] =247 DATA[2] = 56 DA T A[3] = 429 DATA[4] = 135 DATA[5] = 87 DATA[6] = 156
(b) An automobile company uses an array AUTO to record the number of automobiles sold
each year from 1932 through 1984. Rather than beginning the index set with 1, it is more useful to
begin the index set with 1932 so that.
AUTO[K] = number of automobiles sold in the year K
Then LB = 1932 is the lower bound and UB = 1984 is the upper bound of AUTO.
Length = UB - LB + 1 = 1984 -1930 + 1 = 55
That is, AUTO contains 55 elements and its index set consists of all integers from 1932 through
1984.
Each programming language has its own rules for declaring arrays. Each such declaration must give,
implicitly or explicitly, three items of information: (1) the name of the array, (2) the data type of the
array and (3) the index set of the array.
Let LA be a linear array in the memory of the computer. Recall that the memory of the computer is
simply a sequence of addressed locations as pictured in Fig. below. Let us use the notation
where w is the number of. words per memory cell for the array _A. Observe that the time
to calculate LOC(LA[K]) is essentially the same for any value of K. Furthermore, given any
subscript K, one can locate and access the content of LA[K] without scanning any other element
of LA.
1000
1001
1002
1003
1004
Fig: Computer memory
EXAMPLE
Consider the array AUTO in previous example, which records the number of automobiles, sold
each year from 1932 through 1984. Suppose AUTO appears in memory as pictured in Fig.
below. That is, Base (AUTO) = 200, and w = 4 words per memory cell for AUTO. Then
LOC (AUTO [1965]) = Base (AUTO) + w (1965 -lower bound) = 200 + 4(1965 - 1932) = 332
Thus, the contents of this element can be obtained without scanning any other element in array
AUTO.
200
201
202
203
204
205
206
207
208
209
210
211
.
Remark: A collection A of data elements is said to be indexed if any element of A, which we shall
call AK, can be located and processed in a time that is independent of K., The above discussion
indicates that linear arrays can be indexed. This is a very important property of linear arrays. In
fact, linked lists, which are covered in the next section, do not have this property.
Let A be a collection of data elements stored in the memory of the computer. Suppose we want
to print the contents of each element of A or suppose we want to count the number of elements of A
with a given property. This can be accomplished by traversing A, that is, by accessing and processing
(frequently called visiting) each element of A exactly once.
The following algorithm traverses a linear array LA. The simplicity of the algorithm comes from the fact
that LA is a linear structure. Other linear structures, such as linked lists, can also be easily traversed.
On the other hand, the traversal of nonlinear structures, such as trees and graphs, is considerably
more complicated.
1. [Initialize counter]
Set K LB
2. Repeat Steps 3 and 4 while K≤UB.
3. [Visit element]
Apply PROCESS to LA[K]
4. [Increase counter]
Set K K + 1
[End of Step 2 loop]
5. Exit
Let A be a collection of data elements in the memory of the computer. "Inserting" refers to the
operation 'of adding another element to the collection A, and "deleting" refers to the operation of
removing one of the elements from A. This section discusses inserting and deleting when A is a linear
array.
Inserting an element at the "end" of a linear array can be easily done provided the memory space
allocated for the array is large enough to accommodate the additional element. On the other hand,
suppose we need to insert an element in the middle of the array. Then, on the average, half of the
elements must be moved downward to new locations to accommodate the new element and keep the
order of the other elements.
Similarly, deleting an element at the "end" of an array presents no difficulties, but deleting an element
somewhere in the middle of the array would require that each subsequent element be moved one
location upward in order to "fill up" the array.
EXAMPLE
Suppose TEST has been declared to be a 5-element array but data have been recorded only for
TEST[l], TEST[2] and TEST[3]. If X is the value of the next test, then one simply assigns
TEST[4]:= X
to add X to the list. Similarly, if Y is the value of the subsequent test, then we simply assign
TEST[5]:= Y
to add Y to the list. Now, however, we cannot add any new test scores to the list.
EXAMPLE
Suppose NAME is an 8-element linear array, and suppose five names are in the array, as
in Fig. (a). Observe that the names are listed alphabetically, and suppose we want to keep the array
names alphabetical at all times. Suppose Ford is added to the array. Then Johnson, Smith and Wagner
must each be moved downward one location, as in Fig. (b). Next suppose Taylor is added to the array;
then Wagner must be moved, as in Fig. (c). Last, suppose Davis is removed from the array. Then the five
names Ford, Johnson, Smith, Taylor and Wagner must each be moved upward one location, as in Fig.
(d). Clearly such movement of data would be very expensive if thousands of names were in the
array.
The following algorithm inserts a data element ITEM into the Kth position in a linear array LA
with N elements. The first four steps create space in LA by moving downward one location each element
from the Kth position on. We emphasize that these elements are moved in reverse order-i.e. first LA[N],
then LA[N - 1], . . . , and last LA[K];-otherwise data might be erased.
In more detail, we first set J:= N and then, using J as a counter, decrease J each time the loop is
executed until J reaches K. The next step, Step 5, inserts ITEM into the array in the space just created.
Before the exit from the algorithm, the number N of elements in LA is increased by 1 to account for the
new element.
SYBCA (SEM-III) Page 12 of 42
Course: BCA Sub:Data Structure SEM- III
1. [Initialize counter]
Set J N
2. Repeat Steps 3 and 4 while J >= K
3. [Move Jth element downward]
Set LA [J + 1] LA[J]
4. [Decrease counter]
Set J J - 1
[End of Step 2 loop]
5. [Insert element]
Set LA [K] ITEM
6. [Reset N]
Set N N +1
7. Exit
STACK:-
• What is a Stack?
• What operations can be performed on it?
• Applications
The linear lists and linear arrays allowed one to insert and delete elements at any place in the
list-at the beginning, at the end, or in the middle. There are certain frequent situations in
computer science when one wants to restrict insertions and deletions so that they can take
place only at the beginning or the end of the list, not in the middle. Two of the data structures
that are useful in such situations are stacks and queues.
A stack is a linear structure in which items may be added or removed only at one end.
Figure below pictures three everyday examples of such a structure: a stack of dishes, a stack of
pennies and a stack of folded towels. Observe that an item may be added or removed only from
the top of any of the stacks. This means, in particular, that the last item to be added to a stack
is the first item to be removed. Accordingly, stacks are also called last-in first-out (LIFO) lists.
Other names used for stacks are "piles" and "push- down lists." Although the stack may seem
to be a very restricted type of data structure, it has many important applications in computer
science.
Figure- Stack
Definitions:
A Stack is a list of elements in which an element may be inserted or deleted only at one end,
called “top” of the stack. This means, in particular, that elements are removed from a stack in
the reverse order of that in which they were inserted into the stack. It is also called Last In
First Out (LIFO).
Special terminology is used for Four basic operations associated with stacks:
(a) "PUSH" is the term used to insert an element into a stack.
(d) “CHANGE” is used to change the value of Ith element from the top of the stack.
EXAMPLE:
Suppose the following 6 elements are pushed, in order, onto an empty stack:
Figure below shows three ways of picturing such a stack. For notational convenience, we will
frequently designate the stack by writing:
The implication is that the right-most element is the top element. We emphasize that, regardless of the
way a stack is described, its underlying property is that insertions and deletions can occur only at the
top of the stack. This means EEE cannot be deleted before FFF is deleted, DDD cannot be deleted before
EEE and FFF are deleted, and so on. Consequently, the elements may be popped from the stack only in
the reverse order of that in which they were pushed onto the stack.
Postponed Decisions
Stacks are frequently used to indicate the order of the processing of data when certain steps
of the processing must be postponed until other conditions are fulfilled. This is illustrated as follows.
Suppose that while processing some project A we are required to move on to project B, whose
completion is required in order to complete project A. Then we place the folder containing the data of A
onto a stack, as pictured in Fig. (a), and begin to process B. However, suppose that while processing B
we are led to project C, for the same reason. Then we place B on the stack above A, as pictured in Fig.
(b), and begin to process C. Furthermore, suppose that while processing C we are likewise led to project
D. Then we place C on the stack above B, as pictured in Fig. (c), and begin to process D.
On the other hand, suppose we are able to complete the processing of project D. Then the only project
we may continue to process is project C, which is on top of the stack. Hence we remove folder C from
the stack, leaving the stack as pictured in Fig. (d), and continue to process C. Similarly, after completing
the processing of C, we remove folder B from the stack, leaving the stack as pictured in Fig. (e), and
continue to process B. Finally, after completing the processing of B, we remove the last folder, A, from
the stack, leaving the empty stack pictured in Fig. (f), and continue the processing of our original
project A.
Observe that,at each stage of the above processing,the stack automatically maintains the order
that is required to complete the processing. An important example of such a processing in computer
science is where A is a main program and B, C and D are subprograms called in the order given.
Stacks may be represented in the computer in various ways, usually by means of a one-way list or a
linear array. Unless otherwise stated or implied, each of our stacks will be maintained by a linear array
STACK; a pointer Variable TOP, which contains the location of the top element of the stack; and a
variable MAXSTK which gives the maximum number of elements that can be held by the stack. The
condition TOP = 0 or TOP = NULL will indicate that the stack is empty.
Figure on next page pictures such an array representation of a stack. (For notational convenience, the
array is drawn horizontally rather than vertically.) Since TOP = 3, the stack has three-elements, XXX,
YYY and ZZZ; and since MAXSTK = 8, there is room for 5 more items in the stack.
The operation of adding (pushing) an item onto a stack and the operation of removing (popping) an item
from a stack may be implemented, respectively, by the following procedures, called PUSH and POP. In
executing the procedure PUSH, one must first test whether there is room in the stack for the new item; if
not, then we have the condition known as overflow.
Analogously, in executing the procedure POP, one must first test whether there is an element in the
stack to be deleted; if not, then we have the condition known as underflow.
EXAMPLE
(a) Consider the stack in previous figure. We simulate the operation PUSH (STACK, WWW):
(b) Consider again the same stack. This time we simulate the operation POP (STACK, ITEM):
Observe that STACK [TOP] = STACK [2] =YYY is now the top element in the stack.
Applications of Stacks
• Polish Notations
• Recursion
POLISH NOTATION
For most common arithmetic operations,the operator symbol is placed between its two Operands.
For example,
(A+B)*C and A + (B * C)
Polish notation, named after the Polish mathematician Jan Lukasiewicz, refers to the notation in
which the operator symbol is placed before its two operands. For example,
The fundamental property of Polish notation is that the order in which the operations are to be performed
is completely determined by the positions of the operators and operands in the expression. Accordingly,
one never needs parentheses when writing, expressions in Polish notation.
Reverse Polish notation(Postfix or Suffix) refers to the analogous notation in which the operator
symbol is placed after its two operands:
Again,one never needs parentheses to determine the order of the operations in any arithmetic expression
written in reverse Polish notation. This notation is frequently called postfix (or suffix) notation, whereas
prefix notation is the term used for Polish notation, discussed in the preceding paragraph.
The computer usually evaluates an arithmetic expression written in infix notation in two steps. First, it
converts the expression to postfix notation, and then it evaluates the postfix expression. In each step,
the stack is the main tool that is used to accomplish the given task.
We note that, when Step 5 is executed, there should be only one number on STACK.
Example:
Consider the following arithmetic expression P written in postfix notation:
P : 5, 6, 2, +, *, 12, 4, /, -, )
(Commas are used to separate the elements of P so that, 5,6,2 is not interpreted
as the number 562.)
We also assume that operators on the same level, including exponentiations, are performed
from left to right unless otherwise indicated by parentheses. (This is not standard, since
expressions may contain unary operators and some languages perform the exponentiations
from right to left. However, these assumptions simplify our algorithm.)
The following algorithm transforms the infix expression Q into its equivalent postfix
expression P. The algorithm uses a stack to temporarily hold operators and left parentheses.
The postfix expression P will be constructed from left to right using the operands from Q and
the operators, which are removed from STACK. We begin by pushing a left parenthesis onto
STACK and adding a right parenthesis at the end of Q. The algorithm is completed when
STACK is empty.
Algorithm:POLISH(Q, P)
Suppose Q is an arithmetic expression written in infix notation. This
algorithm finds the equivalent postfix expression P.
Q: A+(B*C- (D/E↑F)*G)*H
P: A B C * D E F ↑ / G * - H * +
RECURSION:
(1) There must be certain criteria, called base criteria, for which the procedure does not
call itself.
(2) Each time the procedure does call itself (directly or indirectly), it must be closer to the
base criteria.
A recursive procedure with these two properties is said to be well-defined.
Similarly, a function is said to be recursively defined if the function definition refers to
itself. Again, in order for the definition not to be circular, it must have the following two
properties:
(1) There must be certain arguments, called base values, for which the function does not
refer to itself.
(2) Each time the function does refer to itself,the argument of the function must be closer
to a base value.
Step 3 END
QUEUE
• What is a Queue?
• What operations can be performed on it?
A queue is a linear list in which items may be added only at one end and items may be
removed-only at the other end.
The name "queue" likely comes from the everyday use of the term. Consider: queue of
people waiting at a bus stop, as pictured in fig. below. Each new person who comes takes
his or her place at the end of the line, and when the bus comes, the people at the front of
the line board first Clearly, the first person in the line is the first person to leave. Thus
queues are also called first-in first-out (FIFO) lists.
A queue is a linear list of elements in which deletions can take place only at one end,
called the front, and insertions can take place only at the other end, called the rear.
The terms "front" and "rear” are used in describing a linear list only when it is
implemented as, a queue.
Queues are also called first-in first-out (FIFO) lists, since the first element in a queue
will be the first element out of the queue. In other words, the order in which elements
enter a queue is the order in which they leave. This contrasts with stacks, which are last-
in first-out (LIFO) lists.
Representation of Queues
Queues may be represented in the computer in various ways, usually by means of one-
way lists or linear arrays. Unless otherwise stated or implied, each of our queues will be
maintained by a linear array QUEUE and two pointer variables: FRONT, containing the
location of the front element of the queue; and REAR, containing the location of the rear
element of the queue. The condition FRONT = NULL will indicate that the queue is empty.
Figure below indicates the way elements will be deleted from the queue and the way new
elements will be added to the queue. Observe that whenever an element is deleted from
the queue, the value of FRONT is increased by 1; this can be implemented by the
assignment
FRONT = FRONT + 1
Similarly, whenever an element is added to the queue, the value of REAR is increased by
1; this can be implemented by the assignment
REAR: = REAR + 1
This means that after N insertions, the rear element of the queue will occupy QUEUE [N]
or, in other words, eventually the queue will occupy the last part of the array. This occurs
even through the queue itself may not contain many elements.
Suppose we want to insert an element ITEM into a queue at the time the queue does
occupy the last part of the array, i.e., when REAR = N. One way to do this is to simply
move the entire queue to the beginning of the array, changing FRONT and REAR
accordingly, and then inserting ITEM as above. This procedure may be very expensive.
The procedure we adopt is to assume that the array
DEQUEUE:
A dequeue (pronounced either "deck" or "dequeue") is a linear list in which elements can
be added or removed at either end but not in the middle. The term dequeue is a
contraction of the name double-ended queue.
There are various ways of representing a dequeue in a computer. Unless it is –otherwise
stated or implied, we will assume our dequeue is maintained by a circular array DEQUE
with pointers LEFT and RIGHT, which point to the two ends of the dequeue. We assume
that the elements extend from the left end to the right end in the array. The term
"circular" comes from the fact that we assume that DEQUE [I] comes after DEQUE [N] in
the array. Figure below pictures two deques, each with 4 elements maintained in an array
with N = 8 memory locations. The condition LEFT = NULL will be used to indicate that a
deque is empty.
The procedures, which insert and delete elements in dequeues and the variations on
those procedures, are given as supplementary problems. As with queues, a complication
may arise (a) when there is overflow, that is, when an element is to be inserted into a
deque which is already full, or (b) when there is underflow, that is, when an element is to
be deleted from a deque,which is empty. The procedures must consider these possibilities.
Circular Queue:
From above discussion in a linear queue, we face the problem of overflow of a queue
frequently. If queue contain total maximum element and we want to insert a new element,
this type problems can be solved by using circular queue instead of Linear queue.
PRIORITY QUEUE:
A priority queue is a collection of elements such that each element has been assigned a
priority and such that the order in which elements are deleted and processed comes from
the following rules:
(1) An element of higher priority is processed before any element of lower priority.
(2) Two elements with the same priority are processed according to the order in which
they were added to the queue.