Data Structures and
Algorithms
--- An introduction
ME CSE I semester
Aug - Dec 2008
--- Mrs. R. Kanchana
Objectives
• To introduce you to a systematic study of
algorithms and data structure.
• The two guiding principles of the course are:
abstraction and formal analysis.
• Abstraction: We focus on topics that are broadly
applicable to a variety of problems.
• Analysis: We want a formal way to compare two
objects (data structures or algorithms).
• In particular, we will focus on "always correct"-
ness, and worst-case bounds on time and
memory (space).
18-Aug-08 CS1602 Data Structures and Algorithms 2
ME I sem Aug - Dec 2008
Course Outline
• C++ Review
• Algorithms and Analysis
• List, Stack, Queues
• Trees
• Hashing
• Sorting, Searching
• Graph
• Memory Management
• Design Techniques
18-Aug-08 CS1602 Data Structures and Algorithms 3
ME I sem Aug - Dec 2008
Lecture Format
• Feel free to interrupt to ask questions
• Lectures:
– Slides are available at least one day before the lecture
– It is important to attend the lectures (Not all materials are
covered in slides)
– If you miss any lectures, learn from your friends
• Tutorials
– Supplement the lectures
– Some important exercises
• Programming and homework assignments
– More rigorous problems to consolidate your knowledge
18-Aug-08 CS1602 Data Structures and Algorithms 4
ME I sem Aug - Dec 2008
Assignments
• Written homework / Tutorials
– Due date and time specified
– Have to come prepared with solutions before the
tutorial class
• Programming assignments
– Deadlines and instructions must be followed strictly
– Have to come with the working program so that
demo can be given during the Lab session
18-Aug-08 CS1602 Data Structures and Algorithms 5
ME I sem Aug - Dec 2008
Late Policy
• For written assignments, 20% will be
deducted for one day late submissions.
Assignments later than 1 day will not be
accepted.
• For programming assignments, you are
allowed to submit ONE assignment late
(up to 1 week) among the three
assignments.
18-Aug-08 CS1602 Data Structures and Algorithms 6
ME I sem Aug - Dec 2008
Plagiarism Policy
• 1st Time: both get 0
• 2nd Time: -20 marks in the model exam
• 3rd Time: Fail in the Internal marks
You are encouraged to collaborate in study groups.
But you cannot directly copy or slightly change
other students’ solutions or code
18-Aug-08 CS1602 Data Structures and Algorithms 7
ME I sem Aug - Dec 2008
Foundations of Algorithm
Analysis and Data Structures.
• Analysis:
– How to predict an algorithm’s performance
– How well an algorithm scales up
– How to compare different algorithms for a problem
• Data Structures
– How to efficiently store, access, manage data
– Data structures effect algorithm’s performance
18-Aug-08 CS1602 Data Structures and Algorithms 8
ME I sem Aug - Dec 2008
Algorithm Analysis
• Space complexity
– How much space is required
• Time complexity
– How much time does it take to run the algorithm
• Often, we deal with estimates!
18-Aug-08 CS1602 Data Structures and Algorithms 9
ME I sem Aug - Dec 2008
Space Complexity
• Space complexity = The amount of memory
required by an algorithm to run to completion
– [Core dumps = the most often encountered cause is
“memory leaks” – the amount of memory required larger
than the memory available on a given system]
• Some algorithms may be more efficient if data
completely loaded into memory
– Need to look also at system limitations
– E.g. Classify 2GB of text in various categories [politics,
tourism, sport, natural disasters, etc.] – can I afford to load
the entire collection?
18-Aug-08 CS1602 Data Structures and Algorithms 10
ME I sem Aug - Dec 2008
Space Complexity (cont’d)
1. Fixed part: The size required to store certain
data/variables, that is independent of the size of
the problem:
- e.g. name of the data collection
- same size for classifying 2GB or 1MB of texts
2. Variable part: Space needed by variables,
whose size is dependent on the size of the
problem:
- e.g. actual text
- load 2GB of text VS. load 1MB of text
18-Aug-08 CS1602 Data Structures and Algorithms 11
ME I sem Aug - Dec 2008
Space Complexity (cont’d)
• S(P) = c + S(instance characteristics)
c = constant
• Example:
void float sum (float* a, int n)
{ float s = 0;
for (int i = 0; i<n; i++) { s+ = a[i]; }
return s;
}
Space? One word for n, one for a [passed by
reference!], one for i constant space!
18-Aug-08 CS1602 Data Structures and Algorithms 12
ME I sem Aug - Dec 2008
Time Complexity
• Often more important than space complexity
– space available (for computer programs!) tends to be larger
and larger
– time is still a problem for all of us
• 3-4GHz processors on the market
– still …
– researchers estimate that the computation of various
transformations for 1 single DNA chain for one single
protein on 1 TerraHZ computer would take about 1 year to
run to completion
• Algorithms running time is an important issue
18-Aug-08 CS1602 Data Structures and Algorithms 13
ME I sem Aug - Dec 2008
Running Time
• Problem: prefix averages
– Given an array X
– Compute the array A such that A[i] is the average of elements
X[0] … X[i], for i=0..n-1
• Sol 1
– At each step i, compute the element X[i] by traversing the
array A and determining the sum of its elements, respectively
the average
• Sol 2
– At each step i update a sum of the elements in the array A
– Compute the element X[i] as sum/I
Big question: Which solution to choose?
18-Aug-08 CS1602 Data Structures and Algorithms 14
ME I sem Aug - Dec 2008
Running time
5 ms worst-case
4 ms
3 ms
} average-case?
best-case
2 ms
1 ms
A B C D E F G
Input
Suppose the program includes an if-then statement that may
execute or not: variable running time
Typically algorithms are measured by their worst case
18-Aug-08 CS1602 Data Structures and Algorithms 15
ME I sem Aug - Dec 2008
Experimental Approach
• Write a program that implements the
algorithm
• Run the program with data sets of varying
size.
• Determine the actual running time using a
system call to measure time (e.g. system
(date) );
• Problems?
18-Aug-08 CS1602 Data Structures and Algorithms 16
ME I sem Aug - Dec 2008
Experimental Approach
• It is necessary to implement and test the
algorithm in order to determine its running
time.
• Experiments can be done only on a limited
set of inputs, and may not be indicative of
the running time for other inputs.
• The same hardware and software should
be used in order to compare two
algorithms. – condition very hard to
achieve!
18-Aug-08 CS1602 Data Structures and Algorithms 17
ME I sem Aug - Dec 2008
Theoretical Approach
• Based on high-level description of the
algorithms, rather than language
dependent implementations
• Makes possible an evaluation of the
algorithms that is independent of the
hardware and software environments
Generality
18-Aug-08 CS1602 Data Structures and Algorithms 18
ME I sem Aug - Dec 2008
ADT
• ADT = Abstract Data Types
• A logical view of the data objects together
with specifications of the operations
required to create and manipulate them.
• Describe an algorithm – pseudo-code
• Describe a data structure – ADT
18-Aug-08 CS1602 Data Structures and Algorithms 19
ME I sem Aug - Dec 2008
What is a data type?
• A set of objects, each called an instance of the data
type. Some objects are sufficiently important to be
provided with a special name.
• A set of operations. Operations can be realized via
operators, functions, procedures, methods, and special
syntax (depending on the implementing language)
• Each object must have some representation (not
necessarily known to the user of the data type)
• Each operation must have some implementation (also
not necessarily known to the user of the data type)
18-Aug-08 CS1602 Data Structures and Algorithms 20
ME I sem Aug - Dec 2008
What is a representation?
• A specific encoding of an instance
• This encoding MUST be known to
implementors of the data type but NEED
NOT be known to users of the data type
• Terminology: "we implement data types
using data structures“
18-Aug-08 CS1602 Data Structures and Algorithms 21
ME I sem Aug - Dec 2008
Two varieties of data types
• Opaque data types in which the
representation is not known to the user.
• Transparent data types in which the
representation is profitably known to the
user:- i.e. the encoding is directly
accessible and/or modifiable by the user.
• Which one you think is better?
• What are the means provided by C++ for
creating opaque data types?
18-Aug-08 CS1602 Data Structures and Algorithms 22
ME I sem Aug - Dec 2008
Why are opaque data types better?
• Representation can be changed without
affecting user
• Forces the program designer to consider
the operations more carefully
• Encapsulates the operations
• Allows less restrictive designs which are
easier to extend and modify
• Design always done with the expectation
that the data type will be placed in a library
of types available to all.
18-Aug-08 CS1602 Data Structures and Algorithms 23
ME I sem Aug - Dec 2008
How to design a data type
Step 1: Specification
• Make a list of the operations (just their
names) you think you will need. Review
and refine the list.
• Decide on any constants which may be
required.
• Describe the parameters of the operations
in detail.
• Describe the semantics of the operations
(what they do) as precisely as possible.
18-Aug-08 CS1602 Data Structures and Algorithms 24
ME I sem Aug - Dec 2008
How to design a data type
Step 2: Application
• Develop a real or imaginary application to
test the specification.
• Missing or incomplete operations are
found as a side-effect of trying to use the
specification.
18-Aug-08 CS1602 Data Structures and Algorithms 25
ME I sem Aug - Dec 2008
How to design a data type
Step 3: Implementation
• Decide on a suitable representation.
• Implement the operations.
• Test, debug, and revise.
18-Aug-08 CS1602 Data Structures and Algorithms 26
ME I sem Aug - Dec 2008
Example - ADT Integer
Name of ADT Integer
Operation Description C/C++
Create Defines an identifier with an
undefined value int id1;
Assign Assigns the value of one integer id1 = id2;
identifier or value to another integer
identifier
isEqual Returns true if the values associated id1 ==id2;
with two integer identifiers are the
same
18-Aug-08 CS1602 Data Structures and Algorithms 27
ME I sem Aug - Dec 2008
Example – ADT Integer
LessThan Returns true if an identifier integer is
less than the value of the second id1<id2
integer identifier
Negative Returns the negative of the
integer value -id1
Sum Returns the sum of two
integer values id1+id2
18-Aug-08 CS1602 Data Structures and Algorithms 28
ME I sem Aug - Dec 2008
Example – ADT Integer
Operation Signatures
Create: identifier Integer
Assign: Integer Identifier
IsEqual: (Integer,Integer) Boolean
LessThan: (Integer,Integer) Boolean
Negative: Integer Integer
Sum: (Integer,Integer) Integer
18-Aug-08 CS1602 Data Structures and Algorithms 29
ME I sem Aug - Dec 2008
More examples
• We’ll see more examples throughout the
course
– Stack
– Queue
– Tree
– And more
18-Aug-08 CS1602 Data Structures and Algorithms 30
ME I sem Aug - Dec 2008