Quantitative Methods for
Decision Making
Lecture 1
Dr. Akhter
5th edi tion
Marking Scheme
Mid term
30%
Final Exam 40%
Quizzes 15% (mean of best five quizzes each of 15 points)
Assignments 15% (mean of best 7 assignments each of 15 points)
Book
Introductory STATISTICS
9TH EDITION
ISBN-13: 978-0-321-69122-4
ISBN-10: 0-321-69122-9
Neil A. Weiss
Addison-Wesley
Topics
Gathering information and its Presentation
Measures of central tendency
Measures of DispersionProbability Concepts
Random & Non Random Variables
Some Special Distributions
The Normal distribution
Fitting of a distribution
Sampling distributions
Topics
Estimation Theory
Mathematical Models
Regression & Correlation
Decision Theory (p-value approach)
Decision based on risk
Experimental Designs
Case studies related to the CRD and RBD using some
industrial and financial data sets
Setting up ANOVA tables and Decision Making
Computer Support producing group research
Statistics
Statistics (as subject)
Science of collecting and analyzing data for the purpose
of drawing conclusions and making decisions
Provides data collection methods to reduce biases, and
analysis methods to identify patterns and draw
inference from noisy data
Statistics (facts and figures)
Aggregate of numerical facts: Statistics of scores,
statistics of marks, statistics of wages etc.
Statistic
(constant)
A characteristics of sample
Important terms
Population: Homogeneous, Heterogeneous, finite, Infinite,
Hypothetical, Existent,
Census Complete enumeration
Sampling frame or frame A complete list of all elements in our
population
Sampling, Sample, Random Sample
Parameter Characteristic of population
Statistic Characteristic of sample
Statistical Methods
Statistical
Methods
Descriptive
Statistics
Inferential
Statistics
Descriptive statistics
Descriptive statistics consists of methods for organizing,
displaying, and describing data by using tables, graphs, and
summary measures.
Descriptive statistics is concerned with exploring, visualising, and
summarizing data but without fitting the data to any models.
This kind of analysis is used to explore the data in the initial stages
of data analysis.
Since no models are involved, it can not be used to test hypotheses
or to make testable predictions.
Nevertheless, it is a very important part of analysis that can reveal
many interesting features in the data.
Inferential statistics
Involves the identification of a suitable model. The data is then
fit to the model to obtain an optimal estimation of the model's
parameters.
The model then undergoes validation by testing either
predictions or hypotheses of the model.
Models based on a unique sample of data can be used to infer
generalities about features of the whole population.
Using Statistics (Two Categories)
Descriptive Statistics
Collect
Organize
Summarize
Display
Analyze
Inferential Statistics
Predict and forecast
values of population
parameters
Test hypotheses about
values of population
parameters
Make decisions
Types of Data - Two Types
Qualitative Categorical or
Nominal:
Color
Gender
Nationality
Quantitative Measurable or
Countable:
Temperatures
Salaries
Number of points scored
on a 100 point exam
Data
Collection of facts and figures
May be qualitative or quantitative
May be discrete or continuous
May be in un-group or group form
Data
Qualitative
Quantitative
Discrete
Continuous
Samples and Populations
A population consists of the set of all
measurements for which the investigator
is interested.
A sample is a subset of the measurements
selected from the population.
A census is a complete enumeration of
every item in a population.
Simple Random Sample
Sampling from the population is often
done randomly, such that every possible
sample of equal size (n) will have an
equal chance of being selected.
A sample selected in this way is called a
simple random sample or just a random
sample.
A random sample allows chance to
determine its elements.
Sampling Techniques
Random Sampling
Stratified Sampling
Cluster Sampling
Systematic Sampling
Judgment Sampling
Quota Sampling
Parameter and Statistic
Parameter A population constant
, , ,
2
Statistic A sample constant
2
x, s , r, p
Samples and Populations
Population (N)
Sample (n)
Why Sample?
Census of a population may be:
Impossible
Impractical
Too costly
Subscript Notation
List Name
Xi
Subscript
Subscript Notation
List Name
Xi
Subscript
Double Subscript
X ij
X 11
X 21
X 31
X 12
X 22
X 32
X 13
X 23
X 33
Summation Notations
X
i
stop value
i 1
start value
summation
index
Sigma Notation
Suppose our list has just 5 numbers, and
they are 1,3,2,5,6.
5
X
i 1
Xi
i 1
5
12 32 22 52 62 75
2
i
1 3 2 5 6
17 289
2
Properties of Sigma
N
a Na
aX
i 1
i 1
y
a ( y x 1)a
i x
N
X
i 1
X
i 1
i 1
i 1
i 1
i 1
Yi X i Yi
Yi X i Yi
a X i
i 1
Properties of Sigma
n
xi x
Show that
i 1
xi2 nx 2
i 1
where x is the arithmetic mean of data which is
n
x
i 1
or
n
x
i 1
nx
Sigma Notation
Commonly used Greek Letters
=
Expand
2 X
N
i 1
Exercise
In a survey it was found that 64 families bought milk in the
following quantities (liters) in a particular month:
19
7
28
13
21
22
17
24
36
31
09 22 12 39 19 14 23 06 24 16 18
20 25 28 18 10 24 20 21 10 07 18
20 14 24 25 34 22 05 33 23 26 29
11 26 11 37 30 13 08 15 22 21 32
17 16 23 12 09 15 27 17 21 16
(a)
(b)
(c)
(d)
Construct a frequency distribution using 5 intervals
Construct histogram, polygon, and frequency curve
Construct c.f. distributions and draw Ogives
Construct relative, cumulative relative, percentage relative distn.
Arithmetic Mean
The central value
Group data, ungroup data
Unweighted , weighted
Combined arithmetic mean
Assumed mean, trimmed mean
Median
The most middle observation in arranged data
Ungroup data (even, odd # of observations)
Group data
Graphical method of finding median
Mode
The most frequent observation
Ungroup data
Group data
Graphical method of finding mode
Relationship b/w mean, median, & moade
Quartiles
Quartiles are the percentage points that break down
the ordered data set into quarters.
The first quartile is the 25th percentile. It is the point
below which lie 1/4 of the data.
The second quartile is the 50th percentile. It is the
point below which lie 1/2 of the data. This is also
called the median.
The third quartile is the 75th percentile. It is the
point below which lie 3/4 of the data.
Quartiles and Interquartile Range
The first quartile, Q1, (25th percentile) is
often called the lower quartile.
The second quartile, Q2, (50th
percentile) is often called median or the
middle quartile.
The third quartile, Q3, (75th percentile)
is often called the upper quartile.
The interquartile range is the difference
between the first and the third quartiles.
Example : Finding Quartiles
(n+1)P/100
Sales
9
6
12
10
13
15
16
14
14
16
17
16
24
21
22
18
19
18
20
17
Sorted
Sales
6
9
10
12
13
14
14
15
16
16
16
17
17
18
18
19
20
21
22
24
First Quartile
Median
Third Quartile
Quartiles
Summary Measures: Population Parameters
Sample Statistics
Measures of Central
Tendency
Median
Mode
Mean
Measures of Variability
Range
Interquartile range
Variance
Standard Deviation
Other summary
measures:
Skewness
Kurtosis
Measures of Central Tendency
or Location
Median
Middle value when
sorted in order of
magnitude
50th percentile
Mode
Most frequentlyoccurring value
Mean
Average
Example Median (Data is used from previous example )
Sales
9
6
12
10
13
15
16
14
14
16
17
16
24
21
22
18
19
18
20
17
Sorted Sales
6
9
10
12
13
14
14
15
16
16
16
17
17
18
18
19
20
21
22
24
Median
50th Percentile
(20+1)50/100=10.5
16 + (.5)(0) = 16
Median
The median is the middle
value of data sorted in
order of magnitude. It is
the 50th percentile.
Example - Mode (Data is used from
Example 1-2)
.
. . . . : . : : : . . . .
--------------------------------------------------------------6
9 10 12 13 14 15 16 17 18 19 20 21 22 24
Mode = 16
The mode is the most frequently occurring value. It
is the value with the highest frequency.
Arithmetic Mean or Average
The mean of a set of observations is their average the sum of the observed values divided by the
number of observations.
Population Mean
Sample Mean
x
i 1
x
i 1
Example Mean
Sales
9
6
12
10
13
15
16
14
14
16
17
16
24
21
22
18
19
18
20
17
317
x
i 1
317
15.85
20
Example - Mode
.
. . . . : . : : : . . . .
--------------------------------------------------------------6
9 10 12 13 14 15 16 17 18 19 20 21 22 24
Mean = 15.85
Median and Mode = 16
Group Data and the Histogram
Dividing data into groups or classes or
intervals
Groups should be:
Mutually exclusive
Not overlapping - every observation is assigned to
only one group
Exhaustive
Every observation is assigned to a group
Equal-width (if possible)
First or last group may be open-ended
Frequency Distribution
Table with two columns listing:
Each and every group or class or interval of values
Associated frequency of each group
Number of observations assigned to each group
Sum of frequencies is number of observations
N for population
n for sample
Class midpoint is the middle value of a group or
class or interval
Relative frequency is the percentage of total
observations in each class
Sum of relative frequencies = 1
Example : Frequency Distribution
x
Spending Class ($)
0 to less than 100
100 to less than 200
200 to less than 300
300 to less than 400
400 to less than 500
500 to less than 600
f(x)
Frequency (number of customers)
f(x)/n
Relative Frequency
30
38
50
31
22
13
0.163
0.207
0.272
0.168
0.120
0.070
184
1.000
Example of relative frequency: 30/184 = 0.163
Sum of relative frequencies = 1
Cumulative Frequency Distribution
x
Spending Class ($)
0 to less than 100
100 to less than 200
200 to less than 300
300 to less than 400
400 to less than 500
500 to less than 600
F(x)
Cumulative Frequency
30
68
118
149
171
184
F(x)/n
Cumulative Relative Frequency
0.163
0.370
0.641
0.810
0.929
1.000
The cumulative frequency of each group is the sum of the
frequencies of that and all preceding groups.
Histogram
A histogram is a chart made of bars of
different heights.
Widths and locations of bars correspond to
widths and locations of data groupings
Heights of bars correspond to frequencies or
relative frequencies of data groupings
Histogram Example
Frequency Histogram
Histogram Example
Relative Frequency Histogram
Skewness and Kurtosis
Skewness
Measure of asymmetry of a frequency distribution
Skewed to left
Symmetric or unskewed
Skewed to right
Kurtosis
Measure of flatness or peakedness of a frequency
distribution
Platykurtic (relatively flat)
Mesokurtic (normal)
Leptokurtic (relatively peaked)
Skewness
Skewed to left
Skewness
Symmetric
Skewness
Skewed to right
Kurtosis
Platykurtic - flat distribution
Kurtosis
Mesokurtic - not too flat and not too peaked
Kurtosis
Leptokurtic - peaked distribution
Methods of Displaying Data
Pie Charts
Categories represented as percentages of total
Bar Graphs
Heights of rectangles represent group frequencies
Frequency Polygons
Height of line represents frequency
Ogives
Height of line represents cumulative frequency
Time Plots
Represents values over time
Pie Chart
Bar Chart
Fig. 1-11 Airline Operating Expenses and Revenues
12
Average Revenues
Average Expenses
10
American Continental Delta
Northwest Southwest United
A i r li n e
USAir
Frequency Polygon and Ogive
Relative Frequency Polygon
0 .3
Ogive
1. 0
0 .2
0. 5
0 .1
0 .0
0. 0
0
10
20
Sales
30
40
50
10
20
Sales
30
40
50
Time Plot
M o n th l y S t e e l P r o d u c t io n
( P r o b le m 1 - 4 6 )
Millions of Tons
8 .5
7 .5
6 .5
5 .5
Mo n th
J F M A M J J A S ON D J F M A M J J A S ON D J F M A M J J A S O
1-9 Exploratory Data Analysis - EDA
Techniques to determine relationships and trends,
identify outliers and influential observations, and
quickly describe or summarize data sets.
Stem-and-Leaf Displays
Quick-and-dirty listing of all observations
Conveys some of the same information as a histogram
Box Plots
Median
Lower and upper quartiles
Maximum and minimum
Example: Stem-and-Leaf Display
Construct a stem & leaf graph of the following data
11,12, 12, 13, 15, 15, 15,16,17,20,21,21,
21,22,22,22,23,24,26,27,27,27,28,29,29, 56
30,31,32,34,35,37,41,41,42,45,47,50,52,53,62
1
2
3
4
5
6
122355567
0111222346777899
012457
11257
0236
02
Box Plot
Elements of a Box Plot
Outlier
Smallest data
point not below
inner fence
Largest data point
Suspected
not exceeding
outlier
inner fence
Outer
Fence
Inner
Fence
Q1-1.5(IQR)
Q1-3(IQR)
Q1
Median
Interquartile Range
Q3
Inner
Fence
Q3+1.5(IQR)
Outer
Fence
Q3+3(IQR)
Example: Box and Whisker Plots
Order numbers
3, 5, 4, 2, 1, 6, 8, 11, 14, 13, 6, 9, 10, 7
First, order your numbers from least to
greatest:
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Median
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Then find the median (from the ordered list):
Cross off one number from each side until you reach
the middle number (or numbers).
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Median (continued):
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
If there are two numbers in the middle,
Add those 2 middle numbers together:
6 + 7 = 13
Then divide by 2:
13 2 = 6.5
The median is 6.5.
Quartiles (page 1)
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Then split the numbers on left and right sides
of the median:
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Quartiles (page 2)
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Find the median for each half:
1, 2, 3, 4, 5, 6, 6 7, 8, 9, 10, 11, 13, 14
1, 2, 3, 4, 5, 6, 6 7, 8, 9, 10, 11, 13, 14
Left
Median = 4
Right
Median = 10
Quartiles (page 3)
1, 2, 3, 4, 5, 6, 6 7, 8, 9, 10, 11, 13, 14
Left
Median = 4
Right
Median = 10
The left median is called the LOWER
QUARTILE.
The right median is called the UPPER
QUARTILE.
Number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Draw a number line from the smallest to the
largest number without skipping any numbers.
10
11
12
13 14
Quartiles on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Put circles at the LOWER and UPPER
Quartiles.
10
11
12
13
14
Box on Quartiles on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Draw a box connecting the circles at the
LOWER and UPPER Quartiles.
10
11
12
13 14
Median on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Put a circle at the median (6.5).
10
11
12
13 14
Median on number line
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Draw a line connecting the median to the box.
10
11
12
13
14
Low and high numbers
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Put circles at the high and low points.
10
11
12
13
14
Low and high numbers
1, 2, 3, 4, 5, 6, 6, 7, 8, 9, 10, 11, 13, 14
Draw lines that connect the high and low
points to the box.
10
11
12
13
14
Box and Whisker Plot
3, 5, 4, 2, 1, 6, 8, 11, 14, 13, 6, 9, 10, 7
10
11
12
13
14
Here is the completed Box and Whisker Plot!
Example: Box Plot
Histogram
Histograms
Frequency Polygons & the Ogive
Two Frequency Polygons
Pie Chart
Bar Chart
Box Plot
Box Plot Compare Two Data Sets
Time Plot
Time Plot
Testing Normality
Check the normality of the following data
3, 5, 4, 2, 1, 6, 8, 11, 14, 13, 6, 9, 10, 7
Table of normal scores
Questions?