0% found this document useful (0 votes)

111 views18 pages

BSTAT HANDOUTS - DESCRIPTIVE ONLY Handouts 3

The document discusses measures of central tendency and variability used to summarize quantitative data. It defines the mean, weighted mean, and median, and provides examples of calculating each. It differentiates between population and sample measures, using Greek letters for parameters and Latin letters for statistics.

Uploaded by

Clint Parzan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

111 views18 pages

BSTAT HANDOUTS - DESCRIPTIVE ONLY Handouts 3

Uploaded by

Clint Parzan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

UNIVERSITY OF ST.

LA SALLE
Yu An Log College of Business and Accountancy

BSTAT – BUSINESS STATISTICS

First Semester, Ay 2020 – 2021

HANDOUTS 3

MEASURES OF CENTRAL TENDENCY & VARIABILITY

 Recall: Statistics involves a body of techniques and procedures dealing with the collection,
organization, analysis, interpretation, and presentation of information that can be stated
numerically.

Summarizing data involves using statistical tools and procedures appropriate for answering a research
problem or objective.

The following terms are needed need to be differentiated:

Measure – a numerical representation of a particular characteristic (variable of the study) of the group
being studied

Parameter – A measure calculated from the population; usually represented by letters of the Greek
alphabet
Statistic – A measure calculated from the sample; usually represented by letters of the English alphabet

Summaries of QUALITATIVE DATA:

 Qualitative data are summarized using the following measures:

 proportions ( also called relative frequencies)

 percentages

For example: the variable sex is coded as

M–0
F –1

Remark: Since “sex” is a qualitative variable and the codes 0 and 1 represent nominal data, then
it is not appropriate to consider them as numbers with values, so it is not correct to
apply arithmetic operations such as addition and division to get the “average sex” since
it will not make any sense for a qualitative variable; Rather, use proportion (or
percentage) of males (or females) in the group

Say, “Two out of 10 students are male,” or “twenty percent of the students are males”

Summaries of QUANTITATIVE DATA:

 Quantitative data are usually summarized in terms of the center and spread of the distribution.
 The center of the distribution can be identified using an appropriate measure of central
tendency or location.

LEONARES, S. R. 1
MEASURES OF CENTRAL TENDENCY OR LOCATION (AVERAGES)

A measure of central tendency or location is

 representative value of the data set
 the value around which most of the data points are found

(ARITHMETIC) MEAN

 computed by summing all the data values in the sample or population and dividing the sum by
the number of observations (usually referred to as “average”)
 Most important measure representing the center of the distribution if the distribution is
symmetric
 data must be at least interval
 Most stable measure of location, especially for large data sets
 When n is small, the mean is very sensitive to extreme values
 Differentiate between the population and sample means by their symbols:

Population Mean:  
x i
, where x i is the ith score or observation, and N is the number
N
of observations in the population (the parameter is ,
the Greek letter “mu”)

Sample Mean: x 
x i
, where x i is the ith score or observation, and n is the number of
n
observations in the sample (the statistic is 𝑥̅ , and is
read as “x-bar”)

 Why differentiate between  and 𝑥̅ : if the research procedure is a population study, then
a populations symbol (parameter) must be used; if it is a sample study, then a sample
symbol (statistic) must be used. This will be a very important distinction in inferential
statistics.
 That is why it is important to determine at the beginning of the research process if you
will be doing a population of sample study, since it will have a bearing in the use of
notations/symbols for parameters or statistics.

Example 1: During a particular summer month, the eight salespeople in an appliance store sold the
following number of central air-conditioning units: 8, 11, 5, 14, 8, 11, 16, 11. Considering this month as
the statistical population of interest, the mean number of units sold is


x i

84
 10.5 central a / c units
N 8

Why ? Because the problem stated that the month should be considered as a statistical population of
interest.

LEONARES, S. R. 2
WEIGHTED MEAN

 Also called the weighted average

 an arithmetic mean in which each value is weighted according to its importance in the overall
group
 formulas for the population, and sample weighted means are identical:

 w or X w 
 wX 
w
 Operationally, each value in the group (X) is multiplied by the appropriate weight factor (w), and
the products are then summed and divided by the sum of the weights.

Example 2: In a multiproduct company, the profit margins for the company’s four product lines during
the past fiscal year were: line A, 4.2percent; line B, 5.5 percent; line C, 7.4 percent; and line D, 10.1
percent.

The unweighted mean profit margin is


 x  27.2  6.80%
N 4

However, unless the four products are equal in sales, this unweighted average is incorrect. Assuming the
sales totals in the following table which are not all equal, the weighted mean correctly describes the
overall average.

Product Line Profit Margin, X (%) Sales, in Php (w) wX

A 4.2 30,000,000 126,000,000
B 5.5 20,000,000 110,000,000
C 7.4 5,000,000 37,000,000
D 10.1 3,000,000 30,300,000
Total Php58,000,000 Php303,300,000

Hence, the weighted mean profit margin is

303,300,000
w   5.22%
58,000,000

Remark: The weighted mean is used in computing for final grades when the number of units of
the subjects are not equal. Each grade is multiplied by the number of units of the
subject, and the sum of the (grades x no. of units) is divided by the total number of
units taken.

LEONARES, S. R. 3
MEDIAN

 Center of an array (arrangement of the data from lowest to highest)

 Divides the array into two equal parts
 Useful for summarizing skewed distributions because it is not sensitive to extreme values
 Equal to the mean for symmetric distributions
 Data must be at least ordinal
 If N (or n) is odd, the median is the middle number of the array
 If N (or n) is even, the median is the mean of the two middle values

 Population Median: ~
 (read “mu-tilde”)

 Sample Median: ~
x (read “x-tilde”)

Example 3: The eight salespeople described in Example 1 sold the following number of central air-
conditioning units, in ascending order: 5, 8, 8, 11, 11, 11, 14, 16. Find the median.

Array: 5, 8, 8, 11, 11, 11, 14, 16

~  11  11  11
 central a/c units
2
Since the number of data values is even (N = 8), then the value of the median is the mean of the two
middle values, which are the fourth and fifth values in the ordered group. Both these values equal “11”
in this case, so adding the two 11’s and dividing by 2 gives the median which is equal to 11. Note that
there is an equal number of data points below and above the median (5, 8, 8, 11 are below; 11, 11, 14,
16 are above).

Example 4: The reaction times for a random sample of 9 subjects to a stimulant were recorded as 2.5,
3.6, 3.1, 4.3, 2.9, 2.3, 2.6, 4.1, and 3.4 seconds. Calculate the median.

First form the array: 2.3, 2.5, 2.6, 2.9, 3.1, 3.4, 3.6, 4.1, 4.3

Since there are 9 data values (odd), then there will only be one middle value.

2.3, 2.5, 2.6, 2.9, 3.1, 3.4, 3.6, 4.1, 4.3

𝑥̃ = 3.1 seconds

̃? Because the problem specifically identifies the group as a random sample.

Why 𝒙

NOTE: When the problem does not specifically indicate whether the group involved is a sample
or population, treat the data set as a sample.

LEONARES, S. R. 4
Recall Example 1:

During a particular summer month, the eight salespeople in an appliance store sold the following number
of central air-conditioning units: 8, 11, 5, 14, 8, 11, 16, 11. Considering this month as the statistical
population of interest,
a. the mean number of units sold is


x i

84
 10.5 central a / c units
N 8

b. the median value from Example 3 is

~  11  11  11
 central a/c units
2

Dot plot: The mean and median are relatively close to each other.


 
    
5 6 7 8 9 10 11 12 13 14 15 16

 The mean and the median values would be considered to be good representatives of the
data set since they are located in the center of the distribution (where the points are).

 What if, instead of 16, the highest value is 160?

 Then the last point of the dot plot would be very far from the rest of the points (extremely high
value) – it can also be called an outlier.

Solution with the outlier, 160:

Array: 5, 8, 8, 11, 11, 11, 14, 160

Then: 
x i

228
 28.5 central a / c units
N 8

~  11  11  11
 central a/c units
2
 The resulting value of the mean is not found at the center of where the points are
(28.5 is far from the majority of the points), while the median remains the same.

 The value of the mean is affected if there are extreme values in the distribution, hence, it
cannot be used to represent the distribution if the shape is skewed. That is why, one
condition for its use as a representative value is that the shape must be symmetric.

 On the other hand, the median has not changed, because only the middle value (if n is
odd) or the mean of the two middle values (if n is even) is used; the extreme value is not
used in determining the median. Therefore, the median is a better representative value if
the shape of the distribution is skewed.

LEONARES, S. R. 5
MODE

 Value in the data set which has the highest frequency (occurs most often)
 Can be applied to any measurement level
 May not exist (the data set may not have a mode if all the values occur with the same frequency)
 May not be unique, if it exists (a data set may have more than one value which have the same
highest fequency
 Related to the concept of a peak or peaks in the frequency distribution
 Unimodal – one peak
 Bimodal – two peaks, etc.

 Population Mode: Mo
 Sample Mode: mo

Example 5: The eight salespeople described in Example 1 sold the following number of central air-
conditioning units: 8, 11, 5, 14, 8, 11, 16, and 11. Find the mode.

Mo =11 central air-conditioning units

Example 6: The reaction times for a random sample of 9 subjects to a stimulant were recorded as 2.5,
3.6, 3.1, 4.3, 2.9, 2.3, 2.6, 4.1, and 3.4 seconds. Find the mode.

 Since all values occur only once (they have the same frequency), then this distribution has
no mode or we say that the mode does not exist.
 This different from saying that the mode is 0 (why?)

RELATIONSHIP BETWEEN THE MEAN AND THE MEDIAN:

Note that the shape of the distribution is important in choosing the most appropriate measure of central
tendency (and in other measures and tests as well). Hence, to determine the shape and there is no graph
to base it on, comparing the mean and median values will determine the shape:

a. symmetric distribution: mean = median

b. positively skewed distribution: mean > median
c. negatively skewed distribution: mean <median

Notes: 1. Since the mode does not always exist, it is just the mean and the median that are compared.
2. A positively skewed distribution indicates that the values mostly cluster on the lower half of
the distribution but there are few extremely high values. When the mean is computed, these
high values influence the value of the computed mean and pull its value away from the center
towards where the extremely high values are. On the other hand, the median is not affected
by extreme values, so it stays closer to where most of the values are. That is why, for a
positively skewed distribution, the median is a better representative value than the mean.
3. A negatively skewed distribution as majority of the data clustering on the upper half of the
distribution but there are few extremely low values. For the same reason as in the positively
skewed distribution, the mean is pulled towards where the few extremely lower values are.
The median is the better representative value compared to the mean.

LEONARES, S. R. 6
READ: https://www.khanacademy.org/math/ap-statistics/quantitative-data-ap/describing-
comparing-distributions/v/classifying-distributions

EXERCISES: Show complete solutions. For each item, identify the following needed information:
a. determine whether the data set constitutes a population or sample.
b. identify the variable of the problem (label this as X)

Example for #1:

a. sample (no mention of whether population or sample)
b. X : score in an achievement test in Mathematics (note: this is always stated in the
singular form)

1. The following are scores of 50 high school students in a 150-item achievement test in Mathematics.

112 107 97 69 72 115 81 102 91 76

73 73 86 76 92 95 106 80 81 141
126 124 127 118 128 84 75 98 113 119
82 83 134 132 104 68 95 106 115 98
92 92 100 96 108 100 119 106 94 85

a. Find the mean, median, and mode.

b. What is the shape of the distribution?

2. According to a survey, the average person spends 45 minutes a day listening to recorded music. The
following data were obtained for the number of minutes spent listening to recorded music for a
sample of 30 individuals.
88.3 4.3 4.6 7.0 9.2
0.0 99.2 34.9 81.7 0.0
85.4 0.0 17.5 45.0 53.3
29.1 28.8 0.0 98.9 64.5
4.4 67.9 94.2 7.6 56.6
52.9 145.6 70.4 65.1 63.6

LEONARES, S. R. 7
a. Compute the mean.
Do these data appear to be consistent with the average reported by the newspaper? Explain
your answer.

b. Compute the median.

Between the mean and the median, which measure do you think is more appropriate to use
for this data set? Why?

3. During a 30-day period, the daily number of cars rented of a car rental company are as follows:

7 10 6 7 9 4 7 9 9 8
5 5 7 8 4 6 9 7 12 7
9 10 4 7 5 9 8 9 5 7

a. Find the mean, median, and mode.

b. If the break-even point for the company is 8 cars per day, is the company doing well? Explain.

4. Find the preferred measure of central location for the sample whose observations 18, 10, 11, 98, 22,
15, 11, 25, and 17 represent the number of automobiles sold during this past month by 9 different
automobile agencies. Justify your choice.

5. For a sample of 15 students at an elementary-school snack bar, the following sales amounts arranged
in ascending order of magnitude are observed: Php10, 10, 25, 25, 27, 30, 33, 35, 40, 43, 45, 45, 50, 55,
60.
a. Determine the mean, median, and mode for these sales amounts.

b. How would you describe the distribution from the standpoint of skewness?

6. The following table shows the percentage of defective items in an assembly department. Determine
the overall percentage defective of all items assembled during the sampled week.

Shift Percentage Number of Items,

defective in thousands
1 1.1 210
2 1.5 120
3 2.3 50

7. The average IQ of 10 students in a mathematics course is 114. If 9 of the students have IQs of 101,
125, 118, 128, 106, 115, 99, 118, and 109, what must be the other IQ?

8. What is the average for a student who received grades of 85, 76, and 82 on 3 tests and a 79 on the
final examination in a certain course if the final examination counts three times as much as each of
the 3 tests?

LEONARES, S. R. 8
INTRODUCTION TO VARIABILITY

Consider the following two sets (male and female) of number of bottles of soft drink consumed in a week:

A 3 4 5 6 8 9 10 12 15
B 3 7 7 7 8 8 8 9 15

Fill the table with the needed information:

n x ~
x
A

Describe the two sets with respect to the two measures: _______________________________________
_____________________________________________________________________________________

Remarks:
 The measures of central location do not give an adequate description of a given distribution if the
purpose is to differentiate between the two using measures (the two sets have the same mean
and median)
n x ~
x
A 9 8 bots 8 bots

B 9 8 bots 8 bots

 The two measures do not describe how the observations spread out from the average

Consider the dot plot of the two sets (Set B above the line; set A below):

 
 
    

3 4 5 6 7 8 9 10 11 12 13 14 15

        

 The dot plot shows that the points of B are more closely clustered about the center, while the
points of A are scattered, yet they have the same mean and median 43

 Therefore, there is a need to use a measure that will differentiate between the two distributions
in terms of how they are scattered/dispersed

LEONARES, S. R. 9
MEASURES OF VARIATION

 Also called measures of dispersion or variability

 Numerically describe the degree of dispersion, scatter or spread of scores in a distribution.

RANGE

 difference in value between the highest (maximum) and the lowest (minimum) observation
 can be computed very quickly
 but not very useful because it considers only the extremes
 does not take into consideration the bulk of the observations.
 The range is used when:
1. the data are too scant or too scattered to justify the computation of a more precise measure
of variability.
2. a knowledge of extreme scores or a total spread is all that is wanted.

Example: Find the range of the two sets given above.

RA = 15 – 3 = 12.0 points

RB = 15 – 3 = 12.0 points

 this example shows an instance wherein range values are not able to differentiate between set
A and set B, although the dot plots present different “stories”
 there is a need to have a measure that will be able to truly distinguish between the two sets

STANDARD DEVIATION

 most important and most commonly used measure of variation, together with the mean as a
measure of central tendency
 a measure of variability that is based on the difference between the value of each observation (xi)
and the mean
 difference between each xi and the mean is called a deviation about the mean

 deviation = xi – mean (depending on whether data set constitutes a population, in

which case  is used; otherwise it is x ̅)

The standard deviation is used when:

 the statistic having the greatest stability is desired.
 the mean is the preferred measure of central tendency.

 x  
2

Definitional formula for the population standard deviation:  

2 i

 x 
2
x
s 
2 i
Definitional formula for the sample standard deviation:
n 1

LEONARES, S. R. 10
N x i2  ( x i ) 2
Raw score formula for the population standard deviation: 
N2

n  x i2  ( x i ) 2
Raw score formula for the sample standard deviation: s
n( n  1)

Calculation of the Variance and Standard Deviation: Raw Score Method

Remark: It would be good for you to have a scientific calculator with an SD mode so that you will just
have to learn how to key in the data. Your calculator will generate the values of the measures that you
would like to solve. Since different models work differently, search a You tube tutorial for the particular
calculator model that you have.

Example:

Given the following sample data set (xi) where X : score in a quiz ( n = 10):

Xi Xi2
32 pts (32 pts)(32pts) = 1,024 pts2
71 pts (71 pts)(71 pts) = 5,041 pts2
64 pts (64 pts)(64 pts) = 4,096 pts2
50 pts (50 pts)(50 pts) = 2,500 pts2
48 pts (48 pts)(48 pts) = 2,304 pts2
63 pts (63 pts)(63 pts) = 3,969 pts2
38 pts (38 pts)(38 pts) = 1,444 pts2
41 pts (41 pts)(41 pts) = 1,681 pts2
47 pts (47 pts)(47 pts) = 2,209 pts2
52 pts (52 pts)(52 pts) = 2,704 pts2
Sum of the
column x i  506  xi  26,972 pts2
2

10( 26,972 pts 2 )  (506 pts) 2 269,720 pts 2  256,036 pts 2 13,684 pts 2
s 
2
   152.04 pts 2
10(9) 90 90

 since it makes no sense to have a measure in terms of squared units of the original unit
of measurement (e.g., pts2), the unit has to be reverted back to the original unit
(pts) which can be done by extracting the square root of the value of the variance

Therefore, the standard deviation, s  152.04 pts 2  12.3 pts

LEONARES, S. R. 11
Example: Solve for the standard deviations of the two sets of data on page 1.
A B
2
x x x x2
1 3 9 3 9
2 4 16 7 49
3 5 25 7 49
4 6 36 7 49
5 8 64 8 64
6 9 81 8 64
7 10 100 8 64
8 12 144 9 81
9 15 225 15 225
x i  72 x 2
i  700 x i  72 x 2
i  654

NOTE: If it helps you by creating a table, you may do so, otherwise just presenting the solution in terms
of summations (like below) without the table will suffice.

Set A: x = 72
x2 = 700

9(700)  (72) 2 6300  5184 1116

s 2A     15.5
9(8) 72 72
s A  15.5  3.9 books

Set B: x = 72
x2 = 654

9(654)  (72) 2 5886  5184 702

s B2     9.75
9(8) 72 72
s B  9.75  3.1 books

VARIANCE
 square of the standard deviation:
population variance: 2
sample variance: s2
 of little use in descriptive statistics because its calculated value is expressed in square units of
measurement

WATCH:
Statistics Fundamentals: The Mean, Variance and Standard Deviation.
https://www.youtube.com/watch?v=SzZ6GpcfoQY

Range, variance and standard deviation as measures of dispersion | Khan Academy.

https://www.youtube.com/watch?v=E4HAYd0QnRc&list=TLPQMDMwNTIwMjC21B29Fxwxjg&index=1

Mean, variance, and standard deviation (raw data).https://www.youtube.com/watch?v=75-DpMsd-7w

LEONARES, S. R. 12
APPLICATIONS OF THE STANDARD DEVIATION

A. COEFFICIENT OF VARIATION

 a relative measure of variation (comparing one relative to the other)

 expresses the standard deviation as a percentage of the mean
 expressed in percent, it can be used to compare the variability of two or more distributions when
o observations are expressed in different units of measurement, or
o the data sets being compared have different means
 the greater the CV, the greater the variability
 formula:
s
CV   100%
x

Note: terms used interchangeably: more uniform, more homogeneous, more compact, less dispersed,
less scattered, less variable, less heterogeneous, less varied

Remark:
In the investing world, the coefficient of variation allows you to determine how much volatility (risk) you
are assuming in comparison to the amount of return you can expect from your investment. In simple
language, the lower the ratio of standard deviation to mean return, the better your risk-return tradeoff.

Example: Consider two investment proposals, A and B, with the following data:

The coefficient of variation for each proposal is:

For A: $107.70/$230 x 100% = 47%

For B: $208.57/$250 x 100% = 83%

 herefore, because the coefficient of variation is a relative measure of risk, B is considered more risky
than A. Although B has a greater mean ($250) than A ($230), be is considered a more risky investment
since B is more volatile than A, meaning, your earning with B can vary from $41.43 to $458.57, while for
A it is from $122.93 to $337.07 (the greater the CV, the more variable the data of the group).

Example: The weights of 10 boxes of a certain brand of cereal have a mean content of 278.0 grams with
a standard deviation of 9.6 grams. If these boxes were purchased at 10 different stores and the mean
price per box is PhP64.50 with a standard deviation of PhP4.50, can you conclude that the weights are
relatively more homogeneous than the prices?
9.6 𝑔𝑟𝑎𝑚𝑠
CVw = x 100% = 3.5%
278.0 𝑔𝑟𝑎𝑚𝑠

𝑃ℎ𝑃4.50
CVp = x 100% = 6.98%
𝑃ℎ𝑃64.50

 Yes, the weights are relatively more homogenous than the prices, because the CV for the
weights is less than the CV for the prices.

LEONARES, S. R. 13
B. STANDARD SCORE

 also called the z-score; transformed raw score

 a measure of relative position
 usually used to compare observations in two or more different distributions of raw scores which have
different means and/or different standard deviations.
 no unit of measurement
 the mean of standard scores is zero
 a positive standard score indicates that the transformed raw score is above or higher than the mean,
while a negative standard score shows that the given raw score is below or lower than the mean.
 formula for transforming a raw score to a standard score, represented by z, is
xx
z
s

Example: Ruben got a final grade of 85 in both English and Physics. The mean final grades of his class in
these two courses are 80 in English and 75 in Physics with standard deviations of 12 and 10, respectively.
In which subject was his academic performance better in relation to his class?

Subject Ruben’s final grade (x) Class Mean Class Std. Dev.
English 85 80 12
Physics 85 75 10

85−80 85−75
ZE = = 0.40 ZP = = 1.00
12 10

 Ruben had a better academic performance in Physics than in English in relation to

each of his classes
 it is wrong to compare the grades of the two subjects per se since they are not comparable
(how can English be directly compared to Physics?)
 rather, the student’s performance has to be rated in relation to the others in the same class
 need to standardize the grades to eliminate differences in factors

Example: Different typing skills are required for secretaries depending on whether one is working in a law
office, an accounting firm, or for a research mathematical group at a major university. In order to evaluate
candidates for these positions, an employment agency administers three distinct standardized typing
samples. A time penalty has been incorporated into the scoring of each sample based on the number of
typing errors. The mean and standard deviation for each test, together with the score achieved by a recent
applicant, are given in the following table. Determine which office this particular applicant should be
assigned.
Sample Applicant’s Mean Standard
score (xi) ( ) deviation (s)
Law 141 sec 180 sec 30 sec
Accounting 7 min 10 min 2 min
Scientific 33 min 26 min 5 min

X: applicant’s score in terms of time it takes to finish particular manuscript

(the shorter the time the better the performance)

141 𝑠𝑒𝑐−180 𝑠𝑒𝑐

ZL =
30 𝑠𝑒𝑐
= - 1.30

LEONARES, S. R. 14
7 𝑚𝑖𝑛−10 𝑚𝑖𝑛
ZA = = - 1.50
2 𝑚𝑖𝑛

33 𝑚𝑖𝑛 − 26 𝑚𝑖𝑛
ZS = = 1.40
5 𝑚𝑖𝑛

 Since a secretary is supposed to type speedily and accurately, a lower z-score is desired. This
particular applicant should be assigned to an accounting firm.
 there is no need to convert to the same units since the numerator units will cancel with the
denominator units. z should have no unit of measurement

C.. PEARSONIAN SKEWNESS

 measure of relative asymmetry
 compares shapes of two or more distributions
 The means are different
 The standard deviations are different
 no unit of measurement

3(mean  median)
 Formula: SK 
std deviation

 if SK > 0 => positively skewed

 if SK < 0 => negatively skewed
 If SK = 0 => symmetric

 Rule of thumb (Bulmer, 1979): If SK is

• less than −1 or greater than +1, the distribution is highly skewed.
• between −1 and −½ or between +½ and +1, the distribution is moderately skewed.
• between −½ and +½, the distribution is approximately symmetric.

D. EMPIRICAL RULE

 When the data are believed to approximate a bell-shaped distribution, the empirical rule can be
used to determine the percentage of data values that must be within a specified number of
standard deviations of the mean, that is,
o Approximately 68% of the data values will be within 1 standard deviation of the mean
( ± 1) = ( - 1 ,  + 1).

o Approximately 95% of the data values will be within 2 standard deviations of the mean
( ± 2) = ( - 2 ,  + 2).

o Approximately 99.7% of the data values will be within 3 standard deviations of the mean
( ± 3) = ( - 3 ,  + 3).

LEONARES, S. R. 15



Remarks on the bell-shaped curve (also called the normal curve):

1. the horizontal line can go much lower than  - 4 and much higher than  + 4.
2. the total area under the curve and above the horizontal line is 1 or 100%
3. since it is symmetric, the percentage between similarly distanced points on the x-axis from
the mean are equal ( see above figure)
4. 0.15% (on the left of the figure) is the area from  - 3 and below; 0.15% (on the right of
the figure) is from  + 3 and above.
Example: Liquid detergent cartons are filled automatically on a production line. Filling weights
frequently have a bell-shaped distribution. If the mean filling weight is 16.00 ounces and the standard
deviation is 0.25 ounces, use the empirical rule to draw conclusions about the distribution of filling
weights.
 = 16.00 oz ;  = 0.25 oz

LEONARES, S. R. 16
 ± 1 : 16.00 ± 0.25  (16.00 - 0.25, 16.00 + 0.25)
 (15.75, 16.25)
 68% of the liquid detergent cartons have filling weights between
15.75 oz and 16.25 oz

 ± 2 : 16.00 ± (2)0.25  16.00 ± 0.50  (15.50, 16.50)

 95% of the liquid detergent cartons have filling weights
between 15.50 oz and 16.50 oz

 ± 3 : 16.00 ± (3)0.25  16.00 ± 0.75  (15.25, 16.75)

 99.7% of the liquid detergent cartons have filling weights
between 15.25 oz and 16.75 oz

EXERCISES

1. A goal of management is to help their company earn as much as possible relative to the capital
invested. One measure of success is return on equity – the ratio of net income to stockholder’s
equity. Shown here are return on equity percentages for 25 companies. Find the range, variance,
and standard deviation.
9.0 19.6 22.9 41.6 11.4
15.8 52.7 17.3 12.3 5.1
17.3 31.1 9.6 8.6 11.2
12.8 12.2 14.5 9.2 16.6
5.0 30.3 14.7 19.2 6.2

2. During a 30-day period, the daily number of cars rented of a car rental company are as follows:
7 10 6 7 9 4 7 9 9 8
5 5 7 8 4 6 9 7 12 7
9 10 4 7 5 9 8 9 5 7
Find the range, variance, and standard deviation.

3. A manufacturing firm regularly places orders with two different suppliers, A and B. The following
data are the number of days required to fill orders for these suppliers.
Supplier A: 11 10 9 10 11 11 10 11 10 10
Supplier B: 8 10 13 7 10 11 10 7 15 12
Determine which supplier provides the more consistent and reliable delivery times. Use the
range and standard deviation. Since you are comparing the two, why just use the standard
deviation and not compute for the coefficient of variation?

LEONARES, S. R. 17
4. A production department uses a sampling procedure to test the quality of newly produced items.
The department employs the following decision rule at an inspection station: If a sample of 14
items has a variance of more than .005, the production line must be shut down for repairs.
Suppose the following data have been collected:
3.43 3.45 3.43 3.48 3.52 3.50 3.39
3.48 3.41 3.38 3.49 3.45 3.51 3.50
Should the production line be shut down? Why or why not?

5. Two friends want to take a summer holiday before going to college in the autumn. They are looking
for somewhere with plenty of clubs where they can party all night. Unfortunately they have left it
rather late to book and there are only two resorts, Medlena and Bistry, available within their
budget. When they ask about the ages of the holiday-makers at these resorts their travel agent
says the only thing he can tell them is that that the mean age of people going to Medlena is 19
whereas the mean age of visitors to Bistry is 22. Just as they are about to book holidays in Medlena
because it seems to attract the sort of young crowd they want to be with the travel agent says.
‘I’ve got some more figures, the standard deviation of the ages of visitors to Medlena is 8 and the
standard deviation of the ages of visitors to Bistry is 2’. Should they change their minds on the
basis of this new information, and if so, why?

6. Many national academic achievement and aptitude tests, such as the SAT, report standardized
test scores with the mean for the normative group used to establish scoring standards converted
to 500 with a standard deviation of 100. Suppose that the distribution of scores for such a test is
known to be approximately normally distributed. Determine the approximate percentage of
reported scores that would be
a. between 400 and 600
b. between 500 and 700
c. greater than 700
d. less than 200
Hint: Draw the bell-shaped curve and replace the values of  and  on the horizontal axis:

7. A SAT test taker (refer to #6) got a score of 625. What is his standard score?

8. The same student (in #7) got the same score (625) in a different test, the mean of which is 450
and standard deviation 150. In which test did this student fare better?

LEONARES, S. R. 18

Advanced Mathematics Module 3: Systems of Linear Equations: Prepared by
No ratings yet
Advanced Mathematics Module 3: Systems of Linear Equations: Prepared by
13 pages
Bachelor-of-Science-in-Industrial-Engineering-BSIE UB Batangas
No ratings yet
Bachelor-of-Science-in-Industrial-Engineering-BSIE UB Batangas
5 pages
Sampling & Data Collection Guide
No ratings yet
Sampling & Data Collection Guide
12 pages
Discrete Probability Distributions Guide
No ratings yet
Discrete Probability Distributions Guide
18 pages
Answers To The Learning Activities From Module 8
No ratings yet
Answers To The Learning Activities From Module 8
21 pages
Linear Algebra 2023 1
No ratings yet
Linear Algebra 2023 1
50 pages
History of Binan
No ratings yet
History of Binan
2 pages
The F Test or Anova
No ratings yet
The F Test or Anova
5 pages
2025 JLSS Primer
No ratings yet
2025 JLSS Primer
16 pages
Cpe PC 214 Discrete Mathematics
No ratings yet
Cpe PC 214 Discrete Mathematics
9 pages
Basic Program Structure in C++: Study Guide For Module No. 2
No ratings yet
Basic Program Structure in C++: Study Guide For Module No. 2
9 pages
Background and Purpose of The Study
No ratings yet
Background and Purpose of The Study
4 pages
Attendance Monitoring System With Fingerprint Scanner and SMS Notification
No ratings yet
Attendance Monitoring System With Fingerprint Scanner and SMS Notification
19 pages
5-PT Questions (FINALS) - Statistics PDF
No ratings yet
5-PT Questions (FINALS) - Statistics PDF
11 pages
Math Written Reportgroup 4 PDF
No ratings yet
Math Written Reportgroup 4 PDF
18 pages
Review Finals - GE 9
No ratings yet
Review Finals - GE 9
4 pages
Ged 103 - Rizal's Topic in Discussion
No ratings yet
Ged 103 - Rizal's Topic in Discussion
14 pages
12-Selected Inferential Statistics
No ratings yet
12-Selected Inferential Statistics
11 pages
Eda Hypothesis Testing For Single Sample
No ratings yet
Eda Hypothesis Testing For Single Sample
6 pages
4.data Interpretation
No ratings yet
4.data Interpretation
47 pages
Eee
100% (1)
Eee
3 pages
Engineering Data Analysis: Instructional Materials in STAT 20023
No ratings yet
Engineering Data Analysis: Instructional Materials in STAT 20023
75 pages
Module 1b. TECHNICAL WRITING SECTION 2 LESSONS 1 5PPT Autosaved
No ratings yet
Module 1b. TECHNICAL WRITING SECTION 2 LESSONS 1 5PPT Autosaved
29 pages
Pavement Design
No ratings yet
Pavement Design
6 pages
Local Media8201345101832179145
No ratings yet
Local Media8201345101832179145
218 pages
Stats Finals
100% (1)
Stats Finals
7 pages
Last Trip Abroad of Rizal - Odp
No ratings yet
Last Trip Abroad of Rizal - Odp
34 pages
Physics Modules for Engineering Students
No ratings yet
Physics Modules for Engineering Students
97 pages
Activity No 6
No ratings yet
Activity No 6
5 pages
Module 1 Advance Math
No ratings yet
Module 1 Advance Math
20 pages
AniscalMA - Learning Activity 4.1 4.3
No ratings yet
AniscalMA - Learning Activity 4.1 4.3
7 pages
Week No. Topic: Inferential Statistics: Simple Test of Hypothesis - The Z-Test and The T-Test Statistical Tools. What Is A Hypothesis?
No ratings yet
Week No. Topic: Inferential Statistics: Simple Test of Hypothesis - The Z-Test and The T-Test Statistical Tools. What Is A Hypothesis?
12 pages
Integral Calculus for Students
No ratings yet
Integral Calculus for Students
37 pages
Lesson 1 Vball
No ratings yet
Lesson 1 Vball
3 pages
Discrete Probability Distribution
No ratings yet
Discrete Probability Distribution
5 pages
Philippine S&T for Nation Building
No ratings yet
Philippine S&T for Nation Building
21 pages
Module 1 Rizal's Life PDF
100% (1)
Module 1 Rizal's Life PDF
17 pages
MAT-051 App - of Derivatives
No ratings yet
MAT-051 App - of Derivatives
49 pages
Iriga City History
No ratings yet
Iriga City History
5 pages
MMW
No ratings yet
MMW
6 pages
STS Report
No ratings yet
STS Report
30 pages
DOST-SEI Hails Qualifiers To The 2022 Undergrad S&T Scholarships
No ratings yet
DOST-SEI Hails Qualifiers To The 2022 Undergrad S&T Scholarships
199 pages
IM Business Management Accounting ACCO 018
No ratings yet
IM Business Management Accounting ACCO 018
88 pages
Phsics Solutions CH 4 2
No ratings yet
Phsics Solutions CH 4 2
26 pages
Lesson 1 (Obtaining Data)
100% (1)
Lesson 1 (Obtaining Data)
7 pages
Ilagan City, Isabela History
No ratings yet
Ilagan City, Isabela History
1 page
Differential Equation Syllabus
No ratings yet
Differential Equation Syllabus
7 pages
MMW - Correlation Analysis
No ratings yet
MMW - Correlation Analysis
5 pages
Declaration of Martial Law
No ratings yet
Declaration of Martial Law
4 pages
Module 2 Occupational Safety - Cont Firesafety - No
No ratings yet
Module 2 Occupational Safety - Cont Firesafety - No
34 pages
RAS CV 2010-2p
No ratings yet
RAS CV 2010-2p
2 pages
Philosophy Course for Future Teachers
No ratings yet
Philosophy Course for Future Teachers
2 pages
ACTM 2017 Regional Statistics and Key A 1
100% (3)
ACTM 2017 Regional Statistics and Key A 1
15 pages
Logic Exercises for Engineering Students
100% (1)
Logic Exercises for Engineering Students
2 pages
Math in Modern Life for Students
No ratings yet
Math in Modern Life for Students
2 pages
PCM Midterm Reviewer
No ratings yet
PCM Midterm Reviewer
4 pages
Science & Math Teacher Scholarships
No ratings yet
Science & Math Teacher Scholarships
6 pages
Unit 2 PPT Probability
No ratings yet
Unit 2 PPT Probability
77 pages
Importance of Statistics in Modern Mathematics
No ratings yet
Importance of Statistics in Modern Mathematics
34 pages
Portion 9
No ratings yet
Portion 9
44 pages
Cold Calling Rebuttal Scripts Debbie PDF
No ratings yet
Cold Calling Rebuttal Scripts Debbie PDF
6 pages
Aussie Pooch Mobile
No ratings yet
Aussie Pooch Mobile
10 pages
EWaste List
No ratings yet
EWaste List
4 pages
Cover Letter UHOP 2024
No ratings yet
Cover Letter UHOP 2024
2 pages
Basis
No ratings yet
Basis
75 pages
STS Climate-Change
No ratings yet
STS Climate-Change
1 page
Rhist Essay Module 1
No ratings yet
Rhist Essay Module 1
1 page
CM2A
No ratings yet
CM2A
4 pages
For Green Marketing Project
No ratings yet
For Green Marketing Project
16 pages
Physics 107L Lab Guidelines
No ratings yet
Physics 107L Lab Guidelines
2 pages
Heimdal The Gjallarhorn The Horn Resounding and Ragnarok by Ormungandr Melchizedek
100% (1)
Heimdal The Gjallarhorn The Horn Resounding and Ragnarok by Ormungandr Melchizedek
4 pages
Top 100 AI Tools for Productivity
No ratings yet
Top 100 AI Tools for Productivity
19 pages
Johnson Grammar School: Kuntloor-Hyderabad
No ratings yet
Johnson Grammar School: Kuntloor-Hyderabad
2 pages
DLL - Tle-H.e. 6 - Q1 - W7
No ratings yet
DLL - Tle-H.e. 6 - Q1 - W7
6 pages
Ann Cum Syllabus AP English 10-04-2025 1
No ratings yet
Ann Cum Syllabus AP English 10-04-2025 1
5 pages
Determinants of The Money Supply: © 2005 Pearson Education Canada Inc
No ratings yet
Determinants of The Money Supply: © 2005 Pearson Education Canada Inc
17 pages
Lab Report: Submitted To
No ratings yet
Lab Report: Submitted To
6 pages
alloy20DataSheet PDF
No ratings yet
alloy20DataSheet PDF
2 pages
Mrcs Part B Osce Anatomy
No ratings yet
Mrcs Part B Osce Anatomy
287 pages
Array Formulas
No ratings yet
Array Formulas
12 pages
Unit 1
No ratings yet
Unit 1
10 pages
Navigating Landscapes of Mediated Memory 1st Edition Paul Wilson Instant Download
100% (5)
Navigating Landscapes of Mediated Memory 1st Edition Paul Wilson Instant Download
85 pages
Ep 20 Units
No ratings yet
Ep 20 Units
142 pages
Sodium Chloride Nacl Data Sheet
No ratings yet
Sodium Chloride Nacl Data Sheet
1 page
The Empathetic School
100% (1)
The Empathetic School
9 pages
Aspiring Entrepreneur's CV
No ratings yet
Aspiring Entrepreneur's CV
4 pages
18nov-5th Sem Green Synthesis
No ratings yet
18nov-5th Sem Green Synthesis
21 pages
Government Arts College Salem-7
No ratings yet
Government Arts College Salem-7
2 pages
Criminology MCQs
100% (1)
Criminology MCQs
4 pages
Science Quiz Bee
No ratings yet
Science Quiz Bee
5 pages
Bacterii
No ratings yet
Bacterii
11 pages
CADVR-1004FD / - 08FD: Honeywell Black
No ratings yet
CADVR-1004FD / - 08FD: Honeywell Black
4 pages
Chapter 1 5 Thesis Sample
100% (2)
Chapter 1 5 Thesis Sample
64 pages
Industrial Valve Specifications
No ratings yet
Industrial Valve Specifications
9 pages
Disorders of The Thyroid Gand
No ratings yet
Disorders of The Thyroid Gand
167 pages
The Yellow World How Fighting For My Life Taught Me How To Live Espinosa Albert Download
No ratings yet
The Yellow World How Fighting For My Life Taught Me How To Live Espinosa Albert Download
35 pages
Organophosphate Insecticides (OPC)
No ratings yet
Organophosphate Insecticides (OPC)
27 pages

BSTAT HANDOUTS - DESCRIPTIVE ONLY Handouts 3

Uploaded by

BSTAT HANDOUTS - DESCRIPTIVE ONLY Handouts 3

Uploaded by

UNIVERSITY OF ST.

BSTAT – BUSINESS STATISTICS

MEASURES OF CENTRAL TENDENCY & VARIABILITY

The following terms are needed need to be differentiated:

Summaries of QUALITATIVE DATA:

 Qualitative data are summarized using the following measures:

 proportions ( also called relative frequencies)

For example: the variable sex is coded as

Summaries of QUANTITATIVE DATA:

A measure of central tendency or location is

 Also called the weighted average

The unweighted mean profit margin is

Product Line Profit Margin, X (%) Sales, in Php (w) wX

Hence, the weighted mean profit margin is

 Center of an array (arrangement of the data from lowest to highest)

Array: 5, 8, 8, 11, 11, 11, 14, 16

2.3, 2.5, 2.6, 2.9, 3.1, 3.4, 3.6, 4.1, 4.3

̃? Because the problem specifically identifies the group as a random sample.

b. the median value from Example 3 is

 What if, instead of 16, the highest value is 160?

Solution with the outlier, 160:

Array: 5, 8, 8, 11, 11, 11, 14, 160

Mo =11 central air-conditioning units

RELATIONSHIP BETWEEN THE MEAN AND THE MEDIAN:

a. symmetric distribution: mean = median

Example for #1:

112 107 97 69 72 115 81 102 91 76

a. Find the mean, median, and mode.

b. Compute the median.

a. Find the mean, median, and mode.

Shift Percentage Number of Items,

Fill the table with the needed information:

 Also called measures of dispersion or variability

Example: Find the range of the two sets given above.

 deviation = xi – mean (depending on whether data set constitutes a population, in

The standard deviation is used when:

Definitional formula for the population standard deviation:  

Calculation of the Variance and Standard Deviation: Raw Score Method

Therefore, the standard deviation, s  152.04 pts 2  12.3 pts

9(700)  (72) 2 6300  5184 1116

9(654)  (72) 2 5886  5184 702

Range, variance and standard deviation as measures of dispersion | Khan Academy.

Mean, variance, and standard deviation (raw data).https://www.youtube.com/watch?v=75-DpMsd-7w

 a relative measure of variation (comparing one relative to the other)

The coefficient of variation for each proposal is:

For A: $107.70/$230 x 100% = 47%

 also called the z-score; transformed raw score

 Ruben had a better academic performance in Physics than in English in relation to

X: applicant’s score in terms of time it takes to finish particular manuscript

141 𝑠𝑒𝑐−180 𝑠𝑒𝑐

C.. PEARSONIAN SKEWNESS

 if SK > 0 => positively skewed

 Rule of thumb (Bulmer, 1979): If SK is

Remarks on the bell-shaped curve (also called the normal curve):

 ± 2 : 16.00 ± (2)0.25  16.00 ± 0.50  (15.50, 16.50)

 ± 3 : 16.00 ± (3)0.25  16.00 ± 0.75  (15.25, 16.75)

You might also like