Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
26 views58 pages

ch03 3

This chapter discusses measures of central tendency, which provide a single value to characterize a data distribution. The three main measures discussed are the mode, median, and mean. The mode is the most frequent value, the median splits the data in half, and the mean is the average value calculated by summing all values and dividing by the total number. Each measure has advantages and disadvantages depending on the characteristics of the specific data set and measurement scale. Understanding measures of central tendency is important for analyzing and describing the results of nursing research studies.

Uploaded by

dmfszmzw6s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views58 pages

ch03 3

This chapter discusses measures of central tendency, which provide a single value to characterize a data distribution. The three main measures discussed are the mode, median, and mean. The mode is the most frequent value, the median splits the data in half, and the mean is the average value calculated by summing all values and dividing by the total number. Each measure has advantages and disadvantages depending on the characteristics of the specific data set and measurement scale. Understanding measures of central tendency is important for analyzing and describing the results of nursing research studies.

Uploaded by

dmfszmzw6s
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as KEY, PDF, TXT or read online on Scribd
You are on page 1/ 58

Statistics and Data Analysisfor

Nursing Research
Second Edition

CHAPTER
3
Central Tendency, Variability, and Relative
Standing

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Characteristics of a Data Distribution

Shape (Chapter 2)
Central tendency
Variability
Both central tendency and variability can be
expressed by indexes that are descriptive
statistics

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Central Tendency
Indexes of central tendency provide a single
number to characterize a distribution

Measures of central tendency come from the


center of the distribution of data values,
indicating what is “typical,” and where data
values tend to cluster

Popularly called an “average”

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Central Tendency Indexes

Three alternative indexes:

The mode
The median
The mean

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mode
The mode is the
score value with the
2.5
The mode
highest frequency;
2.0

1.5

the most “popular” 1.0

score .5 Std. Dev = 1.80

Age: 26 27 27 28 0.0
Mean = 28.3
N = 7.00

29 30 31 26.0 27.0 28.0 29.0 30.0 31.0

AGE
Mode = 27

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mode: Advantages

Can be used with data measured on any


measurement level (including nominal level)
Easy to “compute”
Reflects an actual value in the distribution,
so it is easy to understand

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mode: Disadvantages

Ignores most information in the distribution

Tends to be unstable (i.e., value varies a lot


from one sample to the next)

Some distributions may not have a mode (e.g.,


10, 10, 11, 11, 12, 12)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Median
The median is the
score that divides the 2.5 The median
distribution into two 2.0

equal halves 1.5

50% are below the


1.0

median, 50% above


.5 Std. Dev = 1.80
Mean = 28.3
0.0 N = 7.00

Age: 26 27 27 28 26.0 27.0 28.0 29.0 30.0 31.0

29 30 31 AGE

Median (Mdn) = 28

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Median: Advantages
Not influenced by outliers

Particularly good index of what is “typical”


when distribution is skewed

Easy to “compute”

Appropriate when data are ordinal level

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Median: Disadvantages

Does not take actual data values into


account—only an index of position

Value of median not necessarily an actual


data value, so it is more difficult to
understand than mode

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mean
The mean is the
arithmetic average 2.5 The mean
2.0

Data values are 1.5

summed and divided 1.0

by N .5 Std. Dev = 1.80


Mean = 28.3
0.0 N = 7.00
26.0 27.0 28.0 29.0 30.0 31.0

Age: 26 27 27 28 AGE

29 30 31
Mean =

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mean (cont’d)

Most frequently used measure of central


tendency—usually preferred for interval- and
ratio-level data
Equation:
M = ΣX ÷ N
Where:
M = sample mean
Σ = the sum of
X = actual data values
N = number of people

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mean: Advantages

The balance point in the distribution:


Sum of deviations above the mean always
exactly balances those below it

Does not ignore any information

The most stable index of central tendency


Many inferential statistics are based on the
mean

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mean: Disadvantages

Sensitive to outliers

Gives a distorted view of what is “typical”


when data are skewed

Value of mean is often not an actual data


value

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Mean: Symbols

Sample means:
In reports, usually symbolized as M
In statistical formulas, usually symbolized as
x (pronounced X bar)
Population means:
The Greek letter μ (mu)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Central Tendency in Normal
Distributions

In a normal
distribution, all
three indexes
coincide

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Central Tendency in Skewed
Distributions

In a skewed distribution, the mean is pulled


“off center” in the direction of the skew

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Variability

Variability concerns how spread out or


dispersed data values in a distribution are

Two distributions with the same mean could


have different dispersion

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Variability (cont’d)

High variability: A
heterogeneous
distribution (A)

Low variability: A
homogeneous
distribution (B)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Indexes of Variability

Range

Interquartile range

Standard deviation

Variance

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Range

Range: The difference between the highest


and lowest value in the distribution
Weights (pounds):

110 120 130 140 150 150 160 170 180 190

The range here is 80 (190 – 110)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Range: Advantages

Easy to compute

Readily understood

Communicates information of interest to


readers of a report

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Range: Disadvantages

Depends on only two scores, does not take all


information into account

Sensitive to outliers

Tends to be unstable—fluctuates from sample


to sample

Influenced by sample size

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Interquartile Range

Interquartile range (IQR): Based on quartiles


Lower quartile (Q1): Point below which 25% of
scores lie
Upper quartile (Q3): Point below which 75% of
scores lie
IQR = Q3 - Q1
IQR is the range of scores within which the middle
50% of scores lie

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Interquartile Range (cont’d)

IQR Example: Weights (pounds):

110 120 130 140 150 150 160 170 180 190
The IQR is 45.0 (172.5 – 127.5)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Interquartile Range: Advantages

Reduces influence of outliers and extreme


scores in expressing variability
Uses more information than the range
Important in evaluating outliers
Appropriate as index of variability with
ordinal measures

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Interquartile Range: Disadvantages

Is not particularly easy to compute

Is not well understood

Does not take all values into account

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Standard Deviation

Standard deviation (SD): An index that conveys


how much, on average, scores in a distribution
vary

SDs are based on deviation scores (x),


calculated by subtracting the mean from each
person’s original score
x=X-M

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
The Standard Deviation (cont’d)

Standard deviation example: Weights (pounds):

110 120 130 140 150 150 160 170 180 190

In this distribution, M = 150


For the first person, x = -40
For the last person, x = +40

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Deviation (cont’d)

The sum of all deviation scores in a


distribution always = 0

Thus, to compute SDs, deviation scores


must be squared (x2) before being summed

SD equation:
SD = Square root of: Σx2 ÷ (N -1)
Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Deviation (cont’d)

Weights (pounds):

110 120 130 140 150 150 160 170 180 190
Deviation scores (x) for M = 150:
-40 -30 -20 -10 0 0 10 20 30 40

Squared deviation scores (x2):


1600 900 400 100 0 0 100 400 900 1600

Sum of squared deviation scores:


1600+900+400+100+0+0+100+400+900+1600 = 6000

SD = √(6000/(N -1) =
SD = √(6000/(9) = 25.82
Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Deviation Interpretation

Provides a “standard”—the SD indicates the


average amount of deviation of scores from
the mean
Tells you how wrong, on average, the mean
is as a summary of the overall distribution
An SD provides valuable information when
the distribution is normal:
There are approximately three SDs above and
below the mean in a normal distribution
Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Deviation Interpretation
(cont’d)

In a normal distribution, a fixed percentage


of cases lie within certain distances from the
mean:

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
SDs and Individual Scores

A person who scores one SD below the mean


has a higher score than 16% of the cases
(2.3% + 13.6%)
A person who scores one SD above the mean
has a higher score than 84% of the cases
(50.0% + 34.1%)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Deviation: Advantages

Takes all data into account in describing


variability
Is more stable as a measure of variability than
the range or IQR
Lends itself to computation of other measures
often used in inferential statistics
Is helpful in interpreting individual scores
when data are distributed approximately
normally Copyright ©2010 by Pearson Education, Inc.
Statistics and Data Analysis for Nursing Research, Second Edition
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Deviation: Disadvantages

Can be influenced by extreme scores

Not as “intuitive” or as easy to interpret as


the range

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Variance

An important variability concept in inferential


statistics, but not used descriptively
The variance = SD2
In earlier example, SD2 = 25.822 = 666.67
Not easily interpreted because it is not in
units of original data—it is in units squared
(here, pounds squared)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Measurement Scales and Descriptive
Statistics

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Relative Standing

Central tendency and variability indexes


describe a distribution
There are also descriptive statistics to
describe individual scores—i.e., their relative
standing or position in a distribution:
Percentile ranks
Standard scores

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Percentiles

A percentile is one one-hundredth of a


distribution
Quartiles divide a distribution into quarters
Deciles divide a distribution into tenths
Each percentile, quartile, etc. can be
determined in relation to a score in a
distribution

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Percentile Rank

A percentile rank is the location of a given


score in the distribution—it communicates
what percentage of cases fall at or below
that value

Score  What percentile rank?


Percentile  What score?

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Percentiles and Outliers

Outliers are often defined in relation to


percentiles

There are:
Mild outliers
Extreme outliers

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Outliers: Formal Definition

A mild outlier is a score that is between 1.5


and 3.0 times the value of the IQR, below Q1
or above Q3

An extreme outlier is a score that is greater


than 3.0 times the value of the IQR, below
Q1 or above Q3

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Box Plots

A box plot (or box-and-whiskers plot) is a


graphic depiction of a distribution that
shows the median, the IQR, and the outer
limits of values not considered outliers
Outlying cases can be shown on the box plot,
with identifying information (e.g., an ID
number)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Box Plots (cont’d)

Bottom of “box” shows Q1


Top of “box” shows Q3
Horizontal line in box shows median
“Whiskers” show outer limits of what is
NOT an outlier
In SPSS, a circle O indicates value and ID of a
mild outlier
An asterisk * is for an extreme outlier

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Box Plot Illustration

Textbook Heart Rate Data:


Q1 = 62
Q2 = 66 = Median
Q3 = 68
“Whiskers” limits: 53, 77
Mild outliers:
50 (#106), 45 (#105)
Extreme outliers:
40 (#104), 90 (#103),
95 (#102), 100 (#101)

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Box Plots Versus Histograms

Outliers can be seen in histograms, but box


plots give more useful information about
degree of extremity and ID numbers

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Scores

Standard scores—another index of relative


standing helpful in interpreting raw scores

A standard score (also called a z score) is a


score expressed in standard deviation units,
in relative distance from the mean

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Scores (cont’d)

Standard score equation:


z = (X – M) ÷ SD

That is, the mean is subtracted from an


individual score, then divided by the SD

For example:
M = 100, SD = 25, X = 125, z = 1.0
M = 100, SD = 25, X = 50, z = -2.0
Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Standard Scores (cont’d)

Standard scores have a mean of 0.0 and an SD of


1.0:

But z scores can be transformed mathematically to


have any mean and SD
Most typical:
Mean = 500, SD = 100 (e.g., GRE, SAT)
Mean = 100, SD = 15 (e.g., IQ tests)
Mean = 50, SD = 50 (called T scores)
Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Uses of Descriptive Statistics

Indexes of central tendency and variability


are used to:
Understand data, get a “big picture”
Evaluate outliers and need for strategies to
address problems (e.g., using a trimmed mean
that recalculates mean after deleting a fixed
percentage (e.g., 5% from either end)
Describe research participants (e.g., their age,
education, length of illness)
Answer descriptive questions

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Descriptive Statistics in SPSS

Can be obtained through Analyze 


Descriptive Statistics and are obtained in
three programs within that broad umbrella
(each has slightly different options):

Frequencies  Statistics
Descriptives  Options
Explore  Statistics

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Descriptive Statistics in SPSS
Frequencies

Percentile values
Central tendency
Dispersion (variability)
Skewness and Kurtosis

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Descriptive Statistics in SPSS
Descriptives

Mean (no median)


Dispersion (variability)
Skewness and Kurtosis
No percentiles
BUT has good display options

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Descriptive Statistics in SPSS
Descriptives (cont’d)

Another important
feature: The ability
to create standard
scores

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Descriptive Statistics in SPSS Explore

Can request both


statistical description
and graphical
description (plots)
Select options with
pushbuttons

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Descriptive Statistics in SPSS Explore
(cont’d)

Statistical options:
Full descriptive
statistics
Outliers
Percentiles

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.
Descriptive Statistics in SPSS Explore
(cont’d)

Important graphic
option: Box-and-
whiskers plots

Statistics and Data Analysis for Nursing Research, Second Edition Copyright ©2010 by Pearson Education, Inc.
Denise F. Polit Upper Saddle River, New Jersey 07458
All rights reserved.

You might also like