Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
30 views18 pages

Descriptive Statistics Part 1

The document provides an overview of descriptive statistics, focusing on measures of central tendency (mean, median, mode) and measures of variability (range, variance, standard deviation, coefficient of variation). It explains how to organize and summarize raw data to convey meaningful information. The document includes examples and properties of each statistical measure discussed.

Uploaded by

fatimanuhu211104
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views18 pages

Descriptive Statistics Part 1

The document provides an overview of descriptive statistics, focusing on measures of central tendency (mean, median, mode) and measures of variability (range, variance, standard deviation, coefficient of variation). It explains how to organize and summarize raw data to convey meaningful information. The document includes examples and properties of each statistical measure discussed.

Uploaded by

fatimanuhu211104
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

DESCRIPTIVE STATISTICS

PART 1
Abdulwali Sabo Abdulrahman (MSc Medical Statistics)
Department of Community Medicine
OUTLINE

 1. INTRODUCTION

 2. MEASURES OF CENTRAL TENDENCY

 3. MEASURES OF VARIABILITY
1. INTRODUCTION

 When measurements of a random variable are taken on the


entities of a population or sample, the resulting values are
made available to the researcher or statistician as a mass of
unordered data.

 Measurements that have not been organized, summarized, or


otherwise manipulated are called raw data.

 Unless the number of observations is extremely small, it will be


unlikely that these raw data will impart much information until
they have been put into some kind of order.
2. MEASURES OF CENTRAL TENDENCY

 Measures of central tendency convey information


regarding the average value of a set of values. They give
us information about the center of the distribution.

 The three most commonly used measures of central


tendency are the mean, the median, and the mode.

 These measures give us the ability to summarize the data


by means of a single number called a descriptive measure.
2.1 The Mean
 The most familiar measure of central tendency is the mean. It is the
descriptive measure most people have in mind when they speak of the
“average.”
 The mean is obtained by adding all the values in a population or sample
and dividing by the number of values that are added.

 For example: Porcellini et al., studied 13 HIV-positive patients who


were treated with highly active antiretroviral therapy (HAART) for at
least 6 months. The CD4 T cell counts (X 106/L) at baseline for the 13
subjects are listed below.
 230, 205, 313, 207, 227, 245, 173, 58, 103, 181, 105, 301, 169.
2.1 Properties of the Mean

 1. Uniqueness: For a given set of data, there is one and only one mean.

 2. Simplicity: The mean is easily understood and easy to compute.

 3. It is affected by extreme values.


2.2 The Median

 The median is that value that divides the sorted data set into two equal parts such that the
number of values equal to or greater than the median is equal to the number of values equal to
or less than the median.
 If the number of values is odd, the median will be the middle value when all values have been
arranged in order of magnitude. When the number of values is even, there is no single middle
value, as such, the median is taken to be the mean of these two middle values.
 Median = (n + 1)/2th observation.
 For example, Find the median age of the participants presented below. 59, 61, 50, 57, 64, 65,
66, 38, 43, 57.
2.2 Properties of the median

 1. Uniqueness. As is true with the mean, there is only one median for a given
set of data.

 2. Simplicity. The median is easy to calculate.

 3. It is not as drastically affected by extreme values as is the mean.


2.3 The Mode
 The mode of a set of values is that value that occurs most frequently. If all the values are
different there is no mode; on the other hand, a set of values may have two modes (bimodal).
It’s also possible to find data sets with more than two modes (multimodal).
 The mode may be used for describing qualitative data.
 For example, suppose the patients seen in a mental health clinic during a given year received
one of the following diagnoses: mental retardation, organic brain syndrome, psychosis, neurosis,
and personality disorder. The diagnosis occurring most frequently in the group of patients would
be called the modal diagnosis.
3. MEASURES OF VARIABILITY
 A measure of variability conveys information regarding the amount of dispersion present in a set of
data.
 If all the values are the same, there is no dispersion; if they are not all the same, dispersion is
present in the data. The amount of dispersion may be small when the values, though different, are
close together.
 The Figure below shows the frequency polygons for two populations that have equal means but
different amounts of variability. Population B, which is more variable than population A, is more
spread out. If the values are widely scattered, the dispersion is greater.
 Other terms used synonymously with dispersion include variation, spread, and scatter.
Two frequency distributions with equal means but
different amounts of dispersion.
3.1 The Range
 The range is the difference between the largest and smallest value in a set of observations.
 For example, we wish to compute the range of the ages of the sample subjects from the sample
below
59, 61, 50, 57, 64, 65, 66, 38, 43, 57.
The oldest of the sample = 66.
The youngest of the sample = 38.
Range = 66 – 38 = 28.
The usefulness of the range is limited. The fact that it takes into account only two values causes it
to be a poor measure of dispersion.
3.2 The Variance

 Is a measure of dispersion relative to the scatter of the values about their mean.
 In computing the variance of a sample of values, we subtract the mean from each of the values,
square the resulting differences, and then add up the squared differences. This sum of squared
differences is divided by the sample size, minus 1, to obtain the sample variance.
 Letting S2 stand for the sample variance, the procedure may be written in a notational form as
follows:
3.2 The Variance

 The reason for dividing by n – 1 rather than n, as we might have expected, is


the theoretical consideration referred to as degrees of freedom.

 Example: Compute the variance of the ages of subjects below.

 43, 66, 61, 64, 65, 38, 59, 57, 57, 50.
3.3 The Standard Deviation

 The variance represents squared units and, therefore, is not an appropriate measure of
dispersion when we wish to express this concept in terms of the original units.
 To obtain a measure of dispersion in original units, we merely take the square root of the
variance. The result is called the standard deviation.
 In general, the standard deviation of a sample is given by:
3.4 The Coefficient of Variation

 When we want to compare the dispersion in two sets of data, however, comparing two standard
deviations may lead to erroneous conclusion.
 It may be that the two variables involved are measured in different units. For example, we may
wish to know, for a certain population, whether serum cholesterol levels, measured in
milligrams per 100 ml, are more variable than body weight, measured in pounds.
 Furthermore, although the same unit of measurement is used, the two means may be quite
different.
3.4 The Coefficient of Variation

 The formula is expressed as

 Suppose two samples of students yield the following results

Sample 1 Sample 2

Age 25 years 18 years

Mean weight 145 pounds 80 pounds

 We wish todeviation
Standard know which is more variable,
10 pounds the weights of the1025-year-olds
pounds or the weights of the
18-year-olds.
THANK YOU FOR YOUR AUDIENCE

You might also like