CHAPTER 1
INTRODUCTION
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Refers to numerical facts
WHAT IS STATISTICS?
Definition
Statistics is a group of methods used to
collect, analyse, present, and interpret
data and to make decisions.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
TYPES OF STATISTICS
Descriptive Statistics consists of methods for organizing,
displaying, and describing data by using tables, graphs,
and summary measures. Those statistics that summarize a
sample of numerical data in terms of averages and other
measures for the purpose of description.
Descriptive statistics, as opposed to inferential statistics,
are not concerned with the theory and methodology for
drawing inferences that extend beyond the particular set
of data examined.
Thus, a teacher who gives a class, of say, 35 students,
an exam is interested in the descriptive statistics to
assess the performance of the class. What was the class
average, the median grade, the standard deviation,
etc.? The teacher is not interested in making any
inferences to some larger population.
TYPES OF STATISTICS
TYPES OF STATISTICS
Example of inferential statistics from quality control:
GE manufactures LED bulbs and wants to know how
many are defective. Suppose one million bulbs a year
are produced in its new plant in Staten Island. The
company might sample, say, 500 bulbs to estimate the
proportion of defectives.
N = 1,000,000 and n = 500
If 5 out of 500 bulbs tested are defective, the sample
proportion of defectives will be 1% (5/500). This statistic
may be used to estimate the true proportion of defective
bulbs (the population proportion).
In this case, the sample proportion is used to make
inferences about the population proportion.
6
Key Terms
Data: Any observations that have been collected
8
Key Terms
Population: A population consists of all
elements – individuals, items, or objects – whose
characteristics are being studied. The population
that is being studied is also called the target
population.
Or
The entire category under consideration. Or the
complete set of elements being studied. The
population size is usually indicated by a capital N.
Examples: every lawyer in the United States;
all single women in the United States.
9
Key Terms
Sample. A portion of the population selected for study is referred
to as a sample.
or
That portion of the population that is available, or to be made
available, for analysis. A good sample is representative of the
population. We will learn about probability samples and how they
provide assurance that a sample is indeed representative. The
sample size is shown as lower case n.
If your company manufactures one million laptops, they might take a
sample of say, 500, of them to test quality. The population size is N =
1,000,000 and the sample size is n= 500.
Census: A survey that includes every member of the population is called
a census. The technique of collecting information from a portion of the
population is called a sample survey.
Figure 1.1 Population and Sample
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Applications
Explain whether each of the following constitutes data collected
from a population or a sample.
a. Opinions on a certain issue obtained from all adults living in a city.
b. The price of a gallon of regular unleaded gasoline on a given day
at each of 28 gas stations in the Miami, Florida, metropolitan area.
c. Credit card debts of 100 families selected from a given city.
d. The percentage of all U.S. registered voters in each state who
voted in the 2012 Presidential election.
e. The number of left-handed students in each of 50 classes selected
from a given university.
Key Terms
Introduction 13
POPULATION VERSUS SAMPLE
A sample that represents the characteristics of
the population as closely as possible is called a
representative sample.
A sample drawn in such a way that each
element of the population has a chance of being
selected is called a random sample. If all
samples of the same size selected from a
population have the same chance of being
selected, we call it simple random sampling.
Such a sample is called a simple random
sample.
Sample with replacement
Sample without replacement
BASIC TERMS
An element or member of a sample or
population is a specific subject or object (for
example, a person, firm, item, state, or country)
about which the information is collected.
A variable is a characteristic under study that
assumes different values for different elements. In
contrast to a variable, the value of a constant is
fixed.
The value of a variable for an element is called an
observation or measurement.
A data set is a collection of observations on one
or more variables.
Table 1.1 Charitable Givings of Six Retailers in
2007
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Applications of Basic Terms
Country Number of Billionaires
United States 413
China 115
Russia 101
India 55
Germany 52
Britain 32
Brazil 30
Japan 26
Refer to the given example.
a. What is the variable for this data set?
b. How many observations are in this data set?
c. How many elements does this data set contain?
TYPES OF VARIABLES
Quantitative Variables
Discrete Variables
Continuous Variables
Qualitative or Categorical Variables
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
TYPES OF VARIABLES
Qualitative variables A variable that
cannot assume a numerical value but can
be classified into two or more nonnumeric
categories.
result in categorical or non-numeric
responses. Also called Nominal, or
categorical data (variable)
Example: Gender MALE FEMALE
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
TYPES OF VARIABLES
Quantitative variables A variable that can be
measured numerically is called a quantitative
variable.
The data collected on a quantitative variable are
called quantitative data.
result in numerical responses, and may be
Discretevariables
Continuous variables
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Quantitative Variables
Discrete variables A variable whose values
are countable is called a discrete variable.
In other words, a discrete variable can
assume only certain values with no
intermediate values.
Example: How many courses have you
taken at this College? ____
Quantitative Variables
Continuous variables A variable that can assume
any numerical value over a certain interval or intervals
is called a continuous variable.
Example: How much do you weigh? ____
One way to determine whether data is continuous, is
to ask yourself whether you can add several decimal
places to the answer.
For example, you may weigh 150 pounds but in
actuality may weigh 150.23568924567 pounds.
On the other hand, if you have 2 children, you do
not have 2.3217638 children.
Figure 1.2 Types of Variables
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Applications
Measurement Scales
Nominal
• “Nominal” scales could simply be called
“labels.”
• A good way to remember all of this is that
“nominal” sounds a lot like “name” and
nominal scales are kind of like “names” or
labels.
03:50 PM 25
Measurement Scales
Ordinal
• It reports the ranking and ordering of the
data without actually establishing the
degree of variation between them.
• “Ordinal” is easy to remember because is
sounds like “order” and that’s the key to
remember with “ordinal scales”–it is the
order that matters, but that’s all you really
get from these.
03:50 PM 26
Measurement Scales
Interval
• Interval scales are numeric scales in which
we know both the order and the exact
differences between the values.
• Here’s the problem with interval scales:
they don’t have a “true zero.”
• For example, there is no such thing as “no
temperature,” at least not with Celsius.
03:50 PM 27
Measurement Scales
Ratio
• Ratio scale allows any researcher to
compare the intervals or differences and
possesses a zero point or character of
origin.
• What is your height in feet and inches?
1. Less than 5 feet
2. 5 feet 1 inch – 5 feet 5 inches
3. 5 feet 6 inches- 6 feet
4. More than 6 feet
03:50 PM 28
Measurement Scales
Scale
Nominal Ordinal Interval Ratio
Data may True Zero Point Meaningful
Data are
only does not Zero
ranked
be classified Exist. point and Ratio
Between values
Eye color, Level of Temperature,
Hair Color Knowledge Shoe Size, Height, Weight,
Gender. about SPSS IQ Scores Distance.
03:50 PM 29
Cross-Section Data
Definition
Data collected on different elements at
the same point in time or for the same
period of time are called cross-section
data.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Table 1.2 Charitable Givings of Six Retailers in
2007
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Time-Series Data
Definition
Data collected on the same element for
the same variable at different points in
time or for different periods of time are
called time-series data.
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Table 1.3 Number of Movie Screens
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Application
SOURCES OF DATA
Data may be obtained from
Internal Sources
External Sources
Surveys and Experiments
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Primary vs. Secondary Data
Primary data. This is data that has been
compiled by the researcher using such techniques
as surveys, experiments, depth interviews,
observation, focus groups.
Types of surveys. A lot of data is obtained
using surveys. Each survey type has advantages
and disadvantages.
Mail: lowest rate of response; usually the lowest cost
Personally administered: can “probe”; most costly;
interviewer effects (the interviewer might influence the
response)
Telephone: fastest
Web: fast and inexpensive
Introduction 36
Primary vs. Secondary Data
Secondary data. This is data that has been
compiled or published elsewhere, e.g.,
census data.
The trick is to find data that is useful. The data was
probably collected for some purpose other than
helping to solve the researcher’s problem at hand.
Advantages: It can be gathered quickly and
inexpensively. It enables researchers to build on
past research.
Problems: Data may be outdated. Variation in
definition of terms. Different units of measurement.
May not be accurate (e.g., census undercount).
Introduction 37
SUMMATION NOTATION
A sample of prices of five literary books:
$75, $80, $35, $97, and $88
The variable price of a book: x
Price of the first book = x1 = $75
Price of the second book = x2 = $80
…
Adding the prices of all five books gives
75+80+35+97+88 = x1+x2+x3+x4+x5 = Σx
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 1-1
Annual salaries (in thousands of dollars)
of four workers are 75, 90, 125, and 61,
respectively. Find
(a) ∑x (b) (∑x)² (c) ∑x²
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 1-1: Solution
(a) ∑x = x1 + x2 + x3 + x4
= 75 + 90 + 125 + 61
= 351 = $351,000
(b) (∑x)² = (351)² = 123,201
(c) ∑x² = (75)² + (90)² + (125)² + (61)²
= 5,625 + 8,100 + 15,625 + 3,721
= 33,071
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 1-2
The following table lists four pairs of m and
f values:
Compute the following:
(a) Σm (b) Σf² (c) Σmf (d)
Σm²f
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Example 1-2: Solution
Table 1.4
(a) (b) (c) (d)
Prem Mann, Introductory Statistics, 7/E
Copyright © 2010 John Wiley & Sons. All right reserved
Self Review Test
Self Review Test