Chapter 01:
Data and
Statistics
B Y: M R . S O K M E N G S O E U N
STAT 201 BUSINESS STATISTICS 1
OUTLINES:
1.1. APPLICATIONS IN BUSINESS AND ECONOMICS
1.2. DATA
1.3. DATA SOURCES
1.4. DESCRIPTIVE STATISTICS
1.5. STATISTICAL INFERENCE
1.6. COMPUTER AND STATISTICAL ANALYSIS
STAT 201 BUSINESS STATISTICS 2
INTRODUCTION
Frequently, we see the following types of statements in our every life or
newspaper:
• What was the GDP of Cambodia in 2022?
• -> 29.96 Billion USD
• What were the GDP growths of Cambodia in the last 5 years up to 2022?
• -> 7.47, 7.05, -3.1, 3.03, and 5.15% Respectively
• The average one-way travel time to work is 25.3 minutes (U.S. Census
Bureau, March 2009).
• The total number of students in this class is ??? Female ???
• The average age of the students in this class is ???
• What is the population of Phnom Penh City?
STAT 201 BUSINESS STATISTICS 3
INTRODUCTION
Statistics is defined as the art and science of collecting, analyzing,
presenting, and interpreting data.
STAT 201 BUSINESS STATISTICS 4
1.1. APPLICATIONS IN
BUSINESS AND ECONOMICS
In this section, we provide examples that illustrate some of the uses
of statistics in business and economics.
STAT 201 BUSINESS STATISTICS 5
1.1. APPLICATIONS IN
BUSINESS AND ECONOMICS
Accounting
STAT 201 BUSINESS STATISTICS 6
1.1. APPLICATIONS IN
BUSINESS AND ECONOMICS
Finance
STAT 201 BUSINESS STATISTICS 7
1.1. APPLICATIONS IN
BUSINESS AND ECONOMICS
Marketing
STAT 201 BUSINESS STATISTICS 8
1.1. APPLICATIONS IN
BUSINESS AND ECONOMICS
Production
STAT 201 BUSINESS STATISTICS 9
1.1. APPLICATIONS IN
BUSINESS AND ECONOMICS
Economics
STAT 201 BUSINESS STATISTICS 10
1.2. DATA
Data are the facts and figures collected, analyzed, and summarized for
presentation and interpretation.
All the data collected in a particular study are referred to as the data set for the
study.
Table 1.1 shows a data set containing information for 25 mutual funds that are
part of the Morningstar Funds500 for 2008.
STAT 201 BUSINESS STATISTICS 11
STAT 201 BUSINESS STATISTICS 12
1.2. DATA
Elements, Variables and Observations
Elements are the entities on which data are collected. For the data set in Table
1.1 each individual mutual fund is an element: the element names appear in the
first column. With 25 mutual funds, the data set contains 25 elements.
A variable is a characteristic of interest for the elements.
Observation is the set of measurements obtained for a particular element.
STAT 201 BUSINESS STATISTICS 13
1.2. DATA
Elements, Variables and Observations
The data set in Table 1.1 includes the following five variables:
• Fund Type: The type of mutual fund, labeled DE (Domestic Equity), IE (International
Equity), and FI (Fixed Income)
• Net Asset Value ($): The closing price per share on December 31, 2007
• 5-Year Average Return (%): The average annual return for the fund over the past 5 years
• Expense Ratio: The percentage of assets deducted each fiscal year for fund expenses
• Morningstar Rank: The overall risk-adjusted star rating for each fund; Morningstar ranks go
from a low of 1-Star to a high of 5-Stars
STAT 201 BUSINESS STATISTICS 14
1.2. DATA
Scales of Measurement
Nominal scale is the scale of measurement for a variable when the data are labels or names used to
identify an attribute of an element. Nominal data may be nonnumeric or numeric.
Ordinal scale is the scale of measurement for a variable if the data exhibit the properties of nominal data
and the order or rank of the data is meaningful. Ordinal data may be nonnumeric or numeric.
Interval scale is the scale of measurement for a variable if the data demonstrate the properties of ordinal
data and the interval between values is expressed in terms of a fixed unit of measure. Interval data are
always numeric.
Ratio scale is the scale of measurement for a variable if the data demonstrate all the properties of interval
data and the ratio of two values is meaningful. Ratio data are always numeric.
STAT 201 BUSINESS STATISTICS 15
1.2. DATA
Categorical and Quantitative Data
Categorical data are labels or names used to identify an attribute of each element. Categorical data
use either the nominal or ordinal scale of measurement and may be nonnumeric or numeric.
Quantitative data are numeric values that indicate how much or how many of something.
Quantitative data are obtained using either the interval or ratio scale of measurement.
Categorical variable is a variable with categorical data.
Quantitative variable is a variable with quantitative data.
STAT 201 BUSINESS STATISTICS 16
1.2. DATASET
Cross-Sectional and Time Series Data
Cross-sectional data are data collected at the same or approximately the same point in time.
Time series data are data collected over several time periods.
How about Panel Data?
STAT 201 BUSINESS STATISTICS 17
STAT 201 BUSINESS STATISTICS 18
STAT 201 BUSINESS STATISTICS 19
STAT 201 BUSINESS STATISTICS 20
STAT 201 BUSINESS STATISTICS 21
1.3. DATA SOURCES
Existing Sources:
Ø Company Database
Ø Specialized organization
(eg. Bloomberg, World Bank, etc.)
Ø Industry Association and special interest organization
(eg. Cambodian Rice Federation: http://www.crf.org.kh/ )
Ø The Internet
Ø Government agencies
(eg. Cambodian National Institute of Statistics:
https://www.nis.gov.kh/index.php/km/ )
STAT 201 BUSINESS STATISTICS 22
STAT 201 BUSINESS STATISTICS 23
STAT 201 BUSINESS STATISTICS 24
STAT 201 BUSINESS STATISTICS 25
STAT 201 BUSINESS STATISTICS 26
Source: http://nis.gov.kh/nis/cpi/2021/PP_CPI%20summary%20table%20Sep%202021.htm
STAT 201 BUSINESS STATISTICS 27
1.3. DATA SOURCES
Statistical Studies:
Sometimes the data needed for a particular application are not available through existing sources. In such
cases, the data can often be obtained by conducting a statistical study. Statistical studies can be classified
as either experimental or observational.
In an experimental study, a variable of interest is first identified. Then one or more other variables are
identified and controlled so that data can be obtained about how they influence the variable of interest. For
example, a pharmaceutical firm might be interested in conducting an experiment to learn about how a new
drug affects blood pressure. Blood pressure is the variable of interest in the study. The dosage level of the
new drug is another variable that is hoped to have a causal effect on blood pressure. To obtain data about
the effect of the new drug, researchers select a sample of individuals. The dosage level of the new drug is
controlled, as different groups of individuals are given different dosage levels. Before and after data on
blood pressure are collected for each group. Statistical analysis of the experimental data can help
determine how the new drug affects blood pressure.
STAT 201 BUSINESS STATISTICS 28
1.3. DATA SOURCES
Statistical Studies:
Nonexperimental, or observational, statistical studies make no attempt to control the variables of interest.
A survey is perhaps the most common type of observational study. For instance, in a personal interview
survey, research questions are first identified. Then a questionnaire is designed and administered to a
sample of individuals. Some restaurants use observational studies to obtain data about customer opinions
on the quality of food, quality of service, atmosphere, and so on. A customer opinion questionnaire used
by Chops City Grill in Naples, Florida, is shown in Figure 1.4. Note that the customers who fill out the
questionnaire are asked to provide ratings for 12 variables, including overall experience, greeting by
hostess, manager (table visit), overall service, and so on. The response categories of excellent, good,
average, fair, and poor provide categorical data that enable Chops City Grill management to maintain high
standards for the restaurant’s food and service.
STAT 201 BUSINESS STATISTICS 29
STAT 201 BUSINESS STATISTICS 30
1.3. DATA SOURCES
Data Acquisition Errors
STAT 201 BUSINESS STATISTICS 31
1.4. DESCRIPTIVE STATISTICS
Most of the statistical information in newspapers, magazines, company reports, and other publications
consists of data that are summarized and presented in a form that is easy for the reader to understand. Such
summaries of data, which may be tabular, graphical, or numerical, are referred to as descriptive statistics.
Refer again to the data set in Table 1.1 showing data on 25 mutual funds. Methods of descriptive statistics
can be used to provide summaries of the information in this data set.
STAT 201 BUSINESS STATISTICS 32
STAT 201 BUSINESS STATISTICS 33
STAT 201 BUSINESS STATISTICS 34
1.5. STATISTICAL INFERENCE
Many situations require information about a large group of elements (individuals, companies, voters,
households, products, customers, and so on). But, because of time, cost, and other considerations, data can
be collected from only a small portion of the group. The larger group of elements in a particular study is
called the population, and the smaller group is called the sample. Formally, we use the following
definitions.
STAT 201 BUSINESS STATISTICS 35
1.5. STATISTICAL INFERENCE
The process of conducting a survey to collect data for the entire population is called a census. The process
of conducting a survey to collect data for a sample is called a sample survey. As one of its major
contributions, statistics uses data from a sample to make estimates and test hypotheses about the
characteristics of a population through a process referred to as statistical inference.
As an example of statistical inference, let us consider the study conducted by Norris Electronics. Norris
manufactures a high-intensity lightbulb used in a variety of electrical products. In an attempt to increase the
useful life of the lightbulb, the product design group developed a new lightbulb filament. In this case, the
population is defined as all lightbulbs that could be produced with the new filament. To evaluate the
advantages of the new filament, 200 bulbs with the new filament were manufactured and tested. Data
collected from this sample showed the number of hours each lightbulb operated before filament burnout.
See Table 1.5.
STAT 201 BUSINESS STATISTICS 36
STAT 201 BUSINESS STATISTICS 37
1.5. STATISTICAL INFERENCE
Suppose Norris wants to use the sample data to make an inference about the average hours of useful life for
the population of all lightbulbs that could be produced with the new filament. Adding the 200 values in
Table 1.5 and dividing the total by 200 provides the sample average lifetime for the lightbulbs: 76 hours.
We can use this sample result to estimate that the average lifetime for the lightbulbs in the population is 76
hours. Figure 1.7 provides a graphical summary of the statistical inference process for Norris Electronics.
Whenever statisticians use a sample to estimate a population characteristic of interest, they usually provide
a statement of the quality, or precision, associated with the estimate.
For the Norris example, the statistician might state that the point estimate of the average lifetime for the
population of new lightbulbs is 76 hours with a margin of error of 4 hours. Thus, an interval estimate of the
average lifetime for all lightbulbs produced with the new filament is 72 hours to 80 hours. The statistician
can also state how confident he or she is that the interval from 72 hours to 80 hours contains the population
average.
STAT 201 BUSINESS STATISTICS 38
STAT 201 BUSINESS STATISTICS 39
1.6. COMPUTER AND
STATISTICAL ANALYSIS
Statisticians frequently use computer software to perform the statistical computations required with large
amounts of data. For example, computing the average lifetime for the 200 lightbulbs in the Norris
Electronics example (see Table 1.5) would be quite tedious without a computer. To facilitate computer
usage, many of the data sets in this book are available on the website that accompanies the text. The data
files may be downloaded in either Minitab or Excel formats.
STAT 201 BUSINESS STATISTICS 40
STAT 201 BUSINESS STATISTICS 41