INFERENTIAL STATISTICS
ESTIMATION
CHE HOANG THAI, M.D
University of Medicine Pham Ngoc Thach
LEARNING OBJECTIVES
At the end of this presentation, the learner will be able to:
• Distinguish the difference between sample and population
• Define the concept of estimate, point estimate and interval
estimate
• Explain the concept of confidence interval and interpret it
• Calculate confidence interval of mean and proportion
DEFINITION OF STATISTICS
STATISTICS
a field of study concerned with
the collection, organization, the drawing of inferences about a body of
summarization, and analysis of data data when only a part of data is observed
Statistics is a tool for creating new understanding
from a set of numbers
DEFINITION OF BIOSTATISTICS
Biostatistics: application of statistics in biological sciences and
medicine (biology, biomedical sciences, medicine, public health…)
DESCRIPTIVE STATISTICS
INFERENTIAL STATISTICS
DESCRIPTIVE STATISTICS
Ways to condense and organize information into a set of
measures that enhance the understanding of complex data
INFERENTIAL STATISTICS
Ways to reach a conclusion about a population on the basis of
the information contained in sample that has been drawn from
that population
DEFINITION OF STATISTICS
DESCRIPTIVE STATISTICS STATISTICS INFERENCE STATISTICS
a field of study concerned with
the collection, organization, the drawing of inferences about a body of
summarization, and analysis of data data when only a part of data is observed
Statistics is a tool for creating new understanding
from a set of numbers
Population of Massachusetts in 2010:
6,547,629 persons
Variable: diastolic blood pressure
POPULATION
Population is the collection of all subjects of interest
Descriptive measure for population: parameter
SAMPLE
Sample is a subset of the population of interest
Descriptive measure for sample: sample statistic or statistic
POPULATION
Population is the collection of all subjects of interest
Descriptive measure for population: parameter
FINITE AND INFINITE POPULATION
Finite population: a fixed number of values
Infinite population: an endless succession of values
ESTIMATE
Estimate: calculating from data of sample (statistics) to formatting the
corresponding parameter of population (parameters)
POINT AND INTERVAL ESTIMATE
Point estimate: a single numerical value used to estimate the
corresponding population parameter
Interval estimate: two numerical values defining a range of values used to
estimate the corresponding population parameter
Population of Massachusetts in 2010:
6,547,629 persons
Variable: diastolic blood pressure
STANDARD DEVIATION (SD)
Standard deviation: the variability in individual observations in a single
sample or population
STANDARD ERROR (SE)
Standard error: a measure of standard deviation, but not of individual
values, rather variation in multiple sample means/proportions computed
on multiple random samples of same size, taken from the same
population
STANDARD ERROR (SE)
Standard error of a mean:
Standard error of a proportion:
CONFIDENCE INTERVAL (CL)
Confidence interval of mean:
Confidence interval of proportion:
INTERPRET CONFIDENCE INTERVAL
If 100 random samples of size n were taken from the same population,
and 95% confidence intervals computed using each of these 100
samples, 95 of the 100 intervals would contain the values of true mean μ
within the endpoints
EXAMPLES
Suppose a researcher, interested in obtaining an estimate of the average level of
some enzyme in a certain human population, takes a sample of 10 individuals,
determines the level of the enzyme in each, and computes a sample mean of = 22.
Suppose further it is known that the variable of interest is approximately normally
distributed with a variance of 45. We wish to estimate μ.
EXAMPLES
• An approximate 95 percent confidence interval for m is given by:
EXAMPLES
The Pew Internet and American Life Project (A-13) reported in 2003 that 18
percent of Internet users have used it to search for information regarding
experimental treatments or medicines. The sample consisted of 1220 adult
Internet users, and information was collected from telephone interviews. We
wish to construct a 95 percent confidence interval for the proportion of
Internet users in the sampled population who have searched for information
on experimental treatments or medicines.
THANK YOU