CHAPTER 5: Estimation
5.1 Point and interval estimation
5.2 Confidence interval
5.3 Confidence interval for the mean in single population
5.4 Confidence Interval for the difference population mean
5.5 Confidence interval for single population variance and ratio of two variance
Chapter 5
Estimation
One aspect of inferential statistics is estimation, which is the process of estimating the
values of an unknown population parameter from a sample. These values are only estimates the
true parameters and are derived from data that collected from samples.
Estimator
An estimator is a sample statistic from sample data that used to estimate
the unknown population parameter.
Estimate
An estimate is a specific value of a sample statistic that assigned to the
unknown population parameter.
Estimator and Estimate
If from a sample data, the mean of body temperature is
true population mean
. Thus, the sample statistic used to estimate the true population
parameter is estimator and the assigned value is estimate. Here,
the estimate is
, then we may estimate the
is called the estimator while
There are two types of estimation, which are (a) point estimate, and (b) confidence interval. As
shown in Figure 5.1 below:
Estimation
Point estimate
Confidence Interval
Figure 5.1 : Estimation
5.1
The Point Estimation
A point estimator, generally designated by the symbol , is the rule or formula that tells
us how to use the observations in a sample to compute a single number (a point) which serves as
a point estimate of the value of
Point Estimate
A point estimate is a specific numerical value that used to
estimate an unknown population parameter
Example 5.1
a) State that how to get the point estimate of the population parameter .
b) Find the point estimate of the sample data below to estimate the population mean, .
3.55
3.61
3.47
3.48
3.80
Solution:
a) We would like to estimate the value of
and obtained the statistic . The value of
with consideration of a random sample
of the statistic , computed from this sample of size n
, is a point estimate of the population parameter , ie
b) From (a), the point estimate of the population mean
is
5.1.1 The Properties of a Point Estimation
Since a point estimator is calculated from a sample, it possesses a sampling distribution.
The sampling distribution of a point estimator completely describes its properties. A good
estimator satisfies the three properties below:
Three Properties of a Good Estimator
1. The estimator
should be unbiased estimator.
That is, the expected value or the mean of the estimates obtained from samples of
a given size is equal to the parameter
being estimated, i.e.
2. The estimator should be consistent.
For the consistent estimator, as sample size increase, the value of the estimator
approaches the value of the parameter estimated.
3. The estimator should be a relatively efficient estimator.
That is, of all the statistics that can be used to estimate a parameter, the relatively
efficient estimator has the smallest variance.
Example 5.2:
If X has the binomial distribution with the parameters n and p, show that the sample proportion,
is an unbiased estimator of p.
Solution:
Since
Hence
it follows that
is an unbiased estimator of p.
5.2
The Confidence Interval
For the purpose of knowing how close the point estimate to the population means or
variance, statisticians prefer to other types of estimate that is called an interval estimate.
Interval estimate
An interval estimate of a parameter is an interval or range of
value that is used to estimate the true parameter.
Example 5.3
If the average age of all students in a college is 22.3 years and the error would be 0.4 year, then
write an interval estimate for the average age.
Solution:
The interval estimate for the average age of all the students in the college might be
years or
years.
A degree of confidence interval (usually a percentage) can be assigned before making
interval estimation. For instance, one may be wish to be 95% confident that the interval contains
the true population mean. If one desires to be more confident, such as 98% or 99% confidence,
then the interval must be larger. These confidence interval estimate would named as confidence
interval.
Confidence Level
The confidence level of an interval estimate of a parameter is the
probability that the interval estimate will contain the true parameter.
Confidence Interval
A confidence interval is a specific interval estimate of a parameter that determined by
using collected data from a sample with the specific confidence level of estimate.
Confidence Interval for Population Mean
Confidence Interval for Sample
Example 5.4
With 95% confidence, state the interval for (a) population mean, and (b) sample mean, where
standard deviation is
and sample size is n.
Solution:
Since we are 95% confident, then with
a) The interval for population mean,
is
b) The interval for a specific selected sample mean, , that falls within the range of
is
Refer to Example 5.4, the value used for the 95% confidence interval, 1.96 is obtained
from statistical table. It is based on the standard normal distribution. Since other confidence
intervals would be used in estimation, thus the symbol
(read z sub alpha over two) is used
in the general formula for confidence intervals. The Greek letter
(alpha) is the total are in
both of the tails of the standard normal distribution curve while
is the area in each one of
tails.
Figure 5.2:
For
the probability of selecting a random variable that will produce an
interval where containing
is
sample, is called a
. The interval
computed from the selected
confidence interval.
Confidence Coefficient
The values
is called the confidence coefficient or the degree of
confidence.
Confidence Limit
The endpoints of interval
and
are called the lower and
upper confidence limits, respectively.
According to the central limit theorem, we can expect the sampling distribution of
approximately normally distributed with mean
and standard deviation
to be
From
Figure 5.2,
where
As a result, it give
confidence interval of
with
known,
In the interval estimation, the error would be existed in the interval estimate. That error is
called the maximum error of estimate or standard error. It takes into account during the
estimation. If the error is large, it means that the estimation is far from the true parameter. Thus
the estimation is inaccurate. In practice, we want to make the error as small as we could when
estimating the true parameter. Thus, the
confidence interval provides an estimate
of accuracy of point estimate.
From Figure 5.3, as shown below, we could view that if
thus
may estimate
without error. However,
estimate is in error. We are
is the center of the intervals,
will not be exactly equal to . Hence, the point
confident that the error will not exceed
Figure 5.3: Error of Estimating
by
Standard error
Assume that the life of time of the car battery population has a true mean
years and it is
known for others. A quality engineer has make a sampling of the life time of the battery
population and obtained the sample mean
years. Since the sample mean
used to estimate the true mean of the battery population but
. Therefore the error is exist
which is
Standard error and Sample size
a) Error, e = absolute value of the difference between
[In case
and
unknown and sampling is from a normal distribution, s replaces .
b) Sample size, n =
is
From the formula above, we may conclude that there exists an inverse relationship
between standard error and sample size. The standard error would be decrease when the sample
size increasing and conversely.
Statement of Confidence Interval
Probability of selecting a random variable that will produce
1.
an interval containing
2.
Total area in both of the tails of the standard normal
(alpha)
distribution curve, or called level of significance
3. (1-
Confidence coefficient or the degree of confidence
4. Interval
5. Endpoint
Lower confidence limit
6. Endpoint
Upper confidence limit
7. Standard error
Maximum or minimum difference between the point
confidence level
estimate and the true parameter.
5.3
The Confidence Interval for the Mean in Single Population
Case 1: Variance,
known, for any value of sample sizes.
Case 2: Variance,
unknown and sample size
(If is
unknown then replace
Case 3: Variance,
with s)
unknown and sample size
Example 5.4
A study was carried out to estimate the average life of a large shipment of light bulbs. Previous
studies indicated that the standard deviation is known to be 100 hours. A random sample of 50
light bulbs was selected and indicated that the sample average life was 350 hours. Construct a
95% confidence interval estimate of the true average life for light bulbs in this shipment.
Solution:
Write down the information:
Confidence interval 0.95 for
Determine the case
By case 1:
From table,
The conclusion is, we have 95% confident that the average life of light bulbs is between 322.28
and 377.72 hours.
Example 5.5
The brightness of a television picture can be evaluated by measuring the amount of current
required to achieved a particular brightness level. A random sample of 10 tubes indicated a
sample mean of
microamps and a sample standard deviation is
microamps.
Find (in microamps) a 99% confidence interval estimate for mean current required to achieve a
particular brightness level.
Solution:
Example 5.6:
Talking a random sample of 35 individuals waiting to be serviced by the teller, we find that the
mean waiting time was 22.0 min and the standard deviation was 0.8 min. Construct a 90%
confidence interval estimate the mean waiting time for all individuals waiting in the service line.
5.4
The Confidence Interval for the Mean in Difference Population
Case 1: Variance,
and
known, for any value of sample sizes.
Case 2: Variance,
and
unknown and large sample size
Two different situations must be treated.
In Case 2 (a) :
In Case 2 (b) :
Where
Case 3: Variance,
and
unknown and small sample size
Same goes to Case 3, two different situations must be treated.
In Case 2 (a) :
With
and v is degree of freedoms obtained from
In Case 2 (b) :
Where
The degree of freedoms,
Example 5.7
The breaking strength of yarn supplied by two manufactures is being investigated. From the
previous studies, we noted that
and
specimens from each manufacturer results in
. A random sample of 20 test
and
a 90% confidence interval for the difference in the mean breaking strength.
, respectively. Find
Example 5.8:
A study was conducting to compare the starting salaries for university graduates majoring in
education and engineering. A random sample of 50 recent university graduates in each major
was selected and the following information was obtained.
Major
Mean
SD
Education
RM 2000
RM 100
Engineering
RM 2900
RM 150
Construct a 99% confidence interval for the difference in the mean starting salaries for two
majors.
5.5
Confidence Interval for Single Population Variance,
and Ratio of Two Variance,
and
In previous section, confidence intervals were constructed for single mean and difference
mean between two means. In this section, we will construct the confidence interval for
population variances.
There are two types of the confidence intervals for the population variance as shown in
Figure 5.4:
Confidence Interval
Of Population Variances
Single Variance
Ratio of Two Variances
Use Chi-Square distribution
Use F-distribution
Figure 5.4 : Types of Confidence Interval of Population Variance
Formula 1: The Confidence Interval of the Single Variance,
Where
degree of freedom
Formula 2: The Confidence Interval for the Ration of Two Variance,
Where
and
degree of freedom
Example 5.9:
A sample of 20 cigarettes has a standard deviation
Find the 95% confidence interval
for the variance and standard deviation of the nicotine content of cigarettes manufactured.
Solution:
Example 5.10:
The data below was obtained from two normal populations. Find the 98% confidence interval for
.
Solution: