Introduction to
Hypothesis Testing
Definition
Hypothesis testing or significance testing is a method for testing a claim
or hypothesis about a parameter in a population, using data measured
in a sample.
In this method, we test some hypothesis by determining the likelihood
that a sample statistic could have been selected, if the hypothesis
regarding the population parameter were true.
“The main goal in many research studies is to check whether the data
collected support certain statements or predictions”
Hypothesis testing is the method of testing whether claims regarding a
population are likely to be true.
Types of hypothesis
Hypothesis is broadly classified in to Two.
A) Null Hypothesis: “The null hypothesis (H0 ), stated as the null, is a
statement about a population parameter, such as the population mean, that
is assumed to be true.
The null hypothesis is a starting point. We will test whether the value stated
in the null hypothesis is likely to be true.
B) Alternate Hypothesis: “An alternative hypothesis (H1 ) is a statement that
directly contradicts a null hypothesis by stating that that the actual value of
a population parameter is less than, greater than, or not equal to the value
stated in the null hypothesis.
The alternative hypothesis states what we think is wrong about the null
hypothesis.”
Note: H0 will ALWAYS have an equal sign (and possibly a less than or greater
than symbol, depending on the alternative hypothesis).
How to differentiate Tests
Examples: State the H0 and H1 for
each case
A researcher thinks that if expectant mothers use vitamins, the birth
weight of the babies will increase. The average birth weight of the
population is 8.6 pounds
An engineer hypothesizes that the mean number of defects can be
decreased in a manufacturing process of compact disks by using robots
instead of humans for certain tasks. The mean number of defective
disks per 1000 is 18.
Important on Null Hypothesis
When a researcher conducts a study, he or she is generally looking for
evidence to support a claim of some type of difference.
In this case, the claim should be stated as the alternative hypothesis.
Because of this, the alternative hypothesis is sometimes called the
research hypothesis.
Important Definitions
Statistical Test – uses the data obtained from a sample to make a
decision about whether the null hypothesis should be rejected.
Test Value (test statistic) – the numerical value obtained from a
statistical test.
Type 1 and Type 2 Errors
When we make a conclusion from a statistical test there are two types of
errors that we could make. They are called: Type I and Type II Errors.
Type I error – reject H0 when H0 is true.
Type II error – do not reject H0 when H0 is false
Results of a statistical test:
Example
Example: Decision Errors in a Legal Trial . What are H0 and H1 ?
H0: Defendant is innocent.
H1: Defendant is not innocent, i.e., guilty
If you are the defendant, which is the worse error? Why?
The decision of the jury does not prove that the defendant did or did not
commit the crime.
The decision is based on the evidence presented.
If the evidence is strong enough the defendant will be convicted in most
cases, if it is weak the defendant will be acquitted.
So the decision to reject the null hypothesis does not prove anything
The question is how large of a difference is enough to say we have enough
evidence to reject the null hypothesis?
Significance level - is the maximum probability of committing a Type I
error. This probability is symbolized by Alpha P(Type I error|H0 is true)
= Alpha
Critical or Rejection Region – the range of values for the test value that
indicate a significant difference and that the null hypothesis should be
rejected.
Non-critical or Non-rejection Region – the range of values for the test
value that indicates that the difference was probably due to chance and
that the null hypothesis should not be rejected.
Some Important Definitions
Critical Value (CV) – separates the critical region from the non-critical
region, i.e., when we should reject H0 from when we should not reject
H0.
The location of the critical value depends on the inequality sign of the
alternative hypothesis.
Depending on the distribution of the test value, you will use different tables to
find the critical value.
Importance on test
One-tailed test – indicates that the null hypothesis should be rejected
when the test value is in the critical region on one side.
Left-tailed test – when the critical region is on the left side of the distribution
of the test value.
Right-tailed test – when the critical region is on the right side of the
distribution of the test value.
Hypothesis Test Procedure
(Traditional Method)
Step 1 State the hypotheses and identify the claim.
Step 2 Find the critical value(s) from the appropriate table.
Step 3 Compute the test value.
Step 4 Make the decision to reject or not reject the null hypothesis.
Step 5 Summarize the results.
Hypothesis Flow
Test 1
MOTIVATING SCENARIO: It has been reported that the average credit
card debt for college seniors is $3262.
The student senate at a large university feels that their seniors have a
debt much less than this, so it conducts a study of 50 randomly selected
seniors and finds that the average debt is $2995, and the population
standard deviation is $1100.
Can we support the student senate’s claim using the data collected.
How….the z Test for a Mean
A statistical test uses the data obtained from a sample to make a
decision about whether the null hypothesis should be rejected.
The numerical value obtained from a statistical test is called the test
value.
You will notice that our statistical tests will resemble the general formula
for a z-score:
Test Value = observed value – expected value /standard error
The z test for Means
The z test is a statistical test for the mean of a population.
It can be used when n ≥ 30, or when the population is normally
distributed and σ is known.
The formula for the z-test is:
Example :
Example: It has been reported that the average credit card debt for
college seniors is $3262. The student senate at a large university feels
that their seniors have a debt much less than this, so it conducts a study
of 50 randomly selected seniors and finds that the average debt is
$2995, and the population standard deviation is $1100.
Let’s conduct the test based on a Type I error of alpha=0.05
Step 1: State the hypotheses and identify the claim.
Step 2 : Find the critical value(s) from the appropriate table.
Left-tailed test, alpha=0.05 , Z will be negative and have probability
0.05 underneath it
Step 3 : Compute the test value
Step 4: Make the decision to reject or not reject the null hypothesis..
Since this is a left-tailed test, our rejection region consists of values of Z that
are smaller than our critical value of Z = - 1.645. Since our test value (-
1.716341) is less than our critical value (-1.645), we reject the null
hypothesis.
Step 5: Summarize the results. We have evidence to support the student
senate claim that the university’s seniors have credit card debt that is less
than the reported average debt. This is based on a Type I error rate of 0.05.
This means we falsely make the claim above 5% of the time.
IMPORTANT NOTE:
When the null hypothesis is not rejected, we do not accept it as
true. There is merely not enough evidence to say that it is false.
We conclude the alternative hypothesis (when we reject the null)
because the data clearly support that conclusion.
P-Value Method for Hypothesis Testing
We often test hypotheses at common levels of significance (α = 0.05, or
0.01). Recall that the choice of alpha depends on the seriousness of the
Type I error. There is another approach that utilizes a P-value.
The P-Value (or probability value) is the probability of getting a sample
statistic (such as the mean) or a more extreme sample statistic in the
direction of the alternative hypothesis when the null hypothesis is true.
The P-value is the actual area under the standard normal distribution
curve of the test value or a more extreme value (further in the tail).
t Test for a Mean
When a population is normally or approximately normally distributed,
but the population standard deviation is unknown, the z test is
inappropriate for testing hypotheses involving means.
Instead we will use the t test when sigma is unknown and the
distribution of the variable is approximately normal.
The one-sample t test is a statistical test for the mean of a population
and is used when the population is normally or approximately normally
distributed and σ is unknown.
The formula for the test value of the one-sample t test is:
T-Test Example:
Example: Find the critical t value for alpha= 0.01 with sample size of 13
for a left-tailed test.
Left tailed means the critical t value will be negative
n=13 means the degrees of freedom are n-1 = 12
The critical value is -2.681
Example:
We wish to check that normal body temperature may be less than 98.6
degrees. In a random sample of n =18 individuals, the sample mean
was found to be 98.217 and the standard deviation was .684. Assume
the population is normally distributed. Use alpha=0.05.
Step 1: State the hypotheses and identify the claim.
Step 2 : Find the critical value(s) from the appropriate table.
Step 4 : Make the decision to reject or not reject the null hypothesis
Since our p-value is less than our alpha = 0.05, we reject the null
hypothesis. The same conclusion is reached by looking at the critical
value.
Our test value is smaller than the critical value of -1.74.
You only need to do it one way. The decision will always match
Step 5 : Summarize the results. We have enough evidence to support
the claim that average body temperature is less than 98.6 degrees.
Setting up a Test for a Population
Proportion
Hypothesis Testing
Why do we do Hypothesis Tests?
Problem statements
C.S. Mott Children’s Hospital Poll C.S. Mott Children’s Hospital conducted
a national poll on an issue in children’s health, sleep habits. We will be
looking at an example about lack of sleep in teens.
Research Question
In previous years 52% of parents believed that electronics and
social media was the cause of their teenager’s lack of sleep. Do
more parents today believe that their teenager’s lack of sleep is
caused due to electronics and social media?
In previous years 52% of parents believed that electronics and social media was the
cause of their teenager’s lack of sleep. Do more parents today believe that their
teenager’s lack of sleep is caused due to electronics and social media?
Population - Parents with a teenager (age 13-18)
Parameter of Interest - p
Test for a significant increase in the proportion of parents with a teenager
who believe that electronics and social media is the cause for lack of sleep
H0 : p = 0.52
H a : p > 0.52
Where p is the population proportion of parents with a teenager who believe that
electronics and social media is the cause of their teenager’s lack of sleep
𝛂 = 0.05
Survey Results:
A random sample of 1018 parents with a teenager was taken and 56%
said they believe electronics and social media was the cause of their
teenager’s lack of sleep
Assumptions :
We need a random sample of parents
We also need a large enough sample size to ensure our distribution of
sample proportions is normal
We need a random sample of parents We also need a large enough
sample size to ensure our distribution of sample proportions is normal
Testing a One Population Proportion
H0 : p = 0.52
H a : p > 0.52 Where p is the population proportion of parents with a
teenager who believe that electronics and social media is the cause of
their teenager’s lack of sleep
𝛂 = 0.05
Test Statistic =Best estimate - Hypothesized estimate /Standard error of
estimate
p-value = 0.0053 < 𝛂 = 0.05 reject the null hypothosis.
Reject the null hypothesis (H0 : p = 0.52)
There is sufficient evidence to conclude that the population proportion of
parents with a teenager who believe that electronics and social media is
the cause for lack of sleep is greater than 52%.
Setting Up a Test of Difference in
Population Proportions
C.S. Mott Children’s Hospital Poll:
C.S. Mott Children’s Hospital conducted a national poll on an issue in
children’s health, water safety. We will be looking at an example about
swimming lessons.
Research Question
Is there a significant difference between the population proportions of
parents of black children and parents of Hispanic children who report
that their child has had some swimming lessons?
Populations - All parents of black children age 6-18 and all parents of
Hispanic children age 6-18
Parameter of Interest - p1 - p2
Test for a significant difference in the population proportions of
parents reporting that their child has had swimming lessons at
the 10% significance level
Hypotheses
H0 : p1 - p2 = 0
Ha : p1 - p2 ≠ 0
𝛂 = 0.10
Survey Results •
A sample of 247 parents of black children age 6 -18 was taken with 91
saying that their child has had some swimming lessons.
A sample of 308 parents of Hispanic children age 6 -18 was taken with
120 saying that their child has had some swimming lessons.
We need to assume that we have two independent random
samples.
Best Estimate of the Parameter
p̂1 = 91/247 = 0.37 p̂2 = 120/308 = 0.39
p̂1 - p̂2 = 0.37 - 0.39 = -0.02
Test Statistic =Best estimate - Hypothesized estimate/ Standard error of
estimate
Decision & Conclusion p-val = 0.63 > 0.10 = 𝛂 → fail to reject null
hypothesis → don’t have evidence against equal population proportions
Formally, based on our sample and our p-value, we fail to reject the
null hypothesis. We conclude that there is no significant difference
between the population proportion of parents of black and
Hispanic children who report their child has had swimming
lessons.
Testing Hypotheses about a
Population Mean
Research Question:
“Is the average cartwheel distance (in inches) for adults more
than 80 inches? “
Population: All adults
Parameter of Interest: population mean cartwheel distance μ Perform a
one-sample test regarding the value for the mean cartwheel distance for
the population of all such adults
formation
Step 1:
Define the Null and Alternative •
Null: Population mean CW distance (μ) is 80 inches •
Alternative: Population mean is _greater than (>)_ 80 inches
Significance Level = 5% More compact notation:
• H0 : μ = 80
• H a : μ > 80 where μ represents the population mean cartwheel
distance (inches) for all adult
Step 2: Examine Results, Check
Assumptions, Summarize Data via Test
Statistic
Test Statistic Interpretation
Step 3: Determine P-Value
P-value = 0.21
If population mean CW distance was really was 80 inches,
then observing a sample mean of 82.48 inches
(i.e. a t statistic of 0.82) or larger is quite likely
Step 4: Make a Decision about the
Null
Since our P-value is much bigger than 0.05 significance level, weak
evidence against the null → we fail to reject the null!
Based on estimated mean (82.48 inches), we cannot support the
population mean CW distance is greater than 80 inches.
Testing a Population Mean Difference
20 homes remodeling their kitchens, requesting cabinet quotes from 2
suppliers Is there an average difference in cabinet quotes from these
two suppliers?
Variable: Difference in cabinet quotes (Supplier A – Supplier B)
Research Question
Is there an average difference between the cabinet quotes from
the suppliers?
Hypotheses
H0 : μd = 0
Ha : μd ≠ 0
𝛂 = 0.05
Graph
Decision & Conclusion
p-val = 0.014 < 0.05 = 𝛂→
reject null hypothesis → have evidence against mean difference in
cabinet quotes is 0 Formally, based on our sample and our p-value, we
reject the null hypothesis.
We conclude that the mean difference of cabinet quote prices for
Suppliers A less B is significantly different from 0
Thank You