Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
24 views27 pages

Statistics Assignment 2 (Team 3) - 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views27 pages

Statistics Assignment 2 (Team 3) - 1

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

CHI-SQUARE TEST FOR

GOODNESS OF FIT
TEAM MEMBERS:
SATHYA K
SHABNA J
SHAKIL MOHAMED KHAN R
SHOBIKA B
SIREEN N
SIVASANGARI B
SRI VISHVA BALAJI S
SUBASHCHANDRAN V
SUJAN S
SUNDAR A
01
Introduction
The Chi-square test for goodness of fit
has a rich history rooted in the
development of statistical theory,
particularly for determining whether
observed data fit a specific distribution
or theoretical model.
02
History
❖ Karl Pearson (1900): The Chi-square test was introduced by
British statistician Karl Pearson in 1900, making it one of the
earliest statistical tests for comparing observed data with
expected frequencies. Pearson's original goal was to assess
whether categorical data followed a specific theoretical
distribution, such as the normal distribution.
❖ He proposed the Chi-square (χ²) statistic as:
χ2=∑(Oi−Ei)^2 / Ei
❖ Development for Categorical Data: Pearson's chi-square test was
primarily designed for categorical data, where the goal is to test if
observed frequencies in different categories match expected
frequencies under a hypothesized distribution, such as a uniform
distribution or a theoretical model like Mendelian inheritance in
genetics.
❖ Extension to Degrees of Freedom: One of the key aspects of
Pearson's Chi-square test is that it depends on the concept of degrees
of freedom (df), which represent the number of independent values in
the final calculation that are free to vary. Pearson's work introduced
this idea to the test, influencing how it could be applied across
different scenarios.
❖ R.A. Fisher's Contributions: Ronald A. Fisher, another key figure in
statistics, made several significant contributions to the Chi-square test
and its applications. He helped formalize the use of the test in
hypothesis testing, particularly in analyzing the goodness of fit for
statistical models.
❖ Neyman-Pearson Framework (1930s): Jerzy Neyman and Egon
Pearson (Karl Pearson's son) developed the Neyman-Pearson
framework in the 1930s, which laid the foundation for modern
hypothesis testing. While their work was more focused on the
development of the likelihood ratio test, it complemented the Chi-
square test by offering formalized approaches to deciding when to
reject null hypotheses.
❖ Applications in Genetics: The Chi-square test found important
early applications in genetics. One of the first practical uses was in
testing Mendelian inheritance ratios in genetic experiments, where
researchers could compare observed offspring ratios to expected
ratios under Mendel's laws.
❖ Use in Modern Statistics: Today, the Chi-square test for goodness
of fit is widely used in many fields, including social sciences,
economics, marketing, and biology, for hypothesis testing and
model validation. Its simplicity and versatility make it a foundational
tool in statistics.
03
Objectives
❖ Test the Fit of a Distribution: The primary goal is to check if the
sample data follows a specific distribution (e.g., uniform, normal, or
any other hypothesized distribution).

❖ Compare Observed and Expected Frequencies: It helps in


comparing the observed frequencies (from the actual data) with the
expected frequencies (calculated under the assumption of the null
hypothesis) to see if there is a significant difference.
❖ Assess Hypotheses:
➢ Null hypothesis (H₀): The data fits the expected distribution.
➢ Alternative hypothesis (H₁): The data does not fit the expected
distribution.

❖ Quantify the Deviation: The test quantifies the difference between


observed and expected values, allowing researchers to determine
whether these differences are large enough to suggest that the
data does not come from the expected distribution.
04
Application
❖ Genetics Testing Mendelian Ratios: In genetic experiments, the Chi-
square test is used to determine whether the observed proportions of traits
in offspring (e.g., dominant vs. recessive) follow the expected Mendelian
inheritance ratios.
➢ Example: Testing if a population of plants follows the expected 3:1 ratio for
dominant and recessive traits.

❖ Marketing Customer Preferences: Businesses use the Chi-square test to


see if observed customer preferences (e.g., product choices or purchase
behavior) match expected market trends.
➢ Example: A company may compare the distribution of preferences among
different product categories with the expected preferences based on prior
market research.
❖ Social Sciences Survey Analysis: The test is applied to assess whether the
observed distribution of responses (e.g., satisfaction levels, voting
preferences) fits expected proportions.
➢ Example: In a survey about political affiliation, researchers may test whether
the distribution of respondents’ affiliations matches the national averages.

❖ Education Grading Distributions: Schools may use the Chi-square test to


compare observed grade distributions (e.g., As, Bs, Cs) in a class to an
expected distribution, such as a bell curve.
➢ Example: Testing if the grades from a particular exam fit a normal
distribution, indicating whether the exam was appropriately challenging.
❖ Manufacturing and Quality Control Defect Testing: In quality control,
companies use the test to determine whether the frequency of defects in
manufactured products follows an expected distribution.
➢ Example: A factory might test if the number of defective items in different
batches follows an expected pattern of defects, or if certain batches have
unexpectedly high defect rates.
05
Example Problem
❖ Purpose:
The main goal of the Chi-square test for goodness of fit is to
assess whether a sample data set matches an expected
distribution. It is commonly used in situations like:
➢ Testing whether a die is fair (each face has an equal
probability).
➢ Checking whether the observed genetic traits follow
Mendelian ratios.
➢ Verifying if survey responses match expected population
proportions.
❖ Hypotheses:
➢ Null hypothesis (H₀): The observed frequencies follow
the expected distribution.
➢ Alternative hypothesis (H₁): The observed frequencies
do not follow the expected distribution.

❖ Formula:
➢ The Chi-square statistic (χ2χ2) is calculated as:
χ2=∑(Oi−Ei)^2/Ei
➢ Where:
• OiOi = Observed frequency for category i
• EiEi = Expected frequency for category i
• The sum is taken over all categories.
❖ Steps to Conduct the Test:
➢ Determine Observed and Expected Frequencies:
a. Collect the observed frequencies for each category.
b. Calculate the expected frequencies based on the
hypothesized distribution.
➢ Calculate the Chi-square Statistic:
Apply the formula by summing the squared differences
between observed and expected values, divided by the
expected values.
➢ Degrees of Freedom (df):
The degrees of freedom for the test are calculated as:
df=(k−1)
Where kk is the number of categories or groups.
❖ Determine the Critical Value:
Use the Chi-square distribution table to find the critical
value for the chosen significance level
(e.g., α=0.05α=0.05) with the appropriate degrees of
freedom.
❖ Make a Decision:
Compare the calculated Chi-square statistic to the
critical value from the Chi-square distribution table.
a) If χ2exceeds the critical value, reject the null
hypothesis (indicating that the observed data does
not fit the expected distribution).
b) If χ2 is less than or equal to the critical value, fail to
reject the null hypothesis.
❖ Example:
Suppose you roll a six-sided die 60 times and observe the following frequencies
For each face:

Face Observed Frequency


1 8
2 10
3 12
4 9
5 11
6 10
➢ You expect the die to be fair, so the expected frequency for each face
is E=60/6=10E=60/6=10.

➢ Now, apply the Chi-square formula:


▪ χ2=(8−10)^2/10+(10−10)^2/10+(12−10)^2/10+(9−10)^2/10+(11−10)^2/10+
(10−10)^2/10
▪ χ2=4/10+0+4/10+1/10+1/10+0=1

➢ For this test, df=6−1=5df=6−1=5. Using a Chi-square table


for df=5df=5 at α=0.05α=0.05, the critical value is 11.07. Since 1 < 11.07, we
fail to reject the null hypothesis, suggesting the die may be fair.
❖ Applications:
➢ Genetics: Testing the fit of observed genetic trait distributions to expected
Mendelian ratios.
➢ Marketing: Assessing whether the distribution of customer preferences
matches expectations.
➢ Survey analysis: Verifying if the distribution of responses matches a
hypothesized distribution.

The Chi-square test for goodness of fit is a versatile and widely-used tool in
statistics, providing insights into whether empirical data align with theoretical
models.
06
Conclusion
The chi-square test for goodness of fit is a valuable
statistical tool used to assess whether an observed
frequency distribution conforms to an expected
theoretical distribution. It helps determine if the
differences between observed and expected values
are due to random variation or if they are
statistically significant. By comparing the chi-square
statistic to a critical value from the chi-square
distribution, researchers can decide whether to
reject or accept the null hypothesis, which states
that the observed data follows the expected
distribution.
Thank You
Slidesgo
CREDITS: This presentation template was created by Slidesgo, and
includes icons by Flaticon
Flaticon, and infographics & images by Freepik
Freepik

Please keep this slide for attribution

You might also like