Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
22 views3 pages

Lecture Material 7

Uploaded by

Ali Naseer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views3 pages

Lecture Material 7

Uploaded by

Ali Naseer
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

Lecture_material_7

March 7, 2024

1 Analysis of variance (ANOVA)


Analysis of Variance (ANOVA) is a statistical method used to analyze the differences among group
means in a sample.
It is an extension of the t-test and is particularly useful when comparing means of more than two
groups.
ANOVA tests the null hypothesis that all group means are equal against the alternative hypothesis
that at least one group mean is different
Key Concepts of ANOVA Between-Groups Variability ANOVA assesses the variance be-
tween different groups and compares it to the variance within each group.
F-Statistic ANOVA produces an F-statistic, which is the ratio of between-group variance to within-
group variance. A high F-statistic suggests that the group means are significantly different.
Post Hoc Tests If ANOVA indicates significant differences, post hoc tests (e.g., Tukey’s HSD or
Bonferroni correction) may be used to identify which specific group means are different.

[ ]: import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

[ ]: kashti=sns.load_dataset('titanic')

[ ]: kashti.head(5)

[ ]: sns.boxplot(x='sex',y='age',data=kashti)

[ ]: sns.boxplot(x='class',y='age',data=kashti)

[ ]: phool=sns.load_dataset('iris')

[ ]: phool.head(5)

[ ]: phool.columns

[ ]: phool.describe()

1
[ ]: sns.boxplot(x='species',y='sepal_length',data=phool)

One-Way ANOVA: Used when there is one independent variable (factor) with more than two
levels (groups). Tests if there are any statistically significant differences between the means of
the groups. Hypotheses: Null Hypothesis (H0): The means of all groups are equal. Alternative
Hypothesis (H1): At least one group mean is different. Example: Testing if there is a significant
difference in test scores among students who used three different teaching methods.

[ ]: import statsmodels.api as sm
from statsmodels.formula.api import ols

[ ]: # One way ANOVA


mod = ols('sepal_length ~ species', data=phool).fit()
mod

[ ]: aov_table = sm.stats.anova_lm(mod, typ=2) # type 2 meaning, the effect of one␣


↪factor is adjusted for the effect of the other factors

print(aov_table)
if aov_table['PR(>F)'][0] < 0.05:
print('Reject null hypothesis')

Two-Way ANOVA: Used when there are two independent variables (factors) and their interaction
on a dependent variable. Can assess the main effects of each independent variable and their
interaction effect.
Hypotheses:
Null Hypothesis (H0): There is no significant difference in means due to factors or their interaction.
Alternative Hypothesis (H1): There is a significant difference in means due to factors or their
interaction. Example: Testing if there is an interaction between two factors (e.g., treatment and
gender) on a response variable (e.g., blood pressure).

2 Pairwise comparison
[ ]: pair_t=mod.t_test_pairwise('species', method='bonferroni')
pair_t.result_frame

3 Tukey HSD test


[ ]: # pip install pingouin

[ ]: import pingouin as pg
# First develop anova table
aov=pg.anova(dv='sepal_length', between='species', data=phool, detailed=True)
print(aov)
if aov['p-unc'][0] < 0.05:
print('Reject null hypothesis')
else:
print('Fail to reject null hypothesis')

2
[ ]: # Tukey HSD test
pt=pg.pairwise_tukey(dv='sepal_length', between='species', data=phool)
print(pt)
if pt['p-tukey'][0] < 0.05:
print('Reject null hypothesis')

4 Manova
Multivariate Analysis of Variance (MANOVA) is a statistical technique used to simultaneously
analyze the differences in mean values of two or more dependent variables between multiple groups.
It is an extension of Analysis of Variance (ANOVA) and is applicable when there are two or more
dependent variables involved in the study.
MANOVA allows researchers to determine whether there are statistically significant differences in
mean vectors (patterns of means across variables) among groups.
Hypotheses in MANOVA: Null Hypothesis (H0): There are no significant differences in mean
vectors among the groups for the set of dependent variables. Alternative Hypothesis (H1): There
are significant differences in mean vectors among the groups for the set of dependent variables.

[ ]: import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from statsmodels.multivariate.manova import MANOVA

[ ]: df=sns.load_dataset('iris')

[ ]: df.head(5)

[ ]: Manova=MANOVA.from_formula('sepal_length + sepal_width + petal_length +␣


↪petal_width ~ species', data=df)

[ ]: print(Manova.mv_test())

You might also like