0% found this document useful (0 votes)

314 views13 pages

Excel Normality Tests Kolmogorov-Smirnov, Anderson-Darling, and Shapiro Wilk Tests For Two-Sample Pooled T-Test

The document discusses performing normality tests on sample data in Excel to determine if the data is normally distributed before conducting a two-sample pooled t-test. It describes creating a histogram and normal probability plot of the sample data in Excel. It then performs the Kolmogorov-Smirnov test, Anderson-Darling test, and Shapiro-Wilk test for normality on the sample data and finds that the null hypothesis of normal distribution cannot be rejected for both sample groups in all three tests.

Uploaded by

puthree rezki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

314 views13 pages

Excel Normality Tests Kolmogorov-Smirnov, Anderson-Darling, and Shapiro Wilk Tests For Two-Sample Pooled T-Test

Uploaded by

puthree rezki

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 13

Excel Normality Tests

Kolmogorov-Smirnov,

Anderson-Darling, and

Shapiro Wilk Tests For

Two-Sample Pooled t-Test

The following five normality tests will be performed on the sample data here:

An Excel histogram of the sample data will be created.

A normal probability plot of the sample data will be created in Excel.

The Kolmogorov-Smirnov test for normality of the sample data will be performed in Excel.

The Anderson-Darling test for normality of the sample data will be performed in Excel.

The Shapiro-Wilk test for normality of the sample data will be performed in Excel.

The quickest way to evaluate normality of a sample is to construct an Excel histogram

from the sample data.

Histogram in Excel
(Click On Image To See a
Larger Version)

To create this histogram in Excel, fill in the Excel Histogram dialogue box as follows:

(Click On Image To See a

Larger Version)
(Click On Image To See a
Larger Version)

To create this histogram in Excel, fill in the Excel Histogram dialogue box as follows:

(Click On Image To See a

Larger Version)

Both sample groups appear to be distributed reasonably closely to the bell-shaped normal
distribution. It should be noted that bin size in an Excel histogram is manually set by the
user. This arbitrary setting of the bin sizes can has a significant influence on the shape of
the histogram’s output. Different bin sizes could result in an output that would not appear
bell-shaped at all. What is actually set by the user in an Excel histogram is the upper
boundary of each bin.

Normal Probability Plot in Excel

Another way to graphically evaluate normality of each data sample is to create a normal
probability plot for each sample group. This can be implemented in Excel and appears as
follows:

(Click On Image To See a

Larger Version)

(Click On Image To See a

Larger Version)

Normal probability plots for both sample groups show that the data appears to be very
close to being normally distributed. The actual sample data (red) matches very closely the
data values of the sample were perfectly normally distributed (blue) and never goes
beyond the 95 percent confidence interval boundaries (green).

Kolmogorov-Smirnov Test For

Normality in Excel
The Kolmogorov-Smirnov Test is a hypothesis test that is widely used to determine
whether a data sample is normally distributed. The Kolmogorov-Smirnov Test calculates
the distance between the Cumulative Distribution Function (CDF) of each data point and
what the CDF of that data point would be if the sample were perfectly normally
distributed. The Null Hypothesis of the Kolmogorov-Smirnov Test states that the
distribution of actual data points matches the distribution that is being tested. In this case
the data sample is being compared to the normal distribution.

The largest distance between the CDF of any data point and its expected CDF is compared
to Kolmogorov-Smirnov Critical Value for a specific sample size and Alpha. If this largest
distance exceeds the Critical Value, the Null Hypothesis is rejected and the data sample is
determined to have a different distribution than the tested distribution. If the largest
distance does not exceed the Critical Value, we cannot reject the Null Hypothesis, which
states that the sample has the same distribution as the tested distribution.

F(Xk) = CDF(Xk) for normal distribution

F(Xk) = NORM.DIST(Xk, Sample Mean, Sample Stan. Dev., TRUE)

Variable 1 - Brand A Battery Lifetimes

(Click On Image To See a

Larger Version)

0.0885 = Max Difference Between Actual and Expected CDF

16 = n = Number of Data Points

0.05 = α

Variable 2 - Brand B Battery Lifetimes

(Click On Image To See a
Larger Version)

0.1007 = Max Difference Between Actual and Expected CDF

17 = n = Number of Data Points

0.05 = α

(Click On Image To See a

Larger Version)

The Null Hypothesis Stating That the Data Are Normally Distributed Cannot Be
Rejected

The Null Hypothesis for the Kolmogorov-Smirnov Test for Normality, which states that the
sample data are normally distributed, is rejected only if the maximum difference between
the expected and actual CDF of any of the data points exceed the Critical Value for the
given n and α. That is not the case here.

The Max Difference Between the Actual and Expected CDF for Variable 1 (0.0885) and for
Variable 2 (0.1007) are significantly less than the Kolmogorov-Smirnov Critical Value for n
= 20 (0.29) and for n = 15 (0.34) at α = 0.05 so the Null Hypotheses of the Kolmogorov-
Smirnov Test of each of the two sample groups is accepted.

Anderson-Darling Test For

Normality in Excel
The Anderson-Darling Test is a hypothesis test that is widely used to determine whether a
data sample is normally distributed. The Anderson-Darling Test calculates a test statistic
based upon the actual value of each data point and the Cumulative Distribution Function
(CDF) of each data point if the sample were perfectly normally distributed.

The Anderson-Darling Test is considered to be slightly more powerful than the

Kolmogorov-Smirnov test for the following two reasons:

The Kolmogorov-Smirnov test is distribution-free. i.e., its critical values are the same for
all distributions tested. The Anderson-darling tests requires critical values calculated for
each tested distribution and is therefore more sensitive to the specific distribution.

The Anderson-Darling test gives more weight to values in the outer tails than the
Kolmogorov-Smirnov test. The K-S test is less sensitive to aberration in outer values than
the A-D test.

If the test statistic exceeds the Anderson-Darling Critical Value for a given Alpha, the Null
Hypothesis is rejected and the data sample is determined to have a different distribution
than the tested distribution. If the test statistic does not exceed the Critical Value, we
cannot reject the Null Hypothesis, which states that the sample has the same distribution
as the tested distribution.

F(Xk) = CDF(Xk) for normal distribution

F(Xk) = NORM.DIST(Xk, Sample Mean, Sample Stan. Dev., TRUE)

Variable 1 – Brand A Battery Lifetimes

(Click On Image To See a
Larger Version)

Adjusted Test Statistic A* = 0.174

Variable 2 - Brand B Battery Lifetimes

(Click On Image To See a

Larger Version)

Adjusted Test Statistic A* = 0.227

Reject the Null Hypothesis of the Anderson-Darling Test which states that the data are
normally distributed if any the following are true:

A* > 0.576 When Level of Significance (α) = 0.15

A* > 0.656 When Level of Significance (α) = 0.10

A* > 0.787 When Level of Significance (α) = 0.05

A* > 1.092 When Level of Significance (α) = 0.01

The Null Hypothesis Stating That the Data Are Normally Distributed Cannot Be
Rejected

The Null Hypothesis for the Anderson-Darling Test for Normality, which states that the
sample data are normally distributed, is rejected if the Adjusted Test Statistic (A*)
exceeds the Critical Value for the given n and α.

The Adjusted Test Statistic (A*) for Variable 1 (0.174) and for Variable 2 (0.227) are
significantly less than the Anderson-Darling Critical Value for α = 0.05 so the Null
Hypotheses of the Anderson-Darling Test for each of the two sample groups is accepted.

Shapiro-Wilk Test For Normality in

Excel
The Shapiro-Wilk Test is a hypothesis test that is widely used to determine whether a data
sample is normally distributed. A test statistic W is calculated. If this test statistic is less
than a critical value of W for a given level of significance (alpha) and sample size, the Null
Hypothesis which states that the sample is normally distributed is rejected.

The Shapiro-Wilk Test is a robust normality test and is widely-used because of its slightly
superior performance against other normality tests, especially with small sample sizes.
Superior performance means that it correctly rejects the Null Hypothesis that the data are
not normally distributed a slightly higher percentage of times than most other normality
tests, particularly at small sample sizes.

The Shapiro-Wilk normality test is generally regarded as being slightly more powerful than
the Anderson-Darling normality test, which in turn is regarded as being slightly more
powerful than the Kolmogorov-Smirnov normality test.

Variable 1 – Brand A Battery Life

(Click On Image To See a
Larger Version)

0.972027 = Test Statistic W

0.887 = W Critical for the following n and Alpha

16 = n = Number of Data Points

0.05 = α

The Null Hypothesis Stating That the Data Are Normally Distributed Cannot Be
Rejected

Test Statistic W (0. 972027) is larger than W Critical 0.887. The Null Hypothesis therefore
cannot be rejected. There is not enough evidence to state that the data are not normally
distributed with a confidence level of 95 percent.

Variable 2 – Brand B Battery Life

(Click On Image To See a
Larger Version)

0.971481 = Test Statistic W

0.892 = W Critical for the following n and Alpha

17 = n = Number of Data Points

0.05 = α

The Null Hypothesis Stating That the Data Are Normally Distributed Cannot Be
Rejected

Test Statistic W (0. 971481) is larger than W Critical 0.892. The Null Hypothesis therefore
cannot be rejected. There is not enough evidence to state that the data are not normally
distributed with a confidence level of 95 percent.

Correctable Reasons That Normal

Data Can Appear Non-Normal

If a normality test indicates that data are not normally distributed, it is a good idea to do
a quick evaluation of whether any of the following factors have caused normally-
distributed data to appear to be non-normally-distributed:

1) Outliers

– Too many outliers can easily skew normally-distributed data. An outlier can oftwenty be
removed if a specific cause of its extreme value can be identified. Some outliers are
expected in normally-distributed data.

2) Data Has Been Affected by More Than One Process

– Variations to a process such as shift changes or operator changes can change the
distribution of data. Multiple modal values in the data are common indicators that this
might be occurring. The effects of different inputs must be identified and eliminated from
the data.

3) Not Enough Data

– Normally-distributed data will often not assume the appearance of normality until at
least 25 data points have been sampled.

4) Measuring Devices Have Poor Resolution

– Sometimes (but not always) this problem can be solved by using a larger sample size.

5) Data Approaching Zero or a Natural Limit

– If a large number of data values approach a limit such as zero, calculations using very
small values might skew computations of important values such as the mean. A simple
solution might be to raise all the values by a certain amount.

6) Only a Subset of a Process’ Output Is Being Analyzed

– If only a subset of data from an entire process is being used, a representative sample in
not being collected. Normally-distributed results would not appear normally distributed if
a representative sample of the entire process is not collected.

When Data Are Not Normally

Distributed
When normality of data cannot be confirmed for a small sample, it is necessary to
substitute a nonparametric test for a t-Test. Nonparametric tests do not have the same
normality requirement that the t-Test does. The most common nonparametric test that
can be substituted for the two-independent-sample t-Test when data normality cannot be
confirmed is the Mann-Whitney U Test.

The Mann-Whitney U Test is performed on the data in this example in a blog article
following this one. Nonparametric tests are generally less powerful (less able to detect a
difference) than parametric tests. The parametric two-independent sample, one-tailed t-
Test performed here does detect a difference at alpha = 0.05. the nonparametric Mann-
Whitney U test conducted at the end of this section on the same data did not detect a
difference at alpha = 0.05.

DIY Guide To Building Your Own Pulk
No ratings yet
DIY Guide To Building Your Own Pulk
41 pages
ANOVA Homework
No ratings yet
ANOVA Homework
7 pages
Applied Longitudinal Analysis Lecture Notes
No ratings yet
Applied Longitudinal Analysis Lecture Notes
475 pages
Stat Course Outline Unity University
No ratings yet
Stat Course Outline Unity University
3 pages
Graduate Statistics in Excel Manual 3 S
100% (1)
Graduate Statistics in Excel Manual 3 S
347 pages
Normality Test in Excel
No ratings yet
Normality Test in Excel
5 pages
Repeated Measures ANOVA PDF
No ratings yet
Repeated Measures ANOVA PDF
30 pages
Correlation and Regression Feb2014
100% (1)
Correlation and Regression Feb2014
50 pages
Repeated Measures ANOVA
100% (1)
Repeated Measures ANOVA
41 pages
Minitab Introduction
No ratings yet
Minitab Introduction
86 pages
Non-Parametric Stats Guide
No ratings yet
Non-Parametric Stats Guide
19 pages
Advanced Statistical Distributions
No ratings yet
Advanced Statistical Distributions
13 pages
Data Types
No ratings yet
Data Types
8 pages
Ch. 9 Multiple Choice Review Questions: 1.96 B) 1.645 C) 1.699 D) 0.90 E) 1.311
100% (1)
Ch. 9 Multiple Choice Review Questions: 1.96 B) 1.645 C) 1.699 D) 0.90 E) 1.311
5 pages
DR M Riaz Lecture Statistics Ch2
No ratings yet
DR M Riaz Lecture Statistics Ch2
8 pages
Practice Problem Set On Control Charts
No ratings yet
Practice Problem Set On Control Charts
3 pages
12-Multiple Comparison Procedure
No ratings yet
12-Multiple Comparison Procedure
12 pages
Chapter 10 Two Sample Inferences
No ratings yet
Chapter 10 Two Sample Inferences
82 pages
Lecture 7 - X-S, P & NP Charts
No ratings yet
Lecture 7 - X-S, P & NP Charts
28 pages
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
No ratings yet
How To Use All 3 Types of ANOVA Built Into Excel To Make Your Internet Marketing More Effective
20 pages
MAT2001-Statistics For Engineers: DR - Mokesh Rayalu, M.SC, PH.D.
No ratings yet
MAT2001-Statistics For Engineers: DR - Mokesh Rayalu, M.SC, PH.D.
10 pages
Lecture 02-Balanced Incomplete Block Design
No ratings yet
Lecture 02-Balanced Incomplete Block Design
23 pages
Regression Study Guide
No ratings yet
Regression Study Guide
9 pages
ANOVA Practice Questions
No ratings yet
ANOVA Practice Questions
2 pages
2 Independent Samples: Mann-Whitney Test.: 1 2 n1 X, 1, 2,...., n2 Y
No ratings yet
2 Independent Samples: Mann-Whitney Test.: 1 2 n1 X, 1, 2,...., n2 Y
5 pages
EDA-HYPOTHESIS-TESTING-FOR-TWO-SAMPLE (With Answers)
No ratings yet
EDA-HYPOTHESIS-TESTING-FOR-TWO-SAMPLE (With Answers)
6 pages
SQC Project PDF
0% (1)
SQC Project PDF
15 pages
Chapter 6 The 2 Factorial Design
No ratings yet
Chapter 6 The 2 Factorial Design
50 pages
4.3. Parametric & Nonparametric Tests
No ratings yet
4.3. Parametric & Nonparametric Tests
26 pages
Hypothesis Testing - Analysis of Variance (ANOVA)
No ratings yet
Hypothesis Testing - Analysis of Variance (ANOVA)
14 pages
20 Reliability Testing and Verification
No ratings yet
20 Reliability Testing and Verification
13 pages
Types of Statistical Distributions
No ratings yet
Types of Statistical Distributions
34 pages
Sample Size Calculations
100% (1)
Sample Size Calculations
5 pages
ANOVA for Diet Efficiency Analysis
No ratings yet
ANOVA for Diet Efficiency Analysis
11 pages
Statistics For Health Research: Non-Parametric Methods
100% (1)
Statistics For Health Research: Non-Parametric Methods
56 pages
Hypothesis Testing II
No ratings yet
Hypothesis Testing II
98 pages
Probability Distribution
No ratings yet
Probability Distribution
17 pages
Mann Whitney U Test
No ratings yet
Mann Whitney U Test
13 pages
Statistical Inference
100% (1)
Statistical Inference
33 pages
Assignment Booklet PGDAST Jan-Dec 2018
No ratings yet
Assignment Booklet PGDAST Jan-Dec 2018
35 pages
MLE Practice
No ratings yet
MLE Practice
2 pages
Example of Two Group Discriminant Analysis
No ratings yet
Example of Two Group Discriminant Analysis
7 pages
STAT3006 Lecture Notes 2021 Aug8 2021
No ratings yet
STAT3006 Lecture Notes 2021 Aug8 2021
110 pages
SPSS & Minitab Guide for Students
No ratings yet
SPSS & Minitab Guide for Students
187 pages
1 Statistical Quality Control, 7th Edition by Douglas C. Montgomery
No ratings yet
1 Statistical Quality Control, 7th Edition by Douglas C. Montgomery
55 pages
Interpret The Key Results For Attribute Agreement Analysis
100% (1)
Interpret The Key Results For Attribute Agreement Analysis
28 pages
Basics of Business Statistics
100% (1)
Basics of Business Statistics
66 pages
Hypothesis Testing
0% (1)
Hypothesis Testing
139 pages
Forum W5 C1 - Binsar Sihombing
No ratings yet
Forum W5 C1 - Binsar Sihombing
3 pages
Statistics Assignment Guidelines
No ratings yet
Statistics Assignment Guidelines
5 pages
Hypothesis Testing with Minitab
No ratings yet
Hypothesis Testing with Minitab
7 pages
Statistical Distributions Guide
No ratings yet
Statistical Distributions Guide
12 pages
Statistical Significance & Association
No ratings yet
Statistical Significance & Association
21 pages
Sample Exam: Statistics & Probability
No ratings yet
Sample Exam: Statistics & Probability
7 pages
02 Introduction To Statistics
100% (1)
02 Introduction To Statistics
18 pages
Shapiro-Wilk Test
100% (1)
Shapiro-Wilk Test
3 pages
Chapter 1 at BULLET Statistics Chapter 1
No ratings yet
Chapter 1 at BULLET Statistics Chapter 1
1,100 pages
Applied Statistics and Probability For Engineers, 6 Edition: Z P Z P Z P P X P X P
No ratings yet
Applied Statistics and Probability For Engineers, 6 Edition: Z P Z P Z P P X P X P
21 pages
Assignment No 3 (Repaired)
No ratings yet
Assignment No 3 (Repaired)
16 pages
Non Parametric Methods
No ratings yet
Non Parametric Methods
19 pages
Test of Normality
No ratings yet
Test of Normality
7 pages
Reviewer On Police Photography by Mr. Herbert Tunac, RMT, MSMT
No ratings yet
Reviewer On Police Photography by Mr. Herbert Tunac, RMT, MSMT
4 pages
Rabbit Silage Study
No ratings yet
Rabbit Silage Study
36 pages
PROBLEMS (Homework)
No ratings yet
PROBLEMS (Homework)
5 pages
According To Saunders Et Al
No ratings yet
According To Saunders Et Al
13 pages
HR Synopsis
No ratings yet
HR Synopsis
11 pages
Continuous
No ratings yet
Continuous
13 pages
Richland Technologies 5th Anniversary Press Release
No ratings yet
Richland Technologies 5th Anniversary Press Release
2 pages
Black Dog Institute Online Clinic Assessment Report
No ratings yet
Black Dog Institute Online Clinic Assessment Report
7 pages
Material Safety Data Sheet Avafulflow
No ratings yet
Material Safety Data Sheet Avafulflow
4 pages
Preparation 7 - Ointments
No ratings yet
Preparation 7 - Ointments
8 pages
Post Test Questionnaire EOC EC
No ratings yet
Post Test Questionnaire EOC EC
4 pages
IIT JEE Organic Chemistry Solutions
100% (3)
IIT JEE Organic Chemistry Solutions
15 pages
Oral-Communications Q2 Module-3
No ratings yet
Oral-Communications Q2 Module-3
15 pages
International Finance Overview
No ratings yet
International Finance Overview
36 pages
High Pass Filter
No ratings yet
High Pass Filter
12 pages
CHEMISTRY Exam
No ratings yet
CHEMISTRY Exam
8 pages
Cre6-C-240
No ratings yet
Cre6-C-240
1 page
Business Plan Group 3
100% (1)
Business Plan Group 3
12 pages
Circles The Final Steps (MCQ'S) Ws
No ratings yet
Circles The Final Steps (MCQ'S) Ws
9 pages
Lec 1
No ratings yet
Lec 1
7 pages
Nanto Company Profile & Introduction Letter & ISO
No ratings yet
Nanto Company Profile & Introduction Letter & ISO
15 pages
Pawan Transfer
No ratings yet
Pawan Transfer
2 pages
Link Game PPSSPP (Sfile
100% (1)
Link Game PPSSPP (Sfile
9 pages
Siemens PBX & Cisco CallManager Guide
No ratings yet
Siemens PBX & Cisco CallManager Guide
37 pages
Design & Implement Trash Rack Cleaning System
No ratings yet
Design & Implement Trash Rack Cleaning System
23 pages
MGMT5410 STRATEGIC MANAGEMENT - Outline 25-1-24
No ratings yet
MGMT5410 STRATEGIC MANAGEMENT - Outline 25-1-24
16 pages
MCB Types
No ratings yet
MCB Types
3 pages
Blue Lock: Isagi & Rin's Reunion
No ratings yet
Blue Lock: Isagi & Rin's Reunion
24 pages

Excel Normality Tests Kolmogorov-Smirnov, Anderson-Darling, and Shapiro Wilk Tests For Two-Sample Pooled T-Test

Uploaded by

Excel Normality Tests Kolmogorov-Smirnov, Anderson-Darling, and Shapiro Wilk Tests For Two-Sample Pooled T-Test

Uploaded by

Excel Normality Tests

Shapiro Wilk Tests For

Two-Sample Pooled t-Test

An Excel histogram of the sample data will be created.

A normal probability plot of the sample data will be created in Excel.

The quickest way to evaluate normality of a sample is to construct an Excel histogram

(Click On Image To See a

(Click On Image To See a

Normal Probability Plot in Excel

(Click On Image To See a

(Click On Image To See a

Kolmogorov-Smirnov Test For

F(Xk) = CDF(Xk) for normal distribution

F(Xk) = NORM.DIST(Xk, Sample Mean, Sample Stan. Dev., TRUE)

Variable 1 - Brand A Battery Lifetimes

(Click On Image To See a

0.0885 = Max Difference Between Actual and Expected CDF

16 = n = Number of Data Points

Variable 2 - Brand B Battery Lifetimes

0.1007 = Max Difference Between Actual and Expected CDF

17 = n = Number of Data Points

(Click On Image To See a

Anderson-Darling Test For

The Anderson-Darling Test is considered to be slightly more powerful than the

F(Xk) = CDF(Xk) for normal distribution

F(Xk) = NORM.DIST(Xk, Sample Mean, Sample Stan. Dev., TRUE)

Variable 1 – Brand A Battery Lifetimes

Adjusted Test Statistic A* = 0.174

Variable 2 - Brand B Battery Lifetimes

(Click On Image To See a

Adjusted Test Statistic A* = 0.227

A* > 0.576 When Level of Significance (α) = 0.15

A* > 0.787 When Level of Significance (α) = 0.05

A* > 1.092 When Level of Significance (α) = 0.01

Shapiro-Wilk Test For Normality in

Variable 1 – Brand A Battery Life

0.972027 = Test Statistic W

0.887 = W Critical for the following n and Alpha

16 = n = Number of Data Points

Variable 2 – Brand B Battery Life

0.971481 = Test Statistic W

0.892 = W Critical for the following n and Alpha

17 = n = Number of Data Points

Correctable Reasons That Normal

Data Can Appear Non-Normal

2) Data Has Been Affected by More Than One Process

3) Not Enough Data

4) Measuring Devices Have Poor Resolution

5) Data Approaching Zero or a Natural Limit

6) Only a Subset of a Process’ Output Is Being Analyzed

When Data Are Not Normally

You might also like