community project
encouraging academics to share statistics support resources
All stcp resources are released under a Creative Commons licence
Statistical Methods
9. Nonparametric
Testing
Based on materials provided by Coventry University and
Loughborough University under a National HE STEM
Programme Practice Transfer Adopters grant
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Overview of workshop
What are nonparametric methods?
When should you use them?
Overview of nonparametric methods
Comparing two groups:
Mann-Whitney U test
Cross-tabulating two variables:
Chi-squared test of association
Fisher’s exact test
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Parametric v.
nonparametric methods
Theoretical statistical distributions are constructed
using parameters:
These parameters determine the location and
shape of the frequency distribution
For example, the mean and standard deviation of
the normal distribution – see Workshop 6
Methods that assume observations come from a
certain distribution are called parametric methods
Methods that make no distributional assumptions
about observations are called nonparametric
methods
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
When should non-
parametric tests be used?
When scale-based data does not satisfy the
assumptions of the appropriate test or the
conditions of its robust use
For testing hypotheses with categorical
(nominal/ordinal) data
The advantage of nonparametric tests is that
we do not have to test any assumptions or
robustness conditions
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Limitations
Nonparametric methods are less powerful than
parametric methods
For example, for normally distributed data, a two
sample t-test will detect a smaller real difference
than the corresponding non-parametric test
Application of nonparametric methods is difficult
or impossible for complex data structures
Nonparametric methods mainly involve
hypothesis testing: less descriptive statistics can
be calculated, for example, only maximum,
minimum, median and quartiles for scale or
ordinal data
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Ranks
Many nonparametric methods replace
scale/ordinal observations with ranks:
Observation 170 112 29 125 224 78
Rank 5 3 1 4 6 2
Nonparametric test statistics are then based
on ordering the data and working with the
ranks
SPSS takes care of all the details
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Nonparametric statistics in SPSS
Chi-squared and Spearman’s
correlation under Crosstabs
Normality tests under Explore
Alternative K Independent
samples tests under One-
Way ANOVA
Two choices with other
nonparametric tests – new or
legacy:
New algorithms are better
and output is more ‘helpful’
Legacy versions are less
strict with data types
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Nonparametric tests in SPSS
One sample: Tests whether a sample of data follows a
particular distribution
2 independent samples: Compares two groups of
cases (like an independent samples t-test – see
Workshop 8)
K independent samples: Compares two or more groups
of cases (like a one-way ANOVA – see Workshop 10)
2 related samples: Compares two paired groups of
cases (like a paired-samples t-test – see Workshop 8)
K related samples: Compares two or more related
groups of cases (like repeated measures ANOVA)
2 ordinal/scale variables: Spearman correlation and
chi-squared test for association, and Fisher’s exact test
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Example 1: Female stroke patients
100 female stroke patients were randomly assigned to
one of two groups:
A standard physical therapy group (Control)
A group with the standard therapy plus emotional therapy
(Treatment)
Three months later the patients were evaluated on their
ability to perform common Activities of Daily Life (ADL):
Code Travel Cooking Housekeeping
0 Same as before illness Plans and prepares meals As before
Gets out if someone Some cooking but less Does at least half
1
else drives than before usual
Gets food out if prepared Occasional dusting
2 Gets out in wheelchair
by others of small jobs
Home or hospital No longer keeps
3 Does nothing for meals
bound house
4 Bed-ridden Never did any Never did any
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Research question
Does the additional therapy have an affect on
any of these three measures activities for daily
life?
Null hypotheses:
The distribution of values for each measure does
not depend on the group
Alternative hypotheses:
The distribution of values for each measure does
depend on the group
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Step 1: descriptive statistics
Open the file ADL.sav
associated with this
presentation
Select Analyze –
Descriptive Statistics –
Explore…
Select the three measures
in the Dependent List and
Group as the Factor List
Select Plots… and choose
None under boxplots and
Histogram under descriptive
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Distribution of Travel by Group
Percentage frequency of
Treatment group is
relatively higher for lower
values of Travel
Thus the Treatment
appears to be having
a positive effect
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Distribution of Cooking by Group
Difference in percentage
frequencies bigger than
for Travel
Thus the Treatment
appears to be having a
bigger positive effect
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Distribution of Housekeeping by Group
Difference in percentage
frequencies even bigger
than for Cooking
Thus the Treatment
appears to be having an
even bigger positive effect
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Step 2: which test?
The data is ordinal so we need to use a nonparametric
test
There are 2 independent groups
The descriptive statistics suggest we should compare
higher/lower ordinal values for the groups rather than
general differences in shape
The appropriate test is therefore the Mann-Whitney U
test
Note: If there had been other kinds of shape difference
we should have used the chi-squared test
Note: The category 4 data should be removed from each
variable before it is tested because it corresponds to ‘not
applicable’
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Mann-Whitney U test
A non-parametric test of two independent samples
of ordinal or scale-based data
Generally needs at least 5 data categories for the
ordinal variable (here we only have 4 when the last
case is removed, which is a bit of a problem)
Alternative to an independent samples t-test for
scale-based data if the test assumptions or
robustness assumptions are not met
Samples can be different sizes
Null hypothesis: Values of Travel for the Control
group are equally likely to be higher or lower than
the values of Travel for the Treatment group
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Step 3: remove the cases with
Travel = 4
Select Data – Select
Cases
Click on the If
condition is satisfied
radio button
Select the If… button
Select Travel then
click on ‘<‘ and 4
Select Continue etc.
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Step 4: run the test
Select Analyze – Nonparametric Tests – Legacy
Dialogs – 2 Independent Samples…
Note: we cannot use the new version of this
test with an ordinal data type for the test
variable
Select Travel for the Test Variable List and
Group for the Grouping Variable
Click on Define Groups and select 0 for Group 1
and 1 for Group 2
The Mann-Whitney U test is selected by default
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Step 5: interpret the output
Significance value of test is 0.032
We reject the null hypothesis and
conclude there is evidence that treatment
is having an effect. The effect is also
positive (negative Z value).
Repeat the same analysis for Cooking
and Housekeeping (first select the
appropriate cases)
Same result obtained for Cooking, p-
value is slightly smaller (so evidence is
slightly stronger but still at 0.05 level)
Even stronger result obtained for
Housekeeping – significant at 0.01 –
strong evidence that treatment is having
an effect
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Example 2 : Smartphone
purchasing survey
92 people were asked:
Q1: What is your gender?
Male
Female
Q2: What is you age?
This is recorded in the following categories:
17-24, 25-29, 30-39 and 40+
Q3: On a scale of 0 to 10 how important do you consider
brand when purchasing a smartphone?
(where 0 = extremely unimportant and 10 = extremely
important)
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Research questions
1. Is the importance of brand when purchasing a
smartphone gender related?
2. Is the importance of brand when purchasing a
smartphone age related?
Null hypotheses:
1. Brand importance is not gender related
2. Brand importance is not age related
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Step 1: descriptive statistics
Upload the file Smartphone.sav
associated with this presentation
Select Analyze – Descriptive
Statistics – Explore…
Select Brand in the Dependent List
and Gender in the Factor List
Select Plots… and choose None
under boxplots and Histogram under
descriptive
Double click on each graph
Double click on the values on the
horizontal axis in the chart editor
Select the Scale tab
Change the Minimum to 0, the
Maximum to 10 and the Major
increment to 1
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Distribution of Brand by Gender
Both distributions clearly
not normal with different
sample sizes non-
parametric test needed
Shape of data different for
Brand = 8, 9 and 10, but
not in the same direction
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
A nonparametric
test of association
The data consist of counts of subjects with
particular profiles
Profiles are formed by scale and categorical variables
Often referred to as a contingency table
The test is whether there is an association
between the categorical variables
Equivalent to dropping pebbles at random in a
grid where the row and column totals are already
known
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Chi-squared (2) test
Works with:
2-way tables with known row and column totals, or
Measuring a sequence of observed values against
expected values (not covered here)
Based on calculating expected values for the table
frequencies and comparing these with the observed
values
Only valid when most (≥80%) of the expected values
are sufficiently large (≥5) and none has expected
value <1
Null hypothesis: the observed values are randomly
distributed based on the expected values (i.e. there is
no association between the two variables)
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Chi-squared test in SPSS
Select Analyze – Descriptive Statistics – Crosstabs…
Select Gender for the rows and AgeCategory for the columns
Select Statistics… then Chi-square then Continue
Select Cells… then Expected then Continue
Analysis is invalid because 15 out of We need to combine
22 cells (way more than 20%) have an some of the columns
expected frequency less than 5 together and retest
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Recode Brand into a new variable
Select Transform –
Recode into Different
Variables…
Select Brand from the list
Enter Brand2 as the
output variable name
and press Change
Select Old and New
Values…
Under Old Value, select
Range and enter the
values 0 and 5
Under New Value, enter
1 and select Add
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Repeat with the range 6 through 7 going to 2
Recode the value 8 as 3, 9 as 4, and 10 as 5
On the Variable View, change the number of
decimal places of Brand2 to 0 and its data type
to Ordinal
Add values to Brand2 to explain these settings
Re-run the chi-squared test by changing the
variable from Brand to Brand2 and keeping all
the other options the same
Analysis still not
valid as 3 out of
10 cells (>20%)
still have expected
frequency <5
Result not significant
as P-value for chi-
squared >0.05
Recoding again and re-
running would probably
not improve the result
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Fisher’s exact test
Applies to 2x2 contingency tables
Works with smaller samples than chi-squared:
Sample size > 40 chi-squared can be used
Sample size between 20 and 40 and the smallest
expected frequency 5, chi-squared can be used
Otherwise Fisher’s exact test must be used
A one-sided test with a 2-sided normal
approximation
Provided automatically by SPSS when you
cross-tabulate and select chi-squared
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Go back to the first data set
Select Transform - Recode into Different
Variable to recode Housekeeping into
Housekeeping2 with:
0, 1 and 2 recoded as 1
3 recoded as 2
(We are leaving out 4)
Change Housekeeping2 so that it has zero
decimal places and the values are as above
Select Analyze – Descriptive Statistics –
Crosstabs and choose Group and
Housekeeping2 and the chi-squared test
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
There are 82 valid
cases so chi-squared
would be valid here
Chi-squared P-
value is 0.011 –
slightly weaker than
the Mann-Whitney
U test result (0.007)
Fisher’s exact P-
value is 0.010 for
one sided (H0:
Housekeeping is
the same or
worse) and 0.014
for 2-sided
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Activities
1. Run a chi-squared test with the three measures of ADL
against the treatment Group with the first data set,
recoding each measure into fewer categories if
necessary. Are the results more or less significant?
Explain.
2. Run a Mann-Whitney U test of Brand against Gender
with the second data set. Are the results more or less
significant? Explain.
3. Describe Brand against AgeCategory (e.g. a boxplot or
multiple histograms), decide on the best way to test this
association, carry out the test, ensuring it is valid,
interpret your findings, repeating if necessary.
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Recap
We have covered:
What are nonparametric methods?
When should you use them?
Overview of nonparametric methods
Comparing two groups:
Mann-Whitney U test
Cross-tabulating two variables:
Chi-squared test of association
Fisher’s exact test
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield
Bibliography
Field, A. (2013) Discovering Statistics using SPSS: (And sex and drugs and
rock 'n' roll), 4th ed., London: SAGE, Chapter 6 (for Mann-Whitney U
test) and Chapter 18 (for Chi-squared and Fisher’s exact tests).
Pallant, J. (2010) SPSS Survival Manual: A step by step guide to data
analysis using SPSS, 5th ed., Maidenhead: Open University Press,
Chapter 16.
statstutor (n.d.) Calculating Expected Frequencies in Two Way Tables
resources. Available at: http://www.statstutor.ac.uk/topics/chi-
squared-tests-of-association/calculating-expected-frequencies-in-
two-way/ [Accessed 8/01/14].
statstutor (n.d.) Chi-Squared Tests for Two-Way (Contingency) Tables
resources. Available at: http://www.statstutor.ac.uk/topics/chi-
squared-tests-of-association/chi-squared-tests-for-two-way-tables/
[Accessed 8/01/14].
statstutor (n.d.) Wilcoxon Mann-Whitney Test resources. Available at:
http://www.statstutor.ac.uk/topics/nonparametric-methods/wilcoxon-
mann-whitney-test/ [Accessed 8/01/14].
Peter Samuels Reviewer: Ellen Marshall
Birmingham City University University of Sheffield