Statistics June Notes
Statistics June Notes
Statistical Inference
• In statistical inference, we test hypotheses about the population using
sample statistics, which estimate parameters in the population.
• Statistical inference assumes that the sample is selected randomly from the
population.
The Sampling Distribution
• In statistical inference, we test hypotheses about the population using sample
statistics, which estimate parameters in the population.
• Every sample statistic has a sampling distribution. Sampling distributions
can thus be created for any type of statistic.
• For RDA IIA, only the sampling distribution of the mean is considered.
Obtaining a Sampling Distribution
1) Choose a sample size (n)
2) Randomly select a sample of that size from the population
3) Calculate the statistic of interest for that sample and write down the answer.
4) Repeat this process an infinite number of times (or at least many times)
5) Create a plot or graph using all of the answers you obtained.
Example
• A researcher is interested in studying levels of burnout in South African
employees working in the finance sector. She is granted access to all the
employees working at a large national financial institution (total = 35 000) – this
group forms the accessible population for the study.
o The researcher begins by randomly selecting 1000 employees – she then
administers a test of burnout to this group and obtains an average level of
burnout for this sample (¯𝑥1 = 82.4).
o She then returns these 1000 employees to the population and randomly
selects another sample of 1000 employees (NB: this could include
employees who were selected for the first sample). She administers the
test and obtains a sample average for this new group of 1000 employees
(¯𝑥2 = 81.7).
o She then returns these 1000 employees to the population and randomly
selects another sample of 1000 employees. She administers the test and
obtains a sample average for this new group of 1000 employees (¯𝑥3 =
81.9)
o She continues with this process until she has collected a large number of
sample averages for different randomly selected samples of 1000
employees at a time.
o As there are now a lot of different sample averages with different values,
the sample average i.e. the sample mean (¯𝑥) becomes a variable. Like any
other variable, it is now possible to plot a distribution of the sample
averages / means (i.e. a distribution of ¯𝑥) and to compute the overall mean
and standard deviation for this distribution.
The Sampling Distribution of the Mean
• The sampling distribution of the mean is the theoretical distribution of
sample means obtained through repeated sampling (i.e. sampling an
infinite number of times) from the population for a given sample size. The
average (mean) of all of these sample means is a very accurate estimate of the
true population mean.
• The sampling distribution of the mean can thus be described as: a distribution
of the sample mean (¯𝑥) for all the possible random samples of a certain
size n, where the mean is 𝜇and the standard deviation is 𝜎/√𝑛 i.e. ¯𝑥 ~𝑁
(𝜇;〖𝜎^2/𝑛〗)
• For the sampling distribution of the man, the sample mean (¯𝑥) is distributed
normally with a mean of 𝜇 and a variance of 𝜎^2/𝑛. The standard error is 𝜎/√𝑛.
• The standard error reflects the variability we would expect to find in the
values of the sample mean over repeated trials i.e. the standard error of the
mean describes the degree to which the computed means will differ from one
another when calculated from different samples of the same size taken from
the same population.
• The average of all means (called the expected value) will be equal to the
population mean, assuming infinite samples.
The Central Limit Theorem
• Given a population with mean 𝜇 and variance 𝜎^2, the sampling distribution of
the mean will have a mean equal to 𝜇 and a variance equal to 𝜎^2/𝑛.
• The sampling distribution of the mean will approach the normal distribution as
the sample size (n) increases.
• Therefore, if many samples of size n are taken from a population with mean
𝜇and variance 𝜎^2, the distribution of the sample mean (¯𝑥) will be normal
with mean 𝜇 and variance 𝜎^2/𝑛.
• NB: The original distribution i.e. the parent population, of X does not
necessarily have to be normally distributed.
• These facts are known collectively as the Central Limit Theorem (CLT). CLT
allows us to make inferences about population means using the normal
distribution regardless of the shape of the distribution of the population being
sampled (i.e. the parent distribution).
• If the parent population is normally distributed, any sample size will be
sufficient for CLT to apply.
• However, if the parent population is anything other than normal it is important
that the size n is large enough (n ≥ 30).
• The Central Limit Theorem thus essentially states that:
o The mean of the sampling distribution of the mean equals to the mean of
the population (i.e.𝜇)
o The standard error of the mean equals to the standard deviation of the
population divided by the square root of n (i.e. 𝜎/√𝑛)
• We can therefore use CLT to find the distribution of the sample mean for
variable X using the following rules:
o If 𝑋 ~𝑁 (𝜇, 𝜎^2) then 𝑋 ̄ ~𝑁 (𝜇, 𝜎^2/𝑛)
o If 𝑋 〖~? 〗(𝜇, 𝜎^2) then 𝑋 ̄ ~𝑁 (𝜇, 𝜎^2/𝑛), if 𝑛≥30
o In the same way that it is possible to transform x-scores for a given variable
X to standardized z-scores if the variable X is distributed normally, it is also
possible to transform sample mean scores (¯𝑥) to standardized z-scores if
the sampling distribution of the mean is distributed normally.
o This implies that we can use the tables of the standard normal distribution
to find probabilities associated with the distribution of the mean ¯𝑥i.e. the
sampling distribution of the mean.
CHAPTER 7: INTRODUCTION TO HYPOTHESIS TESTING
Hypothesis Testing
• The reason we obtain sampling distributions is because we want to draw
inferences about the population based on the sample. There are two ways in
which statistical inference is carried out:
i. estimation of population parameters.
ii. hypothesis testing
• Hypothesis testing is one of the most important tools for the application of
statistics to real-world problems. It can be defined as a process of making
decisions concerning populations on the basis of sample information by
using statistical tests.
• In statistics, these decisions are made in particular about the value of a
parameter. For the better part of RDA IIA, we will make such decisions about
the value of the population mean.
The Logic of Hypothesis Testing
• Consider the statement: “Every cow has only one head.” This statement is
either true or false.
• To show that the statement is true, you would need to go to check every cow in
the world to ensure that they only have one head (this is difficult if not
impossible). However, to show that the statement is false, you only need to
find one cow that has more than one head.
• This logic comes from Fisher’s philosophical argument which states that
we can never prove something to be true, but we can prove something to
be false. We follow the same logic (approach) in hypothesis testing.
Applied Example
• Suppose that the behavioural problem score of 6-year-old children (X) has a
normal distribution with a mean of 50 (i.e.𝜇 = 50) with a known variance.
Suppose we randomly draw a sample of five children and find that their mean
behavioural score is 56 (i.e. ¯𝑥=56).
• We want to test the hypothesis that this sample mean might have arisen if we
selected our sample from a population of 6-year-old children.
• This hypothesis can only be tested if we have some idea of the probability
(chance) of obtaining a sample of five 6-year-old children with a mean of 56
(which is quite extreme) if we actually sampled observations from a population
in which the mean for the population of 6-year-old children is 50.
• For hypothesis testing about the mean, the general idea is to draw a sample,
calculate the mean for the sample, and then determine the likelihood (chance/
probability) of observing the calculated mean if the population mean is
assumed to be accurate (i.e. under the null hypothesis).
• This hypothesis can only be tested if we have some idea of the probability
(chance) of obtaining a sample of five 6-year-old children with a mean of 56
(which is quite extreme) if we actually sampled observations from a population
in which the mean for the population of 6-year-old children is 50.
• For hypothesis testing about the mean, the general idea is to draw a sample,
calculate the mean for the sample, and then determine the likelihood (chance/
probability) of observing the calculated mean if the population mean is
assumed to be accurate (i.e. under the null hypothesis).
• This hypothesis can only be tested if we have some idea of the probability
(chance) of obtaining a sample of five 6-year-old children with a mean of 56
(which is quite extreme) if we actually sampled observations from a population
in which the mean for the population of 6-year-old children is 50.
• For hypothesis testing about the mean, the general idea is to draw a sample,
calculate the mean for the sample, and then determine the likelihood (chance/
probability) of observing the calculated mean if the population mean is
assumed to be accurate (i.e. under the null hypothesis).
• If, however, we found that the probability was 0.002 this would imply that if we
actually sampled from a population with a population mean of 50, the
probability of obtaining a sample of five children with a sample mean of 56 is
0.2%. This is an extremely unlikely event, and we can therefore reasonably
conclude that this sample could not have come from a population of 6-year-
old children with a population mean of 50; it must have come from some other
population whose mean is higher than 50.
• The logic underpinning hypothesis testing is that we cannot know for certain
what happens in the population; however, we can assume about the
population, and then gather empirical evidence from a sample. We can then
use this empirical evidence to test how likely it is that we would obtain these
empirical results if the assumption about the population is true. If this is likely,
then the assumption about the population could be true; however, if the
possibility of the results obtained is too unlikely (i.e. the probability of
obtaining these results is very low), then we have to recognise that the
assumption about the population must be wrong…
Steps in Hypothesis Testing
1) Select the appropriate hypothesis test for the scenario*
2) State the null hypothesis (H0)
3) State the alternate hypothesis (H1)
4) Set a suitable level of significance (i.e. α = …)
5) Calculate the test statistic and the p-value (read from SPSS)
6) Compare the p-value to the level of significance and decide (Reject H0 or Fail
to Reject H0)
7) State the conclusion in English.
The Null Hypothesis
• The null hypothesis (H0) must be defined when we perform hypothesis
testing.
• This hypothesis is the neutral hypothesis and contains what is assumed to
be true to begin with (e.g. the population mean of behavioural problem scores
for 6-year-old children is 50)
• The null hypothesis is always a strict equality i.e. it will always contain an
equal sign in its expression (e.g. H0: 𝜇 = 50)
• Since we always assume the null hypothesis to be true, we will always
perform the test under the null hypothesis. We then use the information from
the sample to assess whether there is enough evidence to say that the null
hypothesis is false or whether there is insufficient evidence to say that the null
hypothesis is false (i.e. to support the alternate hypothesis). This sufficient
evidence is in the form of probability.
• NB: note that there is no way to establish whether the null hypothesis is
actually true or not – we can only state that it is too unlikely to be true or that
there is insufficient evidence to state that it is not true i.e. Reject H0 or Fail to
Reject H0.
The Alternate Hypothesis
• The alternate hypothesis (H1) must be defined when we perform hypothesis
testing.
• This hypothesis contains the statement to be proved i.e. what it is we are
interested in showing (e.g. the population mean of behavioural problem scores
for 6-year-old children is greater than 50).
• The alternate hypothesis can take on one of three forms:
i. the one-sided test to the left (<).
ii. the one-sided test to the right (>).
iii. and the two-sided test (≠).
• A one-tailed (directional) test is a test that rejects extreme outcomes in one
specified tail of the distribution (either the left or the right tail) whereas a two-
tailed (non-directional) test rejects extreme outcomes in either tail of the
distribution.
• The form chosen for the alternate hypothesis will depend on what it is that we
wish to prove with our hypothesis test (e.g. if the intention is to test whether
the population mean is greater than 50, then the alternate hypothesis will be:
H1: 𝜇 > 50. If, however, the intention is to test whether the population mean is
different from 50, then the alternate hypothesis will be H1: 𝜇 ≠ 50).
• The form of the alternate hypothesis determines the tail/s in which the
rejection area/s will lie i.e. right tail only, left tail only, or both tails. To identify
the correct alternate hypothesis, it is necessary to interpret the question
carefully.
The Level of Significance
• The level of significance (also referred to as α) represents the level against
which the p-value is compared i.e. the amount of error that is considered
acceptable.
• This is based on the logic that if the probability of observing a certain mean
value under the null hypothesis is small, we can conclude that the null
hypothesis is false in favour of the alternate hypothesis. However, in making
this decision, there is always the possibility that we are making an error. One
such error could be that we rejected the null hypothesis even though it was
actually true in fact.
The Test Statistic
• The test statistic is used to test the hypothesis. It is evaluated under the
assumption that the null hypothesis (H0) is true.
• The formulation of the test statistic (i.e. the way it is calculated) depends on
what is being tested.
• The p-value
• The p-value is the probability of getting a test statistic equal to or more
extreme than the sample result under the assumption that the null hypothesis
is true i.e. it indicates how far out in the tail the observed value is.
• The p-value is often referred to as the calculated level of significance and it
represents the smallest level at which the null hypothesis can be rejected.
• If the p-value is small, it means that the test statistic value is far from zero, so it
is likely to be larger than any critical value calculated from any specified α. We
are therefore more likely to reject the null hypothesis (H0).
Deciding
• A decision is made by comparing the p-value to the level of significance (i.e. α).
• If the p-value is less than the level of significance (i.e. p-value < α) then the
decision is to ‘Reject the null hypothesis (H0) at the α% level of significance’.
• If the p-value is equal to or greater than the level of significance (i.e. p-value ≥
α) then the decision is to ‘Fail to reject the null hypothesis (H0) at the α% level
of significance’.
• This is equivalent to looking at whether the value of the test statistic falls
within the rejection region or not.
• Note: the decision is always made in terms of the null hypothesis.
• Example: Assume α = 0.05 (i.e. the level of significance is 0.05).
• If the p-value obtained is 0.02 then the p-value is smaller than the level of
significance (α) and we would reject the null hypothesis i.e. Reject H0 at α =
0.05.
• If the p-value obtained is 0.32 then the p-value is larger than the level of
significance (α) and we would fail to reject the null hypothesis i.e. Fail to reject
H0 at α = 0.05.
• Example: Assume α = 0.01
• If the p-value is 0.02, then we would fail to reject the null hypothesis i.e. Fail to
reject H0 at α = 0.01.
• If the p-value is 0.003, then we would reject the null hypothesis i.e. Reject H0
at α = 0.01.
Drawing conclusions in English
• It is important that we relate the results back to the original problem statement
for the hypothesis test i.e. that the results are interpreted (framed) in relation
to the alternate hypothesis (H1)
• This is done by stating whether there is sufficient evidence to support the
alternate hypothesis (if the null hypothesis is rejected) or whether there is
insufficient evidence to support the alternate hypothesis (if the decision is to
fail to reject the null hypothesis)
• Example: Assume the decision was ‘Reject H0 at α = 0.05’. The English
interpretation for this would be ‘There is sufficient evidence at the 0.05 level of
significance to assume that … [the alternate hypothesis]’
• Example: Assume the decision was ‘Fail to reject H0 at α = 0.05’. The English
interpretation for this would be ‘There is insufficient evidence at the 0.05 level
of significance to assume that … [the alternate
hypothesis]’
Error in Hypothesis Testing
• All decisions we make in hypothesis tests are subject to error.
• There are two errors that we commonly make when making decisions, namely
Type I error (α) and Type II error (β).
Type I and Type II Error
• A Type I error occurs when the null hypothesis is rejected incorrectly i.e. when
we reject H0 even though it is true.
• α therefore represents the probability of rejecting the null hypothesis (H0)
given that it is true i.e. α represents the probability of making a Type I error. α is
also referred to as the rejection level or level of significance for a hypothesis
test.
• A Type I error occurs when the null hypothesis is rejected incorrectly i.e. when
we reject H0 even though it is true.
• α therefore represents the probability of rejecting the null hypothesis (H0)
given that it is true i.e. α represents the probability of making a Type I error. α is
also referred to as the rejection level or level of significance for a hypothesis
test.
• A Type I error occurs when the null hypothesis is rejected incorrectly i.e. when
we reject H0 even though it is true.
• α therefore represents the probability of rejecting the null hypothesis (H0)
given that it is true i.e. α represents the probability of making a Type I error. α is
also referred to as the rejection level or level of significance for a hypothesis
test.
• There is therefore a reciprocal relationship between α and β – the larger α is,
the smaller β is and vice-versa. The level of significance, or α, is determined by
the researcher before running a hypothesis test; and it is therefore very
important that the researcher chooses the level of α carefully as this has
implications for β as well.
• Decisions regarding the size of α depend on the seriousness of the
implications of making either a Type I and/or a Type II error for the particular
study. For example, if it is important to avoid a Type I error then the researcher
may choose to use a very stringent (small) level of significance (α).
• If the level of significance (α) is not specified, it is common practice to use a
default of 5% (α = 0.05).
Power
• The power of a test is the probability of rejecting H0 when it is actually false.
• Since the probability of failing to reject a false H0 is β, the power of a test is (1–
β).
CHAPTER 8: SELECTING THE APPROPRIATE HYPOTHESIS TEST
Selecting the Appropriate Test
• When faced with a particular problem statement we do not design the problem
around a statistical method – instead, we use the appropriate method/s
available to us to solve the problem.
• It is therefore extremely important to be able to correctly identify the test/s we
should use in order to effectively solve a particular problem scenario.
• A summary of all of the available statistical methods and the circumstances
they should be applied in, commonly referred to as a ‘decision tree’ is very
useful for identifying the appropriate test. This information can, however, also
be summarised in other ways e.g. verbally or using a mind-map etc…
Decision Tree Step 1: Interpret the Research Scenario
• Identify the research question/s and the variable/s being measured.
• Identify the roles of the variables if appropriate (especially the IV and DV if
applicable)
• Identify the operationalisation and type for each variable (esp. whether the
variable is categorical or continuous and/or the scale of measure)
Decision Tree Step 2: Association/ Relationship or Inference/ Group
Comparison
• Decide whether the focus of the research scenario is to compare groups
(samples) to one another, or whether the focus is to establish the nature of the
relationship/s (association/s) between the variables (this includes prediction).
Decision Tree Step 3a: Association/ Relationship
• If the focus is association/ relationship, identify the scales of measure for the
variables, especially whether the variables are categorical or numerical.
• If both variables are categorical, use a Chi-squared Test of Association.
• If both variables are numerical (continuous), identify whether the focus is to
establish the strength and direction (nature) of the relationship or whether the
focus is to model the relationship i.e. use one variable to predict the other.
• If the focus is to establish the nature of the relationship (strength and
direction), use Pearson’s correlation/s.
• If the focus is to model the relationship i.e. to establish if one variable predicts
the other, use simple regression.
Decision Tree Step 3b: Inference/ Group Comparison
• If the focus is inference/ group comparison, identify the number of groups
(samples).
• If there is one group, identify whether it is one group compared to a population
or one group measured more than once (repeated measures).
• If it is one group compared to a population, identify whether it is a z-test or a
one-sample t-test.
• If it is one group measured more than once, refer to matched tests.
• If there are two groups, identify whether the two groups are matched or
independent.
• If the groups are matched, identify whether a parametric matched/ paired
samples t-test or a non-parametric Wilcoxon’s matched pairs sign rank test
is appropriate.
Parametric assumptions
• Random sampling
• Independent sampling
• At least an interval scale of measure for the DV
• Normal distribution of the data
• Homogeneity of variance (between groups
Exercise
• For each of the following scenarios, identify the appropriate statistical test to
address the given research problem. Justify your answer.
o A social psychologist needs to test if a class of nine-year-old children has a
mean IQ that is greater than 100. There are 46 children in the class with a
mean IQ of 104 and a standard deviation of 15.
o A psychologist compared the duration of REM sleep of subjects in three
different conditions. Each group consists of 10 subjects.
o A researcher is interested in studying the effects of alcohol on males
weighing between 75 and 80 kilograms. A sample of 35 men performed a
reaction time test before and after drinking three beers.
o A researcher would like to test if there is a relationship between religious
affiliation and belief in life after death (Yes/ No).
o A researcher would like to test if there is a difference between the pulse-
rate for smokers and the pulse rate for non-smokers. A sample of smokers
and a sample of non-smokers were included in the study.
o A social researcher is researching whether divorce rate and the proportion
of children born out of wedlock are related. She obtains data from 25
districts.
CHAPTER 9: COMPARISON OF GROUPS (PARAMETRIC)
Formal Hypothesis Testing
o Choose the appropriate type of hypothesis test*
o Set up a null hypothesis (H0)
▪ The null hypothesis always takes the form: H0: …
▪ =…
o Set up an alternate hypothesis (H1)
▪ The alternate hypothesis always takes the form: H1: …. < / > / ≠ …..
▪ The choice of which sign to use (less than <; greater than >; or not equal
to ≠) depends on what is being tested.
o Set up a suitable alpha-level (α)
▪ The Alpha-level, or level of significance, represents the point beyond
which the null hypothesis is rejected.
▪ Alpha is usually set at 5% or 0.05 unless otherwise specified.
o Calculate the test statistic and the p-value.
▪ These will be generated by SPSS.
o Compare the p-value to Alpha and decide.
▪ If the p-value is less than Alpha, then one would reject the null
hypothesis and the results would be considered significant.
▪ If the p-value is greater than Alpha, then one would fail to reject the
null hypothesis and the results would be considered not significant.
o Translate the results into ‘English.’
▪ If the decision is to ‘reject the null hypothesis’: ‘Therefore there appears
to be sufficient evidence at the …. level of significance to conclude that/
believe that….’
▪ If the decision is to ‘fail to reject the null hypothesis’: ‘Therefore there
does not appear to be sufficient evidence at the …. level of significance
to conclude that/believe that….’
Single Sample Tests
• Both the z-test and the one-sample t-test compare a single group to the
population i.e. these tests have hypotheses about the mean of only one
sample.
• This indicates that when we test if the population mean is different from a
specified value, the information used to provide evidence to test against is
taken from a single sample.
• If the population variance (σ2) or standard deviation (σ) is known, then a z-test
is used.
• If the population variance (σ2) or standard deviation (σ) is unknown, then a
one-sample t-test is used. The one-sample t-test is based on the t-distribution.
This distribution is also symmetric and bell-shaped (like the normal
distribution), but it has ‘heavier’ tails.
• For both tests, the null hypothesis always takes the form of: H0: µ = … [given
value].
• For both tests, the alternate hypothesis always takes the form of H1: µ [< / >/ ≠]
… [given value].
Example: Z-test
Question: Were levels of sensation-seeking in the original sample
significantly different from sensation-seeking levels in the general
population?
• Step 1:
o There is only one variable being measured in the sample, sensation-
seeking.
o Sensation-seeking is a continuous score that is interval (measured on a
psychometric scale)
o The question requires a comparison of groups.
o There is only one group (one sample)
o The group is being compared to the population (for the Brief Sensation
seeking Scale, the average score in an adult population is assumed to be 22
with a standard deviation of 3.2)
o The standard deviation (𝜎) and variance (𝜎2) for the population for
sensation-seeking are assumed to be known (artificial)
o Assume parametric assumptions are met*
o Therefore, one would run a z-test.
• Step 2: H0: 𝜇 = 22
• Step 3: H1: 𝜇 ≠ 22
• Step 4: 𝛼 = 0.05
• Step 5:
o SPSS does not run z-tests directly, but online calculators or manual
calculation can be used.
o The results that would be generated using these methods are:
o Test statistic: z = 7.38; p-value: 0.000
• Step 6: Based on the p-value, reject the null hypothesis (H0) at 𝛼 = 0.05
• Step 7: Therefore, there is sufficient evidence at the 0.05 level of significance
to believe that sensation- seeking levels in the sample are significantly
different to those in the general population.
Matched Pairs (Paired Samples) t-test
• The matched pairs (paired samples) t-test is used to test whether there are
significant mean differences between two sets of related scores e.g.
comparing measurements obtained from the same people on the same
variable at two different times or measurements obtained on the same
variable from two groups where specific individuals in each group can be
linked to one another.
• Every observation must consist of paired/ matched information therefore the
sample sizes for the two sets of measurements must be equal and both sets of
measurements must relate to the same variable e.g. pre-test and post-test
scores for the same group of participants: married couples’ ratings of a
couple’s therapy programme.
• The test requires the creation of a new variable of differences (d) – these are
calculated by subtracting one of the values for the paired subjects from the
other to create a set of difference scores.
• The matched-pairs t-test establishes whether the average of these difference
scores is zero or not.
• The null hypothesis always takes the form of: H0: µd = 0
• The alternate hypothesis always takes the form of H1: µd [< / > / ≠] 0.
Example:
Question: Did risky driving behaviour change for the follow-up sample (i.e.
the 70 participants) after they participated in the one-day workshop?
• Step 1:
• There is only one variable being measured in the sample, risky driving
behaviour, but the other variable is the time at which the measurement was
taken (before the workshop and after the workshop).
• Risky driving behaviour is a continuous score that is interval (measured on a
psychometric scale); before and after the workshop is categorical and
nominal.
• The question requires a comparison of groups.
• There is one group (one sample) being compared to itself [this can be seen as
two sets of measurements]
• The group is being measured more than once (repeated measures) therefore a
matched test needs to be used [special note: this can be seen as two linked
groups]
• Assume parametric assumptions are met*
• Therefore, one would run a matched (paired sample) t-test.
• Special note: if parametric assumptions were not met, a Wilcoxon’s matched
pairs sign rank test could be used instead.
• Step 2: H0: 𝜇d = 0
• Step 3: H1: 𝜇d ≠ 0
• Step 4: 𝛼 = 0.05
• Step 5:
• Step 6: Based on the p-value, the null hypothesis would be rejected at 𝛼 = 0.05
• Step 7: Therefore, there is sufficient evidence at the 0.05 level of significance
to believe that risky driving behaviour for the follow-up sample changed after
they participated in the one-day workshop.
Independent Samples t-test
• The independent samples t-test is used to test whether there are significant
mean differences between two independent groups/ samples e.g. comparing
average scores between a control and experimental group or between contrast
groups.
• The focus is to test for differences between the groups with respect to the
variable of interest e.g. comparing average stress scores for Type A and Type B
personalities.
• Note: the two samples (groups) can have a different number of observations
although in general the sample sizes should be the same
• Certain assumptions are made about the data in order to run an independent
samples t-test:
o The two samples are drawn independently from one another i.e.
membership in one sample does not influence membership in the other
sample.
o The two samples are drawn from normal populations.
o The two populations have equal variance (homogeneity of variance)
• The null hypothesis always takes the form of: H0: µ1 = µ2 (i.e. µ1 - µ2 = 0; there
is no difference between the means)
• The alternate hypothesis always takes the form of H1: µ1 [< / > / ≠] µ2 (this
implies that the two samples were drawn from different populations)
Example:
Question: Was there a difference in levels of risky driving behaviour between
those under the age of 25 and those who are 25 and over in the original
sample?
• Step 1:
o There are two variables being considered – levels of risky driving behaviour,
which is the DV, and age, which is the IV.
o Risky driving behaviour is a continuous score that is interval (measured on
a psychometric scale); and age is a categorical and ordinal variable (‘under
the age of 25’ and ‘25 and over’)
o The question requires a comparison of groups.
o There are two groups.
o The groups are independent.
o Assume parametric assumptions are met*
o Therefore, one would run an independent samples t- test.
o Special note: if parametric assumptions were not met, a Mann-Whitney U
test could be used instead.
• Step 2: H0: 𝜇1 = 𝜇2 – in this case 𝜇u25 = 𝜇25o
• Step 3: H1: 𝜇1 ≠ 𝜇2 – in this case 𝜇u25 ≠ 𝜇25o
• Step 4: 𝛼 = 0.05
• Step 5:
• Step 6:
o Based on the p-value, the null hypothesis would fail to be rejected at 𝛼 =
0.05
• Step 7:
o Therefore, there insufficient evidence at the 0.05 level of significance to
believe that levels of risky driving behaviour differ based on age (whether
one is under the age of 25 or 25 and over).
One-Way ANOVA
• The one-way ANOVA (Analysis of Variance) is used to test whether there are
significant mean differences between three or more independent groups/
samples. There is no restriction on the number of groups i.e. there is no
restriction on the number of means that can be compared.
• The test draws inferences about the population means using differences
between the sample means (as was the case for the other hypothesis tests).
• The focus is to test for differences in the means of the variable of interest
between three or more groups by
• e.g. comparing competitiveness scores between rugby
• players, soccer players, netball players, volleyball players, and swimmers.
• The one-way ANOVA is based on the F- distribution.
• Certain assumptions are made about the data in order to run a one-way
ANOVA:
o The data for the k different groups are independent random samples from k
different populations.
o The observations in each group are drawn independently from one another.
o Each of the k populations follows a normal distribution.
• Equality of variance (homogeneity of variance) i.e. the population variances of
the variable of interest for the k groups are equal.
• Note: the ANOVA can still be used when the assumptions of normality and
homogeneity of variance are violated
• The null hypothesis always takes the form of: H0: All of the means are equal.
This can be written statistically as H0: µi = µj for all pairs i and j.
• The alternate hypothesis always takes the form of H1: At least one pair of
means is not equal. This can be written statistically as H1: µi [< / > / ≠] µj for all
pairs i and j.
Sources of Variation in the One-Way ANOVA
• ANOVA tests the hypothesis of equal means by comparing two estimates of
variance.
• One estimate is based on the variance of the group means, while the other is
based on the variance within the groups. These estimates are referred to as the
mean square between groups (MS between) and the mean square within
groups (MS within).
• To find the two mean squares we must first calculate the sum of squares (SS),
defined as the sum of squared deviations around some point, usually the
mean or a predicted value.
• There are two sources of variation among a set of subjects, namely treatment
effects and error. Treatment effects are variation that is due to the fact that the
different subjects belong to different groups. Error is variation that is due to
other factors not related to differences between the groups.
• Together the two sources of variation form the total variation among a set of
subjects – this is measured by the total sum of squares: SS total = SS
between + SS within.
• The mean square estimates are calculated from the sum of squares where
each mean square is the average sum of square as a function of the degrees of
freedom.
• The test statistic is calculated as the ratio of the MS between and the MS
within.
Interpreting the One-Way ANOVA
• If the null hypothesis is true, then both MS between and MS within estimate
the same quantity. Therefore, the F-statistic will be equal to one or very close
to one since both mean squares will have (approximately) the same value.
• If the null hypothesis is false, it is likely that MS between is a larger value than
the MS within. As a result, the ratio between the two will be an F- statistic
that is much greater than one. We will therefore reject the null hypothesis
and conclude that at least one of the k means is different from the others (i.e.
conclude in favour of the alternate hypothesis).
• Note: if we reject the null hypothesis the ANOVA does not provide any
information to tell us which means are different from one another. The only
valid inference that we can make is that at least one population mean is
different from at least one other population mean.
• Further analyses, called post-hoc tests or multiple comparisons, can be
performed to identify precisely where the differences are and the direction of
the differences (these tests are not taught as part of RDA IIA).
Example:
Question: Was there a difference in levels of risky driving behaviour between
Honours, Masters, and PhD students in the original sample?
• Step 1:
o There are two variables being considered – levels of risky driving behaviour,
which is the DV, and year of study, which is the IV.
o Risky driving behaviour is a continuous score that is interval (measured on
a psychometric scale); and year of study is a categorical and ordinal
variable (Honours, Masters, PhD)
o The question requires a comparison of groups.
o There are three groups.
o Assume parametric assumptions are met*
o Therefore, one would run an ANOVA.
o Special note: if parametric assumptions were not met, a Kruskal-Walli’s
test could be used instead.
• Step 2: H0: All of the means are equal.
• Step 3: H1: At least one pair of means is not equal.
• Step 4: 𝛼 = 0.05
• Step 5: Yos: 1 = Honours; 2 = Masters; 3 = PhD
• Step 6: Based on the p-value, the null hypothesis would fail to be rejected at 𝛼
= 0.05
• Step 7: Therefore, there insufficient evidence at the 0.05 level of significance
to believe that levels of risky driving behaviour differ based on year of study
(Honours, Masters, or PhD).
Non-parametric Tests
• All of the hypothesis tests described above (z- test; one sample t-test;
matched pairs t-test; independent samples t-test; one-way ANOVA) are
parametric.
• Parametric tests make assumptions about the distribution of data in order to
estimate population parameters.
• Non-parametric tests (distribution-free tests) do not rely on any assumptions
regarding the underlying distribution of the data.
• All of the hypothesis tests described above (z- test; one sample t-test;
matched pairs t-test; independent samples t-test; one-way ANOVA) are
parametric.
• Parametric tests make assumptions about the distribution of data in order to
estimate population parameters.
• Non-parametric tests (distribution-free tests) do not rely on any assumptions
regarding the underlying distribution of the data.
CHAPTER 10: RELATIONSHIPS/ASSOCIATION
Associations between Variables
• Many quantitative studies focus on the nature of the relationships or
associations that exist between variables.
• The Chi-squared Test of Association can be used to establish the nature of the
association between two or more categorical variables.
• Correlation and regression are two statistical techniques that help to establish
the nature of the linear relationship between two or more continuous
variables.
The Chi-Squared Test of Association
• The Chi-squared Test of Association is a non-parametric test that is used to
test whether a significant association exists between two or more categorical
variables.
• It also tests whether there is independence between the categorical variables.
• In order to perform the test, there must be at least two categorical variables
and each variable must have at least two levels.
• The Chi-squared Test of Association uses a cross- tabulation of the variables of
interest – this table is called a contingency table (sometimes also referred to
as the table of observed values). Each cell in the table shows the frequency of
occurrence at one level of the first factor and one level of the second factor
except for the totals. The sum of all the frequencies in all of the cells must be
equal to the valid sample size.
Example:
Question: Was there a significant association between impulsivity rankings
and gender in the original sample?
• Step 1:
o There are two variables being considered – impulsivity rankings and gender.
o Impulsivity rankings is a categorical and ordinal variable (low; average; high)
and gender is a categorical and nominal variable (male; female)
o The question is one of association and both variables are categorical.
• Therefore, one would run a Chi-squared test of association (non-parametric)
• Step 2: H0: There is no relationship/ association between impulsivity rankings
and gender.
• Step 3: H1: There is a relationship/ association between impulsivity rankings
and gender.
• Step 4: 𝛼 = 0.05
• Step 5:
• Step 6: Based on the p-value, the null hypothesis would fail to be rejected at 𝛼
= 0.05
• Step 7: Therefore, there insufficient evidence at the 0.05 level of significance
to believe that impulsivity rankings and gender are related/ associated.
Scatter Diagrams
• Scatter diagrams (also called scatter plots or scattergrams) provide a graphic
representation of the linear relationship between two continuous variables
where the individual data points for the two variables are plotted in two-
dimensional space.
• Traditionally the predictor (independent) variable is plotted on the x-axis and
the criterion (dependent) variable is plotted on the y-axis.
• If the focus of the study is to predict one variable on the basis of the other,
then it is easy to identify the predictor and the criterion variables - the criterion
or dependent variable is the variable that is predicted; and the predictor or
independent variable is the one that is used to predict the criterion. For
example, if a researcher wishes to establish if age predicts ability level, age
would be the predictor (independent) variable and ability level would be the
criterion (dependent) variable.
• If the focus of the study is to establish the nature of the association between
the two variables (strength and direction of the relationship), then it does not
matter which of the variables plays which role. For example, if a researcher is
interested in establishing the nature of the relationship between stress and
anxiety, then stress could be either the predictor or the criterion variable, and
anxiety could be either the predictor or the criterion variable (the labelling is
irrelevant).
Linear Relationships between Variables
• The linear relationship between two continuous variables can be visualised
using a scatterplot. Scatterplots can also show non-linear relationships or
indicate that there is no relationship between variables – these types of
relationships are not focussed on in RDA IIA.
• If the values of the two variables are linearly related, a straight line can be used
to summarise the data i.e. when we plot the data points on a coordinate
system and then draw a line through the data, all the values appear around the
line segment and follow the direction of the line.
• If the data are not linearly related, this does not mean that they are not related
– the relationship may be non-linear in nature i.e. the line that best describes
the data is not a straight line.
Covariance
• The covariance is a number reflecting the degree to which two variables vary
together.
• For example, if high scores of one variables tend to be paired with high scores
on the other variable, the covariance would be large and positive.
• If high scores of one variables tend to be paired with low scores on the other
variable, the covariance would be large but negative.
• A covariance near zero implies that high scores of one variables are paired with
both high and low scores on the
other variable.
Pearson’s Correlations
• The Pearson’s Product Moment
Correlation Coefficient
estimates the strength and
direction of the linear
relationship between two
numerical variables.
• It is referred to as ‘correlation’
and is represented by the letter r. The value of the correlation will always be a
number between -1 and +1.
• Correlations are calculated based on covariance and factors that affect this
calculation include restriction of the range and non-linearity.
• The correlation coefficient (r) is calculated by dividing the covariance by the
standard deviations for the two variables. Correlation and covariance are
therefore related.
• Understanding how to interpret the strength and direction of a correlation
coefficient is critical in many areas of psychology.
Example:
Question: What were the nature of the relationships between levels of
sensation- seeking, levels of impulsivity, and levels of risky driving behaviour
in the original sample?
• Step 1:
o There are three variables being considered – levels of sensation-seeking,
levels of impulsivity, and levels of risky driving behaviour.
o All three variables are continuous scores that are interval in nature
(measured on psychometric scales)
o The focus of the question is to establish the nature of the relationships
between the variables i.e. to establish (quantify) the strength, direction, and
significance of each bivariate correlation.
• Step 2: H0: ρ = 0
• Step 3: H1: ρ ≠ 0
• Step 4: 𝛼 = 0.05
• Step 5: Although the individual significance of each correlation coefficient
established between two variables is tested for significance against the
population, for the interpretation the interest is in specifying whether the
correlation is significant or not (based on the p-value) as well as the strength
and direction of the relationship between the variables.
• Step 6 and Step 7:
o Based on the p-values above:
▪ Fail to reject H0 - the relationship between driving behaviour (DBQ) and
impulsivity (BIS) would be non- significant. Because the relationship is
non-significant, the correlation would not be interpreted any further.
▪ Fail to reject H0 - the relationship between driving behaviour (DBQ) and
sensation-seeking (BSS) would be non-significant. Because the
relationship is non- significant, the correlation would not be interpreted
any further.
▪ Fail to reject H0 - the relationship between impulsivity (BIS) and sensation-
seeking (BSS) would be non- significant. Because the relationship is non-
significant, the correlation would not be interpreted any further.
Simple Regression
• The purpose of simple regression is to create an equation that will allow us to
predict the value of one variable using the values of another variable i.e. we are
interested in deriving an equation that explains how differences in one variable
relate to differences in another variable.
• For RDA IIA, regression will be restricted to cases where the best fitting line
through the scatterplot is a straight line i.e. the course will only cover linear
regression. RDA IIA will also only focus on simple linear regression, where a
single predictor (independent) variable is used to predict a criterion
(dependent) variable.
• NB: For regression analysis it is very important to distinguish between the
predictor (independent) variable and the criterion (dependent) variable – these
cannot be used interchangeably.
The Regression Line
• The scatterplot gives a good indication of what the linear relationship between
the two variables looks like i.e. whether the direction of the linear relationship
is positive or negative.
• If the straight line is drawn such that it passes through the middle of all the
data points, this line will represent ‘the line of best fit.’ The equation for this line
is determined through the regression of Y on X.
• The equation for the regression line is represented statistically as: 𝑦 = 𝑎 + 𝑏𝑥
o 𝒚 = 𝒂 + 𝒃𝒙represents the predicted value of Y i.e. the value of the criterion
(dependent) variable that is estimated from the regression line
o a represents the intercept i.e. the predicted value of Y when X = 0
o b represents the slope (gradient) of the regression line
o i.e. the amount of difference in Y associated with a one-unit difference in X.
Ideally it is the amount of change in Y for a one-unit change in X.
o x represents the value of the predictor (independent) variable
• Note: the y-hat (𝑦 = 𝑎 + 𝑏𝑥) is used in the equation instead of Y to indicate that
the values obtained are predicted values, not actual values.
• Note: the direction of the slope (b) will follow the direction of the association
between the two variables (r) i.e.
• if the correlation between the two variables (r) is positive, then the slope (b)
will be positive as well.
• if the correlation between the two variables (r) is negative, then the slope (b)
will be negative as well.
The Line of Best Fit
• The regression line is also called the line of best fit because it is the line for
which the combined squared deviation between all points and the line is a
minimum i.e. if we draw vertical lines from each data point to the plotted line
of regression, then calculate the distance from each data point, and then
square these values to remove the negative signs, the sum of these values will
be the lowest possible combined distance from the plotted line.
• The vertical distance from each data point to the plotted line of regression is
called the error of prediction – this is also referred to as the residual. The
residual is calculated based on the difference between the observed and
predicted values of Y.
• 𝑦 = 𝑎 + 𝑏𝑥is therefore the best prediction that can be made of Y. The error
associated with this prediction will be a function of the deviations of Y about
the predicted.
• point𝑦 = 𝑎 + 𝑏𝑥. This is called the standard error of the estimate.
Calculating and Using the Regression Line
• To find the equation for the line of regression, we need to identify estimated
values for b and a. This can be done through a process called ordinary least
squares method.
• Once we have the least squares estimates for b and a, we can use the
regression equation to predict a value of the criterion (dependent variable)
using the observed predictor (independent) variable. This is done by
substituting an actual value for X and then calculating the value of𝑦 = 𝑎 + 𝑏𝑥by
solving the equation.
Interpolation and Extrapolation
• When a value for X is substituted into the regression equation to predict Y, it is
very important to check that the value falls within the range of values X that
were used to calculate the regression line in the first instance.
• Valid values of X are those that lie within the range of observed values of
variable X – substituting these into the regression line equation is called
interpolation.
• If a value for X that falls outside the range of observed values of variable X is
substituted into the regression line equation, this is called extrapolation.
Extrapolation is not ideal as the way in which the regression line might stay the
same or deviate beyond the range is not clear (no data points exist to estimate
from) – for this reason, it is recommended that extrapolation be avoided.
The Coefficient of Determination
• The Coefficient of Determination is the square of the correlation coefficient
(i.e. the correlation coefficient multiplied by itself) – it is represented as r2.
• It provides an estimate of the proportion of variation in the criterion
(dependent) variable that is explained by the predictor (independent) variable
in the regression model.
• i.e. it is an estimate of how much of the variability in the criterion (dependent)
variable comes from (is due to) the predictor (independent) variable.
• For example, for a correlation of r = 0.985, the coefficient of determination is r2
= 0.97. This means that approximately 97% of the variation in Y is explained by
the variability in X.
• This also means that 3% (0.03) of the variation in Y cannot be accounted for
when using the regression equation to predict the values of Y. This unexplained
variation is due to factors other than those used in the linear regression model,
such as other variables or even external information that was not measured.
Example:
Question: Did level of sensation-seeking predict risky driving behaviour in the
original sample?
• Step 1:
o There are two variables being considered – levels of risky driving behaviour,
which is the variable to be predicted (i.e. the DV or criterion) and level of
sensation-seeking, which is the variable to be used to predict (i.e. the IV or
predictor)
o Both variables are continuous scores that are interval in nature (measured
on psychometric scales)
o The focus of the question is to model the relationship between the variables
i.e. establish the extent to which the one variable (the predictor) predicts the
other (the criterion).
• Step 2: H0: β = 0
• Step 3: H1: β ≠ 0
• Step 4: 𝛼 = 0.05
• Step 5: Regression analysis involves calculating the parameters (the slope and
the intercept) for a straight line that can be used to predict one variable (the
criterion) on the basis of the other (the predictor). The equation for this straight
line is: 𝑦 = 𝑎 + 𝑏𝑥 - a represents the intercept and b represents the slope. In the
regression analysis, the slope for each predictor is tested for significance, in
order to indicate whether it is a significant predictor of the criterion or not. The
significance of the overall model is also tested, using an ANOVA technique.
• Step 6: Based on the p-value, the null hypothesis would fail to be rejected at 𝛼
= 0.05
• Step 7: Therefore, there insufficient evidence at the 0.05 level of significance
to believe that sensation-seeking predicts risky driving behaviour.