Statistical Tools and
Techniques in Research
SHERWIN E. BALBUENA
Statistics
• a branch of mathematics
dealing with the collection,
analysis, interpretation, and
presentation of masses of
numerical data (Merriam-
Webster, n.d.)
• Merriam-Webster. (n.d.). Statistics. In Merriam-
Webster.com dictionary. Retrieved May 27, 2021, from
https://www.merriam-webster.com/dictionary/statistics
9/3/20XX Presentation Title 2
the practice or science of
collecting and analyzing
numerical data in large
Statistics quantities, especially for
the purpose of inferring
proportions in a whole
from those in a
representative sample.
6/2/2021 Statistical Tools and Techniques in Research 3
Statistical tools involved in
carrying out a study
include planning,
Statistical designing, collecting
data, analyzing, drawing
Tools meaningful
interpretation and
reporting of the research
findings.
9/3/20XX Presentation Title 4
• statistical methods
Statistical
• quantitative methods
Techniques
9/3/20XX Presentation Title 5
Statistical
Techniques
9/3/20XX Presentation Title 6
• either graphical,
numerical, or tabular
Statistical
Techniques
9/3/20XX Presentation Title 7
• either parametric
or nonparametric
methods
Statistical
Techniques
9/3/20XX Presentation Title 8
Research and
Statistics
How are they related?
The Scientific
Method
9/3/20XX Presentation Title 10
Research
Designs
9/3/20XX Presentation Title 11
Levels of
Measurement of
Data Gathered
from Research
9/3/20XX Presentation Title 12
Level of Measurement:
Example (Grade)
•Nominal: passed/failed
•Ordinal: 1.0, 1.25, 2.0,
3.0, 5.0
•Interval: Z-score = 1.96
•Ratio: 95%
9/3/20XX Presentation Title 13
Are Likert Scales Ordinal?
Individual Likert item responses are ordinal
The sum of ordinal responses can be treated as interval if
the test or questionnaire is unidimensional or reliable
Reliable Likert-type instruments have at least 0.70
Cronbach’s alpha coefficients
9/3/20XX Presentation Title 14
Variables Used
in Research
9/3/20XX Presentation Title 15
Probability
Sampling
Techniques
9/3/20XX Presentation Title 16
1. Add a new column within the
spreadsheet and name it
Random_number
2. In the first cell underneath your
heading row, type “= RAND()”
3. Press “Enter,” and a random
Random number will appear in the cell
Sampling in 4. Copy and paste the first cell into
the other cells in this column
Excel 5. Once each row contains a random
number, sort the records by
Random_number column
6. Choose the top n entries. Those
will be the random n out of N
entries
9/3/20XX Presentation Title 17
Nonprobability
Sampling
Techniques
9/3/20XX Presentation Title 18
Sample Size
Determination
Cochran’s
Formula
9/3/20XX Presentation Title 19
Distributions
of Data
9/3/20XX Presentation Title 20
Statistical
Significance
If the statistic exceeds the critical
value at a certain level of
significance (5%, 1%) and
degree of freedom.
If the p-value is less than the set
level of significance (* p < 0.05,
** p < 0.01)
9/3/20XX Presentation Title 21
Five Steps in Hypothesis
Testing:
• Specify the Null Hypothesis.
Hypothesis • Specify the Alternative
Testing Hypothesis.
• Set the Significance Level (α)
• Calculate the Test Statistic and
Corresponding P-Value.
• Drawing a Conclusion.
9/3/20XX Presentation Title 22
Two-Sample t-Test
• H0: μ1 = μ2.
Hypothesis
• H1: μ1 ≠ μ2.
Testing:
• α = 0.05
Example
• t = -2.5, p = 0.0254
• Reject H0 in favor of H1.
9/3/20XX Presentation Title 23
• Measures of Central
Tendency (or
Location, Position)
Descriptive
• Measures of
Statistics Dispersion (or
Variation, Spread)
• Measures of Shape
9/3/20XX Presentation Title 24
Mean
Measures of
Central Median
Tendency
Mode
9/3/20XX Presentation Title 25
9/3/20XX Presentation Title 26
The Mean • Referred to as average
• Add all entries; divide
the sum by no. of
entries
• Excel formula:
=AVERAGE
9/3/20XX Presentation Title 27
• When data distribution is
continuous and
symmetric
• When data are normally
distributed
• No outliers
When to Use
the Mean?
9/3/20XX Presentation Title 28
Outliers
9/3/20XX Presentation Title 29
• Arrange the data from
lowest to highest; Find
The Median the middle of the data
• Excel formula:
=MEDIAN
9/3/20XX Presentation Title 30
• When your data set is
skewed
• When you are dealing
When to Use with ordinal data
the Median?
9/3/20XX Presentation Title 31
• The most frequently
occurring categories or
values
• Excel formula (for
The Mode unimodal): =MODE.SNGL
• Excel formula (for
multimodal):
=MODE.MULT
9/3/20XX Presentation Title 32
• When dealing with
nominal data
When to Use
the Mode?
9/3/20XX Presentation Title 33
• Frequency = ∑xi
Other Stats • Percentage = 100*∑xi/n
• Proportion = ∑xi/n
1 if present
x=
0 if absent
9/3/20XX Presentation Title 34
Bar graphs
Pie charts
Histograms
Graphical Line graphs
Techniques
Boxplots
Stem-and-leaf plots
Scatterplots
9/3/20XX Presentation Title 35
• Used when data are
nominal or ordinal
• The heights or lengths are
Bar Charts frequencies/percentages
9/3/20XX Presentation Title 36
• Used when
you are
trying to
compare
parts of a
whole
Pie Charts
• When the
data are
nominal or
ordinal
9/3/20XX Presentation Title 37
• Used when you want to see
the shape of the data's
distribution
• When the data are
interval/ratio
Histograms
9/3/20XX Presentation Title 38
• Used to track changes over short and
long periods of time
• Also used to compare changes over
the same period of time for more
than one group
Line Graphs
9/3/20XX Presentation Title 39
• Used to show the shape of
the distribution, its central
value, and its variability
Boxplots • Used when the data is at
(Box-and-Whisker least interval level
Plots)
9/3/20XX Presentation Title 40
• Used to classify either
discrete or continuous
variables.
• Looks something like a bar
graph
Stem-and-Leaf
Plot
9/3/20XX Presentation Title 41
• When you have two
variables (X and Y)
that pair well
together
• X and Y variables
are at least in
interval/ratio level
Scatterplots
9/3/20XX Presentation Title 42
Range
Interquartile Range
Measures of
Dispersion
Variance
Standard Deviation
9/3/20XX Presentation Title 43
Dispersion,
Variability,
Spread
9/3/20XX Presentation Title 44
• Range(X) = Max(X) – Min(X)
The Range • Difference between the
highest score and the
lowest score
9/3/20XX Presentation Title 45
The
Interquartile
Range
9/3/20XX Presentation Title 46
Variance
and
Standard
Deviation
Excel formulas:
=STDEV.P
=STDEV.S
9/3/20XX Presentation Title 47
• In conjunction with a
mean
When to use • To summarize
continuous data
SD? • When data is normal
• When data has no
outliers
9/3/20XX Presentation Title 48
Skewness
Measures of
Shape
Kurtosis
9/3/20XX Presentation Title 49
Skewness
(Symmetry)
Excel function:
=SKEW
9/3/20XX Presentation Title 50
Kurtosis
(Peakedness)
Excel function:
=KURT
9/3/20XX Presentation Title 51
Properties:
Normal 1. Mean=Median=Mode
2. Symmetric at the center
3. Area below the mean =
Distribution Area above the mean
4. Total area under the curve
=1
9/3/20XX Presentation Title 52
Kolmogorov-Smirnov
Tests of Shapiro-Wilk
Normality Normal QQ-plots
Boxplots
9/3/20XX Presentation Title 53
Kolmogorov-
Smirnov Test
H0: There is no difference between
the observed and theoretical
distribution.
9/3/20XX Presentation Title 54
Kolmogorov-
Smirnov Test
Examples of outputs
Higher p-values (>0.05)
indicate normally
distributed data.
9/3/20XX Presentation Title 55
Shapiro-Wilk
Test
H0: The data are normally
distributed.
9/3/20XX Presentation Title 56
Shapiro-Wilk
Test
Examples of outputs
Higher p-values (>0.05)
indicate normally
distributed data.
9/3/20XX Presentation Title 57
GRAPHICAL TECHNIQUE
Normal
QQ-plots
If the points seem to fall
about a straight line the
distribution is normal.
9/3/20XX Presentation Title 58
GRAPHICAL TECHNIQUE
Boxplot
When the median is in the
middle of the box, and the
whiskers are about the
same on both sides of the
box, then the distribution is
symmetric.
9/3/20XX Presentation Title 59
Tests of
Homogeneity
of Variances
Levene’s Test
Bartlett’s Test
9/3/20XX Presentation Title 60
Null hypothesis:
The variances are equal
across all samples. In more
formal terms, that's written as:
Levene’s Test H0: σ12 = σ22 = … = σk2.
9/3/20XX Presentation Title 61
Levene’s Test
Sample
Outputs
9/3/20XX Presentation Title 62
Bartlett’s
Test
9/3/20XX Presentation Title 63
Bartlett’s Test
Sample
Outputs
9/3/20XX Presentation Title 64
Independent samples t-test
Paired samples t-test
Some
Parametric Analysis of variance
Statistics Pearson product-moment
correlation
Linear regression
9/3/20XX Presentation Title 65
• Compares the means of two
independent groups in order to
determine whether there is
Independent statistical evidence that the
associated population means are
Samples t- significantly different.
Test
9/3/20XX Presentation Title 66
• One independent, categorical
variable that has two levels/groups.
Independent • One continuous dependent
variable.
Samples t- • Normality of data within each group
Test: • No significant outliers in the two
Assumptions groups
• Random sampling from the
population
• Homogeneity of group variances
9/3/20XX Presentation Title 67
• Comparing the math
Independent abilities of male and female
Samples t- students
Test: • Comparing the reading
Applications performances of the control
group and the
experimental group
9/3/20XX Presentation Title 68
Independent
Samples t-
Test:
Sample
Outputs
• H0: μ1 = μ2
• t= -1.99, p-value = 0.055
9/3/20XX Presentation Title 69
Please visit this link:
https://support.microsoft.com/
en-us/office/use-the-analysis-
toolpak-to-perform-complex-
How to Activate data-analysis-6c67ccf0-f4a9-
487c-8dec-bdb5a2cefab6
Analysis ToolPak in
Excel Or scan the QR code below:
9/3/20XX Presentation Title 70
Paired t-Test
Compares the means of two
measurements taken from the
same individual, object, or
related units
9/3/20XX Presentation Title 71
• Your dependent variable should be
measured on a continuous scale.
• Your independent variable should
consist of two categorical, "related
groups" or "matched pairs".
Paired t-Test: • There should be no significant
Assumptions outliers in the differences between
the two related groups.
• The distribution of the differences
in the dependent variable between
the two related groups should be
approximately normally
distributed.
9/3/20XX Presentation Title 72
• Comparing the pretest and
posttest scores after using
Paired t-Test: an intervention
Applications • Comparing the math
anxiety scores before and
after conducting a non-
traditional teaching strategy
9/3/20XX Presentation Title 73
Paired t-Test:
Sample
Outputs
• H0: µd = 0
• t= -6.53, p-value = 0.000
9/3/20XX Presentation Title 74
• Compares three or more
than three categorical
groups to establish
whether there is a
difference between them.
Analysis of
Variance
(ANOVA)
• H0: μ1 = μ2 = μ3 = . . . = μk
9/3/20XX Presentation Title 75
• Interval/ratio level dependent
variable
• Independent variable should
consist of two or more
categorical, independent
Assumptions groups
of ANOVA • Independence of observations
• No significant outliers
• Normally distributed data in
each group
• Homogeneity of variances
9/3/20XX Presentation Title 76
• Data may be transformed
using any of the following
techniques:
• Logarithmic (=LN or =LOG)
What if • Square root (=SQRT)
• Cube root (=[cell]^1/3)
normality is
violated?
9/3/20XX Presentation Title 77
• Determining the effect of a
nominal/ordinal-level variable
on an interval/ratio-level
variable
Applications • In experiments involving two or
of ANOVA more experimental groups and
one or more factors
• Comparing the average NAT
scores of 5 schools in the
division
9/3/20XX Presentation Title 78
One-Way ANOVA: • H0: μA = μB = μC
Sample Output • F(2,41)=1.11, p = 0.34
Anova: Single Factor
SUMMARY
Groups Count Sum Average Variance
Method A 16 172 10.75 10.6
Method B 13 143 11 4.666667
Method C 15 181 12.06667 4.066667
ANOVA
Source of Variation SS df MS F P-value F crit
Between Groups 14.79394 2 7.39697 1.115258 0.337571 3.225684
Within Groups 271.9333 41 6.63252
Total 286.7273 43
9/3/20XX Presentation Title 79
• If the p-value is significant at a
specified alpha, proceed to
post-hoc tests.
What to do • Examples of post-hoc tests:
if ANOVA is • Fisher’s Least Significant
Difference (LSD)
significant? • Tukey’s Honestly Significant
Difference (HSD)
• Scheffe’s Test
• Duncan’s new multiple range test
(DMRT)
9/3/20XX Presentation Title 80
Pearson Product-Moment
Correlation (r)
is a measure of the strength of a linear association between two
variables
9/3/20XX Presentation Title 81
• The two variables, X and Y, are
of interval/ratio level
• X and Y are paired
Assumptions • Linear relationship between X
of Pearson’s r and Y
• Bivariate normal distribution
• Homoscedasticity
• No univariate or multivariate
outliers
9/3/20XX Presentation Title 82
LINEAR
BIVARIATE
NORMAL
9/3/20XX Presentation Title 83
Strength of
Linear
Correlation
9/3/20XX Presentation Title 84
• Negative: As X increases,
Y decreases. Or as X
decreases, Y increases.
What does a
• Positive: As X increases,
correlation so does Y. Or as X
coefficient decreases, Y also
decreases.
mean? • But it does not mean that
X affects Y, nor that Y
affects X.
9/3/20XX Presentation Title 85
Significance
of Pearson’s r
9/3/20XX Presentation Title 86
Pearson’s r in
Excel
Excel function:
=CORREL
9/3/20XX Presentation Title 87
Mann-Whitney U Test
Wilcoxon Signed-Rank
Test
Nonparametric
Alternatives Kruskal-Wallis H Test
Spearman’s Rank
Correlation
9/3/20XX Presentation Title 88
• Used when the assumptions
of independent samples t-
test are not met
Mann-
Whitney U
Test
9/3/20XX Presentation Title 89
Mann-
Whitney U
Test Sample
Outputs
9/3/20XX Presentation Title 90
Wilcoxon
Signed-Rank
Test
Used when the
assumptions of paired t-
test are not met
9/3/20XX Presentation Title 91
Wilcoxon
Signed-Rank
Test Sample
Outputs
9/3/20XX Presentation Title 92
• Used when the
assumptions of ANOVA
Kruskal- are not met
Wallis H Test
9/3/20XX Presentation Title 93
Kruskal-
Wallis H Test
Sample
Outputs
9/3/20XX Presentation Title 94
• Used when assumptions of
Pearson’s r are not met by
Spearman’s the data
Rank
Correlation
(Rho)
9/3/20XX Presentation Title 95
Spearman’s
Rho Sample
Outputs
9/3/20XX Presentation Title 96
Statistical
Tests
Appropriate
to Your
Research
Design
9/3/20XX Presentation Title 97
• Wilcoxon signed-rank test
https://www.socscistatistics.com/tests/s
ignedranks/default2.aspx
• Mann-Whitney U test
https://www.socscistatistics.com/tests/
mannwhitney/default2.aspx
• Kruskall-Wallis test
Online https://www.socscistatistics.com/tests/k
ruskal/default.aspx
Calculators • Levene’s test (Homogeneity of variance
test)
https://www.socscistatistics.com/tests/l
evene/default.aspx
• Shapiro-Wilk (Normality test)
http://www.statskingdom.com/320Sha
piroWilk.html
9/3/20XX Presentation Title 98
• https://www.rstudio.com/p
roducts/rstudio/download/
Open-source • https://www.jamovi.org/do
wnload.html
Software
• https://www.blueskystatisti
cs.com/Articles.asp?ID=30
1
9/3/20XX Presentation Title 99
“Facts are stubborn
things, but statistics
are pliable.”
Mark Twain
9/3/20XX Presentation Title 100
Sherwin E. Balbuena
Thank you
[email protected] Phone: +63 909 522 6069
9/3/20XX Presentation Title 101