Lesson 1
Lesson 2
Lesson 3
Lesson 4
Lesson 5
MODULE |
THE STUDY OF STATISTICS
Basic Concepts on Statistics
Determining Sample Size
Tools in Gathering Data
Criteria for Data Gathering
Organization and Presentation of DataMODULE |
THE STUDY OF STATISTICS
C] INTRODUCTION
This module is the introductory part of Advanced Statistics. It
involves lessons on basic concepts on Statistics, functions and types as well as
determining sample size. It also includes organization and presentation of data.
OBJECTIVES
After studying the module, you should be able to:
1, Determine the functions of statistics and cite concrete examples of
uses in the different fields;
Differentiate the branches of statistics and the types of test;
3. Differentiate and provide examples of the scales of measurement,
data and sources of data;
4, Determine the most appropriate way of selecting a sample and
collecting a data in a particular study
5. Identify the advantages and disadvantages of each form of presenting
data;
6. Recognize the uses of different forms of presenting data;
7. Organize collected data and present them in an appropriate form;
and
8. Constructs graphs and charts.
SEME 112 - Advanced Statistics Module |é DIRECTIONS/ MODULE ORGANIZER:
There are 5 lessons in the module. Read each lesson carefully then
answer the exercises/activities to find out how much you have benefited from
it. Work on these exercises carefully and submit your output to your teacher.
In case you encounter difficulty, discuss this with your teacher during
the face-to-face meeting.
Good luck and enjoy reading!!!
SEME 112 - Advanced Statistics Module |Lesson 1
BASIC CONCEPTS ON
STATISTICS
Statistics
is a branch of mathematics that deals with the processes of gathering,
describing, organizing, analyzing and interpreting numerical or statistical data
as well as with drawing valid conclusions and making reasonable decisions on
the basis of such analysis.
FUNCTIONS OF STATISTICS
1.
To describe a group in terms of what is average or typical
Ex. What is the average salary of DEPEd teachers?
To describe a group in terms of its dispersion or variability
Ex. Are the IQ of college students in DMMMSU varied?
. To determine the existence of a relationship between/among two or
more variables
Ex. Is salary of employees correlated with their manifestation of work
ethics?
To compare two or more group scores on a variable
Ex. Is there a significant difference in the level of financial management
skills of teachers when grouped according to highest educational
attainment?
To determine the probability of occurrence of an event or observation
Ex. What is the probability that a person who is in close contact with a
Covid-19 positive be contaminated with the virus
To estimate the value of a population parameter on the basis of an
observed statistic
Ex. Can it be concluded that the claim of a company on the number of
kilos of their chocolate products is 1.5kg based on a sample of 20 packs?
Examples of the Functions of Statistics in Various Fields
Field
Concrete Use
Medicine Trend of Covid Positive in the
SEME 112 - Advanced Statistics Module |different regions
7, Sports Number of wins and loses
3. Education Trend in enrolment, number of
graduates
4,_Land Transportation Office Registration of Cars
5. Department of Trade in Industry | Number of Micro Enterprises in the
Province
6. Local Government Unit Profile of recipient of Social
Amelioration Program
7. Psychology Attitudinal patterns, cause and
effects of misbehavior
@, Business and Economics Sales, price indices, revenues, costs,
inventories
9. Research To test claim or inferences about a
group of people or events
Branches/Fields of Statistics
1. Descriptive Statistics. This type of statistics is used to describe a group of
individuals or describe the data that have been collected. In short, this type of
statistics is devoted to summarization and description of data sets.
Statistical tools used: frequencies, percentage distribution, measures of
central tendency, graphs, skewness and kutosis, measures of variability, degree
of relationships of group characteristics
Statistics- numerical indices are calculated from a sample drawn from a
population.
Parameters-numerical indices are calculated from the entire population
2. Inferential Statistics. This type of statistics is used when one makes
decision, estimates prediction or generalization about a population based on a
sample. In inferential statistics, testing the significant difference and
independence between two or more variables are given emphasis. A hypothesis
about the population is made and is intended to be rejected or accepted
depending on the result of a test based from available samples.
Some tools are: Normal Distribution (area under the curve), Sampling
Distribution (sample size, standard scores), Probability Distribution, Hypothesis
Testing
One group: Chi square; Z
Two groups: t; Z; chi square, Mann-Whitney test; McNemar Test of Change;
Wilcoxon Test; Sign Test; Median Test
SEME 112 - Advanced Statistics Module |Three or more groups: Analysis of Variance, Kruskal-Wallis Test; Cochran’s
Test; Chi square
KINDS OF TESTS IN STATISTICS
a. Parametric Test
~ a test of significance appropriate when the data represent an interval
or ratio scale of measurement and
-it is stronger, the sample size is large n>30
-distribution is normal
-sampling is done at random
b. Non-parametric test
a test of significance appropriate when the data represent an ordinal or
nominal scale
-sample size is small
-the distribution is free
-the samples are not randomized (purposive)
Constants and Variables
Constants- refer to the fundamental quantities that do not change in
value. Ex. Fixed costs and acceleration due to gravity
Variables- quantities that may take anyone of a specified set of values.
Qualitative (categorical) variables. - non measurable
characteristics that cannot assume a numerical value but can be classified into
two or more categories. Ex. Sex (male or female), opinion in an issue (for,
against, undecided), smoking habits (always, often, seldom, very seldom,
never).
Those data that are obtained about a qualitative variable are
called qualitative data.
Quantitative (numerical) variables- those quantities that can be
counted with bare hands, can be measured with the use of some measuring
devices or can be calculated using mathematical formula
Those data involving quantitative variables are called
quantitative data.
Discrete Variables-actual values obtained by counting. Ex.
Number of students, number of vehicular accidents
Continuous Variables- are obtained by measurement,
usually with units such as height, weight, time in minutes
-also obtained by evaluating values using a formula such as
profits, 1Q and final grades
Sources of Data
Data- refers to facts concerning things such as status in life of people,
defectiveness
1. Primary-from eye or ear witness of past, first hand information
SEME 112 - Advanced Statistics Module |2, Secondary- information furnished by a person who was not a direct observer
or participant of the event
3. Documentary data- data obtained from records of offices, hospitals etc.
1.7 Scales of Measurement
1, Nominal
- involves naming or labeling; that is of placing cases into categories and
counting their frequency of occurrence
-distinguishes responses into attributes or categories
ex. religion, gender( real dichotomy), nationality, aggression ( either
active or passive- artificial dichotomy)
2. Ordinal
-distinguishes among categories arranged in rank order, grouped
according to rank/ ranges
Example of ordinal scale: military rank, comparing and rank-ordering of
socio-economic status (high, middle, low); state of happiness (very happy, not
so happy, unhappy, very unhappy); rank in an oratorical contest ( It cannot be
concluded that the 1* places is twice as good as the 2™ placer.
3. Interval-expressed in terms of numbers and differences between successive
numbers are consistently the same
-not only tells about the ordering of categories but also indicates exact
distance between them (ex. Score in an exam, IQ, performance)
-arbitrary zero-0 A zero score does not mean he has no knowledge of the
subject at all.
4, Ratio- like the interval measurements are also expressed in numbers and the
differences between any two successive numbers are consistent
- it has a true zero, meaning measurement starts with zero
ex. no of children, height, speed, capacity, years of experience
For a statistical technique to be more manageable, an interval /ratio
variable may be converted to an ordinal variable, for example length of service
Length of Service Rank
40 and above 1
30-39 z
20-29 years 3
10-19 years 4
10 years below 5
SEME 112 - Advanced Statistics Module |& THINK!
a
‘Answer the following exercis
Exercise I. Name other fields or even agencies of the government and identify
specific uses of Statistics in those fields or agencies.
Exercise 2. Categorize each of the following according to the level of
measurement.
1. sex
2. religious affiliation
3. no. of immediate family members
4. highest level of educational attainment
5. monthly income
6. social class you belong
7. the region where you live
8. math grade
9. first place, second place, third place in a lantern contest
10. rating of a teacher in the licensure exams
SEME 112 - Advanced Statistics Module |Lesson 2
Determining
Sample Size
Determining Sample Size
Population-consists all elements considered in a study. It is a universal set. Ex.
Students in DMMMSU
Finite-can be counted
Infinite-cannot be counted
Sample-representative group taken from a population
Why get a sample size instead of a population?
1. very expensive to get the entire population
2. time consuming
3. the sampling enables the researcher to do some inferences or
generalizations
Let N be the population size and the margin of error e denotes the
allowed probability of committing an error in selecting a small representative
of the population.
The sample size n can be obtained by using the Slovin’s Formula
x
wet
where: n=sample size
N: pulation size
jesired margin of error
The margin of error, e, could range between 1% or .01 and 10%
depending on the desire of the researcher. However, the researcher should be
aware of the Law of Large Number which states, “The larger the size of the
sample, the more certain we can be sure that the sample mean is a good
estimate of the population mean.” The larger the size of the sample, the closer
its characteristics would be to the characteristics of the entire population.” (It
can be noted that the higher the margin of error, the lesser sample size can be
computed.)
In social research, usually 5% or .05 is used while in medical research
studies, 1% or .01 is used.
Lynch Formula
= inz?xp =)
Nate zp G7)
SEME 112 - Advanced Statistics Module |n= sample size
Population size
Z = the standard value of (2.58) of 1% level probability with
0.99 reliability ( 1.96 for 5%)
margin of error (.05)
largest possible proportion (0.50) for getting the correct
number of sample from the population
Solve the sample size for N= 3590 using the slovin’s formula and lynch
formula. Compare the results.
3590
Using the slovin’s formula, n = sos cony
n= 360
[3590 (1.96)2x 5-5)
[35900544 (196)(5)1-3)]
Using the lynch formula, n =
_ 347836
99354
n= 347
Using stratified random sampling, get the sample size for each group of
respondents using slovin’s and lynch formula
a. Using Slovin's Formula, the multiplier is = which is 22%
DMMMSU Community
Group Population Size per | Sample Size per group
Group
‘Administrators 40 a
Teachers 385 39
Staffs/Personnel 65 7
Students 3100 310
Total 3590 360
b. Using Lynch Formula, the multiplier is > which is
SEME 112 - Advanced Statistics Module |1. 40x 2-4
3500
2. 385 x 2% = 37
3390
DMMMSU Community
Group Population Size per | Sample Size per group
Group
‘Administrators 40 a
Teachers 385 37
Staffs/Personnel 65 6
Students 3100 300
Total 3590 347
SAMPLING TECHNIQUE
Questions such as “Which TV network is the most popular among the
people in town?” or “Who will probably be the next president of the country?”
require gathering of information from a number of respondents in a population.
Complete enumeration or the so-called census taking is a vital tool if the
information gathered would be used for administrative purposes and if is of
local or national concern.
Sample surveys are preferred due to material constraints like money,
time and efforts
Sampling Techniques- is selecting a part of the population to represent the
population
1.
Probability sampling- (also known as random sampling) every member of
the population has an equal chance of being selected for the sample,
also called fair sampling
a. Lottery/ fishbowl technique- the name of each member in a
population is written in a piece of paper then draws n out of N
pieces of papers as desired for a sample.
b. Table of random numbers: computer generated number
representation. Point an entry in the Table, then proceed in any
direction vertically, horizontally, or diagonally until n_ distinct
numbers could represent the numerically coded elements in the
population
SEME 112 - Advanced Statistics Module |12
c. Systematic sampling: this method is taking every kth element in the
population (ex. Arranged alphabetically or by age, experience or
position)
By systematic sampling, every kth employee from the listed
order will be included in the sample. If N is nown, k value can be
computed as
k=“ where N is the population size and n is the sample size
d. Stratified- the group is divided based on homogeneity and samples
will be selected from each stratum. (When the population can be
partitioned into several strata or subgroups, it may be wiser to
employ the stratified technique to ensure a representative of each
group in the sample.
1. Simple stratified random sampling - The same number of
respondents are taken from each stratum
Suppose a population of students taking History of size N=
800 can be grouped according to year levels, 50 students
will be taken randomly from each of the four groups and
that comprises a sample of 200 students.
2. Stratified proportional random sampling-the sample is taken from
the strata proportionally
e, Multi-Stage-this technique uses several stages in getting the sample
from the population. However, the selection is still done at random.
Ex. A researcher needs one Barangay in the Philippines. Using lottery
method, he can pick first from the different regions, then provinces
in the region picked, then towns, then barangays.
2. Non probability sampling: (selective or non-random sampling) not all
members are given equal chance of being selected, also called bias
sampling
a. Purposive (judgement) sampling-representative samples are
deliberately chosen based on judgement or criteria. (Ex. Study on the
type of credit card plan availed by customers. In determining the
sample, the researcher may consider only the people who seem to
have white-collar jobs based on attire.)
SEME 112 - Advanced Statistics Module |®
I. Given
distribution to each of the sector.
2B
Quota sampling-the choice of the number of persons or elements to
be included in a sample is done at the researcher’s own convenience
or preference.
Cluster Sampling-sometimes referred to as an area sample because it
is usually applied on a geographical basis.
. Accidental/Incidental Sampling -the design is applied to those
samples which are taken because they are the most available (ex. An
interviewer can simply choose to ask those people around him or in a
coffee shop where he is taking a break)
. Convenience Sampling- this is utilizing the easiest way of reaching
the subject
(ex. Opinions of ty viewers and listeners concerning a controversial
issue- get responses and comments from those who will call)
THINK!
‘olve for the sample size using slovin’s and lynch formula by
filling in the corresponding boxes
the population data below, solve for the sample size and show the
N n
Faculty 250
Administration 60
Office Personnel 180
Maintenance 150
Students 5000
2
School Teachers Students Total
Population | Sample | Population | Sample_| Population | Sample
UNP. 32 320
ISPSC 35 114
NUPSC 25 118
DMMMSU | 62 188
NLUC
DMMNSU__[72 289
SEME 112 - Advanced Statistics
Module |LUC
DMMMSU | 72 460
SLUC
PSU 52 306
Lingayen
PSU 700 350
Bayambang
PSU Asingan | 23 a7
Total 473 z, 272
3.
School Teachers Students Total
Population | Sample | Population | Sample_| Population | Sample
UNP 32 320
ISPSC 35 114
NUPSC 25 118
DMMMSU | 62 188
NLUC
DMMMSU 72 289
MLUC
DMMMSU | 72 460
SLUC
PSU 32 306
Lingayen
PSU 700 350
Bayambang
PSU Asingan_| 23 127
Total
SEME 112 - Advanced Statistics Module |Lesson 3
aa) Tools in Gathering Data
TOOLS IN GATHERING DATA: Advantages and Disadvantages
1. Direct or Interview--it is a purposeful face to face interaction between two
persons, one of whom called the interviewer who asks questions to gather
information and the other called the interviewee or the respondent who
supplies the information asked for.. It can be tape recorded or written
interview
Advantages: Precise and consistent answers are obtained by rephrasing
or recasting the questions especially to illiterate respondents or to children.
Follow up questions can be raised for clarification.
Disadvantages: It is money, time, and effort consuming and it will be
applicable only for small population, except when conducting a census.
Steps in the interview
1. Planning Step
selection of the universe and locale of the study
-selection of the respondents by any valid sampling method
-selection of type of interview
-preparation of the instrument (questions to be asked)
2. Selecting a place for interview
3. Establishing rapport.
4. Carrying out the interview.
5.
6.
. Recording the interview.
. Closing the interview.
What to Avoid in Interviews
1. Avoid exerting undue pressure upon a respondent to make him participate in
an interview.
2. Avoid disagreeing or arguing with or contradicting the respondent.
3. Avoid unduly pressing the respondent to make a reply.
4, Avoid using a language well over and above the ability of the respondent to
understand,
5. Avoid talking about irrelevant matters.
6. Avoid placing the interviewee in embarrassing situations.
7. Avoid appearing too high above the respondent in education, knowledge,
and social status.
8. Avoid interviewing the respondent in an unholy hour.
2. Indirect or Questionnaire. an alternative method for the interview method,
-paper pencil data gathering method. Written responses are obtained by
SEME 112 - Advanced Statistics Module |16
distributing questionnaires (a list of questions intended to elicit answers to a
given problem, must be in a logical order and not too personal) to the
respondents through mail, on line or hand carry
Advantages: Consumes lesser time, money and efforts.
Disadvantages: Many responses may not be consistent due to the poor
construction of the questionnaire. The meaning of the questions may vary from
one person to the other. Inconsistent responses can no longer be modified,
hence, it reduces valid number of respondents.
Guidelines:
. Make all directions clear.
. Use correct grammar.
. Make all questions unequivocal.
. Avoid asking biased questions
. Objectify the responses.
. Relate ail questions to the topic under study.
. Create categories or classes for approximate answers.
. Group the questions in logical sequence.
i. Create sufficient number of response categories.
j. Word carefully or avoid questions that deal with confidential or
embarrassing information.
k. Explain and illustrate difficult questions.
L. State all questions affirmatively
m. Make as many questions as would supply adequate information for
the study.
n, Add a catch-all word or phrase to options of multiple response
questions
0, Place all spaces for replies at the left side
p. Make the respondents anonymous
zerpance
3. Observation: is a scientific method of investigation that makes possible use
of all senses to measure or obtain outcomes/responses from the object of
study. Data which cannot be gathered using the other tools can be gathered
using observation.
Ex. Teaching performance of Mathematics teachers.
-a means of gathering information for research, may be defined as perceiving
data through senses: sight, hearing, taste, touch, and smell. It is widely used in
studying behavior.
Advantages: Observation method is usually applied to respondents that
cannot be asked or need not speak, especially when behaviours of
persons/culture of organization/ performance outcomes of employees/ students
are to be considered.
Disadvantages: Subjectivity of information sought cannot be avoided.
Making Observation More Valid and Reliable
1. Use observation where and when other data gathering devices cannot be
used.
SEME 112 - Advanced Statistics Module |7
2. Use appropriate observation forms.
3. Record immediately.
4, Be as objective as possible
5. Base evaluation on several observations.
4, Registration Method/ Documents or Records- is enforced by private
organizations or government agencies for recording purposes. It is a process of
listing down items of the same kind in some systematic manner for record
purposes.
Ex. in the Philippine Statistics Authority, data such as population, deaths etc.
can be gathered.
registered matter may be classified alphabetically, chronologically,
quantitatively, qualitatively or otherwise.
Advantages: There is an organized data available from different
institutions and agencies which can serve as a ready reference for future study
or for personal claims of people’s records.
Disadvantages: Sometimes agencies have poor Management and
Information System. Sometimes, the process or system of registration is not
implemented well. It requires rigid protocol to secure data from the different
records.
5. Test- a tool used to obtain data about a specific trait or characteristics. It is
a device or technique used to measure the performance, skill level, or
knowledge of a learner on a specific subject.
-a specific type of measuring instrument whose general characteristic is that, it
forces responses from a person and the responses are considered to be
indicative of the person’s skill, knowledge, attitudes, etc.
Classification
A. According to Standardization
1. Standard test-prepared by specialist, norms are established
2. Non-standard test-prepared by teachers to measure achievement of
their students.
B, According to Function
1. Psychological test such as intelligence test, aptitude, personality and
vocational and professional interest inventory
‘Advantages: The data gathered is a measure of competence, hence, not
a perception. This is an objective method of obtaining a data so long as the
test utilize undergo validity and reliability.
Disadvantages: Sometimes the data obtained is not valid due to poorly
constructed test. Respondents may hesitate to take the test.
6. Experiment - it is used when the objective is to determine the cause-and-
effect of a certain phenomenon under some controlled conditions.
SEME 112 - Advanced Statistics Module |18,
Advantages: There is objectivity of information since a scientific method
of inquiry is used. An equal number of respondents with relatively similar
characteristics are being examined to obtain the different effects of something
applied to the experimental group.
Disadvantages: It’s too difficult to find respondents with almost similar
characteristics. The whole method must be repeated if the desired outcome is
not reached.
& THINK!
Exercise
1. Identify 20 different data and determine which data
gathering tool is most appropriate. Justify your answer.
2. Name 20 government agencies and determine what data
can be obtained from each agency.
SEME 112 - Advanced Statistics Module |19
Lesson 4
CRITERIA FOR DATA
GATHERING TOOLS
These criteria are applicable for questionnaires and tests. Before they
are used to gather data, they should be subjected to validity and reliability
tests.
1. Validity- extent to which the procedure actually accomplishes what it seeks
to accomplish or it measures what it intends to measure. Experts are supposed
to evaluate the test or questionnaire.
a. face validity- the construction, arrangement of items and overall
presentation are good
b. Content validity: relevance of the test items, item analysis,
determine which are too easy and which are too difficult
2. Reliability- refers to the degree of consistency, accuracy, stability,
repeatability or precision
methods: split-half method, test-retest method, parallel-form method,
internal consistency method, Richard-kuderson 20 and 21
3. Sensitivity-sensitive to detect changes
4. Specificity-gives only one answer
5. Positive predictive value-note change and improvement
6. Appropriateness-respondents can meet the demands of the instrument
7. Objectivity- free from any influence of the examiner
Reliability
-means the extent to which a test is dependable, self-consistent and
stable. In other words, the test agrees with itself.
-refers to the consistency of the scores obtained-how consistent they are
for each individual from one administration of an instrument to another and
from one set of items to another
Tools for reliability
1. Test-retest method
The same measuring instrument is administered twice to the same group
of subjects. The scores of the first and second administrations of the test are
determined by correlation coefficient.
The disadvantages are:
1. When the time interval is short, memory effects may operate. The subjects
may recall of his previous responses and tends to make the correlation of the
test high
SEME 112 - Advanced Statistics Module |20
2. When the interval is long, such factors as unlearning, forgetting, among
others may occur and may result to low correlation of the test
3. Regardless of the time interval separating the two administrations, other
varying environmental conditions such as noise, temperature, lighting and
other factors may affect the correlation of the test
Spearman rank correlation coefficient or Spearman rho may be used to
correlate the scores of this method. The formula is
62D?
N3-N
Where: XD? =the sum of the squared difference between ranks
N = the total number of cases
Steps:
1. Rank the scores separately for the two administration giving the highest
score a rank of 1.
2. Obtain the difference between the two sets of ranks.
3. Square each difference.
4, Solve for the rank order correlation coefficient.
For example, 10 students in second year high school are used as pilot sample to
test the reliability of an achievement test in Biology. Determine the reliability
coefficient given their scores in the two administrations of the test.
Illustrative Example:
Students Test x Rx Test Y Ry D
1 18 1 24 4 3
2 17 2 2B 2 0
3 14 3 30 1 2
4 13 4 2% 3004
5 12 5 2 5 0
6 10 6 18 6 0
7 8 7 15 7 0
Using the formula:
sca)
749-1)
SEME 112 - Advanced Statistics Module |21
r,=0.75 interpreted as high reliability. (In research, the reliability coefficient
should be .70 for it to be acceptable.)
How to interpret the coefficient of reliability?
Computed Value Interpretation
0. negligible
Tt, low
26> moderate
St high
76-1. very high
2. Split-half Method
The test in this method may be administered once, but the test items
are divided into two halves. The common procedure is to divide the test into
‘odd and even items. The two halves of the test must be similar but not
identical in content, difficulty, means and standard deviations. Each student
obtained two scores, one on the odd and the other on the even items, in one
test. The scores obtained in the two halves are correlated. The result is
reliability coefficient for a half test. Since the reliability holds only for half
test, the reliability coefficient for the whole test may be estimated by using
the Spearman-Brown formula. This formula is:
2rne.
n
we Ltn
where:
Twe = reliability of the whole test
Tne= correlation coefficient between the odd and even
scores which is also called reliability of half the test
Ilustrative Example:
Given the scores in the odd and even nos. determine if the test is reliable:
Student Score (40) Even (20) Odd (20)
A 40 20 20
B 28 15 13
c 35 19 16
D 38 18 20
E 2 10 12
F 30 12 18
G 35 16 19
SEME 112 - Advanced Statistics Module |22
H 33 16 7
l 3 12 9
J 28 14 14
For this information it is possible to calculate correlation using the Pearson
Product-Moment Correlation coefficient, a statistical measure of the degree of
relationship between the two halves.
Pearson Product Moment Correlation Coefficient
N (xy)
zx) @Y)
SSS
4 [N2x2—(@x)?] INEY? -Y)?]
Te =
Using the data above, assume that the X values are the scores in the
even numbered items and the Y scores are the odd numbered items.
Step 1. Complete the colums for IXY, IX’, LY”.
Step 2: Get the summation of each column.
x Yy
20 400
15 169
19 256
18 400
10 144
12 324
16 361
16 289
12 361
44 196
Ex=152 3y7=2900
Step 3. Using the formula:
SEME 112 - Advanced Statistics Module |23
h N (@X¥) - x) @Y)
1. SS
(NEX2—(2X)?] [NZY? -(2Y)?]
where: N=10, compute the reliability of half the test
h 10(2595) — (152) (168)
00° —— SSS
4{[10(2595)—(152)] [10(2900) -(168)"]
Tix = -48 (this is the reliability of half the test)
Step 4. To get the reliability of the whole test, use the formula
Tye = -65. (This is interpreted as high reliability, however may not be sufficient
for research. Therefore, there is a need to improve the test items.)
Other statistical tools to compute the reliability of tests and questionnaires are
KR 21, KR 20, Chronbach alpha etc. But technology can be used to determine
reliability. Use the reliability calculator created by Del Siegle
(
[email protected]). For rating scale, just input the rates given and for test, if
the answer is correct encode 1 and if the answer to the item is wrong, input
zero.
3. Kuder-Richarson 21 Formula
[no2-M (n—M)]
(n-1)0?
where: r- reliability of the whole test
n- product of the number of items in the questionnaire and the highest
scale
o. variance
SEME 112 - Advanced Statistics Module |24
2 ECM)?
ge
N
Where: x= total score of each respondent
M- mean score of the respondents
=x
Me
Where: Ix is the sum of all the scores
Nis the number of respondents
tC) THINK!
Solve for the reliability using the appropriate tool
1. Given the scores below in the even and odd numbered items, determine
the reliability of the given test using the appropriate method.
Odd Items Even Items
8 6
9 7
10 6
6 8
7 7
5 6
6 8
7 3
2. Given the scores of 13 students in the odd and even numbered items,
determine the reliability of the test.
Odd 50 50 48 4 45 44 #44 #43 42 42 41
40 40
Even 36 34 44 50 32 28 42 36 28 40 50
3835
SEME 112 - Advanced Statistics Module |25
3. Given the scores in the first administration and second administration of the
same test, find the coefficient of reliability.
82 86 «675 «674 «(68 BOCs HC
99 48
8 «87 «76 «677 «6700 71 66 7H
99 «50
SEME 112 - Advanced Statistics Module |26
Lesson 5 Optional
Organization and
Presentation of Data
The data which are collected from primary and secondary sources are
still considered raw data. It requires manual tallying and classifying of
responses. After tallying, an appropriate form of organization and
presentation is used to arrive at a meaningful interpretation of data.
1.1 Forms of Presentation of Data
A. Textual- This form of data presentation combines text and numerical
facts in a statistical report. It can be narrative or in enumerative form.
B. Tabular- This presentation of data makes use of statistical tables. Tables
are constructed to see right away relationships and comparisons can be
done. Each class is assigned to a particular
Advantages of Tabular Presentation
4. It is brief and concise.
2. It provides the reader a good grasp of the meaning of quantitative
relationship indicated in the report.
3. The whole story is revealed without the necessity of mixing texts
with figures.
4. The presentation is systematic with the use of columns and rows
making the comparison easier.
C. Graphical Presentation- This makes use of graph. This form is the most
effective means of organizing and presenting statistical data because the
important relationships are brought out more clearly and creatively in
virtually solid and colourful figures.
1.2 Different Kinds of Graphs/Charts
1. Line Graph- It shows relationships between two sets of quantities.
This is done by plotting point of X set of quantities along the
horizontal axis against the Y set of quantities along the vertical axis
in a rectangular coordinate plane. Those plotted points will be
connected by a line segment which finally forms the line graph. It is
used to predict growth trends for a longer period of time.
SEME 112 - Advanced Statistics Module |27
Sample tine Graph
Number of daily confirmed COVID-19 cases
India —Singapore —Indonesia — Philippines — Japan
Malaysia —GreaterChina —SouthKorea — Thailand
— Vietnam — Taiwan ¥ Peak
2,400
> 315 (Mar. 16) 15,152 (Feb. 13) = 813 (Feb. 29) +
2,000
=> 252 (Mar. 29) =p 24 (Mar. 23) => 27 (Mar. 20)
1,600
ation, Taiwan Centers
2. Bar Graph-This consists of bars or rectangles of equal widths, drawn
either vertically or horizontally, segmented or non- segmented. Two
or more information can be compared by showing them in multiple
bar graphs, each of which is shaded with different colors to give
distinctions of each.
SEME 112 - Advanced Statistics Module |28
Coronavirus outbreak in Southeast Asia
1500
25K
f ‘mt Cumulative roportod casos
g 1 Dally increase in cases 20K
© 1000
g 15K
= 500 10K
ze
a 5K
° 0K
Feb? = Mar3— Mart8—Apr2.— Apr iT,
3. Circle graph or Pie- It represents relationships of the different
components of a single total as revealed in the sectors of a circle.
The angles or size of the sectors are proportional to the percentage
components of the data which gives a total of 100%.
NCR
™ Luzon*
= Visayas
™ Mindanao
= Repatriate
© No Province
Distribution of all Covid-19 cases in the Philippines. NCR accounts for 54.5% of
all cases, while the rest of Luzon accounts for 12.8%. Visayas accounts for
16.8% of cases while Mindanao accounts for 2.8% of all cases. There are 1,105
cases (4.9%) classified as repatriates, while 1,855 cases or 8.3% are currently
uncategorized (i.e. it is not indicated the region of residence of the Covid-19
SEME 112 - Advanced Statistics Module |
Cumulative reported cases29
case, and it is not indicated if the case is a repatriate). Distribution of all
Covid- 19 cases in the Philippines. NCR accounts for 54.5% of all cases, while
the rest of Luzon accounts for 12.8%. Visayas accounts for 16.8% of cases while
Mindanao accounts for 2.8% of all cases. There are 1,105 cases (4.9%) classified
as repatriates, while 1,855 cases or 8.3% are currently uncategorized (i.e. it is
not indicated the region of residence of the Covid-19 case, and it is not
indicated if the case is a repatriate).
Source: https://www.up.edu.ph/covid-19-forecasts-in-the-phili
cebu-as-of-june-8-2020/
yines-ncr-and-
Figure 5. Favourite movie genres in Mrs. Smyth's Film
class
mComedy
om D Action
Romance
@ Drama
Horror
Foreign
m Science
fiction
14%
Source: https: //slideplayer.com/slide/5781935/
4, Picture Graph or Pictogram- It is a visual presentation of statistical
quantities by means of drawing pictures or symbols related to the
subject under study. Sizes and magnitudes of drawn pictures should
be clear enough to depict differences.
SEME 112 - Advanced Statistics Module |30
Pictograph
Figure 1. Number of students who like
chocolate chip cookies best
ov GED
w2 PSPS
we @8BS
Ow. 4
ws SESSA BH
ow. 8
OWv.7
Source:
5. Map Graph of Cartogram- It is used to present geographical data. This
kind of graph is always accompanied by a legend which tells us the
meaning of lines, colors, or other symbols used and positioned in a
map.
SEME 112 - Advanced Statistics Module |31
Source:
https: //www.google.com/url?sa=iturl=https%3A%2F%2Fourworldindata.o.
£g%2Fworld-population-
growthtpsig=AOvVaw0Dd5FaqbmDUG_MYLmhEmxwéust=1593497687879
000&source=imagest&cd=vfet&ved=0CIoBEK-
JA20XChMIyK3j8bGmé6gIVAAAAABOAAAAAE AI
Source:
https: / www. google.com /url?sa=i&url=https%3A%2F%2Fstories. thinkingm
achin.es%2Fphilippine-
languages%2F&psig=AOWawOMVryJWIGiToMUadmdhE3-
Gust=1593499853426000&source=imagestcd=vfe&ved=0CAMQjB1qFwoTC
KCst-a3puoCFQAAAAAGAAAAABAD
6. Scatter Point Diagram- It is a graphical device to show the
relationship between two quantitative variables.
SEME 112 - Advanced Statistics Module |Scatter Plot - Positive Correlation
12
os
06
Weight gained
oa
02
1000 1500 2000 2500 3000 3500 ‘4000 500
Calories Consumed
Source: https: // www. qimacros.com/scatter-plot-excel/scatter-plot-examples/
tC) THINK!
Activities:
Read and interpret the contents of the sample graphs above.
Research for other sample graphs and give brief interpretation
of each.
Construct the most appropriate graph for each data set. Describe and
interpret the data using graphs.
1. Monthly budget of a family with an income of 23,000 per month
Category Amount
Food P 8, 000
Shelter P'5, 000
Education P4, 000
Clothing P14, 000
Medical Care P2, 000
Savings P14, 000
Miscellaneous P2, 000
SEME 112 - Advanced Statistics Module |2. Periodic grades of a fourth year high school students in three subjects
Subject Grading Period
First Second Third Fouth
English 80 4 85 90,
‘Mathematics | 82 85 86 89
Science 78 80, 2 82
El MODULE SUMMARY
In module |, you have learned about the study of statistics. The five
lessons are basic concepts on statistics, determining sample size, tools in
gathering data, criteria for data gathering and organization and presentation.
There are five lessons in module I. Lesson 1 consists of the meaning,
function, branches/ fields of Statistics as well as the kinds of test in statistics.
Statistics describe a group in terms of what is average and in terms of
dispersion. It also determine existence of relationship and differences. It is
functional in all fields and in all agencies of the government. The two types of
Statistics are descriptive and inferential and the kinds of tests are parametric
and non parametric. The sources of data are primary or direct witness to the
event, secondary or information furnished by a person who was not a direct
observer or participant to the event and documentary data or data obtained
from records of offices, hospitals etc. The scale of measurement ae nominal,
ordinal, interval and ratio.
Lesson 2 deals with determining sample size which includes meaning of
population and sample, computation of sample size using slovin and lynch
formula. Moreover, it also deals with sampling techniques.
Lesson 3 includes tools in gathering of data, the advantages and
disadvantages. These are using interview, questionnaire, documents or records,
observation, test and experiment.
Lesson 4 considers the criteria for data gathering such as validity and
reliability. Computation of reliability coefficients using the various tools are
also considered. The tools for relaiability are test-retest method, split half
method, kuder-richardson, chronbach alpha and others. The reliability
calculator can also be used to compute reliability.
SEME 112 - Advanced Statistics Module |