Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
79 views15 pages

Statistics Lecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
79 views15 pages

Statistics Lecture

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

STATISTICS

Introduction

Statistics - proved to the world as a very powerful tool in almost all fields of work.

- found in the field of research, education, business, politics, psychology, and even in a
simple event that needs analysis.

- Very useful in recording facts about people, objects, and events and in making predictions
and decisions based from the available data.

- plays a great role as a tool in gathering opinions from a survey.

- in a business, a company of a certain brand of product can do some innovations in terms


of promotion, quality of the product or price once the survey shows negative feedback from the
consumers.

- it demands answers to some questions that were formulated from an existing situation or
environment.

- the validity, reliability, and accuracy of these answers can only be based from the proper
conduct of the five methods in statistics such as 1. Collection 2. Organization 3. Presentation 4.
Analysis and 5 interpretations of data.

Origin and development of statistics

Statistics can be traced back to the Biblical times in ancient Egypt, Babylon and Rome.

- as early as 3,500 yrs before the birth of Christ ; it is used in Egypt in the form of
recording the number of sheep or cattle , the amount of grain produced, the number of people
living in a particular city.

- 3800 B.C. Babylonian govt. – to measure the number of men under a king’s rule and the
vast territory that he occupied.

-700 B.C. Roman empires used statistics by conducting registration to record population
for the purpose of collecting taxes.

In modern time’s statistical methods…

- used to record and predict births and death rates


- employment and inflation rates

-sports achievements and other economic and social trends

- used to assess opinions from polls and unlock secret codes from a game of chance.

Modern statistics is said to have begun with John Graunt (1620 – 1674), an English
tradesman.

-he published records called “bills of mortality” that included information about the
numbers and causes of deaths in the city of London.

- he analyzed more than 50 yrs of data and created the first mortality table ,that shows how
long a person may be expected to live after reaching a certain age.
Karl Friedrich Gauss (1777 – 1855), brilliant German mathematician - making predictions
about the positions of the planets in our solar system.

Adolphe Quetelet (1796-1874) , a Belgian astronomer developed the idea of the “average
man” from his studies of the Belgian census. “Father of Modern Statistics”

Karl Pearson (1857 – 1936) , an English mathematician made important links between
probability and statistics.

Sir Ronald Aylmer Fisher - British statistician developed the F – tool in inferential
statistics. His tool was very useful in testing improvements of production from agricultural
experiments and improvement of precision of results from medical, biological, and industrial
experimentation.

George Gallup (1901 – 1984) – was instrumental in making statistical polling, a common
tool in political campaigns.

In this age of information technology, a lot of computer programs such as Microstat,


Soritec Sampler, SPSS, and others are made available in diskettes or websites that perform more
than the manual calculations in statistics.

People working in some government agencies, in laboratories, in media, and in business


generally use these electronic devices to easily access data, improve graphics, and obtain ready
made analyses and interpretations about the data.

Statistics - is an art and science that deals with the collection, organization, creative presentation,
analysis, and interpretation of data.

Uses of Statistics

1. Education – assess students’ performance and correlate factors affecting teaching and
learning processes to improve quality of education.

2. Psychology – to determine attitudinal patterns, the causes and effects of misbehavior.

3. Business and economics – to analyze a wide range of data like sales, outputs, price
indices, revenues, costs, inventories, accounts, etc.

4. Research and experimentation – to validate or test a claim or inferences about a group of


people or object, or a series of events.

Fields of Statistics

1. Descriptive statistics – concerned with the methods of collecting, organizing, and


presenting data appropriately and creatively to describe or assess group characteristics.

- is the method of collecting and presenting data. It includes the computation of measures
of central tendency, measures of central location, likewise the measures of dispersion or
variability. It also includes the construction of tables and graphs.

2. Inferential statistics- concerned with inferring or drawing conclusions about the


population based from pre – selected elements of that population.
- is concerned with higher degree of critical judgment and advanced mathematical modes
such as using the different statistical tools both the parametric and nonparametric tests. This is
concerned with the analysis and interpretation of data in order to draw conclusion and
generalization from organized data. This also includes the testing of the significant relationship
between the dependent and the independent variables as well as the significant differences between
and among independent samples.
In research, the Descriptive Normative Approach is concerned with the percentage
distribution of respondents, average or typical characteristics of the group, the homogeneity or
heterogeneity of characteristics and degree of relationships of group characteristics.

Most common statistical tool under descriptive statistics

1. Measures of location – (mean, median, mode, quartiles, deciles, percentiles)

2. Measures of variability – (range, variance, standard deviation, coefficient of variability)

3. Measures of tendencies – (skewness and kurtosis)

Most common statistical tool under inferential statistics

1. Normal distribution – (area under the curve)

2. Sampling distribution – (sample size, standard scores)

3. Probability distribution – (priori, posteriori, binomial, Bernoulli, geometric,


hypergeometric)

4. Estimation – (confidence interval, test of significance, alpha/beta errors)

5. Hypothesis testing – (Z-test, T-test, Chi-Square test, F – test or Analysis of variance)

Constant and Variables

Constants – refers to the fundamental quantities that do not change in value. E.g. fixed costs and
acceleration

Variables – are quantities that may take anyone of a specified set of values. Classified as

1. Qualitative (categorical) 2. Quantitative (numerical) variables.

1. Qualitative variables – nonmeasurbale characteristics that cannot assume a numerical


value but can be classified into 2 categories.

E.g. Gender – dichotomous variable since an individual may take one of the two values
(male or female)

Smoking habits – (Always/ Very often, Often, Seldom, Very seldom or Never)
2. Quantitative variables – quantities that can be counted with your bare hands, can be
measured with the use of some measuring devices, or can be calculated with the use of a
mathematical formula.

Classification of quantitative variables/types of measurement

1. Discreet- consist of variates (actual values) usually obtained by counting.


E.g. numbers of students enroll in a semester
- discontinuous or discrete data are measurement expressed in whole units. Counting
people, number of objects, number of cars, passing by, number of houses, number of students,
workers, and so on.

2. Continuous – obtained by measurements, usually with units such as height in meters, wt in kg ,


and time in minutes
- are measures like feet, pounds, kilos, minutes, and meters. These kinds of data can be
made into measurement varying degrees of precision.
Cause and effect - relationships are called Independent/ endogenous variables and
dependent/exogenous variables

Data and information

The process of using statistics always begins with a question. “Who will probably become the next
president?”
When questions like this have been asked the next step it to collect information about the
subject. The kind of information we get is called data and the people who collect, organize and
analyze the data are called researchers.

Data – refers to facts concerning things such as status in life of people, defectiveness of objects or
effect of an event to the society.

Information - is a set of data that have been processed and presented in a form suitable for human
interpretation, usually a purpose or revealing trends or patterns about the population.

Sources of Data

1. Primary source – a first –hand information is obtained usually by means of personal


interview and actual observation.

2. Secondary source – is taken from other’s works, news reports, readings, and those that
are kept by the NSO, etc.

Scales of measuring data

A good questionnaire should contain questions that are arranged in logical order and as
much as possible in a checklist type.

Classification of scales of measurement

1. Nominal scale – classifies objects or peoples’ responses so that all of those in a single
category are equal with respect to some attributes and then each category is coded numerically. Eg.
Marital status, single – 1, married – 2, separated – 3 , or widow – 4.

2. Ordinal Scale – classifies objects or individual’s responses according to degree or level,


and then each level is coded numerically. E.g. customer’s services, excellent – 1, very satisfactory
– 2, satisfactory – 3, fair – 4 , poor /needs improvement – 5.

3. Interval scale – to quantitative measurements in which lower and upper control limits are
adapted to classify relative order and differences of item numbers or actual scores. Household’s
socioeconomic status are classified based from what income level and age bracket they belong.

4. Ratio scale – takes into account the interval size and ratio of two related quantities,
which are usually based on a standard measurement. E.g. weights, time, height, rate of change in
production.

Methods of collecting data: its advantages and disadvantages

1. Direct or interview method – a person –to-person interaction between an interviewer and


an interviewee. Tape recorded or written interviews will help the researcher obtain exact
information from the interviewee.

Advantage – precise and consistent answers can be obtained


Disadvantage – it is time, money, and effort consuming and it will be applicable only by
small population.
2. Indirect or questionnaire method- is an alternative method for the interview method.
Written responses are obtained by distributing questionnaires to the respondents through mail or
hand – carry.

Advantage – lesser time, money, and effort are consumed.


Disadvantage – many responses may not be consistent due to the poor construction of the
questionnaire.

3. Registration method – is enforced by private organizations or government agencies for


recording purposes.

Advantage – organized data from institution can serve as ready references for future study
or personal claims or people’s records.

Disadvantage – problem arises only when an agency doesn’t have a MIS or if the system or
process of registration is not implemented well.

4. Observation method – a scientific method of investigation that makes possible use of all
senses to measure or obtain outcomes/responses from the object of study.

Advantage – applied to respondents that cannot be asked or need not speak, culture of
organization
Disadvantage – subjectivity of information sought cannot be avoided

Population and sample

Population – is a finite or infinite collection of objects, events, or individuals with specified class
or characteristics under consideration, such as students in a certain school.
Sloven’s formula in determining the sample size

N
n = ---------------
1 + Ne2

Law of Large Numbers

- “the larger the size of the sample, the more certain we can be sure that the sample mean
will be good estimate of the population mean”. The larger the size of the sample, the closer its
characteristics would be to the characteristics of the entire population.

Who will probably become the next president?

Complete enumeration or census taking- use as benchmarks or reference points for current
statistics and are used as sampling frame for most current sample surveys.

Random and Non – random sampling

Random sampling – is the most commonly used sampling technique in which each member
in the population is given an equal chance of being selected in the sample.
- called as fair sampling

Non-random sampling – is a method of collecting a small portion of the population by


which not all the members in the population are given the chance to be included in the sample.
- called as a bias sampling.
Properties of Random Sampling

1. Equiprobability – means that each member of the population has an equal chance of being
selected and included in the sample.

2. Independence – means that the chance of one member being drawn does not affect the chance of
the other member. E.g. In conducting a study on the product preference of customers, the choice of
one member of the family cannot be assumed as the choice of the entire

Two kinds of random sampling

1. Restricted random sampling – involves certain restrictions intended to improve the


validity of the sampling. This design is applicable only when the population being investigated
requires homogeneity.

2. Unrestricted random sampling – is considered the best random sampling design because
there were no restrictions imposed and every member in the population has an equal chance of
being included in the sample

Sampling Techniques

a. Random sampling techniques

1. Lottery or fishbowl sampling – done by simply writing the names or numbers of all the
members of the population in small rolled pieces of paper which are later placed in a container.
This is usually done in a lottery.

2. Sampling with the use of Table of random numbers – if the population is large, a more
practical procedure is the use of Table of Random Numbers which contains rows and columns of
digits randomly ordered by a computer.

3. Systematic sampling – done by taking every kth element in the population. It applies to a
group of individuals arranged in a waiting line or in methodical manner
N
k = -------
n
4. Stratified random sampling – when the population can be partitioned into several strata
or subgroups, it may be wiser to employ the stratified technique to ensure a representative of each
group in the sample. Random samples will be selected from each stratum.

b. Kinds of stratified random sampling

1. Simple stratified random sampling – when the population is grouped into more r less
homogeneous classes, that is different groups but with a relatively common characteristic, then
each can be sampled independently by taking equal number of elements from each stratum.

Population Sample
Fourth yr 185 50
Third yr 200 50
Second yr 215 50
First yr 200 50
Total N = 800 n=200
b. Stratified proportional random sampling- the characteristic of the population is such that
the proportions of the subgroups are grossly equal. The researcher may wish to maintain these
characteristics in the sample with the use of the stratified proportion technique

Population N= 800 Proportion % Sample


n= 200

Fourth yr 120 15 30
Third yr 200 25 50
Second yr 220 27.5 55
First yr 260 32.5 65
Total 800 100 200

Multi-stage and Multiple sampling – uses several stages or phrases in getting the sample
from the population. However, selection of the sample is still done at random.

- This method is an extension or a multiple application of the stratified random sampling


technique. The number of stages depends on the number of population and the sample size needed
in the survey.

Non- random sampling techniques

1. Judgment or purposive sampling – is also referred as non-random or non-probability


sampling. It plays a major role in the selection of a particular item and/or in making decisions in
cases of incomplete responses or observation. This is usually based on a certain criteria laid down
by the researcher or his adviser.

2. Quota sampling – this is a quick and inexpensive method to operate since the choice of
the number of persons or elements to be included in a sample is done at the researcher’s own
convenience or preference and is not predetermined by some carefully operated randomizing plan.

3. Cluster sampling- referred to as an area sampling because it is usually applied on a


geographical basis. The population is grouped into cluster or small units; e.g. blocks or districts, in
city or municipality.

4. Incidental sampling- applied to those samples which are taken because they are the most
available. The investigator simply takes the nearest individuals as subjects of the study until it
reaches the desired size.

5. Convenience sampling – widely used in television and radio programs to find out
opinions of TV viewers and listeners regarding a controversial issue.

Organization and presentation of data

Forms of presentation of data

1. Textual – this form of presentation combines text and numerical facts in a statistical
report.
2. Tabular – form of presentation is better than the textual form because it provides
numerical facts in a more concise and systematic manner. Statistical tables are constructed to
facilitate the analysis of relationships.

Advantages of tabular presentation


1. It is brief; it reduces the matter to the minimum.

2. It provides the reader a good grasp of the meaning of the quantitative relationship
indicated in the report.

3. It tells the whole story without the necessity of mixing textual matter with figures.

4. The systematic arrangement of columns and rows makes them easily read and readily
understood.
5. The column and rows make comparison easier.

2. Graphical presentation – is the most effective means of organizing and presenting


statistical data because the important relationships are brought out more clearly and
creatively in virtually solid colorful figures.

Different kinds of graphs/charts

1. Line graph- it shows relationships between two sets of quantities. This is done by
plotting point of X set of quantities along the horizontal axis against the Y set of quantities along
the vertical axis in a Cartesian coordinate plane. It is often used to predict growth trends for a
longer period of time.

2. Bar graph – consists of bars or rectangles of equal widths, either drawn vertically or
horizontally, segmented or non-segmented.

3. Circle graph or pie chart – it represents relationships of the different components of a


single total as revealed in the sectors of a circle. The angles or size of the sectors should be
proportional to the percentage components of the data which give a total of 100%.

4. Picture graph or Pictogram – a visual presentation of statistical quantities by means of


drawing pictures or symbols related to the subject under study.

5. Map graph or cartogram – one of the best ways to present geographical data. This kind
of graph is always accompanied by a legend which tells us the meaning of the lines, colors, or
other symbols used and partitioned in a map.

6. Scatter point diagram – a graphical device to show the degree of relationship between
two quantitative variables. The plotted points for every pair of X and Y set of quantities are not
connected by line segments but are simply scattered on the Cartesian coordinate plane.

Frequency Distribution

Frequency distribution – is a tabulation or grouping of data into appropriate categories


showing the number of observations in each group or category.

E.g. 5 13 6 13 10
8 12 15 10 12
11 15 12 7 15

The numbers shown above are called raw data.

Parts of Frequency Table

1. Class Limits / integral limits – groupings or categories defined by lower and upper
limits.

E.g. 16 – 20
21 – 25
26 – 30
Lower class limits are the smallest numbers that belong to the different classes.
Upper class limits are the highest numbers that belong to the different classes.

2. Class size – width of each class interval.

L.L. U.L.
16 20} class size - 5
21 25}

3. Class boundaries/ real limits – are the numbers used to separate class but w/o gaps
created by class limits. The number to be added or subtracted is half the difference between the
upper limit of one class and the lower limit of the preceding class.

E.g. C.L. C.B.


L.L. U.L. L.C.B. U.C.B.

16 20 15.5 20.5
21 25 20.5 25.5
26 30 25.5 30.5
31 35 30.5 35.5

4. Class Marks – are the midpoints of the classes. They can be found by adding the lower and
upper limits and then divide by 2.

E.g. C.L. Class mark (x)


16 – 20 18
21 – 25 23
26 – 30 28
31 – 35 33

Steps in constructing a frequency distribution table.

1 . Find the range of the values.

Range = highest value – lowest value


E.g. R = 47 – 12
R = 35
2. Find the class interval - in finding the class interval, we simply divide the range by 10
and by 20 in order that the size of the class limits or class interval may not be less than 10 and not
more than 20 provided that such class will cover the total range of the observations. To illustrate,
the range is 35 divided by 10 equals 3.5 and divided by 20 equals 1.75. The class interval is 3
where we will obtain 13 classes. The rule says that we should prefer not less than 10 or more than
20 class limits. And the ideal class limit is 12 – 14.

3. Set up the classes - in setting up the classes, we add c/2 where c is the class interval, to
the highest score as the upper limit of the highest class and subtract c/2 to the highest score as the
lower limit of the highest class. For instance, the highest score is 47 plus 3/2 or 1.5 is equal to 48.5
and 47 minus 1.5 is equal to 45.5. The highest class limit is from 45.5 to 48.5. This setting of
classes is called real limits or exact limits and these are sometimes spoken of as class boundaries.
Once the highest class is set, subtract 3 as your class interval to the next class until you reach the
lowest score.
There are two ways of setting classes, namely, real limits and integral limits. The latter is
obtained by adding 0.5 to the lower limit or a class interval and subtracting 0.5 to the upper limit.
For instance, the upper class is 45.5 to 48.5 for real limits and 46 to 48 for integral limits.
E.g. Setting of Classes in Real & Integral limits
Real limits/ C.B. Integral limits/C.L.
45.5 – 48.5 46 – 48
42.5 – 45.5 43 – 45
39.5 – 42.5 39 – 42
36.5 – 39.5 36 – 39
33.5 – 36.5 33 – 36
30.5 – 33.5 30 – 33

4. Tally the score - having adopted a set of classes, we are ready to tally them. Locate it
within its proper class and tally. After tallying count the number of tallies in each class and write it
in column frequency (f).

The tally should be carefully checked if the sum is equal to the total number of scores in
the sample. If there is an unequal frequency from the sample, tallying should be repeated. At the
bottom of column 4 the symbol, N or ∑f in which ∑(capital Greek sigma) stands for the “sum of”
equals 35 or the total number of cases (N).

Frequency distribution of the Fish 311 that were listed below.

Real limits Integral limits Tally Frequency


45.5 – 48.5 46 – 48 1 1
42.5 – 45.5 43 – 45 1 1
39.5 – 42.5 39 – 42 11 2
36.5 – 39.5 36 – 39 111 3
33.5 – 36.5 33 – 36 111 3
30.5 – 33.5 30 – 33 1111 4
27.5 – 30.5 27 – 30 11111 – 11 7
24.5 – 27.5 24 – 27 1111 4
21.5 – 24.5 21 – 24 111 3
18.5 – 21.5 18 – 21 11 2
15.5 – 18.5 15 – 18 11 2
12.5 – 15.5 12 – 15 1 1
9.5 – 12.5 9 – 12 1 1
35(N or ∑f)

Cumulative frequency distribution

The “less than” cumulative frequency distribution (<F) is obtained by adding successively from
the lowest to the highest interval while “more than” cumulative frequency distribution (>F) is
obtained by adding frequencies from the highest class interval to the lower class interval.

Class interval f <cf >cf


2–4 3 3 60
5–7 9 12 57
8 – 10 14 26 48
11 – 13 18 44 34
14 – 16 10 54 16
17 – 19 6 60 6
n = 60
Relative frequency distribution
The relative frequency of a class is the frequency divided by the total frequency of all classes and
is generally expressed as a percentage.

Frequency of each class interval


Relative frequency = --------------------------------------------
Total number of observations

Class intervals f rf rft (%) <rft(%) >rft(%)

2–4 3 0.05 5 5 100


5–7 9 0.15 15 20 95
8 – 10 14 0.233 23.3 43.3 80
11- 13 18 0.30 30 73.3 56.7
14 – 16 10 0.167 16.70 90 26.7
17 – 19 6 0.10 10 90 90
n = 40

Critical Value of the t Distribution (T – Test)

V 0.05 0.01
1 12.706 63.657
2 4.303 9.925
3 3.186 5.841
4 2.776 4.604
5 2.571 4.032

6 2.447 3.707
7 2.365 3.499
8 2.306 3.355
9 2.262 3.250
10 2.228 3.169

11 2.201 3.106
12 2.179 3.055
13 2.160 3.012
14 2.145 2.977
15 2.131 2.947

16 2.120 2.921
17 2.110 2.898
18 2.101 2.878
19 2.093 2.861
20 2.086 2.845

21 2.080 2.831
22 2.074 2.819
23 2.096 2.807
24 2.064 2.797
25 2.060 2.787

26 2.056 2.779
27 2.052 2.771
28 2.048 2.763
29 2.045 2.756
30 4.960 2.576
THE Z –TEST

The tabular value of z-test at .01 and .05


level of significance

Test Level of Significance


.01 .05
One – tailed ±2.33 ±1.645
Two – tailed ±2.575 ±1.96

CRITICAL VALUE OF THE F- DISTRIBUTION

f0.05 (v1 v2)

V1
V2 1 2 3 4 5 6 7 8 9
1 161.4 199.5 215.7 224.6 230.2 234.0 236.8 238.9 240.5
2 18.51 19.00 19.16 19.25 19.30 19.33 19.35 19.37 19.38
3 10.13 9.55 9.28 9.12 9.01 8.94 8.89 8.85 8.81
4 7.71 6.94 6.59 6.39 6.26 6.16 6.09 6.04 6.00

5 6.61 5.79 5.41 5.19 5.05 4.95 4.88 4.82 4.77


6 5.99 5.14 4.76 4.53 4.39 4.28 4.21 4.15 4.10
7 5.59 4.74 4.35 4.12 3.92 3.87 3.79 3.73 3.68
8 5.32 4.46 4.07 3.84 3.69 3.58 3.50 3.44 3.39
9 5.12 4.26 3.86 3.63 3.48 3.37 3.29 3.23 3.18

10 4.96 4.10 3.71 3.48 3.33 3.22 3.14 3.07 3.02


11 4.84 3.98 3.59 3.36 3.20 3.09 3.01 2.95 2.90
12 4.75 3.89 3.49 3.26 3.11 3.00 2.91 2.85 2.80
13 4.67 3.81 3.41 3.18 3.03 2.92 2.83 2.77 2.71
14 4.60 3.74 3.34 3.11 2.96 2.85 2.76 2.70 2.65

15 4.54 3.68 3.29 3.06 2.90 2.79 2.71 2.64 2.59


16 4.49 3.36 3.24 3.01 2.85 2.74 2.66 2.59 2.54
17 4.45 3.59 3.20 2.96 2.81 2.70 2.61 2.55 2.49
18 4.41 3.55 3.16 2.93 2.77 2.66 2.58 2.51 2.46
19 4.38 3.52 3.13 2.90 2.74 2.63 2.54 2.48 2.42

20 4.35 3.49 3.10 2.87 2.71 2.60 2.51 2.45 2.39


21 4.32 3.47 3.07 2.84 2.68 2.57 2.49 2.42 2.37
22 4.30 3.44 3.05 2.82 2.66 2.55 2.40 2.34
23 4.28 3.42 3.03 2.80 2.64 2.53 2.37 2.32
24 4.26 3.40 3.01 2.78 2.62 2.51 2.42 2.36 2.30

25 4.24 3.39 2.99 2.76 2.60 2.49 2.40 2.34 2.28


26 4.23 3.37 2.98 2.74 2.59 2.47 2.39 2.32 2.27
27 4.21 3.35 2.96 2.73 2.57 2.46 2.37 2.31 2.25
28 4.20 3.34 2.95 2.71 2.56 2.45 2.36 2.29 2.24
29 4.18 3.33 2.93 2.70 2.55 2.43 2.35 2.28 2.22

30 4.17 3.32 2.92 2.69 2.53 2.42 2.33 2.27 2.21


40 4.08 3.23 2.84 2.61 2.45 2.34 2.25 2.18 2.12
60 4.00 3.15 2.76 2.53 2.37 2.25 2.17 2.10 2.04
120 3.92 3.07 2.68 2.45 2.29 2.17 2.09 2.02 1.96
∞ 3.84 3.0 2.60 2.37 2.21 2.10 2..01 1.94 1.88
Reproduced from Table 18 of Biometrika Tables of Statisticians, Vol. 1, by permission of E.S. Pearson and the
Biometrika Trustees

Measures of central tendency – is a descriptive measures that are used to indicate where the
center, the middle property, or the most typical value of a set of data lies.
- any measure indicating the center of a set of data, arranged in an increasing or
decreasing magnitude.

Population mean – is the set of data taken from the average of the population. It is the total
population, added together, then divided by the population size N.

µ= Σx where: µ = population mean


N Σx= sum of the population
N = population size

Eg. The number of faculty members in 10 different colleges are: 16, 25, 40, 24, 15, 20, 50, 15, 35,
and 20 . Treating the data as a population, find the population mean of faculty members for the 10
colleges.

Sample mean – is the set of data taken from the average or mean of the sample, added together
then divided by the sample size n.

Formula:
Eg. The following are the ages of samples of 9 children in a urban area: 9, 8, 1, 3, 4, 5, 6, 7, 2

The Median for Ungrouped data


Median – is the value found at the middle when the data are arranged in an array form from the
highest to the lowest or from the lowest to highest. If there are two middle values, the average is
taken .

Example. 9, 8, 7, 6, 5, 4, 3, 2, 1

Solution: Md = 5 . The middle point value is 5

The Mode for Ungrouped data


Mode – is the value which occurs most often or with greatest frequency

The mean for grouped data


A frequency distribution of the scores in Statistics 01 of 34 BSED students

X f d fd
35 – 39 3 +3 +9
30 – 34 5 +2 +10
25 – 29 8 +1 +8
20 – 24 10 0 0
15 – 19 4 -1 - 4
10 – 14 2 -2 - 4
5–9 2 -3 -6
n= 34 ∑fd = +13
Steps in getting the sample mean or average
1. Select the assumed mean from any of the 7 steps distribution by adding the lower and the upper
limits and dividing their sum by two (2).
Am = 20 +24 = 44 = 22
2
rd
2. Construct the 3 column (d) for positive and negative deviations.
3. Put zero along column d where you select the assumed mean and +1, +2, +3 …, above zero and
-1, -2, -3 below zero.
4. Multiply the frequency (f) and the deviation (d) considering the signs, and write it in column fd.
5. Get the ∑fd algebraically

solution :

Midpoint method
A frequency distribution of the scores in Statistics 01 of 34 BSED students

Scores f Midpoint x’ f’
35 – 39 3 37 111
30 – 34 5 32 160
25 – 29 8 27 216
20 – 24 10 22 220
15 – 19 4 17 68
10 – 14 2 12 24
5–9 2 7 14
n= 34 ∑fx’ = 813

Steps in solving for the sample mean using the midpoint method.
1. Get the midpoint for every step distribution by adding the lower and upper limits then dividing
the sum by 2. place them on column x’ (midpoint).
2. Multiply the values of f and x’ place them under column fx’
3. Find ∑fx’ by adding the values of column fx’ then use the formula

Solution:

The median for grouped data

A frequency distribution of the scores in Statistics 01 of 34 BSED students

Scores (X) f
35 – 39 3
30 – 34 5
25 – 29 8
20 – 24 10
15 – 19 4
10 – 14 2
5–9 2
n= 34
Steps in solving for the median of grouped data:

1. Construct cumulative frequency (F) by copying the frequency of the last step which is 2.
2. Add the frequency going up, thus 2+2 equals 4+4 equal 8+10 equals 26+5equals 31+3 equals
34.
3. Get the half n/2 ; 34/2 is equal to 17
4. Subtract the cumulative frequency (F) from the half sum n/2. take note that the F should not
exceed the n/2 . So 8 in the F, and the small f is 10 one step higher than the cumulative frequency
F.
5. The L is the true lower limit. Subtract .5 from 20, so 20 - .5 is equal 19.5, then

L = 19.5
n/2 = 34/2 = 17
F=8
f = 10
i=5

Scores (X) f F

35 – 39 3 34
30 – 34 5 31
25 – 29 8 26
20 – 24 10 18
15 – 19 4 8
10 – 14 2 4
5–9 2 2
n= 34

The mode for grouped data

Substitute the values of Mean and Median based on the formula.

A frequency distribution of the scores in educational research of 40 BSED students

Scores (X) F

54 – 56 3
51 – 53 2
48 – 50 1
45 – 47 5
42 – 44 6
39 – 41 8
36 – 38 4
33 – 35 6
30 – 32 2
27 – 29 3

You might also like