Basic Statistics (Pg 88 – 108)
Outline
Introduction
1. Data Collection
I. Definition of data
II. Different types of data
III. Approaches in Data Collection (Census vs
Sampling)
IV. Different Methods of Data Collection
Basic Statistics (Pg 88 – 108)
Outline cont.
2. Organize Data (Tabulate)
I. Frequency Distribution Table
II. Cumulative Frequency Distribution Table.
3. Analyze (Tabulated) Data
Introduction
Why collect data? Or Why statistics?
▪ Statistics has always been priority of any government and it goes back to
historical times.
▪ Records of births, deaths, marriages, income and taxes, enrolments etc were
and still are the common needs of the state.
▪ The civilized world today has expanded the scope for the need to collect
data for development and planning purpose.
▪ Population explosion is a major factor driving the need to research and to
collect data in every aspect of life. The reason why these various fields of
study.
Eg. Market research, advertising, economics, insurance, medical
research/health, education, politics, law and many other fields.
1. Collecting Data
Definition
Data: Refers to a collection of related facts gathered via one of the 4 data collection
method which will be covered later.
(Data –plural, Datum- Singular)
So people are driven to collect data on certain objects/persons or data on
objects/persons of the same character.
1. Collecting Data
Two approaches/procedures
CENSUS Vs SAMPLE
CENSUS
Is when one counts every individual element in the population.
Eg. PNG National Census, UPNG HECAS students in 2017. All Milne Bay Fulltime studnets
living on campus etc
❖ Very Costly & Time Consuming.
SAMPLE/SAMPLING (Must be statistically significant- not too small or biased)
Take a sample- count elements in the sample and then you estimate for the
entire population.
1. Collecting Data cont.
Be it a Census or a Sample:
Use 4 Methods of collecting data:
1.Interview.
2.Questionnaire.
3.Observation.
4.Records Review (Archives/Hospitals/Churches/Registry etc)
❖ Each one has its own advantages and disadvantages
1. Collecting Data cont.
Two Types of Data
1. Quantitative Data
Can present these data using numbers
a) Discrete: Counting (whole) numbers. Eg. 20 People etc.
b) Continuous: Decimals. Eg. Weight or height of people, 70 kg or 1.57 m
etc.
1. Qualitative Data.
Cannot present these data using numbers
Eg. Name, Address, Gender, Color, Laws, Acts, Policies.
2. Organizing Data (Tabulate Data)
a. Frequency Distribution Table (FDT)
❑ After collecting the raw data, we then tabulate it.
❑ The FDT enables us to count the scores.
▪ Eg. The FDT should tell us the head count of UPNG students
from age 17 (min) to age 50 (max) enrolled this year.
▪ Go through Example 1. Fire Engine Problem (Page )
2. Organizing Data (Tabulate Data)
Frequency Distribution Table (FDT)
Example 1
Over a period of 70 days, a fire brigade kept records of the number of times the fire engine was called out
each day. The record is given in Table 1.
Number of times fire engine was called out each day.
0 4 3 1 2 1 2 5 1 3
3 0 0 7 1 6 1 1 4 2
1 5 1 4 4 0 2 1 3 0
2 3 2 2 0 1 3 1 6 1
2 0 1 4 3 1 5 2 3 1
1 2 1 3 3 2 4 0 1 5
4 2 2 1 5 1 3 1 2 3
Table 1: Raw data
2. Organizing Data (Tabulate Data)
Example 1 cont.
present them in a form of a Frequency Distribution table. We thus present the raw data in a
frequency distribution table below.
(a) The frequency distribution below shows tally marks and the corresponding frequencies.
Number of calls per day, x Tally Frequency, f
0 //// /// 8
1 //// //// //// //// / 21
2 //// //// //// 14
3 //// //// // 12
4 //// // 7
5 //// 5
6 // 2
7 / 1
Total Frequency (total number of days) = 70
2. Organizing Data (Tabulate Data)
Example 1 cont. (FDT)
Conclusion
From the table we can now tell what’s happening quickly. For
instance:
o There were 8 days in which NO calls were made;
o There were 21 days in which only 1 call was made, etc.
2. Organizing Data (Tabulate Data)
b. Cumulative Frequency Distribution Table (CFDT)
We can construct a Cumulative Frequency Distribution table that will show the
upper class boundaries and the number of observations called the Cumulative
Frequencies.
These cumulative frequencies are often expressed as percentages.
We present the same data above in a Cumulative Frequency Distribution table
below.
2. Organizing Data (Tabulate Data)
b. Cumulative Frequency Distribution Table (CFDT)
b) The data in example 1 could be presented in a Cumulative Frequency Distribution table,
thus giving:
Number of Calls per Cumulative Percentage Cumulative
day, x Frequency Frequency (%-age)
<1 8 11.4
<2 29 41.4
<3 43 61.4
<4 55 78.6
<5 62 88.6
<6 67 95.7
<7 69 98.6
<8 70 100
Table 3: Cumulative Frequency Distribution
2. Organizing Data (Tabulate Data)
b. Cumulative Frequency Distribution Table (CFDT)
To determine the values of the Cumulative Frequency column: you would add the next
frequency value to the current running cumulative frequency total to give you the new
cumulative frequency value. The last cumulative frequency value should equal the total
frequency. The Percentage Cumulative Frequency is the percentage out of the total
frequency.
Conclusion
Eg. What percentage of the calls are below 6 calls? Ans1: 97.5%
The CFDT shows the accumulated frequency/count, and then presents the accumulated
frequency as percentage of the entire population.
2. Organizing Data (Tabulate Data)
Graphical Representation
Organizing collected raw data into a simple table such as the Frequency Distribution
table or the Cumulative Frequency table helps a lot in conveying certain information
about the underlying population. But, graphical presentation of the same simply
does better to show information more easily, and on the fly.
Here, we will use the commonly used graph type, the histogram, to represent the
information in the Frequency Distribution and the Cumulative Frequency Distribution
tables of the previous
2. Organizing Data (Tabulate Data)
Graphical Representation
FREQUENCY HISTOGRAM
25 25
25
20 20
20
frequency
15 15
Frequency
15
10 10
10
5 5 5
0
0 1 2 3 4 5 6 7 0 0
0 1 2 3 4 5 6 7
Number of calls per day
number of calls per day
His
Frequency Distribution Graph
Cumulative Frequency Distribution & Cumulative Frequency Polygon or an Ogive
Graphing the Cumulative Frequency Distribution table