Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views4 pages

Statistcs Topic 1 Tutorial 1 Representation of Data

This document covers the representation of statistical data, including methods for presenting and interpreting data such as frequency distribution tables, stem-and-leaf diagrams, and box-and-whisker plots. It emphasizes understanding measures of central tendency and variation, and how to use cumulative frequency graphs for estimating statistical values. Additionally, it provides examples of data grouping and interpretation, highlighting the importance of class boundaries and modal classes.

Uploaded by

michellchakaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views4 pages

Statistcs Topic 1 Tutorial 1 Representation of Data

This document covers the representation of statistical data, including methods for presenting and interpreting data such as frequency distribution tables, stem-and-leaf diagrams, and box-and-whisker plots. It emphasizes understanding measures of central tendency and variation, and how to use cumulative frequency graphs for estimating statistical values. Additionally, it provides examples of data grouping and interpretation, highlighting the importance of class boundaries and modal classes.

Uploaded by

michellchakaza
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

1

STATISTCS TOPIC 1: REPRESENTATION OF DATA


PAPER 5 (S1) TUTORIAL 1
9079/5

In this topic learners should be able to :

 Select a suitable way of presenting statistical data, and discuss advantages and/or
disadvantages that particular representation may have
 Draw and interpret stem-and-leaf diagrams, back-to-back stem plots, box-and-whisker plots,
histograms and cumulative frequency graphs
 Understand and use different measures of central tendency (mean, median and mode) and
variation (range, interquartile range, standard deviation)
 Use cumulative frequency graphs to estimate medians, quartiles, percentiles, the proportion
of distribution above (or below) a given value or between two values
 Calculate and use the mean and standard deviation of a set of data( including grouped data)
either from data itself or from given totals ∑ 𝒙 , ∑ 𝒙𝟐 ,or coded totals ∑(𝒙 − 𝒂) 𝒐𝒓 ∑(𝒙 − 𝒂)𝟐
and use such totals to solve problems which may involve up to two sets of data.

Introduction

Numerical information obtained from experiments or surveys is called 𝑑𝑎𝑡𝑎. If the information is not
ordered, it is referred to as 𝑟𝑎𝑤 𝑑𝑎𝑡𝑎. There are two main types of data, 𝑑𝑒𝑠𝑐𝑟𝑒𝑡𝑒 𝑑𝑎𝑡𝑎 − that which
takes integral values and 𝑐𝑜𝑛𝑡𝑖𝑛𝑢𝑜𝑢𝑠 𝑑𝑎𝑡𝑎 − that which is obtained from measurement of quantities such
as time, length, temperature etc. In the later, numerical information is approximated. In this topic we are
going to look at ways of representing and interpreting such numerical information.

E .NYANDOROH 0772241993
2

1 Frequency distribution tables to represent raw data.

Ex 1 The number of children in each family was recorded, for 20 pupils, as follows :

4 2 3 0 4 1 2 3 6 5 0 1 4 3 2 5 3 2 3 3

This is discrete raw data. A concise way of representing the above data is to construct a
𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑡𝑎𝑏𝑙𝑒.

Number of children 0 1 2 3 4 5 6

Frequency 2 2 4 6 3 2 1

Advantage : the data is now ordered and the 𝑚𝑜𝑑𝑒 (value that occurs most) is clear – 3 children.

Ex 2 The marks of 20 students in a test were recorded as follows : raw data

84 17 38 45 47 53 76 54 75 22

66 65 55 54 44 51 39 19 51 72

Here there is need for grouping data into classes or intervals ( preferably of same width ).

Lowest mark = 17 and Highest mark = 84.

Chose intervals 10 − 19 , 20 − 29 , 30 − 39 , 40 − 49 , 50 − 59 , 60 − 69 , 70 − 79 , 80 − 89.

The numbers 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 ,90 in this case are called 𝑐𝑙𝑎𝑠𝑠 𝑏𝑜𝑢𝑛𝑑𝑎𝑟𝑖𝑒𝑠 − 𝑐. 𝑏

The class/interval 10 − 19 can be represented in a different way as 10 ≤ Mark < 20

Lower class boundary Upper class boundary


𝑙. 𝑐. 𝑏 𝑢. 𝑐. 𝑏

NB: The upper class boundary (𝑢. 𝑐. 𝑏) of one class is the lower class boundary (𝑙. 𝑐. 𝑏)

of the next class. The 𝑐𝑙𝑎𝑠𝑠 𝑤𝑖𝑑𝑡ℎ = 𝑢. 𝑐. 𝑏 − 𝑙. 𝑐. 𝑏 is the same for classes above, 10

in this case.

Data is represented by a 𝑔𝑟𝑜𝑢𝑝𝑒𝑑 𝑓𝑟𝑞𝑢𝑒𝑛𝑐𝑦 𝑑𝑖𝑠𝑡𝑟𝑖𝑏𝑢𝑡𝑖𝑜𝑛 𝑡𝑎𝑏𝑙𝑒.(see table below)

E .NYANDOROH 0772241993
3

Mark 10−19 20 − 29 30−39 40 − 49 50 − 59 60 − 69 70 − 79 80 − 89

Frequency 2 1 2 3 6 2 3 1

Advantages: Data is more concise.

Modal class is clear – (50 − 59)

Disadvantages: Original data is lost e.g. the mark of one student in the interval 20 – 29 is
not known.
NB: Suppose the marks of 20 students were averaged to the nearest whole number, i.e. marks are
continuous data, then classes may also be given in a different way such as:

9.5 ≤ 𝑀 < 19.5 , 19.5 ≤ 𝑀 < 29.5 , 29.5 ≤ 𝑀 < 39.5 . . . 79.5 − 89.5 Or

10− , 20− , 30− , 40− , 50− , 60− , 70 − ,80 −

Ways of interpreting data – Median and quartiles.

Once the data is ordered extract important information from it such as


𝑡ℎ𝑒 𝑚𝑒𝑑𝑖𝑎𝑛 𝑎𝑛𝑑 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒𝑠.

For data containing 𝑛 observations arranged in order, the 𝑚𝑒𝑑𝑖𝑎𝑛 is the middle number i.e. a
number 50% of the way through the distribution :

𝑛+1 𝑡ℎ
𝑄2 = ( ) value (term).
2

1
The 𝑙𝑜𝑤𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒, 𝑄1 , is a value 4 of the way through a distribution.

𝑛+1 𝑡ℎ
𝑄1 = ( ) term.
4

3
The 𝑢𝑝𝑝𝑒𝑟 𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒, 𝑄3 , is a value 4 of the way through a distribution.

3
𝑄3 = 4 (𝑛 + 1)𝑡ℎ term.

E .NYANDOROH 0772241993
4

The 𝑖𝑛𝑡𝑒𝑟𝑞𝑢𝑎𝑟𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑔𝑒 = 𝑄3 − 𝑄1.

Example 3

(i) Write down the class boundaries and give the class width of each class [2]

(ii) What is the modal class? [1]

Solution

(i) Class boundaries: 3000 , 8000 , 13000 , 23000 , 43000 , 83000

Class width: 5000 5000 10000 20000 40000

Remember u.c.b of class is l.c.b of next class

(ii) Modal class = 3000 - 7000

(iii) Class containing median: 8000 – 12000 .

Class with lower quartile: 3000 – 7000 .

Summary

In cases where the number of intervals (or classes) is not specified, form at least five classes in
your grouped frequency distribution.

Understand the terms, mode , median, lower quartile, upper quartile and interquartile range for
they shall frequently be referred to in the next tutorial.

E .NYANDOROH 0772241993

You might also like