0 ratings0% found this document useful (0 votes) 54 views10 pagesData Representation Chapter
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
16» Probability anc Statistics 807
SECTION A Data representation
By the ond of this section you will be able to:
» distinguish between continuous and discret
> construct frequency distributions
> draw a histogram and bar chart
» plot a frequency polygon
vata
A1 Discrete and continuous data
‘The following are examples of discrete data:
> Number of people in a room
> Number of rejects on an assembly line
> Shoe size of children,
‘What do you think is the definition of ‘discrete data’?
Data which can only take certain values, The number of people in a rom can only be
1, 2, 3,... and not 1,23, 1.57, 10.11,
‘The following are examples of continuous data:
> Weights of people
> Output voltage of an analogue system
> Loads on a beam,
What do you think is the definition of ‘continuous data’?
Data which can take any values between two end points.
“The weights of people can be 60.28 kg, 70.1 kg,
A2 Frequency distribution
‘What does the term ‘frequency’ mean?
Itis the number of times a particular value occurs in some data, The combination of
particular values and their frequency is called a frequency distribution.
(One way of representing the distribution of data is by a frequency table. This is a table that
summarizes the data into some order,
Example 1
‘The number of rejects, in the last 30 days, from an assembly line has yielded the
following data:808 16 > Probability and Statistics
Example 1 continued
360087 4930 36 35
400 4200 4437 33 42.
370 4l aaa 30 36
37373630 44 31
3000 «424d 44 39 42.
Construct a frequency distribution table,
Solution
Remember that frequency of a value is the number of times it occurs in the data, For
example, there are 30 rejects on 4 days. We say the frequency of 30 is 4, We can,
summarize the above data as detailed in Table 1
Number of rejects Frequency f
30
31
33
35
36
37
39 1
40
41
2
44
“9
TABLE 1
‘The representation of data is a lot clearer in the table.
We can check that all the data has been placed in the table, How?
‘The sum of the frequencies should add up to 30 because there are 30 data values, that is
Sf=30. [Remember S means ‘sum of’ . This is no guarantee that our frequency
distribution is correct but it is a good guide.
This table is the frequency distribution for what sort of data?
Discrete data (number of rejects)
‘We can use a similar idea to form a frequency distribution for continuous data. One way of
representing continuous data is to group it into particular ‘classes’ or ‘intervals’, as the next
example shows, This is particularly useful for a large amount of data,16» Probability anc Stalistics 808
Example 2a
The diameters, in mm, of 20 pipes are as follows:
40.6 40.7 40.9 41.0 4d
a4 als aL 412 412
419 413 a4 416 418
416 412 40.5 40.8 419
Form a frequency distribution table, by grouping the data into five classes.
Solution
How do we form a frequency distribution for this data? We can group the data into
classes, but classes of what size?
‘That depends on the data. The smallest value is 40.5 and the largest value is 41.9. Ifwe
use classes of size 0.3, then we will get five classes. Lets form the frequency distribution.
table for classes of size 0.3 (Table 2)
w)Diameter am) Frequency
4 yoas sa 40.75 4
z
4075 Probability and Statistics
A3_ Histogram
A histogram is a graphical representation of a frequency distribution,
Example 20
Considering the data of Example 2a, draw a histogram for this data
Solution
‘The frequency is plotted along the vertical axis and the grouped diameter of pipes along
the horizontal axis.
Figure 1 shows a histogram of Ficqueaey
the data contained in Table 2
The symbol, x, used in Fig. 1
means that there isno data Fig. 1
before the specified value, in
this case 40.45.
LOS 61354165 41.98 Diameee
This is one of the simplest
histograms to draw because it
hhas equal class widths and so the height represents the frequency
‘When either axis does not start at zero (in this case the horizontal axis starts at 40.45), itis
normally abbreviated by omitting a section of the scale, indicated by ~.
In a histogram the area of the rectangle is proportional to the frequency.
‘Only in the case of equal class width do we have the height of each rectangle representing
the frequency.
For histograms with unequal class widths we need to be careful, as Example 3a, below,
shows.
Example 3a
‘The resistances of 100 resistors are given in Table 3a,
. Resistance R (KO) Frequency
4 °
& 20
= 8
12
16=R<18 18
18=R<20 2
205R<23 a
Draw a histogram for this data.16» Probability ang Statistics 811
Example 3a continued
Solution
Remember that the frequency is proportional to the area. We need to choose a
standard class width. By looking down the left-hand column of the table, we find the
class widths are of sizes 0.1, 0.2, 0.1, 0.1, 0.2, 0.2 and 0.3. Which class width would you
choose to be standard?
It really doesn’t make much difference, but the most suitable seems to be 0.1 because
there are three intervals with this width and it keeps the arithmetic easy, that is
0.1 xX 2=0.2,01 x3 = 03.
If we choose our standard width = 0.1 then the second interval is twice the
standard width and so we halve the frequency height. Similarly for the last interval,
wwe take 1/3 of the frequency height (Table 3b). This new figure is known as the standard
frequency.
va) Resistance R(kO) Cass wideh Frequency Standard
8 (Standard width, SW) frequency
Zo tsr Probability and Statistics
A4 Frequency polygons
Another graphical representation of a frequency distribution is a frequency polygon. There
are two ways of constructing a frequency polygon:
1 Draw a histogram and join the midpoints of the tops of the rectangles.
2 Plot the standard frequency on the vertical axis against the midpoint of the interval.
Example 3b
Plot the frequency
poly
Solution
Which method should we use?
Method 1, because
wwe have already ‘Standart
drawn a histogram 12 Miequeoey a
for the data Frequency
Y polygon
Figure 2b shows LLP “
the frequency * DE
polygon of resistance UY,
values
y L Resistance
tri2 141s 16 18 20 23 ae)
AS Bar charts
Another graphical way to represent data is to plot a bar chart. A bar chart consists of bars
which can be drawn vertically or horizontally, and the height or length of these bars gives
the frequency. We will confine ourselves to vertical bars.
You will find it easier to plot a bar chart using appropriate software.
Example 4
Table 4 shows the number of new registrations with the Engineering Council at the end
of each year. Draw a vertical bar chart to represent
fa the number of CEng registrations against the year of entry
b the number of CEng, [Eng and EngTech registrations against the year of entry.16» Probability ang Statistics. 813
Example 4 continued
< Number of new registrations with the
z Engineering Council
Year Cling Engng Tech
2000 $096 «1708 683
2001 49321362, 392
2002 $180 789 S74
2003 ©4504 599 466
2004 4518 484 758
2005 $906 532 1880
2006 $563 498 944
20073489 586 839
2008 3439 498, 1343
2009 3750 S47 1314
Solution
a The number of CEng registrations is given in the second column, We plot a
series of vertical bars of the same width with the year plotted horizontally and the
number of CEng registrations (numbers in the second column of Table 4) vertically.
‘This is illustrated in Fig. 3a:
‘The numberof new registrations for CEag
Numbersegisered
500
Fig. 8a S000
4500
4000
3500
2000 2001 2002 2003 2004 2005 2005 2007 2008 2009 Yar
‘Note that the vertical axis starts at just above 3000 because all the entries in the CEng
column of Table 4 are above 3000, We could start at zero, but it would be more difficult
to visualize the difference between the various years.814 16 > Probability and Statistics
Example 4 continued
b We can also plot three bars for each year showing each of the categories CEng, IEng
and EngTech as illustrated in Fig. 3b:
‘The number of new engineers registered
Number registred
so00
4900
Fig.do 3000
2000
2000 2001 2002 2003 2008 2005 2005. 2007 200% 2009 est
E
fog hag Ba EngTe
Note that this time the vertical axis starts at zero to enable the smaller quantities of
Eng and EngTech registrations to be shown.
SUMMARY
Discrete data can only take certain values while continuous data can take any value
between the two end points.
A frequency table is one way of representing the distribution of data,
A histogram is a graphical representation of a frequency distribution. The frequency is,
proportional to the area,
‘A frequency polygon is another graphical representation of a frequency distribution.
Another graphical representation of data is a bar chart,
Exercise 16(a) Seomnpagaeconengrecngsg ne
4 statewhether the following fe dcrete _@ The numberof motors in Date
or continuous dat a Th eines of ght bulbs
2 The weights of people o The resistance value of vane
b Marks in an examination resistor,Exercise 16{a) continued
2 The temperatures, to the nearest degree
Celsius, for the last 30 days are as
follows:
22 23
214
23s
16 18
23
Construct a frequency distribution.
3 The heights, in m, of 40 students are
19
23
18
26
7
20
shown below:
1.68 1.67
1.81 1.85
1.82.1.76
1.95 1.87
1.86 1.88
Construct a frequency distribution with
an equal class width of 0.1
16
7
au
22.
22
1.53 1.70 1.69
1.76 1.66 1.91
1.84 1.55 1.61
197
561.99
1,93 1.64 1.89
17 20
19 21
20 20
2423
21.20
1711.71 1.80
1.95 1.87 1.80
1.85 1.93 1.88
1.83 1.74 1.83
1,90 1.72 1.95
4 Draw a histogram for question 3 with the
same class width.
5 Draw a frequency polygon for question 3.
6 The table below shows the time taken, in
is, for 105 op-amps to become fully
operational:
Time taken t (ans)
lost
mst
30st
40st
455
sost
ssst
yo=t
<20
30
<40
<45
50
Probability and Statistics
Exercise 16(a) continued Eee a
9 The table to the right shows the
number of new registrations with the
Engineering Council at the end of each
Number of new registrations with
the Engineering Council
year. Draw a vertical bar chart to Year CEng Hing_-EngTech
represent the number of registrations 1984 391123911002
of CEng, IEng and EngTech against, 1985 500227741337
each year on the same graph. 1986 596026821039
c ‘hand d 1987 6022-3066 = 1130
fomment on your graph and data, toss | seve | asi | 1189
1989 474623351210
1990 920725591315
1991 S413, 26341185
1992 5588 = 21281184
1993 6189 2050-1190
1994 S721. 15561237
1995 537614331146
1996 S485 15791082
1997 S641. 1595 903
199847921484 789
1999 S187 1562916
SECTION B Data summaries
By the end of this section you will be able to:
> ovaluate the moan
> understand what standard deviation means
> evaluate the standard deviation
> derive propertias of mean and standard deviation
> evaluate the mean and standard deviation of data in a frequeney distribution
Bt Averages
‘The sample mean, or average, of m numbers, XX, Ay. denoted by Fi given by
Ay + Xp + Xy + + Xy al
mom x= THY
7
( sum of observations )
‘umber of observations
The notation Sx; means sum x, from 1 to n, that is aj + X2 4&5 +--+ Xe