Data Representation & Presentation
Data Representation & Presentation
Structure
Objectives Introduction Stages of Statistical Inquiry Arrangement of Data
3.3.1 Simple Array 3.3.2 Frequency Array or Discrete Frequency Distribution 3.3.3 Continuous or Grouped Frequency Distribution 3.3.4 Various forms of Frequency Distributions
Tabulation of Data
3.4.1 Meaning and Types of Tables 3.4.2 Parts of a Table 3.4.3 Importance of Tables
Let Us Sum Up Key Words Some Useful Books Answers or Hints to Check Your Progress Exercises
3.0 OBJECTIVES
On going through this Unit, you will be able to explain: stages of statistical inquiry after data have been collected; methods of organizing (classification and arrangement) and condensing statistical data;
concepts of frequency distribution and its various types; and different methods of presentation of statistical data such as tables, graphs, diagrams, pictograms, etc.
3.1 INTRODUCTION
In the preceding Unit, we discussed the methods of collection of data either by a statistical survey (or inquiry) or from some secondary source. Data collected either fiom census or sample inquiry, that is fiom primary source, are always hotchpotch and in rudimentary form. To start with, they are contained in hundreds and thousands of questionnaires. To make a head and tail out of them, they must be organised, (i.e., classified and arranged) and condensed or surnmarised. For this purpose we can use various methods like preparing master sheets in which various information are recorded directly h m the questionnaires. From these sheets small summary tables can be prepared manually. Now-a-days computers can be used for organisation and condensation of data more swiftly, efficiently and in much less time. Some computer softwares are available which help us to construct various types of graphs and diagrams. Data can be summarized numerically also. Here we use summary measures like measures of central tendency (such as Arithmetic, Geometric and Harmonic Means, Mode and Median); measures of dispersion (such as Range, Quartile Deviation, Mean Deviation, and Standard Deviation); measures of association in bivariate analysis (such as Correlation and Regression), Index Numbers, etc. In this Unit we plan to discuss how data can be summarized using tables and graphs. Numerical summarization will be discussed in subsequent Blocks (2,3 and 4). It must be kept in mind that a good summarization and presentation of data is not undertaken for its own sake. It is not an end in itself In fact it sets the stage for usehl analysis and interpretation of data. Again, a good presentation helps us to highlight significant facts and their comparisons. Figures can be made to speak out thereby making possible their intelligent use.
'
In Unit 2, a q~estionnaire was prepared on Family Planning. Suppose this questionnaire was used to collect information from 50 families of C-III Block of XYZ Colony, New Delhi. Let us assume that it producedthat following types of information as given in Tables 3.1 and 3.2. Can we make any head or tail out o it? f
Table 3.1 Number of Children per family in C-111 block, XYZ Colony, New Delhi
Table 3.2 Monthly Income of 50 families of C-111 block, XYZ Colony, New Delhi
547 622 691 684 567 586 680 578 583 578
As pointed out earlier, to make any head or tail out of the mass of raw data, such as presented above, we have to classi@ and arrange it. This can'be done either by forming a simple array or a frequency array (discrete frequency &stribution) or a continuous frequency distribution Sub-sections 3.3.1,3.3.2 and 3.3.3 attempt to explain this aspect.
D a t a and
I ~ S Presentation
After the arrangement of data in ascending order as in Table 3.3 the raw data make some sense. The possible conclusions that can be drawn from this arrangement of data (see Table 3.3) are that five families are issueless, twelve fm'lies have one child each, fourteen have two children each, ten families posses three children each, six families have four children each and three families have five children each.
Table 3.4
Number of Children :
Number of Families :
L
0
5
1 12
4 6
5
3
Total
X)
14
10
When the number of observations is large enough, the counting process is often undertaken by the use of tally bars. In this method, all possible values of the variable are written in a column. For every observation, a tally bar denoted by ( ( ) is noted against its corresponding values. Every fifth repetition is marked by crossing the previous four bars as ( m ) . In this way, we get blocks of five which simpli& counting at the end. Thus a nurnber or an observation repeated fourteen times will be marked as (Mm I I II).Note that after representing each observation by a bar on the tally sheet, the same will be ticked (3) or crossed (5) so that it is not duplicated. The data of Table 3.1 is rewritten in the form of frequency distribution as shown in Table 3.5 below:
No. of Children
Tally Sheet
Frequency
Monthly Income
Tally Sheet
'THL ?t+lI
(Rs.1
500 - 550 550 - 600 600 - 650 650 - 700 700 - 750 750 - 800 800 - 850
Total
7HITHI 7HI?t+lII
'THLIIII
'THL
5
3 50
Ill
In Table 3.6 we have completed an exercise where the variable "incomeo the f family" has been grouped in order to reduce it to a manageable form called grouped data or Continuous Frequency Distribution. However, prior to the construction of any grouped fi-equencydistribution, it is very important to find answers to the following questions: 1) What should be the number of class intervals? 2) What should be the width s f each class interval? 3) How will the class limits be designated? 1) What should be the number of class intervals? Though there is no hard and fast rule regarding the number of classes to be formed, yet their number should be neither too small nor too large. If the number of classes is too small, i.e., width of each class is large, there is likelihood of greater loss of information due to grouping. On the other hand, if the number classes is very large, the distribution may appear to be too fiagrnented and may not reveal any pattern of behaviour of the variable. Based on experience, it has been observed
that the minimum number of classes should not be less than 5 or 6 and in any case, there should not be more than 20 classes. Usually the formula to determine the number of classes is given by Number of classes = 1+ 3322 x log,, N , where N is the total number of observations.
In our example of raw data on incomes of 50 families, the number of classes can be calculated as under:
Number of classes
= 1+ 3322 x log,, 50
=1
=
1 + 3.322
x 1.6990
+ 5.644 = 6.644= 7.
2) What should be the width of each class interval? As far as possible, all the class intervals should be of equal width. However, when a frequency distribution, based on equal class intervals, does not reveal a regular pattern of behaviour of observations, it might become necessary to re-grobp the observations into class intervals of unequal width. By a regular pattern of behaviour we mean that there are no classes, with possible exclusion of extreme classes, where there are nil or very few observationswhile there is concentration of observations in their adjoining classes. The approximate width of a class can be determined by the following formula:
Width of a Class = Largest Observation - Smallest Observation Number of Class Intervals
However, the final decision, regarding width of class intervals, should also take into account the following points.
0 As far as possible, the width should be a multiple of 5, because it is easy to grasp numbers like 5, 10, 15, ..... etc.
i) It should be convenient to find the mid-value of a class.
iii) The observations in a class should be uniformly distributed.
The srpallest and the largest observations of a class interval are known as class limits. Thlsebre also termed as the lower and upper limits of a class, respectively. Since the mid-value of a class, which is used to compute mean, standard deviation, etc., is obtained fiom the class limits, it is necessary to define these limits in an unambiguous manner. The following points should be kept in mind while defining iis class lmt: a) It is not necessary that the lower limit of the first class be exactly equal to the smallest observation of the data. In fact it can be less than or equal to the smallest observation. Similarly, the upper limit of the last class.may be geater than or equal to the largest observation of the data. b) It is convenient to have the lower limit of a class either equal to zero or some multiple of 5 or 10. c) The chosen class limits should be such that the observations in a class are uniformly distributed.
1 1
10 12
I I
9 5
3
50
Total
ii) Inclusive Method :In this method, all the observationswith magnitude greater than or equal to the lower limit but less than or equal to the upper limit of a class is included in it. Now observe Table 3.8. Income of Rs. 549 is included in the class 500 to 549 so that an income of Rs. 550 automatically goes to the next class of 550 to 599. Since the upper limit of one class is not equal to the lower limit of the following class, this saves us fiom the conhsion whether Rs. 550 goes to (500 to 549) or (550 to 599) class.
Table 3.8 Inclusive Class Intervals Monthly Income (Rs.)
500 - 549
I he cmolce between excZusive and incluszve rriclhads depends upon whether we are dealing with continuous variable like income, heights, weights, etc. or a discrete variable like number of children in a family. For a continuous variable it is desirable to construct fiequency distribution by the exclusive method because, as we have seen earlier, it ensures continuity. For a discrete variable like number of children in a family or number of students getting h t division, the frequency distributions should be constructed by using inclusive type of class intervals.
499.5- 549.5 549.5 - 599.5 599.5 - 6493 649.5 - 699.5 699.5- 749.5 749.5 - 799.5 799.5- 849.5
Total
5
6
1 0
1 2
9 5
3
50
Mid-Value of a Class
In exclusive type of class intervals,the mid-value or class mark of a class is defined as the arithmetic mean of its lower and upper limits. However, in case of inclusive class intervals, there is a gap between the upper limit of a class and the lower limit of the following class. This gap is eliminated by adding half of the gap to the upper limit and subtracting halfofthe gap from the lower limit. The new class limits, thus obtained, are known as class boundaries. The class boundaries of the inclusive class intervals in Table 3.8 are given in Table 3.9.
a n
Below 25
20 - 25 25 - 30
25 - 30 30 - 40 40 - 50 50 and above
3
5 2 1
12
3
5
. 30-40
40 - 55 5 - 60 5
2 1
12
Total
Total
b) A Frequency Distribution with Unequal Class Width The classes of a fiequency distribution may or may not be of equal width. A fiequency distributionwith unequal class width is reproduced in Table 3.11. Here, the width of lst, 2nd and 5th classes is 5, while that of 3rd is 10 and that of 4th is 15. As we will see in Unit 4, mode is not a representative value in such types of series and hence not defined. c) Cumulative Frequency Distribution Suppose that, with reference to data given in Table 3.6, we ask the following questions:
How many families have their monthly income less than or equal to Rs.700? n How many families have their monthly income greater than or equal to ) Rs. 600? The answers to the above questions can be ehsily obtained by forming an appropriate cumulative kquency distribution. To answer t h question, we need k t to form a "less than type" cumulative fkquency distribution while a "great& than type" cumulative frequency distribution is required for answering the second question. These distributions are given in Tables 3.12 and 3.13 respectively.
Table 3.12 "Less-than type" Cumulative Frequency Distribution Monthly Income (Rs.) Frequencies
Simple Less than 550 Less than 600 Less than 650 Less than 700 Less than 750 Less than 800 Less than 850
Cumulative
5
6
5+6
5 11 2 1
I
1 0
12
9
3 3
42
.5 3
47
5 0
Table 3.13 "More-than type" Cumulative Frequency Distribution Monthly Income (Rs.)
Slmple More than 500 More than 550 More than 600 More than 650 More than 700 More than 750
Frequencies
Cumulative
5 6
10
1 2 9 5
3 3
d) Relative Frequency Distribution So far we have expressed the fiequency of a value or that of a class as the number o! times an observation is repeated. We can also express these frequencies as a f fraction or apercentage of the total number of observations. Such frequencies are known as the relativefrequencies. Table 3.14 demonstrates the construction of relative fiequency distribution.
Table 3.14 Relative Frequency Distribution of Monthly Income of 50 Families
Class Frequency Relative Frequency As a fiaction As a percentage
500 - 549 550 - 599 600 - 649 650 - 699 700 - 749 750 - 799 800 - 849
Total
5 6 1 0
0.10 + 100 = 10 0.12 + 100 = 12 0.20 + 100 = 20 0.24 + 100 = 24 0.18 + loo= 18 0.10 + 100 = 10 0.06 + 100 = 6
100
12
9
5
3
50
3 + 50 = 0.06
1
firom the above table it is clear that sum of the relative ikquencies should be either 1 (in case of fraction) or 100 (in case of percentage).
.............................................................................................................
.............................................................................................................
2) Explain the following terms giving examples: a) Ungrouped data b) Class mark c) Open end classes d) Class limits e) Class boundaries f) Class fiequencies g) Tally bar h) Relative fiequencies
.............................................................................................................
3) Build a hypothetical fkequency distribution on monthly pocket money of 20 students belonging to the lower middle class of a college. Prepare a relative frequency distribution fi-om it.
4) What points are to be kept in mind while taking decisions for preparing a frequency distribution in respect of :
...................................................................................................................
5) Construct less than and more than type cumulative frequency distributions
...................................................................................................................
6) Construct a relative frequency distribution for the data given in question 5.
ii) Graphic methods which will include line graphs, histograms, fiequency polygon and curves, and cumulative fiequency curves.
iiii Geometric forms, pictures and statistical maps, which will include pie diagrams, bar diagrams, area and volume diagrams, etc.
Table 3.15 is based on hypothetical figures of exports and imports of country X with country B' for three years 1995, 1996 and 1997.
Table 3.15 Imports and Exports o f i with Country B during 1995 1997
Imports
1995
Exports
Imports
1996
Exports
Imports
1997@
Exports
Total
@
195
202
225
235
240
230
Note : Figures are quick estimates. Source : Trade Bulletin, 1998, Ministry of Foreign Trade of X.
In this table it is clear that the purpose is to show the imports and exports of country X vis-a-vis the rest of the world. Note that a particular entry of the table refers t o ~ o l u m n a row. For example, an entry at the intersection of second row and and fourth column indicates that in 1996 country X imported goods and services worth Rs.60 crore b m country B. This figure then can be compared with other import and export figures to seek important interpretations.
Types of Tables
Basically, we have two types of tables: 1) Reference tables or general purpose tables
2) Text tables or special purpose tables. 1) Reference tables are a general purpose tables and are a store of information with the aim of presenting detailed statistical information. From these tables, we can derive our information (i.e., secondary source). Tables presented by different government departments, ministries, Reserve Bank of India, Economic Surveys, etc. are reference tables and are a routine work of these departments.
Data and
I ~ S Presentation
Another important example is the Population Census tables prepared by the Registrar General of India giving detailed information on the dernographik features of India. Students are advised to consult the latest issue of "Economic Survey" which is issued every year along with the union budget of India. Prepare fiom it a table on exports and imports of India to USA, UK, Russia, Canada and Germany for three or four years. 2) Text tables are the special type of tables. They are smaller in size and are prepared fiom the reference tables. Their aim is to analyse only a particular aspect to bring out a specific point or to answer a particular question. For example fiom the Population Census tables we may pick out information on the number of people in Bombay and Delhi who speak different languages (mother tongue), profess different religions and come fiom different states of India. Similarly fiom various publications of Reserve Bank of India, we may be able to extract dormation, in tabular form, on money supply, rate of interest and bank rate for the last ten years or so. Tables can be simple and one way, like the tables'given in Section 3.3, where we deal with only one variable, say, income. Alternatively, it is called a univariate frequency distribution. In addition to this, we can have two-way or multi-way tables where we deal with two or more related characteristics (for example, Table 3.15).
1) Table number is required for the identification of a table particularly when there are more than one tables in a particular analysis. Table number is always mentioned in the centre at the top. 2) Title of the table gives the indication of the type of information contained in the body of the table. It is said that the title is to the table what heading is to an essay. Next to the table number, we mention the title of the table. Its purpose is to answer the questions like: a) b) c) d)
What is in the table? Where is it in the table? m e n did a particular information occur? How has a particular information been arranged?
In respect of a sample of a table on exports and imports, (Table 3.15), these qyestiqns will be answered as below: a) The table contains values of exports and imports of country X. b) Mormation contained in the body of the table shows exports (sales to) and imports (purchases fi-om) four countries A, B, C and D. c) These exports and imports occurred in 1995,1996 and 1997. d) Idormation on exports and imports has been arranged according to year and countries. Dos and Don'ts of the Title Don't opt for long sentences. Title should be brief and to the point. Present the title in bold letters and/or in capital letters. Expressions used should not convey more &an one meaning. Avoid the expressions like 'Table Presents ..........' or 'A
1i
Detailed Comparison of Data Relating to .........', etc. It should be like a telegraphic message. 3) Head note, also called prefactory note, is written just below the title. It shows conients and unit of measurement like (rupees crore) or (lakh tomes) or (thousand bales). It should be written in brackets and should appear on right side top just below the title. However, every table does not need a head note, like number of students in each class. 4) Stubs are used to designate rows. They appear on the left hand column of the table. Stubs consist of two parts: a) Stub head describes the nature of stub entry. b) Stub entry is the description of row entries.
5) Captions, also called box heads, designate the data presented in the columns of the table. It may contain more than one column heads, and each column head may be sub-divided in more than one sub-head. For example, we can divide the students of a college into hostelers and non-hostlers and then again i l into males and females. This w l help us to know the number of male hostelers in, say, first year, second year and third year.
6) Main body of the table, also calledjeld of the table, is its most important and bulky part. It contains the relevant numerical information about which a hint is already contained in the title of the table. In our example of Table 3.15 the title amply suggests that the body of the table contains numerical information on exports and imports of co.untry X for a period of three years.
7) Foot Note, is a qualifying statement put just below the table (at the bottom). Its purpose is to caution about the limitations of the data or certain omissions~' For example, in Table 3.15, the foot note reads that "@ figures are quick . estimates". This implies that the figures for the year 1997 where a superscript '@' is given are not final.
8) Source of data may be the last part of a table, yet it is important. It speaks about the authenticity of the data quoted. It also offers opportunity to the reader to check the data if (s)he so desires and get more of it.
Taking all these points into consideration, the format of a hypothetical table is presented below:
Table 3.16
(
TITLE
1
(In Crore of Rupees)
Stub Head
Caption
Stub Entnes
MAIN
BODY OF
TIE
TABLE
...................................................................................................................
2) Comment on the statement: "Title is to the table what heading is to an essay".
...................................................................................................................
3) Enumerate the various parts of a Statistical table.
.
...................................................................................................................
4) Make a sketch of a two-way table to show the following information: For a college divide the students according to
a) 1st Year, 2nd Year and 3rd Year students b) Hosteler and non-hostelers
...................................................................................................................
f
L
i
I
I
Although there are four quadrants on a plane, in economics we usually draw our diagrams only in the first quadrant where both the quantities measured on X-axis and Y-axis are positive. Economic quantities like price, quantity demanded and supplied, national income, consumption, production and host of other such variable are non-negative ( 2 0 ). Let us take a demand schedule and plot it on the graph. The resultant curve on i l joining different points,assumingcontinuity,wl give us line graph expmsing relation between price and quantity demanded. Such a line graph in Economics is called a demand curve. Note that price is measured on Y-axis and quantity demanded on X-axis. The demand curve for data given in Table 3.17 is given in Fig. 3.1.
Table 3.17 Demand Schedule Table 3.18 Time Series Data Year
1990 1991
1992
R. Price of X ( s )
5 10 1 5
20 25
Quantity of X demanded
16 12 8
4 2
25
20
40
50
30
45
60
Demand Curve
Quantity Demanded
Fig. 3.1
A line graph may be used to show changes in some economic variable, say, steel praduction over time. In other words, if out of the two variables, one happens to be time (months, years, etc.), we get a line graph over time or simply time series graph or historigram. A time series expresses behaviour of an economic variable over time. An example of time series data is given in Table 3.18. Measuring years on X-axis and steel production on Y-axis, we can plot time series data on a graph, as shown in Fig.3.2.
Years
Fig. 3.2
Construction of Histogram
To plot a histogram of the frequency distribution given in Table 3% on a graph paper, we mark off class intervals like 500 - 550,550 - 600, etc. on the horizontal axis. Similarly, we mark off fiquencies on the vertical axis. Since all the classes
are of equal width, the height of each rectangle is taken to be equal to the Ikquency of the respective class. The histogram is shown in Fig. 3.3.
Histogram
Fig. 3 3
1) The width of various rectangles show the nature of classes in the distribution, i.e., whether of equal width or not.
2) Area of a rectangle shows the proportion of the class frequency in the total.
I
Frequency Polygon
Frequency Polygon has been derived finm the word "polygon" which means many sides. In statistics, it means a graph of fiquency distribution. A fquency polygon is obtained from a histogram by joining the mid-points of the top of various rectangles with the help of straight lines, as shown in Fig. 3.4. In order that total area under the polygon remains equal to the area under histogram, two arbitrary classes, each with zero frequency, are added on both ends, as shown below.
Frequency Polygon
450-500
550-600
650-700
750-800
8%-900
Monthly Income
Fig. 3.4
Frequency Curve
Ethe points, obtained in the case of frequency polygon are joined with the help of a smooth curve, we get a frequency curve 8 shown in Fig. 3.5.
Frequency Curve
l4 T
450-500
550-600
650-700
750-800
850-900
500
550
600
650
700
750
800
850
The bar diagram of the above data is drawn in Fig. 3.7. To make the bar diagram beautiful we can either colour the bars or shade them in different ways. This is left to the aesthetic taste of the investigator.
Bar Diagram
North
South
East
West
Zone
Fig. 3.7
A sub-divided bar diagram is used when it is desired to represent the comparative values of different components of a phenomenon. In this diagram, the bars, comspondmgt each phenomenon, isdivided into various unnponhts. The portion o of the bar occupied by each component denotes its share in the total. The subut divisions of different bars m s always be done in the &me order and these should be distinguished fiom each other by using different colours or shades. A subdivided bar diagram for the hypothetical data on sales of T.V. sets, given in Table 3.20 is drawn in Fig. 3.8.
Table 3.20 Lone-wise sale of T.V. sets (1995-1997)' Zone Number of T.V. Sets sold (lakhs) 1995 1996 1997
NO&
12
8
5
20
28
9
7
15 10 11
64
6
31
8
44
Total
60
m "0 50 m 40
f!
9
b,
i 30 :
20
10
W South
z"
Years
Fig. 3.8
Year
1990
Total Revenue 30
Total cost
25
Profit
5
40 --
6 c: 2
35 -30
--
$
$
--
.Total
Profit
Cost
25 20 -1510 -5 -
t
1990
a
1991
1992
Years
Fig. 3.9
Let us consider data on, say, average salaries of three categories of university teachers, and prepare all the three types of area diagrams.
Table 3.22 Average Salaries of University Teachers as on 1/1/1998 Class of teachers
Professors Readers Lecturers
a) For drawing rectangles, a common base of, say, 100 is taken. Accordingly, the heights can be determined as: 1) Salary of Rs.25,000 2) Salary of Rs. 16,000 3) Salary of Rs. 9,000
= =
=
100 (base) x 250 (height) 100 (base) x 160 (height) 100 (base) x 90 (height)
Now take a scale of 2 cm = 100, so that the first rectangle has dimensions of 2 cm. x 5 cm, the second one has the dimensions of 2 cm x 3.2 cm and the third one has the dimensions of 2 cm x 1.8 cm. After this, we are in a position t draw the rectangles as area diagrams (Fig. 3.10). o
13
Readers Rs. 25,000
Rs. 16,000
Lecturers
Rs. 9,000
b) For drawing squares, we find the square root of various incomes. We have,
Chose a scale 1 cm = 50 so that the first square has each side approximately . equal to 3.2 cm. (since 158.114150= 3.2), second has the side of 2.53 cm. and the third has the side of 1.9 cm. The relevant squares are drawn in Fig. 3.11.
Average Salary of University Teachers (Rs.)
Professors
El
Rs. 25,000
Readers Lecturers
Rs. 16,000
Rs. 9,000
c) For drawing Circles we take the squares of their radii in the ratio of areas, i.e., 25000: 16000: 9000 or 25: 16: 9. This is based on the property of the circles that area of a circle is proportional to the square of its radius. Let r,, r2 and r, denote the radii of the three circles, then we can write' r12: r22: r,2 = 25 : 16 : 9 or rl : r2 : r3 : = 5 : 4 : 3. Taking 2.5 units = 1 cm the radii of the three circles will be 2.0, f.6 and 1.2 cms respectively. Let us draw the required circles..
Professors Readers
Rs.25,000
Scale : 1 crns = 2.5 unit.
Fig. 3.12: Area Diagram (circles)
Exports
Percentage Share
(300 X 100) s 800 = 37.50 (250 x 100)
i 800 = 31.25
Degree
(37.5 X 3600) + 100 = 135O (31.25 X 360)
t
300
B
C
20 5
150 100 800
10Q= 12.5O
D
Total
Fig. 3.13
Income (Rs.)
216
Cube-root
Side of cube
1.5 cms.
Poor
Very Rich
2.
3375
m= 15
=4
==6
3.75 cms.
Scale : 1 cm
units.
4) Now draw two cubes with sides equal to 1.5 cms. and 3.75 cms. respectively.
Income Levels of Poor-and Very Rich People (Rs.)
Very Rich
Poor
Rs.216
Rs. 3375
Fig. 3.14
Fig. 3.15
Pictograms suffer fiom a limitation that they present only approximate values. For more accurate presentations bar diagrams are preferable.
.............................................................................................................
2) Prepare a sub-divided bar chart and a pie diagram from the following data.
Academic Year Expenditure on Books Economics 5200 8000 Commerce 10000 14000 Maths 5000 7000 Languages 4800 6000 Total 25000 35000
1996 - 97 1997 - 98
.................................................................................................................
3) Explain the following terms:
a) Line graph b) Bar diagram c) Sub-divided or component bar diagram d) Multiple bar diagram e) Area diagram
f ) Volume diagram
.............................................................................................................
.............................................................................................................
4) Fill in the blanks with a suitable word out of those given in brackets: a) A pie diagram is also called ........................... diagram. (bar, angular, multiple bar).
b) In the case of vertical bars, the variable is measured on the ........................... (X- axis, plane, Y- axis). c) Bar diagrams, rectangles, squares, circles and pie charts are ........................... forms of presenting data. (geometric, arithmetic, horizontal). d) By joining the mid-points of the top of each rectangle of a histogram, we get ........................... (an ogive, a frequency curve, a frequency polygon) e) Graph of "morethan" cumulative kquency distribution is also called "more - than" ........................... (ogive, frequency polygon, frequency curve) f ) The caption of a table labels data presented in the ...........................of a table. (rows, columns, foot-note)
5) Are the following statements true or false? If false, what should be the correct statement?
1) A picture is worth a thousand words: 2) Squares and circles are examples of area diagrams. 3) We can have only vertical bar to present some data having one variable. 4) The graph of an ordinary fiequency distribution is called ogive. 5) A time series graph is known as historigram. 6) Histogram is same as bar diagram,
Condensation of data: It is a process of classifjrlng and arranging complex and unorganised mass of data to make them fit for comparison and analysis. Array: An array is an arrangement of data in ascending or descending order. It is also called a simple array. Frequency array: It is an array or series formed by writing various possible values of the variable along with their respective Gequencies. Discrete frequency distribution: A discrete distribution or discrete series is formed where the variable can take only discrete values like 1,2,3,..... Number of children in a family, number of students in a university, etc. are examples of discrete variable. Continuous frequency distribution:A continuous frequency distribution is formed where the variable can take any value between two numbers. For example, height, weight, income and temperature. Inclusive type class interval: A class interval in which all observations lying between and including the class limits are included. Exclusive type class interval: A class interval which includes all observations that are greater than or equal to the lower limit but less than the upper limit.
Open-end class: A class in which one of the limits is not specified. kequency polygon: It is a broken line graph to represent a kequency distribution and can be obtained either from a histogram or directly from the frequency distribution. Frequency curve: It is a smoothened graph of a fiequency distribution obtained fkom fkequency polygon through h e hand tracing in such a way that the area under both of them is approximately the same. Class and class limits:It is a decided group of magnitudes having two ends called class limits or class boundaries. Class range: Also called class interval. It is the difference between two limits of a class. It is equal to upper limit minus lower limit. It is also called class width. Mid-point: Also called mid-value. It is the average value of two class limits. It falls just in the middle of a class. Relative frequency distribution:It is a fiquency distribution where the fiquency of each value is expressed as afiaction or apercentage of the total number of observations. Cumulative frequency distribution: It is obtained by successive totaling of the simple fkequencies of a discrete or continuous frequency distribution. This totaling can be done either h m above (we get "less-than" cumulative fiquency distribution) or from below (we get "more-than" cumulative frequency distribution). Ogivt?:It is the graph of cumulative fkequency. Graph of "less-than" cum~llative fkequencies gives "less-than" ogive and that of '=more-than"gives "more-than" ogive. Tabulation: It is a systematic presentation of data in rows and columns. Caption:It is a part of a table and labels data presented in the column of a table. It is also called box head. It may contain one or more than one column head. Stub: It is a part of a table. It consists of stub head and stub entries. Each stub entry labels a given data placed in the rows of the table. Both stub head and stub entries appear on the left-hand column of a table. They describe the row heads. Main body of the table: It is certainly the most important part of the table and contains numerical information about which a hint is already made clear by the title. It is also calledfield o the table. f
Line graph: It is the locus of different points obtained with the combinations of X arid Y coordinates measured on X-axis and Y-axis respectively.
Historigram: The line graph of a time series is c'alled historigram (For example, steel production since 1950). Histogram: It is a set of adjacent rectangles presented vertically with areas proportional to the frequencies. Bar diagram: It is often defined as a set of thick lines corresponding to various values of the variable. It is different from histogram where width of the rectangle is important.
one variable can be presented. A sub-divided bar diagram is used to show various . components of a phenomenon.
Representation o f Data
Pie diagram: It is a circle sub-divided'into components to present proportion of different constituent parts of a total. It is also called pie chart. Area diagrams: These are two dimensional diagrams. Here both the height and the base of the diagram are important. That is why they are known as area diagrams. They can be either rectangles, or squares or circles.
Volume diagrams: These are three dimensional diagrams. In their constsuction length, width and height are used. They consist of boxes, cubes, blocks, spheres and cylinders.
Delhi
Mansfield, E., 1991, Statistics for Business and Economics: Methods and Applications, W. W. Norton and Co. Yule, G. U. and M. G. Kendall, 1991, An Introduction to the Theory of Statistics, Universal Books, Delhi.
2) You may give examples h m your surrounding. For exact meaning of the terms refer to Section 3.3. 3) In the text we have converted the monthly income data in Table 3.2 to a frequency distribution in Table 3.6. From this you can take a clue. 4) Refer to Sub-section 3.3.3 5) Refer to Sub-section 3.3.4(c) 6) Refer to Sub-section 3.3.4(d)
n-r-..
4-
C'.-l.
2 A 9/91 '
3) Refer to Table 3.16 4) It can be presented in more than one ways. We have given one below. Try another.
Division of Students of XY College
Year
Male
First Year
Hostelers Female
Second Year
Third Year
Check Your Progress 3 1) a) See Sub-section 3.5.1 and 3.5.2 b) See Sub-section 3.5.2 and 3.6.1 c) See Sub-section 3.5.2 d) See Sub-section 3.5.3 e) See Sub-section 3.6.2 and 3.6.3 2) Refer to Sub-sections 3.6.1 and 3.6.3 3) a) See Sub-section 3.5.1 b) c) d) e) See Sub-section 3.6.1 See Sub-section 3.6.1 See Sub-section 3.6.1 See Sub-section 3.6.2 See Sub-section 3.6.4
4) a) an& b) y-axis c) geometric d) a hquency polygon e) ogive 9 ~01umns 5) Tnpe: 1,2,5 False: 3,4,6