ANALYSIS OF DATA
Analysis of data is the most skilled task of all stages of research. It should be done by
the researcher himself and should not be entrusted to any other person. Proper analysis of data
requires a familiarity with the back ground of the survey and with all its stages. Both
quantitative and qualitative methods can be used. The steps envisaged in the analysis of data
vary depending on the type of the study. Part of analysis is a matter of working out statistical
distribution, constructing of diagram and calculating simple measures like averages, measures
of dispersion, percentages, correlation etc. The analysis of data means verification of
hypothesis. The problems raised by the analysis of data are directly related to the complexity
of the hypothesis or hypotheses.
Professor Wilkinson and Bhandakar- “ Analysis of data involves a number of closely
related operations that are performed with purpose of summarizing the collected data and
organizing these in such a manner that they will yield answer to the research questions or
suggest hypothesis or questions if no such questions or hypothesis had initiated the study.
Some scholars are of the opinion that processing of data is one under analysis of data.
Professor John Gatting had made distinction between analysis of data and processing of
data. He is of opinion that processing of data refers to concentrating, recasting and dealing
with data so that they are amenable to analysis as possible, while analysis of data refers to
seeing the data in the light of hypothesis of research questions and the prevailing theories and
drawing conclusions that are amenable to theory formation as possible.
Francis Rummel- The analysis and interpretation involve the objective material in the
possession of the researcher and his subjective reaction and desires to derive from the data the
inherent meaning in their relation to the problem.
The problem should be analyzed in detail to see what data are necessary in its solution and to
be assured that methods used will provide for definite answers. The data may be adequate,
valid and reliable to any extent, it does not serve any worthwhile purpose unless it is carefully
edited, systematically classified and tabulated, scientifically analyzed, intelligently interpreted
and rationally concluded.
Characteristics of Analysis of Data
It is highly skilled and technical job.
It should be carried out by the researcher himself or under his close supervision.
The researcher should also possess judgment skill, ability of generalization and
should be familiar with the background objects and hypothesis of the study.
It is possible through systematic analysis that the important characteristics which are
hidden in the data from which valid generalization are drawn.
It is only possible by organizing, analyzing and interpreting the research data that we
can know the important features, inter-relationship and cause-effect relationship.
The data to be analyzed and interpreted should i) be reproducible, ii) be readily
disposed of quantitative treatment and iii) have significance for some systematic
theory and can serve as a basis for broader generalization.
The steps envisaged in the analysis of data will vary depending on the type of study.
A set of clearly formulated hypothesis to start with study presents the prescribing a
certain action to be taken. The more specific the hypothesis, the more specific is the
action and in such types of studies, the analysis of data is almost completely a
mechanical procedure.
The task of analysis is incomplete without interpretation. In fact, analysis of data and
interpretation of data are complementary. The end product of analysis is the setting up
of certain general conclusions while the interpretation deals with what these
conclusions really mean. Interpretation is the process of establish relationship
between variables which are expressed in the findings and why such relationships
exists.
For any successful study, the task of analysis and interpretation of data should be
designed before the data are actually collected with exception of formulative studies
where researcher had no idea.
Statistics as Statistical Method
According to A.L.Bowley “Statistics may be called the science of counting”. He considered
only one aspect i.e. collection of data. At another place he mentioned “Statistics may rightly
be called the science of averages because the collected data should not be placed before the
public in new shape. This definition is also unsatisfactory because statistical methods are
employed not only to summarize the data but also to study the other characteristics of the data
as variability, skewness, correlation or regression. Bowley himself realized the weakness of
his definition and pointed out “statistics cannot be confined to any one science.”
M.R. Spiegel defined ‘Statistics is concerned with scientific methods for collecting,
organizing, summarizing, presenting and analyzing data as well as drawing valid conclusions
and making reasonable decisions”. It takes all aspects of a static enquiry into consideration
from the collection of data to the analysis and interpretation of data.
Seligman defined “Statistics is the science which deals with the methods of collecting,
classifying, presenting, comparing and interpreting numerical data collected to throw light on
any sphere of enquiry”.
Croxton and Cowden defined “Statistics is as the science of collection, presentation,
analysis and interpretation of numerical data”.
Statistics is not merely a device for collecting numerical data but as a means of sound
techniques for their handling, analysis and drawing valid inferences from them. When the data
are collected, edited, classified, tabulated, analyzed and interpreted with the help of various
statistical techniques and tools depending upon the nature of the investigation. These
statistical methods are central tendency, dispersion, skewness, moments, kurtosis, correlation,
regression, association, analysis of time series index numbers, interpolation etc. thus, the basic
knowledge about statistics becomes inevitable for research workers for systematic analysis
and n accurate and precise interpretation of data.
Types of statistical methods
Statistical methods are the principles employed for the description, analysis and interpretation
of the data. They may be classified into two categories.
Descriptive statistical methods-Those methods which are primarily employed to
describe what has been observed are called descriptive statistical methods. Their
sole purpose is to describe the behaviour of a variable and no attempt is made to
analyze and to interpret the data. The data are collected, organized and presented
either by tables or by diagrams to describe the behaviour of the data.
Analytical statistical methods- Those methods which are employed to analyze
and to interpret what has been observed are called analytical statistical methods.
These two categories of statistical methods are into mutually exclusive but
analytical statistical methods are based upon and make use of descriptive
methods for analyzing and interpreting data.
Use of Statistical Methods
The statistical methods are employed almost in very branch of knowledge. They are valuable
tools in the hands of research workers and are used to cast light upon the behaviour of a
variable or variables under study.
Statistical methods are employed to throw light upon the situation and to probe
the unknown e.g effect of different kinds of fertilizers on crop yield.
Statistical methods enables to reduce a mass figures to something that is of
manageable size and can be easily under stood and interpreted e.g salary of 1000
workers in factory is not always keep in mind.
Statistical methods are employed for comparing two or more series. How to
compare the wages of labour in two factories.
Statistical methods enable us to draw inferences or conclusions from the data
about behaviour of a variable or variables under study. By experiment we can
conclude a particular insecticide is more effective to control disease of crop.
Basic Statistical Techniques
While analyzing the data, researchers usually make use of many simple statistical devices.
Therefore, knowledge of basic statistical techniques used in statistical analysis to the
phenomena under study is inevitable. In fact, a through knowledge of at least the
fundamentals of statistics is an indispensable part of the equipment of the researcher in the
field of social science.
1. Collection, classification and tabulation of data- It helps in quantification and
objective valuation of social phenomena. It presents facts in a proper form and reduces
complexity of data. The first step is a statistical enquiry in the collection of data
relating to the problem under study when a mass of data has been gathered, it is
necessary to arrange the material in some sort of concise and logical order. The
procedure is referred to as classification and tabulation of data.
2. Diagrammatic and graphic representation- another important convincing,
appealing and easily understood method of presenting the statistical data is the use of
diagrams and graphs. They also save a lot of time as a very little effort is required to
grasp them and draw meaningful inferences from them. They highlight the salient
features of the collected data, facilities comparisons among two or more sets of data
and enable us to study the relationship between them more readily. Graphs reveal the
trend. If any present in the data more vividly than the tabulated numerical figures.
3. Averages- average hold a very important place in all types of statistical work because
they describe the inherent characteristics of a frequency distribution in a concise
manner and help in comparative study of different distributions. The different kinds of
average are as under
i) Arithmetic mean
ii) Median-measure of central tendency. It is more frequently used where extreme
item is to be eliminated.
iii) Mode- whenever complete data is not available, mode is common form of average
to be used. It can also be estimated graphically from a histogram. It is directly
applicable to a large number of items iv) Geometric mean- geometric mean is more
mathematical and complicated than mean, median or mode. Geometric mean of
different times of a series is that root of the product of all items values. It is mostly
used in such cases where the data has to be put further mathematical analysis. It is
especially suitable in those cases where less importance is to be given to large
measurements. It is used in Agril. Economics in finding the rate of growth of
population, compound growth rate of interest, rate of depreciation of machinery and
equipment, cost-benefit analysis etc. v) Harmonic mean- it is reciprocal of the
arithmetic mean of the reciprocals of the given observations. It is rigidly defined,
based on all observations and is amenable to further algebric treatment. It can be used
in phenomena involving time, rate and price. It is the most suitable average when it is
desired to give greater weight to smaller observations and less weight to the larger
ones.
4. Index numbers- index numbers are indicators which reflect the relative changes in
the level of a certain phenomena in any given period( or over as specified period of
time) called the current period with respect to its value in some fixed period called the
base period selected for comparison. The technique of index number is used to study
all such problems which are capable of quantitative expression and change with
change in time. It measures only relative change and also helpful in forecasting the
future trends. Cost of living index number and consumer price index.
5. Variability- another important aspect in statistics is variability. The mean, mode and
median give only one essential characteristics of a frequency distribution. It is possible
for several distributions to have the same average yet be markedly different in
variability. It is therefore very important to determine the spread of the individual
values on either side of their central tendency. The important measures of absolute
variability are mean, deviation and standard deviation and for relative variability we
calculate co-efficient of variation.
i) Mean deviation ii) standard deviation and iii) co-efficients of variation.
6. Skewness- it is measure that refers to the extent of symmetry or asymmetry in a
distribution. It is used to describe the shape of a distribution.
7. Kurtosis- It is measure that indicates the degree to which curve of a frequency
distribution is peaked or flat-topped.
8. Moments- It is used to describe the peculiarities of a frequency distribution. Using
moments one can measure the central tendency of a set of observations their
scatteredness i.e. dispersion, skewness, kurtosis of curve etc.
9. Correlation- It is statistical technique used for analyzing the behaviour of two or
more variables. It measures the degree and the direction of sympathetic movements in
two or more variables. The value ranges from -1 to +1.
10. Association of attributes-
11. Regression analysis- it means the estimation or prediction of the unknown value of
one variable from the known value of the other variable. There are two types of
variables in regression analysis. The variable whose value is influenced or is to be
predicted is called dependent variable and the variable which influences the values or
is used for prediction is called independent variables. The regression analysis for
studying more than two variables at time is known as multiple regression analysis. In
the word of M.M. Blair- regression is the measure of the average relationship between
two or more variables in terms of the ordinal units of the data.
12. Analysis of time series- Time series is an arrangement of statistical data in
accordance with its time of occurrence. If the values of phenomena are observed at
different periods of time, the values so obtained will show appreciable variations.
Price, production and national income time series data. It also helps in analysis of
phenomena in terms of the effect of various technological, economic and other factors
on its behaviour over time.
13. Interpolation and extrapolation- Interpolation is a techniques of estimating past
figures where as extrapolation means as estimation of a probable figure for future.
Both are statistical techniques of arriving at the unknown facts from the known facts.
14. Probability- It means the chance of occurrence of a particular event. This method
helps not only in ascertaining the chance of occurrence of an event but also in finding
out the total effect of any uncertain events if the consequences of various occurrences
are known. It is extremely used in the quantitative analysis of economic problems. It is
an essential tool in statistical inference and forms the basis of the decision theory. All
statistical laws are based on probability. Therefore, statistics is called the science of
estimates and probabilities.
Limitation of Statistical Methods
Statistics does not study qualitative phenomena like welfare,
poverty, utility. However, the techniques of statistical analysis can be applied to
qualitative phenomena indirectly by expressing them numerically.
Statistics does not study individuals and is confined only to those
problems where group characteristics are to be studied.
The laws of statistics are probabilistic in nature and are true on the
average. Hence, the inference based on them is only approximate and are not
exact like the inferences based on mathematical or scientific laws.
Statistics does not reveal the entire story. It only helps in the
analysis of certain quantitative facts and does not take into consideration of all
the relevant facts for interpreting the results.
Statistics deals with figures which do not bear on their face the label
of their quality and can easily be distorted, manipulated or moulded by the
persons involved in using statistical methods