MODULE I
INTRODUCTION TO STATISTICS
Statistics – column of figures, zig-zag
graphs or tables
Statistics – is a body of methods
of obtaining and analysing data in order
to base decisions on them
Statistics – refers either to quantitative
information or to a method of dealing
with quantitative information.
Statistics – is employed as a tool
The word statistics is derived from the Latin
word ‘Status’, Italian ‘Stato’ and German word
‘Statistik’ meaning Political state.
Origin Of Statistics – Governmental Records,
Mathematics.
Governmental Records
◦ Data was collected by agents of government for
governmental purpose
◦ In ancient Egypt police prepared registration
lists of all heads of families.
◦ Roman census was conducted in 435 B.C
◦ In India Kautilya’s Arthashastra contains
statistics of Agriculture, medicine etc.
◦ ‘Ain – e – Akbari’ gives an account of statistics
relating to population, production etc. during
Akbvar’s rule.
Statistics was described as Science of Kings
Mathematics
Mathematical theory of probability forms
the base for Statistical methods.
The famous mathematician De moivre
discovered the normal curve which forms an
important part of the modern statistical
theory.
The great mathematician “Quetlet’
discovered the fundamental principle ‘the
constancy of great numbers’ which is
fundamental to sampling
The classified facts relating the condition of
the people in state especially those facts
which can be stated in numbers or in tables
of numbers or in any tabular or classified
arrangement – Webster
By Statistics we mean quantitative data
affected to a marked extent by multiplicity of
causes – Yule and Kendall
Aggregate of facts affected to a marked
extent by multiplicity of causes, numerically
expressed, enumerated or estimated
according to reasonable standards of
accuracy, collected in a systematic manner
for a predetermined purpose and placed in
relation to each other – Prof. Horace
Secrist
Statistics may be called as science of
counting – Prof. A. L. Bowley
Statistics may be defined as science of
collection, presentation , analysis and
interpretation of numerical data
- Croxten and Cowden
Science – Systematised body of knowledge.
Art – It is an action. Application of given
methods to obtain facts, derive results and
finally to use them for devising action.
It presents facts in a definite form
It simplifies mass of figures.
It facilitates comparison.
It helps in formulating and testing
hypothesis.
It helps in prediction.
It helps in the formulation of suitable
policies.
Statistics and State
Statistics in Business and Management
- production, finance, banking, personnel
Statistics and Economics
- Planning, measues of GNP
Statistics and Physical Science
- Astronomy, physics, Geology
Statistics and Natural Sciences
- Biology, Medicine, Zoology, Botany
Statistics and Research
Statistics and other uses
- insurance, politicians
Statistical theory and electronic computers
reinforce each other
Computers can process large amounts of
data quickly and accurately
Some of the Statistical packages
- MEDCALC
- SPSS
- MINITAB
- R-Software
Statistics does not deal with isolated
measurement
Statistics deals with only quantitative
characteristics
Statistical results are true only on an average
Statistics is only a means
Statistics can be misused
Numerical information about a variable
generally found almost everywhere in
business, industries, economics and many
other areas.
Data can be obtained by
Primary source
Secondary Source
Secondary Source
◦ When an investigator uses the data which has
already been collected by others, is called
secondary data
◦ It can be obtained from journals, reports,
Government publications, publications of resesarch
organisations etc.
◦ Secondary data should be examined for
Suitability
Adequacy
reliability
Primary data
Measurements recorded as a part of original study.
Collection of original data is limited by time, money
and manpower available.
Two methods of obtaining primary data
Questioning
Personal interview, mail and telephone
Observation
Types of classification
Geographical ( area-wise,: cities, districts etc)
Production of sugarcane, wheat etc for various states )
Chronological ( on basis of time)
Sales figures of a company for various years.
Qualitative ( according to some attributes)
Based on attribute like blindness, sex, colour of hair
etc.
How many persons are blind in a given population
Quantitative ( in terms of magnitudes)
According to the characteristics that can be measured
like height, weight etc.
No. of workers of a factory according to the wages(Rs)
Class limits
Class intervals
Class frequency
Class midpoint
Class width
Exclusive method
Inclusive method
Open end classes
Frequency distribution can be represented
graphically by four ways
Histogram
Frequency Polygon
Smoothed frequency curves
Cumulative Frequency Curves or Ogives
Histogram
Graphical method of presenting frequency distribution
Variable ( class intervals) is always taken on the X- axis
and their frequencies on Y-axis.
Series of rectangles each having a class interval distance
as its width and frequency distance as its height.
Histogram is two dimensional whereas bar diagram is
one dimensional
Cannot be constructed for distribution of open end
classes.
If the distribution has unequal class intervals, suitable
adjustments are to be made on frequencies before
constructing histogram
Frequency Polygon
Graphical method of presenting frequency
distribution
Histogram is drawn
Midpoints of upper width of each rectangle is
joined to obtain frequency polygon.
Close both ends of polygon by extending it to x-
axis with frequency zero.
Frequency polygons of several distributions can be
plotted on the same axis making it possible for
comparision.
Value of mode can be obtained.
Smoothed frequency curve
It is a freehand curve drawn in such a manner that
area under the curve is approximately same as that
of polygon.
The curve should look as regular as possible and all
sudden turns should be avoided.
Polygon is first drawn and then smoothning is
done.
Curve should begin and end at the x-axis.
It should be extended to the midpoints of class
intervals outside the histogram
Cumulative frequency curves or Ogives
Graph of a cumulative frequency distribution.
Two methods “ less than cumulative frequency
curve or less than ogive” and “ More than
cumulative frequency curve or more than Ogive”
In less than ogive curve, upper limits of class is
interval taken on x axis and less than cum. freq on
Y – axis, frequencies are plotted, we get a rising
curve.
In More than ogive curve, lower limits of class is
interval taken on x axis and more than cum. freq on
Y – axis, frequencies are plotted, we get a declining
curve.
Ogives are used to obtain proportion of cases
above or below certain value.
Ogives are also used to obtain median and other
partition values like quartiles, deciles etc.
1. The profits (in lakhs of rupees) of 30 companies for the year
2005-2006 are given below
20, 22, 35, 42, 37, 42, 48, 53, 49, 65, 39, 48, 67, 18,
16, 23, 37, 35, 49, 63, 65, 55, 45, 58, 57, 69, 25, 29,
58, 65
Classify the above data taking a suitable class interval
Draw frequency histogram, frequency polygon, Ogive
curves.
2. Form a frequency distribution taking a suitable class interval
for the following data giving the age of 52 employees in a
government agency
67, 34, 36, 48, 49, 31, 61, 34, 43, 45,
38, 32, 27, 61, 29, 47, 36, 50, 46, 30,
46, 32, 30, 33, 45, 49, 48, 41, 53, 36,
37, 37, 47, 30, 46, 50, 28, 35, 38, 36,
46, 43, 34, 62, 69, 50, 28, 44, 43, 60,
39, 35
Also Draw frequency Histogram, Frequency
Polygon, less than ogive, more than ogive
curves
Capital Upto 10-20 20-30 30-50 50-80 80- Above
Range 10 100 100
(Rs.
Lakhs )
No. of 10 12 10 14 7 8 5
compani
es
Prepare frequency distribution and draw less
than ogive
Marks No. of students
Below 10 1
Below 20 8
Below 30 35
Below 40 46
Below 50 50
Prepare a frequency distribution and draw
more than ogive
Income More thn Rs No. of persons
500 100
1000 96
1500 92
2000 59
2500 28
3000 2