Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
64 views27 pages

Chapter One - Introduction To Statistics

Uploaded by

Azadazu Babu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views27 pages

Chapter One - Introduction To Statistics

Uploaded by

Azadazu Babu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 27

Module One – Introduction to Statistics

Meaning of Statistics
A.L. Bowley defined statistics as “The science of counting”. Afterwards, he redefined as
“The science of averages”.

Boddington defined statistics as “The science of estimates and probabilities”.

According to Croxton and Cowden, “Statistics is the science of collection, presentation,


analysis and interpretation of numerical data”.
Prof.Horace secrist defined statistics as follows
“By statistics we mean aggregate of facts, affected to a marked extent by multiplicity of
causes, numerically expressed, enumerated or estimated according to reasonable standards of
accuracy, collected in a systematic manner for a pre-determined purpose and placed in
relation to each other”.

Characteristics of statistics
The features or charactistics of statistics are as follows
1. Statistics means an aggregate of facts.
Statistics are group of facts. Single facts are not called as statistics. Collection of
many facts is called statistics. Facts can be analysed only when there are more than
one facts.
For example – The age of group of persons, the height of students in a class, the price
of a product for number of period, the profit of a group of companies, the profit of a
company for number of period is called as statistics.

2. Statistics are affected to a marked extent by multiplicity of causes.


The facts are due to results of many numbers of internal and external factors.
For example
 The statistics of wheat production is based on rainfall, method of cultivation,
fertility of soil, fertilisers used, quality of seeds, usage of manure etc.
 The statistics of price of a product is based on demand, supply, imports,
exports, government policy etc.

3. Statistics are numerically expressed.


Statistics are numerical statements i.e.; statements are expressed in numbers. The
facts or statements not expressed in numbers cannot be called as statistics.
For example – Price decrease with increasing production is not statistics. Qualitative
expression like good, bad, right, wrong, sad, happy etc. is not statistics.

4. Statistics are enumerated or estimated according to reasonable standards of


accuracy.
The facts should be enumerated (collected) or estimated (computed) with a required
degree of accuracy. Reasonable standard of accuracy must be maintained
For example – in measuring the height of students, accuracy up to centimetre is
required.
5. Statistics are collected in a systematic manner.
The data must be collected in a planned and scientific method. Otherwise, data
collected may be wrong and leads to wrong results or conclusion.

6. Statistics are collected for a pre-determined purpose.


The data must be collected for a definite purpose and it must be decided in advance.
Otherwise, the facts become useless and hence cannot be called as statistics.

7. Statistics are placed in relation to each other.


The data is collected for the purpose of comparison and so the data must be
homogenous (related). When the data is not homogenous, the comparison is not
possible and cannot be called as statistics.

Functions of Statistics
The important functions of statistics are as follows
1. It presents the data ( facts) in a definite form
It presents the data or facts in a simple and definite form. The facts or statements or
results expressed in numbers is more convincing and clear than those not expressed in
numbers (expressed in quality).

2. It simplifies the complexity of data


It simplifies the huge collection of numerical data so that it is understandable. Some
of the statistics measures used are table, graphs diagrams, averages, dispersion,
correlation, regression, index numbers etc., helps to simplify the large collection of
numerical data and making them easily understandable.

3. It facilitates comparison
Comparison of data is a function of statistics after simplifying the data. Statistical
measures like averages, ratios, coefficient etc., are used for the purpose of
comparison.

4. It reduces the bulk of the data


Statistics reduces the large volume of data to a few figures such as averages, ratios,
percentages, coefficient etc., which are easily understandable.

5. It helps the studying relationship between different factors.


Statistics enable to observe and understand relationships between different factors or
facts
For example – production and price, demand and supply, price and wages,
advertisement and sales etc.

6. It indicates trends and tendencies


Future is uncertain. Forecasting refers to the process of predicting future events.
Statistics helps in forecasting trends and tendencies based on available information.
For example – forecasting population growth, demand for goods, sales of a product,
prices of shares etc.
7. It helps in the formulation of policies
Statistics helps in the formulation of policies in different fields from the data available
For example – government make policies like industrial policy, export – import
policy, taxation policy, monetary policy on the basis of data of statistics.
Business organisation makes policies in the area of marketing, finance, human
resource, production etc. on the basis of data of statistics.

8. It increases human knowledge and experience


Statistics helps in increasing the knowledge and experience of human beings.
Procedure of statistics increases knowledge, experiences, thinking and reasoning
power, and reach a rational conclusion.

9. It helps to derive valid conclusion ( inference)


Statistics helps to take rational and valid conclusion by collecting and analysing facts
in various fields.

10. It helps in formulation and testing of hypothesis


Statistics are extremely useful in formulating and testing of hypothesis and
developing a theory on the basis of results.

Limitations of statistics
The following are the important limitations of statistics
1. Statistics does not deal with qualitative data. It deals only with quantitative data
Statistics can be applied only for quantitative data i.e., data can be measured in
numbers such as price, salary, income, expenses, height, weight etc.
Statistics cannot be applied for qualitative data i.e., data cannot be measured in
numbers such as honesty, integrity, loyalty, taste, culture, friendship, wisdom etc.

2. Statistics does not deal with individual fact


Statistics does not deal with individual single fact. It deals only with aggregate or
group of facts. Statistics cannot be applied or studied for individual or single fact.
For example – The income of one employee in a company is not statistics. The
income of 50 employees in a company is statistics.

3. Statistical results or conclusions are not exact


Statistical results or conclusions are not exact. They are not absolutely true. They are
not valid always and under all conditions. Accurate results cannot be obtained from
statistics. Statistics results are true only on averages.
For example – Per Capita Income is 172000, but it does not mean that all the
population in India has per capita income is 172000

4. Statistics can be misused


The most important limitation of statistics is it can be misused. The misuse of
statistics is possible because of various reasons like used by inexperienced and
unskilled person, no proper knowledge to use it, choosing favourable items in the
sample, manipulation of data, improper data collection and interpretation of data. So
misuse of statistics leads to wrong or misleading conclusion or results.
There is a saying that “statistics are like clay of which you can make a god or devil”.
5. Common person cannot handle statistics properly
Statistics can be used by experts only who have sufficient knowledge of statistics. The
methods of statistics are not easy to use. Therefore, common person cannot handle or
use statistics properly.

Conclusion
There are limitations of statistics, but still it is useful and helpful to study various
problems. Only thing is that it must be used by experts with proper care and caution.

Basic concepts in statistics

Units or Individuals
The objects whose characteristics are studied in a statistical survey are called units or
individuals.

Population or Universe
The totality or collection of all units or individuals (objects) under consideration is
called population or Universe
For example – number of students in a particular school or college, number of
companies in a particular region etc.

Finite population
A population which contains countable number of units is called finite population
For example – number of students in a school or college, number of text books in a
library etc.

Infinite population
A population which contains uncountable number of units is called infinite population
for example – number of stars in the sky, number of fish in the sea or ocean etc.

Characteristics
The units or individuals (objects) to be studied has some characteristics. It can be
quantitative or qualitative characteristics.

Quantitative characteristics
A characteristic which is numerically measurable is called quantitative characteristics.
For example – marks of a student, attendance of a student, price of a product, height
of a person, weight of a person etc.

Qualitative characteristic
A characteristic which is not numerically measurable is called qualitative
characteristic.
For example – taste of a fruit, skin colour of a person etc.

Variable
A quantitative characteristic which varies from unit to unit is called variable.
For example – marks, height, weight, price etc.
Attribute
A qualitative characteristic which varies from unit to unit is called attribute.
For example – region, colour, taste etc.

Discrete variable
A variable which assumes only specified values in a given range is called discrete
variable.

Continuous variable
A variable which assumes all the values in the range is called continuous variable.

Data
Data means information (facts and figures) collected from which conclusion is
obtained.

Quantitative data
Data which are expressed in numbers is called quantitative data

Qualitative data
Data which are not expressed in numbers is called qualitative data

Census survey or census enumeration


The study or investigation is based on all the units or individuals of the population is
Called census survey.

Sample survey
The study or investigation is based on the part of the population is called sample
Survey.

Sample
Sample is the part of the population or universe which is selected for the purpose of
Study or investigation.
Sample means a representative part of the population.

Sampling
The process of extracting a sample from a population is called sampling

Primary data
Primary data are fresh data collected directly from the field. It is also called first hand
data.
Primary data are the data collected for the first time directly from the field by the
investigator.

Secondary data
Secondary data are the data which investigator does not directly collect from the field.
They are the data which is collected by others for some other purpose
For example
 Journals, newspapers, periodicals
 Websites on the internet etc.
Investigator
Investigator is the person who conducts the statistical enquiry.

Enumerator
Enumerator is the person who collects the information for the investigator.

Respondents (informants)
Respondents are the persons from where the information is collected.

Series
Series refers to an arrangement of data in a logical or specific order such as size, time
of occurrence or any other characteristics (measurable or non-measurable)

Types of series
There are three types of series
 Individual series
 Discrete series
 Continuous series

Individual series
It is a series of values of each units or individual observations.
The individual series can be arranged in two different ways
 In ascending order
Arranging the data from the smallest value to largest value.

 In descending order
Arranging the data from the largest value to the smallest value.

Discrete series
It is a series which shows specified value of the variables and the corresponding
frequency. Variables are not repeated in the discrete series.

Continuous series
It is a series which shows the variables in group as class internal and the
corresponding frequency.

Frequency distribution
A systematic presentation of the values of a variable and the corresponding frequency
is called frequency distribution.

Frequency
It refers to the number of times the value of a variable repeated in the series.

Class frequency
It refers to the number of observations relating to a particular class.

Frequency table
A tabular presentation of frequency distribution is called frequency table.
Discrete frequency distribution
A discrete frequency distribution is a presentation of specific value of a variable and
the corresponding frequency.

Class Intervals
In a frequency distribution, if the range is vast, it is divided into sub ranges (groups)
called class intervals

Class
The sub ranges (groups) is called class.

Class limit
The lowest and the highest values taken to define the boundaries of each class is
called class limit. The boundaries of a class intervals is called class limit.

Lower limit
The lowest value of the class is called lower limit.

Upper limit
The highest value of the class is called upper limit.

Width of the class interval


The difference between the lower limit and upper limit of the class interval is called
width of the class interval.

Types of class interval


1. Exclusive class interval
2. Inclusive class interval

1. Exclusive class interval


The lower limit is included and the upper limit is excluded in the same
class interval is called exclusive class interval
The upper limit of the class interval and the lower limit of the next class
interval is same is called exclusive class interval
For example: 0 - 10, 10 - 20, 20 – 30, 30 – 40 etc.

2. Inclusive class interval


The lower limit is included and the upper limit is included in the same
class interval is called inclusive class interval.
The upper limit of the class interval and the lower limit of the next class
interval is not same is called inclusive class interval.
For example; 0 – 9, 10 – 19, 20 – 29, 30 -39 etc.

Data
Data is defined as the collection of numbers, words, characters, images, and others that can
arranged in some manner to form meaningful information.

Types of data
The data is divided into two categories based on the source from which they are obtained
1. Primary data
2. Secondary data

1. Primary Data
Primary data are the data that are collected for the first time by an investigator for a
specific purpose. Primary data are the fresh data that are directly collected from the
field. They are the first hand data

Methods of collecting primary data


a) Direct observation method
b) Personal interview method
c) Telephonic interview
d) Information through correspondents
e) Method of questionnaire
f) Method of schedule ( collection through enumerators)
g) Google form

a) Direct observation method


The investigator gets the data or information by personal observation of the units.
The investigator has to keep observing while collecting data. Most of the surveys
in various scientific, social and economic fields are done by this method.

b) Personal interview method


The investigator personally interviews (face to face) the respondents and gets the
required data or information from them. The investigator prepares a small list of
questions relating to enquiry and information is collected This method is common
in studying social and economic problems.

c) Telephonic interview
The investigator gets the data or information through telephone. This method is
quick and get accurate information.

d) Information through correspondents


The investigator appoints the agents or correspondents at different places. These
agents or correspondents collect required data or information in their area and
hand them over to the investigator. This method is useful when data are to be
collected regularly for a long period of time.

e) Method of questionnaire
Questionnaire is a list of questions which is to be filled in the by the informants
and these answers are the required data or information for the investigation. The
questionnaire is sent to the informants by mail or otherwise. The informants are
required to fill up the questionnaire and send them back to the investigator. Thus
the investigator obtains the required data or information.

f) Method of schedule ( collection through enumerators)


The investigators collect data or information through trained enumerators. The
enumerators contact the informants and with the help of a schedule collect the
required data or information. Schedule is a list of items on which the enumerators
have to collect and record information. It is filled in by the enumerators. This
method requires training of the enumerators. Reliability of the data depends
mainly on the training given to them and their integrity. The 10 yearly population
census of India is conducted by this method.
g) Google form
It is an online form which is used for primary data collection. Google form is a
survey administration app that is included in the Google Drive office suite and
Google classroom along with Google Docs, Google Sheet, and Google slides.

2. Secondary data
Secondary data are the data which the investigator does not collect directly from the
field. They are the data which he borrows from others who have collected them for some
other purpose.

Sources of secondary data


Published sources
a) Reports and publications of central and state government departments
b) Reports and publications of international bodies such as UNO, IMF. World bank
etc.
c) Publications of banks, research institutions, administrative office etc.
d) Journals, Magazines and newspapers
e) Websites of various organizations on the internet.
Unpublished sources
a) Records maintained at government offices, municipal offices, panchayat offices,
b) Records maintained by research institutions, research scholars etc.

Difference between primary data and secondary data

Differences Primary data Secondary data

Definition Primary data are those that are Secondary data are those data
collected for the first time that is already collected by
some other person
Originality Primary data are original Secondary data are not original
because they are collected by because they are collected by
the investigator for the first some other person for their
time purpose.
Nature of data Primary data are in the form Secondary data are in the form
of raw-material of finished form
Reliability and suitability Primary data are more reliable Secondary data are less reliable
and suitable because these are because they are collected by
collected for a particular some other person for their
purpose purpose which may not match
Time and money Collecting primary data is Secondary data is economical
quite expensive because it because it requires less time
requires both time and money and money
Precaution and editing Precaution and editing is not Precaution and editing is
required for primary data required because it is collected
because it is collected for a by some other person for their
particular purpose. purpose.
Process It done not involve much
It involves much process in process in collecting secondary
collecting primary data data but rather it is quickly and
easily.
Sources The sources in collecting
The sources in collecting
primary data is through
secondary data is websites,
surveys, experiments,
government publications,
observations, personal
journals, articles etc.
interviews etc.

Classification
Classification is a systematic grouping of units according to their common characteristics.
Each of these groups is called class.

Functions or objectives of classification


1. It reduces the bulk of the data
2. It simplifies the data and makes data more comprehensible.
3. It eliminates unnecessary details
4. It facilitates comparison of characteristics
5. It enables to understand the information and helps in drawing inferences or
conclusions.
6. It renders the data ready for further statistical analysis.

Types of classification
There are four types of classification. They are
1. Quantitative classification
2. Qualitative classification
3. Spatial classification
4. Temporal classification

1. Quantitative classification
Classification of units on the basis of quantitative characteristics (variable) such as
age, height, weight, income, etc. is quantitative classification.
For example
Weight: 40 – 50 50 -60 60-70 70-80 80-90 90-100
No.of Persons: 50 200 260 360 90 40

2. Qualitative classification
Classification of units on the basis of qualitative characteristics (attribute) such as
gender, literacy, colour, taste etc., is qualitative classification.

3. Spatial classification ( Geographical classification )


Classification of units on the basis of locality, country, city, village etc is called
spatial classification.

4. Temporal classification ( chronological classification)


Classification of units on the basis of time is called temporal classification.

Other types of classification


1. Simple or one way classification
2. Mani-fold classification.

1. Simple or one way classification


Classification of units on the basis of a single characteristic is called simple or one
way classification.

2. Mani-fold classification
Classification of units on the basis of two or more characteristics is called Mani-fold
classification.

Tabulation
Tabulation is a process of systematic arrangement of data in a rows and columns of table. It is
a neat form of presentation of classified data.

Objectives of Tabulation
1. To present the data in a simple and understandable manner.
2. To facilitate comparison of the data
3. To give an identity to the data
4. To facilitate quick location of required data.

General format of a table


Table No.
Title of the Table
Headnote:
Stub Captions heading Total
heading

Stub entries Body of the table

Footnote: Source:

1. Table Number
A number should be given for each table when there are large numbers of tables. It is
for identification and future reference.

2. Title
A title should be given to the table. The title should describe the content of the table.
It should be clear and brief.

3. Headnote
It is a brief note given applying to all or major part of the data in the table
For example: units of measurement like in lakhs, in crores, in kilograms, in rupees, in
millions, etc.
4. Stub
It refers to the row headings. It explains what the row represents.

5. Captions.
It refers to the column headings. It explains what the column represents.

6. Body of the table


The body of the table contains numerical data. It is the most important part of a table.

7. Footnote
It is an explanation in brief and precise clarifying anything of the table.
For example – abbreviation etc.

8. Source
It indicates the source from where the data is collected.

Diagrams and Graphs


Diagrams
A diagram is a visual form of presentation of statistical data which shows their base facts and
relationship.

Types of diagram
1. Simple bar diagram
2. Component bar diagram
3. Percentage bar diagram
4. Multiple bar diagram
5. Pictogram
6. Pie diagram or pie chart

1. Simple bar diagram


Simple bar diagram express only one variable at a time. A simple bar diagram
consists of bars of equal width but varying length. The length (height) of the bar
represents the measurement or magnitude of the variable. All bars stand on the same
base line. Space between consecutive bars is equal and the bars are of equal width.

Simple bar diagram can


express only one variable at
a time. A simple bar
diagram consists of bars of
equal width but varying
length. The length (height) of
the
bars represents the magnitude
or quantity of the item. All
bars stand on the same
baseline. The bars are
separated from each others by
equal intervals. Such a
diagram
can be drawn for giving a
better look and facilitating
comparison.
2. Component bar diagram (sub divided bar diagram)
Component bars are drawn when the data have items whose magnitudes have two or
more components.
Here the items are represented by rectangular bars of equal width and height
proportional to magnitude. Then the bars are divided so that the sub-division in height
represent the components.
3. Percentage bar diagram
Percentage bar diagram are drawn to represent items whose magnitudes have two
more components and when comparison of these components as percentages is
required.
 Here the components are expressed as percentages of the corresponding totals.
 The totals are represented by bars of equal width and height equal to hundred
each.
 These bars are divided according to the percentage components.
 The different sub division are shaded properly and an index which describes
the shades is provided.

4. Multiple bar diagram


Multiple bar diagram is used to represent two or more comparable values or variables
at a time. It is simply an extension of simple bar diagram. Here set of rectangular bars
of equal width with height proportional to the values or variables are drawn. The bars
are placed side by side or together adjacent to one another. The diagram is shaded
properly and an index is provided.

5. Pictogram
Pictograms are diagrammatic representation of statistical data using pictures of
resemblance. These are very useful in attracting attention. They are easily understood.

6. Pie chart or pie diagram


A pie chart divides a circle into sectors to represents its components. The area of the
sector will be proportional to the magnitude of the components. This will be done by
computing the angles for each sector.

Graph
A graph is a mathematical diagram which shows the relationship between two or more sets of
numbers or measurements.
A graph is a pictorial representation or diagram that represents data or values in an organised
manner.
Types of Graph
1. Histogram
2. Frequency polygon
3. Frequency curve
4. Ogives ( cumulative frequency curve)

1. Histogram
Histogram is drawn for a continuous frequency distribution. A histogram is a set of
adjacent rectangles whose height and width is proportional to frequency and width of
the class interval. Class interval are taken on X axis and frequency is on Y axis. The
graph formed by series of rectangles adjacent to one another is histogram.
2. Frequency polygon
A frequency polygon is a graph that displays the frequencies of data values as points
connected by straight lines. It is obtained by plotting midvalues of class interval (or
midpoints) on the x-axis and their corresponding frequencies on the y-axis.

3. Frequency curve
A frequency curve is smoothed line graph that displays the frequencies of data values
as points connected by smooth lines. It is obtained by plotting midvalues of class
interval (or midpoints) on the x-axis and their corresponding frequencies on the y-
axis.

4. Ogives ( cumulative frequency curves)


Ogives are a smooth graph with cumulative frequencies plotted against variables.
There are two types of ogives.
a) Less than ogives ( less than cumulative frequency curves)
b) More than ogives ( more than cumulative frequency curves)
a) Less than ogives
Here the variable is taken along the X axis. Less than ogives or cumulative
frequency curves are plotted against the respective upper class interval. Then these
points are joined by a smooth curve. The resulting graph is less than ogives.

b) More than ogives


Here is variable is taken along the Y axis. More than ogives or cumulative
frequency curves are plotted against the respective lower class interval. Then these
points are joined by a smooth curve. The resulting graph is more than ogives.

General rules for constructing diagrams and graph


1. Every diagram should have a suitable title and is written above it .
2. Proper scale should be selected
3. It should not be overloaded with more information
4. Suitable shades, colours, crossings should be used to indicate different parts.
5. An index indicating shades, colours, crossing etc. should be shown clearly.
6. It should be complete in all respects
7. It should be simple and self-explanatory.

Importance or Need or Uses or Advantages of Diagrams and Graphs


1. They are attractive and impressive and hence used in advertisement, research, articles,
project report for the presentation of statistical data
2. They give a bird’s eye view of the entire data at glance.
3. They can be easily understood by common people
4. They facilitate comparison of various characteristics
5. They can be remembered for a long period of time.

Disadvantages of Diagrams and Graph


1. They are visual aids. They cannot be considered as alternatives for numerical data
2. Diagrams and graphs are not as accurate as tabular data. Only tabular data can be used
for further analysis
3. Observers can be misled easily by misrepresentation of diagram and graphs. It is
possible to create wrong impressions using diagrams and graphs.

Difference between Diagram and Graph


Diagram Graph

Diagram can be drawn on plain paper Graph is drawn on a graph paper

Diagram cannot establish a mathematical Graph establish a mathematical relationship


relationship between two variables. between two variables

Diagram is suitable for showing categorical Graph is suitable for showing time series and
and geographical data frequency distribution

Diagram requires drawing skill Graph requires mathematical skill

Diagram is not giving accurate information Graph gives accurate information

Measures of Central Tendency or Measure of Averages

Central Tendency
The property of concentration of the observations around a central value is called central
tendency. The central value around which there is concentration is called measure of central
tendency.

Objectives of Averaging
1. To present the entire data in a single value this describes the characteristics of the
entire data
2. To facilitate comparison of data
3. To facilitate further statistical treatment of data
4. To provide data for decision making

Requisites or characteristics of good average

1. It should be easy to understand


It should not be complex. Even a non-statistical person (layman) should be in a
position to understand it easily

2. It should be simple to calculate


It should not contain many calculations while computing the average. But the value of
average calculated should be adequacy and accuracy which is representative of all the
values.

3. It should be based on all the observations


An ideal average should take into consideration all the values or observations which
are being studied. Otherwise, the average may show wrong results.
4. It should not be affected by extreme values.
An ideal average should be based on all the values or observations. It should not be
unduly affected by extreme values.

5. It should be rigidly defined.


It should have mathematical formula so that it should produce same answer each time
and by every person. It should not depend on personal bias of the investigator.

6. It should be capable of further statistical treatment


A good average should lead to further statistical calculations and interpretations.

7. It should have sampling stability


An average should not be affected by sampling fluctuations. The average obtained
from different samples of the same mass should not vary too much from one another.

Measures of central tendency


The following are the important measures of central tendency which are commonly used in
practice.
1. Arithmetic mean
2. Median
3. Mode

1. Arithmetic mean
Arithmetic mean is the quotient obtained by dividing the sum of the observations by
the number of observations.

Merits of arithmetic mean


a) It is simple to understand
b) It is easy to compute
c) It is rigidly defined as it is based on mathematical formula
d) It is based on all the observations.
e) It lends itself to subsequent algebraic treatment
f) It acts as centre of gravity balancing the value on either side of it.
g) It is a calculated value and not a positional value.

Demerits of arithmetic mean


a) It is very much affected by extreme values
b) It is not useful in case of open-end classes
c) It may not lead to wrong conclusions if the details of the data are not available
d) It is not always reliable

2. Median
Median of a set of values is the middle most value when they are arranged in the
ascending order of magnitude.

Merits of median
a) It is easy to understand and calculate
b) It Is useful in the case of open-end classes
c) It is not influenced by the magnitude of extreme deviation from it
d) It is most appropriate average of qualitative data
e) It indicates the value of the middle item in the distribution

Demerits of median
a) It is necessary to arrange the data for calculating median
b) It is a positional average, therefore each and every observation is not considered
c) It may not always be representative of the observation as it ignores the extreme
values .
d) It is erratic if the number of items is small
e) It is not capable of further algebraic treatment as it is not based on mathematical
property.

Mode
Mode is the value which has the highest frequency. Mode is the value which is most
frequently occurring value. Mode is the value which is repeated maximum number
of items.

Merits of mode
a) It is simple to calculate. In most cases , it is located by inspection
b) It is not unduly affected by extreme values.
c) It can be determined even in open-end classes
d) It is used to determine average of qualitative data
e) It can also be determined graphically

Demerits of mode
a) It cannot always be determined.
b) It is not capable of algebraic treatment.
c) It is not based on each and every item of the series
d) It is not rigidly defined. There are several formula for determining mode which
gives different answers.

Measure of Dispersion or Variation

The measurement of deviation of values from average is called measure of dispersion or


variation

Objectives of measuring dispersion or variation


1. To determine the dependability of an average
It reveals the extent to which an average is representative of all the values. If the
variation is small, the average is reliable. If the variation is too much, the average is
not reliable.

2. To serve as a basis for control of the variability


It reveals the cause and effect relationship, and thus acting as a control. For example,
variation in blood pressure, body temperature etc., act as a basis for reasoning and
control
3. To compare two or more series with regard to their variability
It makes comparative study of two or more sets of data with regard to the degree of
consistency, uniformity, reliability etc. a low degree of variation means more
consistent and high degree of variation means lack of uniformity.

4. To facilitate the use of other statistical techniques


Measure of dispersion is essential and used for computing other statistical measures
like correlation, regression analysis, testing hypothesis, time series etc.

Characteristics of good or ideal measure of dispersion


1. It should be simple to understand
2. It should be easy to calculate.
3. It should be rigidly defined
4. It should be based on all the values.
5. It should be suitable for further algebraic and arithmetic treatment.
6. It should have sampling stability
7. It should not be unduly affected by extreme values.

Various measures of dispersion


1. Range
2. Quartile deviation
3. Mean deviation
4. Standard deviation
5. Coefficient of variation
Skewness
In a frequency distribution, the spread of the values may be symmetrical around the centre or
it may not be so. If the values are not distributed around the centre, the distribution is said to
be skew.
Skewness means asymmetry or non-symmetry or lack of symmetry. Skewness means lack of
symmetry (balance of equal spread) in the shape of the frequency distribution. A frequency
distribution is asymmetrical when its left side and right side are not mirror images.
Skewness is the study of concentration of frequencies n a frequency distribution.
It measures the characteristics of the frequency distribution i.e., it measures the spread of
numbers from an average i.e., whether most of the numbers are around average or whether
most of the numbers are less than the average or whether most of the numbers are more than
average.

Symmetrical distribution
In a symmetrical distribution the values of mean, median and mode coincide. The spread of
the frequencies (observations) is the same on both sides of the centre point of the curve.

Asymmetrical distribution.
A distribution which is not symmetrical is called asymmetrical or skewed distribution. Such a
distribution can be positively skewed distribution or negatively skewed distribution.

Positively skewed distribution Negatively skewed distribution


Types of skewness
1. Zero skewness
2. Positive skewness
3. Negative skewness

1. Zero skewness
It means majority of the data points or values are concentrated around average. It’s
left and right sides are mirror images. Frequency (observation) increases slowly at the
same proportion and after reaching the highest, it decreases slowly at the same
proportion. It is a perfect bell shape curve. The value of mean, median and mode is
equal or same.
Mean = Median = Mode

2. Positive skewness
It means majority of the data points or values are concentrated on left side of the
distribution or averages (mean). Frequency (observation) increases immediately and
after reaching the highest, it decreases. It has a tail at the right side more or longer. A
positive skewness has a long tail on the right side. The value of mean is highest and
mode is lowest and median lies between the mean and mode.
Mean ˃ Median ˃Mode
3. Negative skewness
It means majority of the data points or values are concentrated on the right side of the
distribution or averages (mean). Frequency (observation) increases slowly and after
reaching the highest, it decreases. It has a tail at the left side more or longer. A
negative skewness has a long tail on the left side. The value of mode is highest and
mean is lowest and median lies between the mean and mode.

Mean ˂ Median ˂ Mode

Kurtosis
Kurtosis is a statistical measure that measures the shape of a frequency distribution. It
provides information about the tail and peak of the frequency distribution comparing to
normal distribution. Tail means values at the extremes and peak means values around the
average.
Kurtosis refers to extent of presence of extreme values (tails) in the frequency distribution.
Kurtosis tells us spread of data points or values around the tails of the frequency distribution.
It describes the share of the data or frequency distribution. Kurtosis is a measure of whether
the data is heavily tailed or light tailed.

Types of kurtosis
1. Leptokurtic ( K ˃3)
2. Platykurtic ( K ˂3)
3. Mesokurtic (K=3)

1. Leptokurtic ( K ˃3)
It is greater than mesokurtic which has longer tail. . It means more numbers located at
the tails (extremes) and few numbers around the mean (average).
It has a long tail i.e., more numbers are located at the tail or outliers (extremes) and
few numbers are located around the mean (average).
Tail is length and peak is high
2. Platykurtic ( K ˂3)
It is lower than mesokurtic which has short tail. It means very few numbers located at
the tail (extremes) and more number around the mean (average).
It has a less tail i.e., less numbers are located at the tail or outliers (extremes) and
more numbers are located around the mean (average)
Tail is less and peak is flat

3. Mesokurtic (K=3)
It is between leptokurtic and platykurtic i.e., it is a normal distribution.
Tail is moderate and peak is moderate.
It is a type of distribution which is symmetry. It means both the extreme ends are
similar. It same as normal distribution.

Role of statistics in Managerial Decision Making

1. Data analysis and interpretation


Statistics helps manager in summarizing, organizing, and interpreting of numerical
data. Descriptive statistics like mean, median, mode, quartile deviation, mean
deviation, standard deviation, variation, correlation helps in summarizing data.
Inferential statistics like hypothesis testing and regression helps managers to draw
conclusions and make predictions based on the data

2. Quantitative decision making


Statistics helps manager to take decisions under uncertainty and risk. Statistics
provides tools such as probability theory and decision theory which helps to quantify
uncertainty and risk and assess different outcomes. This enables managers to taken
decisions based on probabilities.

3. Risk Management
Statistics helps managers to assess and manage risks and take business decisions.
Statistics provide tools such as risk analysis, Monte Carlo simulation, sensitivity
analysis, decision trees helps manager to evaluate various risks, simulate various
scenarios and take decision which minimize risks and maximize opportunities.
4. Forecasting and prediction
Statistics helps manager to forecast future trends and outcomes based on historical
data. Statistics provide tools such as time series, regression analysis, forecasting
models helps in predicting future trends, demand pattern, and business condition. This
information is essential for planning, budgeting, resource allocation and setting
realistic goals.

5. Performance measurement and evaluation


Statistics helps manager to monitor and evaluate performance. Statistics provide tools
such as variance analysis, control charts, statistical process control, helps managers to
assess whether actual performance meets targets or benchmarks. This allows
manages to identify areas of improvement and implement corrective actions by
comparing actual performance with targets or benchmarks.

6. Quality control and process improvement


Statistics helps managers in quality control and process improvement in
manufacturing and service industries. Statistics provide tools such as six sigma,
control charts, statistical process control helps manager to monitor production
process, identify variations or defects, implement corrective measures to improve
product quality and operational efficiency. Therefore it helps manager in reducing
defects, optimizing processes and ensuring consistent quality.

7. Market research and customer analysis


Statistics helps managers in understanding consumer behaviour, preferences, market
trends. Statistics provide tools such as surveys, sampling methods, regression
analysis, cluster analysis, factor analysis, conjoint analysis helps manager in
understanding consumer behaviour, preferences, and market trends. This information
helps manager to take decisions relating to product development, pricing strategies,
marketing strategies, , and customer service enhancements.

8. Resource allocation and optimization


Statistics helps manger in optimizing resource allocation. Statistics provide tools such
linear programming problem, queuing theory, inventory management model helps the
manger in allocating resources efficiently such as manpower, capital, materials etc. to
maximize productivity and minimize costs.

9. Strategic planning
Statistics helps manager in strategic planning initiatives. Statistics provides details
relating to market conditions, competitive dynamics, and industry trends. Statistical
analysis helps manager to identify opportunities, market potential and formulate
strategies to achieve the goals and objectives of the organization.

10. Decision support system


Statistics helps manager in decision support system. Statistical models, algorithms,
and data visualization are integrated into decision support system which helps
manager with quantitative data and analysis. DSS helps in evaluating alternatives,
assessing risks, identifying optimal solutions based quantitative data and analysis.
11. Benchmarking and comparison
Statistics helps managers in benchmarking by comparing performance metrics with
industry standards or competitors. This comparative analysis helps managers in
settling realistic goals identify strengths, weaknesses and opportunities for
improvement.

12. Communication and presentation


Statistics helps manager in communicating of data insights and findings through
diagrams, graphs and dash board. Clear and concise presentation of statistical
findings helps manager communicate information to stakeholders, facilitate consensus
building and decision making.

Explain various statistical tools and techniques in managerial decision


making

1. Descriptive statistics
a) Descriptive statistics summaries and describe the features of data set.
b) Measure of central tendency such as mean, median and mode helps the manager
to understand the average or characteristics of a group or most typical values.
c) Measure of dispersion such as range, quartile deviation, mean deviation, standard
deviation and coefficient of variation which shows the variability in the data. It
helps the manager to assess the consistency and reliability of the processes or
outcomes.
d) Managers use descriptive statistics such as mean, median, mode, mean deviation,
standard deviation, coefficient of variation to understand historical performance,
assess current situations and identify trends or patterns.

2. Diagrams and Graphs


a) Diagrams and graphs are used in statistics to show various aspects of data, trends
and relationships and communicate data effectively.
b) Managers use diagrams and graphs like simple bar diagram, multiple bar diagram
pie chart, line graph, histogram, line graph etc. to show various aspects of data,
trends and relationships.
c) Presentation of data through diagrams and graphs will in communication of
findings and understanding to stakeholders, executives and team members. This
will promote clarity, consensus and take proper decisions.

3. Tabulation
a) Tabulation is a process of systematic arrangement of data in a rows and columns
of table. It is a neat form of presentation of classified data.
b) Tabulation reduces large volumes of data into a concise and structured form. This
makes managers to grasp the information quickly and efficiently.
c) Managers can compare different variables, trends with the help of tabulated data.
This comparative analysis will help managers to understand the data effectively
and take decisions.
d) Presentation of data through tables will help in communication of findings and
understanding to stakeholders, executives and team members. This will promote
clarity consensus and take proper decisions.

4. Correlation analysis
a) Correlation refers to the relationship or association between two or more
variables.
b) Correlation allows managers to identify relationships between different factors or
variables that may influence business outcome.
c) Correlation allows managers to make predictions about future outcomes based on
historical data by examining the strength and direction of correlation between
variables
d) Correlations helps managers to allocate resources more efficiently

5. Regression analysis
a) Regression analysis examines the relationship between dependent variable and
independent variable and predict based on these relationships. It includes linear
regression and multiple regression.
b) Linear regression – it establishes the relationship between independent variable
and dependent variable, and predict based on the changes in the independent
variables. For example, predicting sales based on advertising spend.
c) Multiple regression – it establishes the relationship between several independent
variables or factors and its impact on dependent variable. It helps managers in
understanding the combined effect of various independent variables or factors on
the dependent variable or outcome.
d) Managers use regression analysis to understand the impact of independent
variables on dependent variable. It helps in forecasting future trends, optimizing
resource allocation, and decision making.

6. Time series analysis


a) Time series analysis studies data or values collected over time to identify trends,
seasonal variation, cyclical variations and irregular variations.
b) Managers use time series analysis for forecasting and taking decisions in resource
allocation, financial planning, workforce planning, inventory management,etc.

7. Quality control tools


a) Quality control techniques like Control Charts, six sigma, Process capability
analysis, and Statistical Process Control monitor and control processes to ensure
consistent product or service quality.
b) Manager use utilize control charts, process capability analysis, and statistical
process control to identify deviations from quality standards, identity causes for
defects, implement corrective actions. This helps in improving operational
efficiency, reducing costs, minimizing defects, and maintains quality standards.

8. Decision analysis
a) Decision analysis helps in making decision making processes under uncertainty
and risk by using techniques lie decision trees, sensitivity analysis and scenario
analysis
b) Manager uses various techniques like decision trees, Monte Carlo simulations,
sensitivity analysis scenario analysis in evaluating alternative courses of actions,
assessing risks and choosing optimal strategy or best course of action which
maximizes profitability.

9. Inferential Statistics
a) Inferential statistics involve making inferences or conclusion about a population
based on sample data, using techniques like hypothesis testing and confidence
intervals.
b) Hypothesis testing consists of null hypothesis and alternative hypothesis which is
assumptions or statement about the characteristics of the population.
c) There are various types of testing of hypothesis like Z –test, t-test, chi-square test,
ANOVA etc. depending on the nature of data and hypothesis and interpreting the
result either to accept or reject the null hypothesis.
d) Confidence interval or level is specified. It is the range of values within which a
population parameter is estimated to like at a certain level of confidence usually at
95 % or 99% confidence interval.
e) Significance Level (α) is specified. The probability of rejecting the null hypothesis
when it is actually true. It is usually set at 5% or 1% significance level.
f) Managers use inferential statistics to draw conclusions about a population based
on sample data, make decisions with a level of certainty and lead organization to
the success.

10. Optimization techniques


Optimization techniques such as linear programming, integer programming, network
etc., are used to find the best solution from a set of alternatives. Optimization
techniques help managers allocate resources efficiently, improve decision-making
while leads to maximization of profit or minimization cost.

You might also like