0 ratings0% found this document useful (0 votes) 40 views41 pagesStatistics
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
REP en ee
MAR GREG
OF ARTS & SCIENCE
Block No.8, College Road, Mogappair West, Chennai — 37
Affiliated to the University of Madras
Approved by the Government of Tamil Nadu
An ISO 9001:2015 Certified Institution
DEPARTMENT OF COMMERCE
SUBJECT NAME: BUSINESS STATISTICS
SUBJECT CODE: CDZ3A
SEMESTER: III
PREPARED BY: PROF. B. HARISWARAN[J To facilitate the understanding of the relevance and need of the Statistics in the Current
Scenario.
[05 To Customize the importance of Business Statistics for the Commerce Students.
UNIT- Lntroduction
Meaning and Definition of Statistics ~ Collection and Tabulation of Statistical Data - Presentation of
Statistical Data - Graphs and Diagrams
Measures of Central Tendency - Arithmetic Mean, Median, Mode, Harmonic Mean and Geometric
‘Mean. Measures of Variation - Standard deviation - Mean Deviation - Quartile Deviation - Skewness
and Kurtosis- Lorenz Curve
Simple Correlation - Scatter Diagram - Karl Pearsons Correlation - Spearman's Rank Correlation -
Regression Meaning - Linear Regression.
unrr-1w Tif
Analysis of Time Series - Causes of variation in Time Series Data - Components Of Time series;
Additive and multiplicative models - Determination of Trend by Semi average, Moving average and
Least squares (Linear, Second degree and Exponential) Methods - Computation of Seasonal indices by
‘Simple average, Ratio-to-moving average, Ratio-to Trend and Link relative methods
UNIT-YV Index Numbers
Meaning and Types of Index numbers - Problems in Construction of Index numbers - Methods of
Construction of Price and Quantity indices - Tests of adequacy = Erfors in Index numbers - Chain
Base Index numbers - Base shifting - splicing - deflating - Consumer Price index and its uses - Statistical
Quality ControlINTRODUCTION
‘The word ‘Statistics * is derived from a Latin term “Status” or Italian term ‘Statistics’ or the German
term ‘Statistick” is the French term ‘Statistique’ each of which means a political state, The term
statistics was applied to mean facts and figures and figures which were needed the state in respect of
the division of the state, their respective population birth rate, income and the like.
Statistics Meaning:
The term ‘Statisti¢s’ is conveyed to two different things, In the plufal use, statistics means
some systematic collection of numerical data about some particular topic.
In the Singular use, it means the science of statistics. In the general practice, statistics is
used to mean the science of statistics and data or statistical data used for the numerical variables.
Statistics — Definition
“Statistics are numerical statement of facts in any department of enquiry placed in relation to
each other”. — A.L.Bowley.
“Statisties may be defined as the science of collections presentation, analysis and interpretationof
numerical data” - Croxton and Cowden.
Characteristics of statistics
Aggregate of fact’s
Statistical enquiry is to get information from a mass of observation with regards to the
group behavior of individual items. For example, the aggregate of figures related to production,
sale and profit over different times is called statistics.
Numerically expressed
‘Numerical expression of the observed fact in terms of quantitative standards of particular
scores could be regarded as statistics.
Estimated
‘The numerical data pertaining by field of enquiry can be observed either by enumerating
or by estimation. Enumeration is used for sell field of enquiry while estimation is used for wide
and large field of enquiry.Standard of Accuracy
In case of enumeration and estimation, it is essential to fix the desired standard of
accuracy beforehand.
Predetermined purpose
The purpose of enquiry is specifically stated, and then the data should be collected in a
systematic manner through some suitable plan, so as to make the figures free from bias and
errors.
Comparability
The ultimate aim of statistical data is, for the purpose of the comparative or relative
study. Therefore, it is homogeneous to make valid comparison.
Objective of statistics
To improve the unknown and to cast light upon the statistics out of facts and figures
To enable comparison to be make between past and present
To throw light on the reasons of changes, effects of changes and plans for future
To handle analyze and draw valid inferences.
To help to drawing conclusion from facts effected by a multiplicity of causes
Importance / Scope of Statistics / Application of Sta in various,
fieldsin States
Statistics was regarded as the “Science of Kings”, It supplies the essential information to
run the government; Policies are adopted by the government with the help of statistics.
In economic
In economies, the problems are studied by the use of statistical methods economic lossis,
based on the study of collected statistical data. The loss economics refer to statistics to prove
their accuracy, Statistics in economics as given birth to a new discipline called econometric:
In Business
In the competitive business, the business people face some like shortage is overstocking,
uneconomic crisis etc., which can be solved through statistical analysis. To a greater extent
statistics help the businessman maximize their profit.
In Education
Statistics is widely used in education for research purpose. It is used to test the past
knowledge and evolved new knowledge.
In Astronomy
Astronauts study the eclipse and astronocal issues by applying statistics. They rely on
estimation in many cases and it was corrected with the help of statistics.
In accounting correlation analysis between profit and sales is widely used. In auditing,
Sampling techniques are commonly followed.In Banking
In this past developing technology, the banking sectorneeds a lot of information about the
present and future business development.
In Investment Decision
Statistics helps an investors in selecting securities, which are safe, yielding a good return
an appreciation in the market price.
In Insurance
Statistics is extensively used in the field of Insurance. Actuarial statistics is must of
theinsurancecompany through fix the premium relates which is based on the mortality tables.
Market researchers largely depends upon statistical methods in drawing conclusion
In management
Statistical tools are used widely by business enterprises for the promotion of new
business.
It also helps in the assessment of quantum of product to be manufactured, the amount of
raw material, labor needed, marketing avenues for the product and the competitive products in
the market and so on,
In Industry
In Industry staties is used in quality control through control charge which has its basis on
the theory of probability, normal distribution and inspection, which are based on sampling
techniques,
In Medical sciences
In medical sciences, the test of significance by student T test for testing the efficiency
of new drug, injection for controlling and curing specific ailments is done carried out by
statistics. Comparative study for the effectiveness of different medicine by different concerns can
also be also be made by statistical techniques of T & F test of significance
In War
The theory of decision functions propounded by A.Wald can be of great assistance tothe
military and technical personnel to plan maximum destruction with minimum effort. Moreover, the
statistical data obtained in the post war period reveal some useful information for planning future
military strategies.
Functions of Statistics
It prevents facts in a definite numerical form
It simplifies the complexity of the data
It provides a technique of comparison
Ithelps in formulation and testing hypothesis
It helps in forecasting of future trends and tendencies
It studies relationship
It helps the governmentLimitation of statistics
Statistics cannot be applied to individual term
Statistical study qualitative phenomena in indirect form
Statistical law are not exact
Statistical results are uncertain
Statistics is not simple
Statistical data may be incomparable
Statistics is liable to be misused
Collection of Data Meaning
Data Collection means the assembling for the purpose of a particular investigation of
entirely new data, presumably nor available in published sources
Data: meaning
Data refer to the facts, figures or information collected for a specific purpose
Types of Data
Primary data, & Secondary data
Choice between Primary and Secondary Data
‘Nature and scope of the enquiry
Availability of financial resources
Availability of time
Degree of accuracy designed
Collecting agency
Primary Data
Primary data are new and original in nature which are firsthand information generatedto
achieve the purpose of the research
Advantages of Primary data
> First and new information
» More reliable
> Formulated int such a manner, which best suits the purpose
Methods of collection of Primary DataExpeFiméntiMigthod
Here the researcher examines the truth contained in his hypotheses by conducting
experiments, through which the date are collected.
Survey Method
Under this method, data can be collected by any one or more of the following ways:
A) Observation method
This method refers to the collection of information by way of investigator's ownobservation without interviewing the respondents,
B) Interview Method
In the interview method, a lot of questions relation to the proposed study is prepared
and the answer for these questionnaires obtained from the respondents,
C) Mailed Questionnaire method
Under this method, the questionnaire is sent to the respondents with a covering letter to
fill up the questionnaire and send back within a specified time.
D) Through Schedules
Under this method, enumerators are appointed and trained .Who will take the
questionnaire to the respondents and fill the answer to the questions, obtained from the
respondents
Secondary data
Secondary data are not new and original in nature which are obtained from published and
or unpublished sources
Sources of Secondary Data
> Published Sources
>» Un published Sources
Classification’and tabulation
Meaning
Classification is the process of arranging the data under various understandable
homogeneous ‘groups for the purpose of convenient interpretation, The grouping of data is
making on the basis of common characteristics
Definition
The process of grouping a large number of individual facts or observations on the basis
of similarity among the items is called classification, - Stactor and Clark
Characteristics of classification
> All facts can be arranged into homogeneous groups,
> Classification may be according to their resemblances and affinities
> Classification may be made on either actuality or nationality
> Going expression to the unity of attributes
> It should be flexible to accommodate adjustment
Objectives of classification
To facilitate comparison
To study the relationship
To trace location of important facts at a glance
To eliminate unnecessary details
To effect statistical treatment of the collected data> To facilitate easy interpretation
Significance of classification
> It is helpful to tabulation
> It leads to a valid result
> It makes interpretation clear and meaningful
‘Types of Classification Geographical Classification
In this type the data are classified on the basis of geographical locational differencesamong
various items on the basis of states districts, cities, regions, and the like
Chronological Classification
Under this type data are classified in them basis of differences in time or period such astainfall
for 12 months.
Qualitative Classification
In this classification, data are classified on the basis of some attributes or qualitativephenomena|
such as religion, sex, marital status, literacy, occupation and the like.
Quantitative Classification
Under this type data are classified according to some quantitative phenomena capable of
quantitative measurement such as age, experience, income, prices, production, sales and thelike
Frequency Distribution
Frequency distribution is the process or method in simplify mass of data into grouped
form of classes-and the member of items in such class is recorded
a. Univaraite Frequency Distribution
b. Bivariate Frequency Distribution
a. Univariate Frequency distribution
It is one way frequency single variable distribution and further classification into
> Individual Observation
> Discrete Frequency Distribution
> Continuous Frequency Distribution
b. Bivariate Frequency Distribution
Bivariate Frequency Distribution is a two way Frequency distribution, where two
variables are measured in the same set of items through cross distribution
Tabulation
Tribulation is a systematic arrangement of raw data in a compact form of historical
Rows and vertical columnUses of tables
> It simplifies the presentation
> It facilitates comparison
> Itis easier to distend the required information
> Itreflects the trends and tendencies
Parts of tables
Table Number
Title of the table
Head Note
Caption
Body of the table
Source Note
Foot Note
Diagrammatic and Graphic PresentationDiagrams and Graphs
Diagrams and graphs are easy methods of understanding of data as they are a visual form
of presentation of presentation of statistical data.
Diagrams and attractive and useful to find out the result. Data should be simplified before
presenting in the diagram. Two or more sets of data can be compared with the help of diagrams
Diagrams provide moiré information than the table.
Methods of Diagrams
Points, lines, bars, squares, rectangles, circles cube and so on.
Types of Charts
Charts, pictures Maps and the like
Advantages of diagrams
‘Visual form of presentationProvide attractive and Impressive view
Save time and labour
Made Comparison Easy
Useful for production
Provide more information
Limitations of Diagrams and Charts
Further analysis is not possible
They show only approximate values
Alll details cannot be presented diagrammatically and or graphically
Construction of diagrams and graphs require some skill
It is complementary in the table but not an alternative to it‘Types of Diagrams
One dimensional diagram
Two dimensional diagram
Three dimensional diagram
Pictogram
Cartograms
Bar Diagrams
Bar is a thick wide line, Statistical data presented in the form of bar is called bar
diagram, Simple but diagram is commonly used in business
‘Types of bar diagram
> Simple bar diagram
Percentage bar diagram
Bilateral deviation bar diagram
Multiple bar diagram
Sub divided bar Diagram
MEASURES OF CENTRAL TENDENCY
Average Meaning
Average is a single value that represents group of values
Definition
An Average is a value which a typical or representative of a net of data
Characteristics of a Good Average
> It should be defined clear and unambiguous so that it leads to one and only one
interpretation by different persons
> It should be easy to understand and simple to compute and should not involve heavy
arithmetical calculations:
> It should be based on all the items of the given set of data is compute the average.
> It should besuitable for further algebras mathematical treatment and capable of being
used is further statistical computations
Uses of Average
» Itis useful todescribe the distribution in a concise manner
> It is useful to com pare different distributions> It is useful to comparevarious statistical measures such as dispersion, skewness, kurtos
and soon
Functions or An average
> To facilitate Quick understanding of complex data
> To facilitate Comparison
> Itestablishes mathematical relationship
> Capable of further statistical comparison
‘Types of Average
> Mathematical Average
> Location Average
> Commercial Average
ives of an Averaige
> To get a single value that describe the features of the entire group
> To provide ground for better comparison
> To provide ground for further statistical computation and analysis
Arithmetic Mean
The arithmetic mean of a series of items is the sum of the values of all items divided by
that total number. It is a multinational average and it is the most popular measure of central
tendency
Merits of Anti¢metic Mean
> Easyto calculate and understand
> Itis a perfect average, affect by the value of every item in the series
> Itis calculated value and not based on position in the series
ca
It is determined by a rigid formula, Hence, everyone who computes the average gets
the same answer
tis used in further calculation
It gives a good base for comparison
Demerits of Arithmetic Mean
> The mean is unduly affected by the extreme items
> Itis unreliable Itmay lead to a false conclusion
> It is not useful for the study of qualities
> It cannot be located by the graphic method
Arithmetic Meanindividual Seri
Find our mean from the following data
RollNo ]1 |2 3
Marks [21 | 30 8Solution:
Roll No
Marks (X)
2
30
28
26
34
9
15
17
EX =300
Formula = X =3X/N X =300/10 = 30.
‘The mean marks =30
Discrete Series
Calculate the arithmetic mean for the wages of workersin a Factory
Wages in Rs.
6
10
Workers
15
7
lution
Wages in
Rs.
ix
4
6
8
10
15
16
4x5=20
6x15=96
8x6=48
10x7=70
15x8=120
16x2=32
YAK=380
X=YiK/N = 380, N=43
= 380 / 43 = 8.837The average wage of workers = Rs.8.84
Continuous Series Calculate Arithmetic Mean
Class 0-10 10-20
Intervals
Frequency
Class Intervals Mid-point Frequency fim
0-10 5 6
10-20 15 5 15
20-30 25 8 200
30-40 35 15 525
40-50 45 7] 315
Nope
Arithmetic Mean = X = =Sfim/N
‘The Arithmetic mean = 27.92
Median
Median is the value of the middle item of a series arranged in ascending or descending
order of magnitude, Hence it is the “Middle most” or “Most central” value of a set of number. It
divide the series into two equal part, one part containing values. greater and the other with
values less than the median.
Meaning
‘The number is that value of the variablewhich divides the group into two equal parts,
one part comprising all values greater and the other, all values less than median.
Merits of Median:
It is easy to compute and understand
Iteliminates the effect of extreme item
The value of median can be located graphically
Demerits of Median
‘The calculating media, it is necessary to arrange the data other averages do not need an
arrangement
It is affected more by fluctuation of samplingthan the arithmetic mean.
> It is not based on all the items of the series
Individual Series
Arrange the data either ascending or descending orderMedian ~ Size of N+1
th
Ttem2
Find out the median from the following
37 58 6l a2
Data arranged in ascending order
38
42
37
58
62
65
66
R
80,
Median= —Size of N+1
th Item:
2
= Size of 941
the item
2
=10/2 = S*item
Median = 62
Discrete Series
Compute the median for the following distribution of weeks of wagers of 65 employees of the
xyz. company
Weekly 55 65 785 85 95 105 Ws
wages in
Rs
Number of
employeesSolution
Weekly wages in Rs No of Employees
‘Cumulative frequency (eH
55 8
65 10
15 16
85 14
95 10
105 5
15; 2
8
18
34
48
58
63
65
Median= -Sizeof N+1
coeenneeeth Tem,
2
= sizeof 65+ 1
ft
= 337 which is nearer to 34
Cfof 34=75
Median weekly wages = 75
Continuous Series
Calculate the median form the following data
‘Marks 0-20 20-40
No of Students 5 15
lution
‘Marks ‘No of students
‘Cumulative frequene;
0-20 5
20-40 15,
40-60 30
60-80 8
80-100 2
5
20
50
58
60
Median = —izeof N
~ th Item ¢ size of 60/2th Item
x 20 = 46.47Median marks = 46.676
Mode
‘Mode is the modal value in the value of the variable which occurs more number of
times or most frequently is a distribution, Mode is the value which occurs with the greatest
number of frequency in a series
‘Types of modal
1. Uni-model
If there is only one mode in series is called uni-model
il, Bi-Modal
If there are two modes in the series, it is called bi-model
i. Tri-Modal
If they are three modes in the series, it is Relationship between different Averages
Symmetrical is called Tri-model.
IV. Multimodal
If there are more than three modes in the series itis called multi-mode.
Relationship among mean, median and mode
‘The three averages are identical, when the distribution is symmetrical. In an asymmetricaldistribution,
the values of mean, median and mode are not equal.
Median = 1/3 (Mean - mode )
Mode = 2.median - 2 mode
Median =Mode * 2/3 (Mean — Mode)
Individual Series
Calculate the mode form the following data of the marks obtain by 10 students
Serial No|
“Marks
obtained
Solution
Marks obtained by 10 students is here 77 is repeated three times
‘Therefore the Mode mark is 77Discrete Series
Calculate the mode form the following data of the wages of workers of are establishment.
the modal wages
[Daily 10 fiz is
Iwagesin
Solution Grouping Table
Daily Frequency of Wages
Wages Earners
is 3 a
Rs.
Analysis Table:T
1 3 5 4 1
From the analysis table it is known that size10 has been repeated the maximum number of
times, thus is, so the modal wages Rs10
Continuous series
Find out the mode from the following series
x 0-5 [5-10 | 10-15 | 15-20
frequency | 1 2 5 4
Modal value lies in 15-20 as it occurs most frequently
fl - 0
Mode (Z) =L +.
2f1-f0- 2
Mode (z) = 154 +.
2(14)- 5 - 10
S15 + 9/13 X5 = 15 + 45/13
= 15 +346
Mode = 18.46
Geometric Mean
Merits of geometric Mean
» Every item in the distribution is included in the calculation
> Itcan be calculated with mathematical exactness, provided that all the qualities are
greater than zero and positive
> Large items have less effect on it than in the arithmetic average.
> Itis amenable to further algebraic manipulation
Demerits of Geometric mean
> Itis very difficult to calculate
> It is impossible to use it when any item is zero or negative
> The value of the geometric mean may not correspond with any actual value in the
distributionUses of Geometric mean
> This average is often used to construct index numbers, where we are chiefly concerned
with relative changes over a period of time
> Itis the only useful average that can be employed to indicate rate of have
Individual series
Tlog X
G.M= Antiling of
N
Calculate Geometric Mean
50 72
Logt X
1.6990,
1.8573
1.7324
1.9238,
1.9685
Dog X
GM = Anti ling of.
=9.1710
_ = 1.8342
5
= Antilog of 1.8342 = 68.26
Discrete Series
Calculate
Geometric mean from the following data
Sizeof [120 [125 | 130
ItemFrequency
Solution:
Flog x
4.1584
6.2907
6.3417
2.1303
4.2670
14,9793
8.5720
4.2922
17.3384,
N=Sfllog x=
68.3700
i ling of Anti log of |
32
=‘Aiitilog of 2.1366 = 137 Therefore. G.
Continuous series
Geometric mean from the following data)
Yield of | 7.54-10.5 | 10.5-13.5 | 13.5-16.5 | 165-195 22.5-25.5 | 25.5-28.5
wheat
No of,
formsSolution:
Yield of wheat flogm
754-105 4.710
10.5-13.5 9.7128
13.5 -16.5 22.3459
16.5-19.5 28.8719
19.5-22.5 9.2554
22.5-25.5 5.5208
25.5-28.5 1.4314
Diflog m=!
GM = Anti logof Sf logm
N
= 81.9092/68 = 1.204547
= Antilog of 1.204547 = 16.02 G.M = 16.02
Harmonic Mean
Meaning
Harmonic Mean is the reciprocal of the arithmetic average of the reciprocal of values of
various item in the invariable
Merits of Harmonic Mean
It utilizes all values ofa variable
It is very important to small values
Itis amenable to further algebraic manipulation
It provides consistent results in problems relating to time and rates than similar
averages
Demerits of Harmonic Mean
> Itis not very easy to understand
> The method of calculation is difficult> The presence of both positive and negative items in a series makes it impossible to
compute its value. The samedifficulty is felt if one or more items are zero
> Itis only a summary figure and may not be the actual item in the series.
HM =
Li
Find out the Harmonic mean
Family
Income
Solution
Computation of Harmonic Mean
Famil I/x
0.01176
0.01429
0.10000,
0.01333
0.00200,
0.12500,
0.02381
0.00400,
0.02500
0.02778
TIk=
0.34697
Dix
= 10 /0,34697 = 28.82 H.M = 28.82
Discrete Series
Size of Item
FrequencySolution:
Size of Item Ix Fix
x
6 0.1667 0.6668
0.1429 0.8574
0.1250 1.1250
0.111 0.5555
0.1000 0.20000.7272
0.0909
Yhlk=
4.1319
Yel
Continuous Series
Compute Harmonic Mean
Size (0-10
Frequency | 5
Solution,
Size F(i/m)
0-10 1.00000
10-20 0.53336
20-30 0.48000
-30-40 0.17142
40-50 0.08888
Diiim=
2.27366
N
HM = = = 15.393682
Yel/m 2.27366
Measures of Variation or Dispersion Meaning
Dispersion is the study of scatterness around an average
Definition
Dispersion is the measures of the variation of the items ---A.L.Bowley_
Dispersion is a measure of extent to which the individual items vary ----L.R.Connor
Importance of measuring variation or dispersion> Testing the Reliability of the Measures of Central Tendency
> Comparing two or more series on the basis of their variability
> Enabling to control the variability
> Facilitating as a Basis for further statistical Analysis
Characteristics of a Measure of Variation
> Itis easy to understand and simple to calculate
> It should be rigidly defined
> It should be based on all observations and it should not be affected by extreme
observations
> It should be amenable to further algebraic treatment
> It should have sampling stability
Methods of Measuring Dispersion
Range
Range
Inter Quartile range
Quartile Deviation
Mean Deviation
Standard Deviation
Lorenz Curve
Range is the difference between the largest and the smallest value in the distribution. It
is the simplest and crudest measure of dispersion.
Uses of Range
> Itis used in industries for the statistical quality control of the m infected product
> Itis used to study the variations such as stock, shares and other commodi
> It facilitates the use of other statistical measures
Advantages of Range
> Itis the simplest method of studying variation
> tis easy to understand and the easiest to compute
» Ittakes minimum time to calculate
> Itis accurate
Disadvantages of Range
> Range is completely depended on the two extreme values
> Itis subject to fluctuations of considerable magnitude from sample to sample
> Itis not suitable for mathematical treatment
>
>
It cannot be applied to open and classes
Range cannot tell us anything about the character of the distributionQuartile deviation
Quartile deviation is an absolute measure of dispersion. It is calculated on the basis ofthe
difference of upper quartile and the lower Quartile divided by 2.
In the series, four quartiles are there. By eliminating the lowest (25%) items and the
highest (25%) items of a series, we can obtain a measure of dispersion and can find out half the
distance between the first and the third quartiles.
Q3 -Ql
Quartile Deviation (Q.D) =
2
Co-efficient of QD = Q3-QI
@+Q
Merits of Quartile Deviation
> Itis simple to calculate and easy to understand
> Risk of extreme item variance is eliminated, as it depend upon the central 50 per cent
items
> It canbe applied to open and classes
Demerits of quartile Deviation
> Items below QI and above Q3 are ignored.
> tis not capable of further mathematical treatment
> Itis affected much by the fluctuations of sampling
> Itisnot calculated from a computed average, but from a positional average.
Mean deviation
Mean deviation is the average difference between the items in a distribution computed
from the mean, median or mode of that series counting all such deviation as positive. The mean
deviation is also known as the average deviation
Mean deviation =S1 DI
N
Co — efficient of Mean Deviation (M.D) =MD/ Mean or Median
Merits of Mean Deviation
> Itis clear and easy to understand
> Itis based on each and every item of the data It can be calculated from any measure of
central tendency and as such as flexible too.Demerits of mean Deviation
It is not suitable for further mathematical processing
Itis rarely used in sociological studies
Itis mathematically unsound and illogical, because the signs are ignored in the
calculation of mean deviation
Standard deviation
Standard deviation is the square root of the means of the stranded deviation from the
Arithmetic mean. So, it is also known as Root Mean Square Deviation an Average of Second
order. Standard deviation is denoted by the small Greek letter ‘o” the concept of standard
deviation is introduced by Karl Pearson in 1893.
Uses of Standard deviation
It is used instatistics because it possesses must of the characteristies of an ideal
measure of dispersion.
1 Itis widely used in sampling theory and by biologists.
1 Itis applied in co-efficient of correlation and in the study of symmetrical frequency
distribution
Advantages of standard deviation
Itis rigidly defined determinate
It is based on all the observations of a series
tis less affected by fluctuations of sampling and hence stable
Itis amenable to algebraic treatment and is less affected by fluctuations of sampling
most other measures of dispersion
The standard deviation is more appropriate mathematically than the mean deviation,
since the negative signs are removed by squaring the deviations rather than by ignoring
SKEWNESS
Introduction
The term “Skewness’ refers to lack of symmetry, that is, when a distributionis to
symmetrical it is called a skewed distribution. It the curve us normal or the data distributed
symmetrically or uniformly. Spread will be the same on both sides of the cent repoint and the
‘means median and mode will all have the same value.
Definition
‘Skewness or symmetry is the attribute of a frequency distribution that extends furtheron
one side of the class with the highest frequency on the other--- Simpson and Kafka
When a series is not symmetrical it is said to be asymmetrical or skewed -Croxton and cowden
Skewness of a Distribution
When a distribution is not symmetrical itis called a skewed Distribution,
‘The analysis of presence of skewness in a distribution implies two main tasks. They are
| Determination ofthe sign of skewness and testing of skewness andAbsolute measures of skewness
i) The Karl Peason’s Coefficient of Skewness
ii) The Bowley’s Co efficient of Skewness
iii) The Kelly’s Coefficient of Skewness
iv) Measure of Skewness based on moments
Karl Pearson’s Co-efficient of Skewness
‘This method is based upon the difference between mean and mode and the differenceis
divided by standard deviation to give a relative measures.
Bowley’s Coefficient of Skewness
Bowelys measure is based on quartiles, in a symmetrical distribution first and third
quartiles are equidistant from the median.
Objectives of Skewness
1). To find out the direction and extent of asymmetry in a series.
lI) To compare'two or more series with regards to skewness,
Ill) To study the nature of variation of the items about the central value.
Graphic method of
dispersionLorenz Curve
Lorenz Curve is a device used to show the measurement of economic inequalities as in
the distribution of income and wealth, It can also be used in business to study the disparities of
distribution of profit, wages, turnover, production and the like.
Correlation and Regression Analysis
Meaning:
Correlation is the study of the natural relationship between tWo or tore variables. Hence,
that the detection and analysis of correlation between two statistical variables requires relationship of some sort
which associates the observation in pairs each of which is a value of the two variables
Definition
‘The relationship that exists between two variables -—-Smith
Correlation analysis deals with the association between two or more variables.Uses of Correlation
1) Correlation is very useful in physical and social sciences. Business and economics
1) Correlation analysis is very useful in economies to study the relationship
between priceand demand
Ill) It is also useful in business to estimates costs, value, price and other related
variables
IV) Correlation is the basis of the concept of regression
V) Correlation analysis help in calculation the sampling once.
‘Types of Correlation
Positive correlation
Negative Correlation
Simple Correlation
Multiple Correlations
Partial Correlation
Linear Correlation
Non =Linear Correlation
Positive Correlation
Correlation is said to be positive when the values of two variables move in the
same direction, so that an increase in the value of one variable is accompanied by an
increase in the value of the other variable or a decrease in the value of one variable is
followed by a decreasein the value of the other variable
Negative Corrélation
Correlation is said to be negative when the values of two variables move in
opposite direction, so that an increase in the values of one variable is followed by a
decrease in the valueof the other and vice-versa.
Simple Correlation
When only two variables are stated, itis said to be simple correlation
Multiple Correlations
When more than two variables are stated simultaneously, the correlation is said to
be multiple
Partial Correlation
Partial correlation coefficient provides a measure of relationship between a
dependentvariable and a particular independent variable when all other variables
involved are keptconstant analysis to yield and rainfall; it becomes a problem relating
to simple correlation Linear Correlation
‘The correlation is said to be linear, if the amount of change is one variable
tends to beara constant ratio to the amount of change in the otherNon Linear Correlation
The correlation is non linear, if the amount of change in one variable does not
bear aconstant ratio to the amount of change in the other related variable.
Methods of studying
correlationGraphical
method
> Scatter diagram
> Simple graph method
Mathematical Methods
» Karl Pearson’sCo-efficient of correlation
> Spearman's Rank Correlation coefficient
> Concurrent deviation method
> Method of least square
Scatter diagrart method
It is.a method of studying correlation between two related variables. The two
variablesX and Y will be taken upon the X and Y axes of a graph paper. For each part
of X and'Y values, we mark a dot and we go as many points as the numbers of
observation,
Graphical method
In this method curves are drawn for separate series on a graph paper. By
examining the direction and closeness of the two curves we canofferwhether prompt
variances are related. If both the curves are moving in the same direction correlation is
said to be positive. On the contrary, if the curves are moving in the opposite directions
is said to be negative
Karl Pearson’s Co-efficient of correlation
Karl Pearson, a great statistician introduced a mathematical method for
measuring the magnitude of relationship between two variables. This method. Known
as Pearson Coefficient of correlation is widely used, It is denoted by the symbol “r"
Spearman’s Rank Correlation Co-efficient
In 1904, a famous British psychologist Charles Edward Spearman found out
the method of Co-efficient of correlation of rank. Rank correlation is applicable to
individual observation. This measure is useful in dealing with qualitative
characteristics. The result, by using ranking method, is only approximate.Regression
Analysis
Meaning
The statistical method employed to estimate the unknown valued of one
variable fromthe known value of the related variables is called regression
Definition
Regression is the measure of the average rélationship between two or more
variables interms of the original units of the data - Blair
Regression analysis Meaning regression analysis is statistical device with
which weestimator or predict the unknown values of one variable from known value
of another variable
Regression analysis definition
One of the most frequently used techniques in economies and business
research, topfind a relation between two or more variables that are related causally,
is regression analysis.
~ Taro Famane
Uses of regression analysis
> It is useful to estimate the relationship between two variables
» Itis useful for production of unknown value
> It is widely used in social sciences like economies, Natural and physical sciences
> Itis useful to forecast the business situation
> Itis useful to caleulate correlation co-efficient and co-efficient of determinations
Methods of studying Regression
> Graphic method
> Algebraic method
Graphic method
Under the method the dots are plotted on a graph paper representing pair of
values of the given variables having a linear relationship the independent variable is
taken in the X axis and the dependent variable taken on Y axis. The regression line of
X on Y provides the most probable value of X given the most probable value of Y
when the exact value of X is known, Thus we get two regression lines.Regression lines
1) Regression of X on Y
Il) Regression of Y on X
‘Time series Analysis
> Time series analysis is the analysis of identifying different components such
as trend,seasonal, cyclical and irregular in a given time series data.
Definition
A time series is a set of observation arranged in chronological order. Morris
Hamberg requirement of a time series Data must be available for a long period of time.
Data must consist of a homogeneous set of values belonging to different time periods. The
time gap between the variables or composite of variables must be as For Possible equal.
‘Causes of variatigntin Time Series Data
+ Social customs, festivals etc. Seasons
+ The four phase of business: prosperity, decline, depression, recovery
+ Natural calamities: earthquake, epidemic, flood, drought etc, Political
movements/changes, war
ete
Components ofTime series; Additive and multiplicative models
Components of Time Series
. Secular Trend
. Seasonal Variation
. Cyclical Variations
. regular variation1, Secular Trend
A secular trend or long-term trend refers to the movements of the series
reflecting continuous growth or decline over a long period of time. There are many
types of trend, Some trends rise upward and some fall downward
2. Seasonal Variation
Is that periodic investment in business activities within the year recurring
periodicallyyear after year?
Generally, seasonal variation appear at weekly, monthly or quarterly intervals
3. Cyclical Variation
Up and down movements afé different from seasonal fluctuations, in that they
extend over longer period of time —usually two or more years, Business time series is
influenced by the wave-like changes of prosperity and depression.
4, Irregular Variation
Irregular vatiations or random variations constitute one of four components of a
time series. They correspond to the movements that appear irregularly and generally
during short periods, Irregular variations do not follow a particular model and are not
predictable:
Mathematical Model for a Time Series
In classical analysis, itis assumed that some types of relationship exist among the
fourcomponents of time series
Additive Model
According.to this model, the time series is
expressed asY=T+S+C+I
Y = the value of original
time seriesT = Time
Value
S=
Seasonal
variationC
= Cyclical
Variation
Irregular
fluctuation
Multiplicat
ive Model
According this model, the time series is
expressed asY = YX SXCXIDetermination of Trend
Measurements of Trends
Following are the methods by which wecan measure the trend,
(i) Freehand or Graphic Method,
(ii) Method of Semi-Averages.
(iii) Method of Moving Averages.
(iv) Method of Least Squares.
Free hand Graphic Method
In this method we must plot the original data on the graph. Draw a smooth
curve carefully which will show the direction of the trend. The time is taken’on the
horizontalaxis I(X) and the value of the variable on the vertical axis (Y)
Semi — Average Method
In this method the original data are divided into two equal parts and average are
caleulated for both the parts, These averages are called semi average, Trend line is drawn
withthe help of the semi averages
Fit'a trend line by the method of semi- averages for the given data.
2000 2001 2002 2003 2004 2005 2006
105 115 120 100 110 125 135
Solution:
Since the number of years is odd{seven), we will leave the middle year’s
productionvalue and obtain the averages of first three yearsand last three years.
Year Production Wen
2000 105
ror 11505 + 115 +120 _1 43 55
2002 120 3
100
2003 (left out)
2004 «110
+
+
=123.33
BIMoving average method
In this method, the average value of a number of years or months or weeks is
taken into account and placed it at the Centre of the time span and it is is the normal
or trend value forthe middle period.
Calculate three-yearly moving averages of number of students studying in a higher
secondary school in a particular village from the following data,
Ti eRe
332
317
387
392
4o2
405
410
427
435
438
Solution:
Computation othree- yearly movingaverages.Method of Least Squares
‘The line of best fit isa line from which the sum of the deviations of various points is zero. This is the
best method for obtaining thetrend values. It gives a convenient basis for calculating the line of best fit
for the time series. is a mathematical method for measuring trend, Further the sum ofthe squares of
thesedeviations would be least when compared with other fitting methods.
Ft stag ne tend by the method se ete thettend vas
Soltis)
copa of trend values by themethod of least squares (ODD
200
0
200
1
200
2
200
3
200
4
200 50 100
5
200 46 138
6
N= EY=316 EX=03X!=283XY=29
7 316Therefore, the required equation of thestraight line trend is given by
Y= a+ 0X
Y= 4543+ 1.036 (x - 2003)
The trend values can be obtained by
When X = 2000,
Yt = 45.143 + 1.036 (2000-2003) = 42.035When X = 2001,
Ye = 45.143 + 1.036 (2001-2003) = 43.071,
similarly other values can be obtained,
Computation of Seasonalintices bySimpleaverage
Seasonal Variations can be measured by the method of simple average. The data
should be available in season wise likely weeks, months quarters.
Method of Simple Averages
This is the’simplest and easiest method for studying Seasonal Variations.
Ratios-to-moving-average method
Ordinarily does not fluctuate so much as the index based on straight-line trends. This is
because the 12-month moving average follows the cyclical course of the actual data quite
closely.
Ratio-to-trend Method’
* The ratio-to-trend method is similar to ratio-to-moving-average method
+ The only difference is the way of obtaining the trend values
+ Whereas in the ratio-to-moving-average method, the trend values are obtained by
the method of moving averages, in the ratio-to-trend method
Link relative’s method:
Link relatives method are calculated by dividing the figure of each season* by the figure
of immediately preceding season and multiplying it by 100. These percentages are called link
relatives since they link each month (or quarter or other time period) to the preceding one.Meaning and of Index number
As index number is a specialized average designed to measure the change in a group of
related variable over a period of time. It was first constructed in the year
Concept
In its simplest form on Index number is a Ratio of two numbers expressed as percent.
Definition
Index number devices for measuring difference in the magnitude of a group of related
variables
---- Croxtonand Cowden
‘Types of Index Numbers
» Price Index
> Quantity Index
> Value Index
Problems in Construction of Index numbers
Simple Aggregative Method:
In this method, the index number is equal to the sum of prices for the year for which index
number is to be found divided by the sum of actual prices for the base year.2,
Por = =~ x 100
wes
Where Py; Stands forthe index number
=P, Stands for the sum of the prices for the year for which index number is to be found =
EP) Stands for the sum of prices for the base year.
Prices in Base Prices in current
Commodity Year 1980 (in Rs.) Year 1988 (in Rs.)
Po Py
10 20
25
60
Index Number (P91) = = X 100 ; Por= & X 100 ; Poy = 161.11
Weighted Aggregative Method:
In this method, different weights are assigned to the items according to their relative
importance. Weights used are the quantity weights, Many formulae have been developed to
estimate index numbers on the basis of quantity weights.(0 Laspeyre’s Formula, In this formula, the quantities of base year are accepted as weights
2Pigo
Poy = = x 100
“1 2Poa0
Where P} is the price in the current year ; Po is the price in the base year ; and qo is the quantity
in the base year.
(i) Paasche’s Formula, In this formula, the quantities of the current year are accepted as weights.
2g
Po == x 100
o” Fa
Where q; is the quantity in the current year.
(ii) Dorbish and Bowley’s Formula. Dorbish and Bowley's formula for estimating weighted index
number is as follows :
ZPigo , ZPigy
Por ~ ote FP 100 or po -
Where L is Laspeyre’s index and P is paasche’s Index.
(iv) Fisher's Ideal Formula. In this formula, the geometric mean of two indices (i.¢., Laspeyre's Index
and paasche's Index) is taken :
por = |2Pid0 y 2PM 199 or Por= VEX Px 100
2Poqo 2Pon
where L is Lespeyre’s Index and P is paasche’s Index.
(i) Laspeyre’s Formula:
Piao
Por = 10 x 100
n= tt x
440
= x 100 = 166.0:
Por = 5X 100 = 166.08
Test of adequacy for an Index Number
* Index numbers are studied to know the relative changes in price and quantity for
any two years compared.«Factor Reversal Test.
* The criterion for a good index number is to satisfy the above two tests.
© Fisher's index number formula satisfies the above relationship.
Chain Index Numbers
Under this method, firstly we express the figures for each year as a percentage of the
preceding year. These are known as Link Relatives, We then need to chain them together by
successive multiplication to form a chain index.
‘Steps in the construction Of Chain Index Numbers
1. Calculate the link relatives by expressing the figures as the percentage of the preceding year.
Thus,
Link Relatives of current year = price of current year/price of previous year X 100.
2. Calculate the chain index by applying the following formula:
Chain Index = Current year relative x Previous year link relative/100
Base Shifting
For a variety of reasons, it frequently becomes necessary to change the reference base of
aan index number series from one time to another without returning to the original raw data and
recomposing the entire series. This change of reference base period is usually referred to as
“shifting the base”.
Splicing
‘The process of combining two or more index numbers covering different bases into a
single series is called splicing, Example the following are two series, A and B of the index
numbers of a commodity taking 1991 and 1994 as the base years.
Deflating
Deflating means making allowances for the effect of changing price levels. The process
of adjusting a series of salary or wages or income according to current price changes to find out
the level of real salary wages or income is called deflating of index numbers.Consumer Price Index
‘The Consumer Price Index (CPI) measures the average price change of a set of consumer
goods and services. CPIs can be calculated for single items or a predetermined group of items.
All of these items are defined as "household goods and services."
‘Uses of Consumer Price Index
‘The consumer price index is mainly.used to measure inflation over a given period of
time. It can also be leveraged to determine the cost of living.
CPI is mainly used:to determine the efficacy of economie policies. Inflation indicates the
health (or lack thereof) of an economy, so tracking it and responding to it appropriately is
important for policymakers. When inflation sharply increases or decreases, the CPI provides
economists and jpolicymakers insight into how a government's economic policy affects the
market.
Statistical quality control
‘The use Of statistical methods in the monitoring and maintaining of the quality of
products and services. One method, referred to as acceptance sampling, can be used when @
decision must be made to accept or reject a group of parts or items based on the quality found in
a sample,
Jetobesrodennenenaicoponnodannenionsaanannedanninicenssiooeeiiace