Numerical Methods Statistical Analysis
Numerical Methods Statistical Analysis
Statistical Analysis
BSDS-304
M.Sc. (IT) Final Year
MIT - 12
NUMERICAL METHODS
AND STATISTICAL ANALYSIS
Advisory Committee
1. Dr. Jayant Sonwalkar 4. Dr. Sharad Gangele
Hon’ble Vice Chancellor Professor
Madhya Pradesh Bhoj (Open) University, R.K.D.F. University Bhopal (M.P.)
Bhopal (M.P.)
2. Dr. L.S. Solanki 5. Dr. Romsha Sharma
Registrar Professor
Madhya Pradesh Bhoj (Open) University, Sri Sathya Sai College for Women,
Bhopal (M.P.) Bhopal (M.P.)
3. Dr. Kishor John 6. Dr. K. Mani Kandan Nair
Director Department of Computer Science
Madhya Pradesh Bhoj (Open) University, Makhanlal Chaturvedi National University of
Bhopal (M.P.) Journalism and Communication, Bhopal (M.P.)
COURSE WRITERS
Dr. N. Dutta, Professor (Mathematics), Head, Department of Basic Sciences & Humanities, Heritage Institute of
Technology, Kolkata
Units (1, 2.0-2.3, 2.6-2.10, 3)
Manisha Pant, Former Lecturer, Department of Mathematics, H.N.B. Garhwal University, Srinagar
(A Central University), U.K.
Units (2.4, 2.5.2, 5.5.3, 4.2, 4.3, 4.6.5, 5.2, 5.6)
C. R. Kothari, Ex-Associate Professor, Department of Economic Administration & Financial Management, University of
Rajasthan
Units (2.5-2.5.1, 4.7-4.7.5)
J. S. Chandan, Retd. Professor, Medgar Evers College, City University of New York
Units (4.0-4.1, 4.3.1-4.6.4, 4.8-4.12, 5.0-5.1, 5.3-5.5, 5.7-5.11)
Copyright © Reserved, Madhya Pradesh Bhoj (Open) University, Bhopal
All rights reserved. No part of this publication which is material protected by this copyright notice
may be reproduced or transmitted or utilized or stored in any form or by any means now known or
hereinafter invented, electronic, digital or mechanical, including photocopying, scanning, recording
or by any information storage or retrieval system, without prior written permission from the Registrar,
Madhya Pradesh Bhoj (Open) University, Bhopal
Information contained in this book has been published by VIKAS® Publishing House Pvt. Ltd. and has
been obtained by its Authors from sources believed to be reliable and are correct to the best of their
knowledge. However, the Madhya Pradesh Bhoj (Open) University, Bhopal, Publisher and its Authors
shall in no event be liable for any errors, omissions or damages arising out of use of this information
and specifically disclaim any implied warranties or merchantability or fitness for any particular use.
Published by Registrar, MP Bhoj (open) University, Bhopal in 2020
UNIT - I
Introduction, Limitation of Number Representation, Arithmetic rules for Unit-1: Representation of Numbers
Floating Point Numbers, Errors in Numbers, Measurement of Errors, (Pages 3-38)
Solving Equations, Introduction, Bisection Method, Regula Falsi Method,
Secant Method, Convergence of the iterative methods.
UNIT - II
Interpolation, Introduction, Lagrange Interpolation, Finite Differences, Unit-2: Interpolation and
Truncation Error in Interpolation, Curve Fitting, Introduction, Linear Curve Fitting
Regression, Polynomial Regression, Fitting Exponential and Trigonometric (Pages 39-113)
Functions
UNIT - III
Numerical Differentiation and Integration, Introduction, Numerical Unit-3: Numerical Differentiation
Differentiation Formulae, Numerical Integration Formulae, Simpson's Rule, and Integration
Errors in Integration Formulae, Gaussian Quadrature Formulae, Comparison (Pages 115-184)
of Integration Formulae, Solving Numerical Differential Equations,
Introduction, Euler's Method, Taylor Series Method, Runge-Kutta Method,
Higher Order Differential Equations.
UNIT - IV
Introduction to Statistical Computation, History of Statistics, Meaning and Unit-4: Statistical Computation and
scope of Statistics, Various measures of Average, Median, Mode, Geometric Probability Distributiona
Mean, Harmonic Mean, Measures of Dispersion, Range, Standard (Pages 185-290)
Deviation, Probability Distributions, Introduction, Counting Techniques,
Probability, Axiomatic or Modern Approach to Probability, Theorems on
Probability, Probability Distribution of a Random Variable, Mean and
Variance of a Random Variable, Standard Probability Distributions, Binomial
Distribution, Hyper geometric Distribution Geometrical Distribution,
Uniform Distribution (Discrete Random Variable), Poisson Distribution,
Exponential Distribution, Uniform Distribution (Continuous Variable),
Normal Distribution
UNIT - V
Estimation, Sampling Theory, Parameter and Statistic, Sampling Distribution Unit-5: Estimation and
of Sample Mean, Sampling Distribution of the Number of Successes, The Hypothesis Testing
Student's Distribution, Theory of Estimation, Point Estimation, Interval (Pages 291-328)
Estimation, Hypothesis Testing, Test of Hypothesis, Test of Hypothesis
Concerning Mean, Test of Hypothesis Concerning Proportion, Test of
Hypothesis Concerning Standard Deviation
CONTENTS
INTRODUCTION
Numerical method and Statistical analysis is the study of algorithms to find solutions
for problems of continuous mathematics or considered as a mathematical science NOTES
pertaining to the collection, analysis, interpretation or explanation, and presentation
of data and can be categorized as Inferential Statistics and Descriptive Statistics.
Numerical method helps in obtaining approximate solutions while maintaining
reasonable bounds on errors. Although numerical analysis has applications in all
fields of engineering and the physical sciences, yet in the 21st century life sciences
and both the arts have adopted elements of scientific computations. Ordinary
differential equations are used for calculating the movement of heavenly bodies,
i.e., planets, stars and galaxies. Besides, it evaluates optimization occurring in
portfolio management and also computes stochastic differential equations to solve
problems related to medicine and biology. Airlines use sophisticated optimization
algorithms to finalize ticket prices, airplane and crew assignments and fuel needs.
Insurance companies too use numerical programs for actuarial analysis. The basic
aim of numerical analysis is to design and analyse techniques to compute
approximate and accurate solutions to unique problems. In numerical analysis,
two methods are involved, namely direct and iterative methods. Direct methods
compute the solution to a problem in a finite number of steps whereas iterative
methods start from an initial guess to form successive approximations that converge
to the exact solution only in the limit. Iterative methods are more common than
direct methods in numerical analysis. The study of errors is an important part of
numerical analysis. There are different methods to detect and fix errors that occur
in the solution of any problem. Round-off errors occur because it is not possible to
represent all real numbers exactly on a machine with finite memory. Truncation
errors are assigned when an iterative method is terminated or a mathematical
procedure is approximated and the approximate solution differs from the exact
solution.
Statistical analysis is very important for taking decisions and is widely used
by academic institutions, natural and social sciences departments, governments
and business organizations. The word ‘Statistics’ is derived from the Latin word
‘Status’ which means a political state or government. It was originally applied in
connection with kings and monarchs collecting data on their citizenry that pertained
to state wealth, collection of taxes, study of population, and so on. In the beginning
of the Indian, Greek and Egyptian civilizations, data was collected for the purpose
of planning and organizing civilian and military projects. Proper records of such
vital events as births and deaths have been kept since the Middle Ages. By the end
of the 19th century, the field of statistics extended from simple data collection and
record keeping to interpretation of data and drawing useful conclusions from it.
Statistics can be called a science that deals with numbers or figures describing the
state of affairs of various situations with which we are generally and specifically
concerned. To a layman, it often means columns of figures, or perhaps tables,
graphs and charts relating to population, national income, expenditures, production,
consumption, supply, demand, sales, imports, exports, births, deaths and accidents. Self - Learning
Similarly, statistical records kept at universities may reflect the number of students, Material 1
Introduction the percentage of female and male students, the number of divisions and courses
in each division, the number of professors, the tuition received, the expenditures
incurred, and so on. Hence, the subject of statistics deals primarily with numerical
data gathered from surveys or collected using various statistical methods.
NOTES
This book is divided into five units. The topics discussed is designed to be
a comprehensive and easily accessible book covering the limitations of number
representation, measurement of errors, solving equations, Regula Falsi method,
secant method, interpolation, Lagrange interpolation, curve fitting, regression,
numerical differentiation, Simpson’s rule, Gaussian quadrature formulae, solving
numerical differential equations, Euler’s method, Taylor series method, Runge-
Kutta method, history of statistics, various measures of statistical computation,
probability distribution, standard probability distribution, sampling theory, point
estimation and test of hypothesis.
The book follows the Self-Instructional Mode (SIM) wherein each unit
begins with an ‘Introduction’ to the topic. The ‘Objectives’ are then outlined before
going on to the presentation of the detailed content in a simple and structured
format. ‘Check Your Progress’ questions are provided at regular intervals to test
the student’s understanding of the subject. ‘Answers to Check Your Progress
Questions’, a ‘Summary’, a list of ‘Key Terms’, and a set of ‘Self-Assessment
Questions and Exercises’ are provided at the end of each unit for effective
recapitulation.
Self - Learning
2 Material
Representation of Numbers
UNIT 1 REPRESENTATION
OF NUMBERS
NOTES
Structure
1.0 Introduction
1.1 Objectives
1.2 Introduction to Numerical Computing
1.3 Limitations of Number Representations
1.3.1 Arithmetic Rules for Floating Point Numbers
1.4 Errors in Numbers and Measurement of Errors
1.4.1 Generation and Propagation of Round-Off Error
1.4.2 Round-Off Errors in Arithmetic Operations
1.4.3 Errors in Evaluation of Functions
1.4.4 Characteristics of Numerical Computation
1.4.5 Computational Algorithms
1.5 Solving Equation
1.5.1 Bisection Method and Convergence of the Iterative Method
1.5.2 Newton-Raphson Method
1.5.3 Secant Method
1.5.4 Regula-Falsi Method
1.5.5 Descarte’s Rule
1.6 Answers to ‘Check Your Progress’
1.7 Summary
1.8 Key Terms
1.9 Self-Assessment Questions and Exercises
1.10 Further Reading
1.0 INTRODUCTION
The use of computers to solve problems involving real numbers is referred to as
‘Numerical Calculations.’A finite string of digits can represent a large number of
real numbers. Most scientific computers limit the amount of digits that can be used
to represent a single number to a set number. Numerical error is the combined
effect of two kinds of error in a calculation. The first is caused by the finite precision
of computations involving floating point or integer values. The second usually called
truncation error is the difference between the exact mathematical solution and the
approximate solution obtained when simplifications are made to the mathematical
equations to make them more amenable to calculation. The number of significant
figures in a measurement, such as 2.531, is equal to the number of digits that are
known with some degree of confidence (2, 5 and 3) plus the last digit (1), which is
an estimate or approximation. Zeros within a number are always significant. Zeros
that do nothing but set the decimal point are not significant. Trailing zeros that are
not needed to hold the decimal point are significant. A round-off error, also called
rounding error, is the difference between the calculated approximation of a number
and its exact mathematical value. Numerical analysis specifically tries to estimate
this error when using approximation equations and/or algorithms, especially when
using finitely many digits to represent real numbers. Self - Learning
Material 3
Representation of Numbers In root finding and curve fitting, a root-finding algorithm is a numerical
method, or algorithm, for finding a value x such that f(x) = 0, for a given function f.
Such an x is called a root of the function f. Generally speaking, algorithms for solving
problems numerically can be divided into two main groups: direct methods and
NOTES iterative methods. Direct methods are those which can be completed in a
predetermined unite number of steps. Iterative methods are methods which
converge to the solution over time. These algorithms run until some convergence
criterion is met. When choosing which method to use one important consideration
is how quickly the algorithm converges to the solution or the method’s convergence
rate.
In this unit, you will learn about the limitations of number representation,
arithmetic rules for floating point numbers, errors in numbers and measurement of
errors, solving equation, bisection method and convergence of the iterative method,
secant method and Regula-Falsi method.
1.1 OBJECTIVES
After going through this unit, you will be able to:
Understand the basic concept of limitations of number representation
Explain about the arithmetic rules for floating point numbers
Analysis the errors in numbers and measurement of errors
Define solving equation
Discuss about the bisection method and convergence of the iterative method
Elaborate on the secant method
Explain Regula-Falsi method
The mantissa has a 0 in the leftmost position to denote a plus. Here, the
mantissa is considered to be a fixed point fraction. This representation is equivalent
to the number expressed as a fraction 10 times by an exponent, that is 0.6132789
× 10+04. Because of this analogy, the mantissa is sometimes called the fraction
part.
Consider, for example, a computer that assumes integer representation for the
mantissa and radix 8 for the numbers. The octal number + 36.754 = 36754 × 8–3 in its
floating point representation will look like this:
sign sign
0 36754 1 03
mantissa exponent
Self - Learning
6 Material
When this number is represented in a register in its binary-coded form, the Representation of Numbers
actual value of the register becomes 0 011 110 111 101 100 and 1 000 011.
Most computers and all electronic calculators have a built-in capacity to
perform floating-point arithmetic operations.
NOTES
Example 1.1: Determine the number of bits required to represent in floating point
notation the exponent for decimal numbers in the range of 10 86 .
Solution: Let n be the required number of bits to represent the number 10 86.
2n 1086
n log 2 86
86 86
n 285.7
log 2 0.3010
Therefore, 1086 2285.7
Thus the propagated round-off error is the sum of two approximate numbers
(having round-off errors) equal to the sum of the round-off errors in the individual
numbers.
The multiplication of two approximate numbers has the propagated round-
off error given by,
xT yT xy 1 y 2 x 1 2
Since the product 12 is a small quantity of higher order, then 1or 2 may
take the propagated round-off error as 1 x1 2 y1 and the relative propagated
error is given by,
1 x 2 y 1 2
xy x y
This is equal to the sum of the relative errors in the numbers x and y.
Similarly, for division we get the relative propagated error as,
xT x
yT y 1 2
x x y
y
Thus, the relative error in division is equal to the difference of the relative
errors in the numbers.
1.4.3 Errors in Evaluation of Functions
The propagated error in the evaluation of a function f (x) of a single variable x
having a round-off error is given by,
f ( x ) f ( x) f '( x)
In the evaluation of a function of several variables x1, x2, …, xn, the
n
f
propagated round-off error is given by 1 , where 1 , 2 ,..., n are the round-
i 1 xi
Self - Learning
8 Material off errors in x1, x2,..., xn, respectively.
Significance Errors Representation of Numbers
Self - Learning
Material 9
Representation of Numbers Table 1.1 Computed Value of f(x) upto Six Decimal Places
Table 1.1 shows that the error in the computed value becomes more serious
for smaller value of x. It may be noted that the correct values of f (x) can be
computed by avoiding the divisions by small number by rewriting f (x) as given
below.
1 cos x 1 cos x
f ( x)
x2 1 cos x
sin 2 x
i.e., f ( x)
x 2 (1 cos x)
digits. This is due to the loss of significant digits during subtraction of nearly equal
numbers.
Example 1.3: Round the number x = 2.2554 to three significant figures. Find the
absolute error and the relative error. NOTES
Solution:The rounded-off number is 2.25.
The absolute error is 0.0054.
22 22
Solution:Relative error = 7 3.14 0.00090.
7
5 xy 2
Example 1.7: Given f (x, y, z) = , find the relative maximum error in the
z2
evaluation of f (x, y, z) at x = y = z = 1, if x, y, z have absolute errors
x y z 0.1
Solution:The value of f (x, y, z) at x = y = z = 1 is 5. The maximum absolute error
in the evaluation of f (x, y, z) is,
f f f
(f ) max x y z
x y z
5y2 10 xy 10 xy 2
2
x 2
y z
z z z3
At, x = y = z = 1, the maximum relative error is,
25 0.1
( ER ) max 0.5
5
Example 1.8: Find the relative propagated error in the evaluation of x + y where
x = 13.24 and y = 14.32 have round-off errors 1 0.004 and 2 0.002 respectively.
Self - Learning
0.004 and 0.002 respectively. Material 11
Representation of Numbers
Solution:Here, x y 27.56 and 1 2 0.006 .
0.006
Thus, the required relative error = 0.0002177 .
27.56
NOTES Example 1.9: Find the relative percentage error in the evaluation of u = xy with
x = 5.43, y = 3.82 having round-off errors 0.01 in both x and y.
Solution:Now, xy = 5.43 × 3.82 ~ 20.74
0.01
The relative error in x is 0.0018.
5.43
0.01
The relative error in y is 0.0026.
3.82
Thus, the relative propagated error in x and y = 0.0044.
The percentage relative error = 0.44 per cent.
Example 1.10: Given u = xy + yz + zx, find the estimate of relative percentage
error in the evaluation of u for x = 2.104, y = 1.935 and z = 0.845. What are the
approximate values correct to the last digit?
Solution:Here, u = x (y + z) + yz = 2.104 (1.935 + 0.845) + 1.935 × 0.845
= 5.849 + 1.635 = 7.484
Error, u ( y z )x ( z x)y ( x y )z
0.0005 2( x y z ) x y z 0.0005
2 4.884 0.0005 0.0049
0.0049
Hence, the relative percentage error = 100 0.062 per cent.
7.884
Example 1.11: The diameter of a circle measured to within 1 mm is d = 0.842 m.
Compute the area of the circle and give the estimated relative error in the computed
result.
d 2
Solution: The area of the circle A is given by the formula, A .
4
A a .b b . a
0.01 3.82 0.01 5.43, since a b 0.01
NOTES
0.0925 10 m 2
where a1, b1, c1, a2, b2, c2 are real constants. The solution of the equations are
given by cross multiplication as,
b2 c1 b1c2 c2 a1 c1a2
x , y
a1b2 a2 b1 a1b2 a2 b1
It may be noted that if a1 b2 – a2 b1 = 0, then the solution does not exist. This
aspect has to be kept in mind while writing the algorithm as given below.
Further, if b 2 4 ac, the roots are real, otherwise they are complex conjugates.
This aspect is to be considered while writing an algorithm.
Algorithm: Computation of roots of a quadratic equation.
Step 1: Read a, b, c
Step 2: Compute d = b2 – 4ac
Step 3: Check if d 0, go to Step 4 else go to Step 8
Self - Learning
14 Material
Representation of Numbers
1.5 SOLVING EQUATION
In this section, we consider numerical methods for computing the roots of an
equation of the form, NOTES
f (x) = 0 (1.1)
where f (x) is a reasonably well-behaved function of a real variable x. The function
may be in algebraic form or polynomial form given by,
f ( x) a n x n a n1 x n 1 ... a1 x a0 (1.2)
It may also be an expression containing transcendental functions such as cos
x ,
sin x, e , etc. First, we would discuss methods to find the isolated real roots of a
x
single equation. Later, we would discuss methods to find the isolated roots of a
system of equations, particularly of two real variables x and y, given by
f (x, y) = 0 , g (x, y) = 0 (1.3)
A root of an equation is usually computed in two stages. First, we find the
location of a root in the form of a crude approximation of the root. Next we use
an iterative technique for computing a better value of the root to a desired accuracy
in successive approximations/computations. This is done by using an iterative
function.
Self - Learning
Fig. 1.1 Graph of y x 2 2 x 1 Material 15
Representation of Numbers In some cases, where it is complicated to draw the graph of y = f (x), we may
rewrite the equation f (x) = 0, as f1(x) = f2(x), where the graphs of y = f1 (x) and
y = f2(x) are standard curves. Then we find the x-coordinate(s) of the point(s) of
intersection of the curves y = f1(x) and y = f2(x), which is the crude approximation
NOTES of the root (s).
For example, consider the equation
x 3 15.2 x 13.2 0
This can be rewritten as,
x 3 15.2 x 13.2
where it is easy to draw the graphs of y = x3 and y = 15.2 x + 13.2. Then, the
abscissa of the point(s) of intersection can be taken as the crude approximation(s)
of the root(s).
20 y = 15.2 x + 13.2
10 y = x3
Example 1.14: Find the location of the root of the equation x log10 x 1.
1
Solution: The equation can be rewritten as log10 x .
x
1
Now the curves y log10 x and y , can be easily drawn and are shown
x
in Figure below.
Y
1
y= x
y = log10 x
1 2 3 X
O
1
Graph of y and y log10 x
x
Self - Learning
16 Material
The point of intersection of the curves has its x-coordinates value 2.5 Representation of Numbers
approximately. Thus, the location of the root is 2.5.
Tabulation Method: In the tabulation method, a table of values of f (x) is made
for values of x in a particular range. Then, we look for the change in sign in the NOTES
values of f (x) for two consecutive values of x. We conclude that a real root lies
between these values of x. This is true if we make use of the following theorem on
continuous functions.
Theorem 1.1: If f (x) is continuous in an interval (a, b), and f (a) and f(b) are of
opposite signs, then there exists at least one real root of f (x) = 0, between a and
b.
Consider for example, the equation f (x) = x3 – 8x + 5 = 0.
Constructing the following table of x and f (x),
x 4 3 2 1 0 1 2 3
f ( x) 27 2 13 12 5 2 3 8
we observe that there is a change in sign of f (x) in each of the sub-intervals (–3,
–4), (0, 1) and (2, 3). Thus we can take the crude approximation for the three real
roots as – 3.2, 0.2 and 2.2.
Self - Learning
Material 17
Representation of Numbers
NOTES
Fig. 1.3 Graph of the Bisection Method showing Two Initial Estimates
xa and xb Bracketing the Root
The method is applicable when we wish to solve the equation f(x) = 0 for the real
variable x, where f is a continuous function defined on an interval [a, b] and f(a)
and f(b) have opposite signs.
The bisection method involves successive reduction of the interval in which
an isolated root of an equation lies. This method is based upon an important theorem
on continuous functions as stated below.
Theorem 1.2: If a function f (x) is continuous in the closed interval [a, b], and
f (a) and f (b) are of opposite signs, i.e., f (a) f (b) < 0, then there exists at least
one real root of f (x) = 0 between a and b.
The bisection method starts with two guess values x0 and x1. Then this interval
1
[x0, x1] is bisected by a point x2 ( x0 x1 ), where f(x0) . f(x1) < 0. We compute
2
f(x2). If f(x2) = 0, then x2 is a root. Otherwise, we check whether f(x0) . f(x2) < 0
or f(x1) . f(x2) < 0. If f (x2)/f (x0) < 0, then the root lies in the interval (x2, x0).
Otherwise, if f(x0) . f(x1) < 0, then the root lies in the interval (x2, x1).
The sub-interval in which the root lies is again bisected and the above process
is repeated until the length of the sub-interval is less than the desired accuracy.
The bisection method is also termed as bracketing method, since the method
successively reduces the gap between the two ends of an interval surrounding the
real root, i.e., brackets the real root.
The algorithm given below clearly shows the steps to be followed in finding a
real root of an equation, by bisection method to the desired accuracy.
Algorithm: Finding root using bisection method.
Step 1: Define the equation, f (x) = 0
Step 2: Read epsilon, the desired accuracy
Setp 3: Read two initial values x0 and x1 which bracket the desired root
Step 4: Compute y0 = f (x0)
Step 5: Compute y1 = f (x1)
Step 6: Check if y0 y1 < 0, then go to Step 6
else go to Step 2
Self - Learning
18 Material
Step 7: Compute x2 = (x0 + x1)/2
Step 8: Compute y2 = f (x2) Representation of Numbers
Define f (x)
Read epsilon
Read x0, x1
Compute y0 = f (x0)
Compute y1 = f (x1)
Is
No y0y1 > 0
Yes
Compute x2 = ( x0 + x1)/2
Compute y2 = f (x2)
Is Yes x0 = x2
y0y2 > 1
No
x1 = x2
Is
No Yes
|(x1 – x0) / x0|
> epsilon
print ‘root’ = x2
Self - Learning
End
Material 19
Representation of Numbers Example 1.15: Find the location of the smallest positive root of the equation
x3 – 9x + 1 = 0 and compute it by bisection method, correct to two decimal
places.
Solution: To find the location of the smallest positive root we tabulate the function
NOTES
f (x) = x3 – 9x + 1 below.
x 0 1 2 3
f ( x) 1 2 9 1
We observe that the smallest positive root lies in the interval [0, 1]. The
computed values for the successive steps of the bisection method are given in the
table.
n x0 x1 x2 f ( x2 )
1 0 1 0 .5 3.37
2 0 0.5 0.25 1.23
3 0 0.25 0.125 0.123
4 0 0.125 0.0625 0.437
5 0.0625 0.125 0.09375 0.155
6 0.09375 0.125 0.109375 0.016933
7 0.109375 0.125 0.11718 0.053
From the above results, we conclude that the smallest root correct to two
decimal places is 0.11.
Simple Iteration Method: A root of an equation f (x) = 0, is determined using
the method of simple iteration by successively computing better and better
approximation of the root, by first rewriting the equation in the form,
x = g(x) (1.4)
Then, we form the sequence {xn} starting from the guess value x0 of the root
and computing successively,
x1 g ( x0 ), x2 g ( x1 ),.., xn g ( xn 1 )
In general, the above sequence may converge to the root as n or it
may diverge. If the sequence diverges, we shall discard it and consider another
form x = h(x), by rewriting f (x) = 0. It is always possible to get a convergent
sequence since there are different ways of rewriting f (x) = 0 in the form x = g(x).
However, instead of starting computation of the sequence, we shall first test whether
the form of g(x) can give a convergent sequence or not. We give below a theorem
which can be used to test for convergence.
Theorem 1.3: If the function g(x) is continuous in the interval [a, b] which contains
a root of the equation f (x) = 0, and is rewritten as x = g(x), and | g ( x) | l 1 in
this interval, then for any choice of x0 [a, b] , the sequence {xn} determined by
the iterations,
xk 1 g ( xk ), for k 0, 1, 2,... (1.5)
Self - Learning This converges to the root of f (x) = 0.
20 Material
Proof: Since x = , is a root of the equation x = g(x), we have Representation of Numbers
g () (1.6)
The first iteration gives x1 = g(x0) (1.7)
Subtracting Equation (1.7) from Equation (1.6), we get NOTES
x1 g () g ( x0 )
Applying mean value theorem, we can write
x1 ( x0 ) g ( s0 ), x0 s0 (1.8)
Similarly, we can derive
x2 ( x1 ) g ( s1 ), x1 s1 (1.9)
....
xn 1 ( xn ) g ( sn ), xn sn (1.10)
1 x 1
| g (1) | 1. Hence, the form x would give a convergent
2 2 x
sequence of iterations.
Example 1.17: Compute the real root of the equation x3 + x2 – 1 = 0, correct to
five significant digits, by iteration method.
Solution: The equation has a real root between 0 and 1 since f (x) = x3 + x2 – 1
has opposite signs at 0 and 1. For using iteration, we first rewrite the equation in
the following different forms:
1 1 1
(i) x 1 (ii) x 1 (iii) x
x2 x x 1
1 2
For the form (i), g ( x) 1 2
, g ( x) and for x in (0, 1), | g ( x) | 1 .
x x3
So, this form is not suitable.
1 1 1
For the form (ii) g ( x) . 2 1 and | g ( x) | 1 for all x in
2 1 x
1
x
(0, 1).
1 1
Finally, for the form (iii) g ( x) . 3
and g ( x) 1 for x in (0, 1).
2
( x 1) 2
Self - Learning
22 Material
Thus this form can be used to form a convergent sequence for finding the root. Representation of Numbers
1
We start the iteration x with x0 = 1. The results of suecessive iterations
1 x
are, NOTES
x1 0.70711 x2 0.76537 x3 0.75236 x4 0.75541
x5 0.75476 x6 0.75490 x7 0.75488 x8 0.75488
1 3 1 1
(i) x ( x 1) (ii) x 9/ x 2 (iii) x 9
9 x x
12
In case of (i), g ( x) x and for x in [2, 4], | g ( x) | 1. Hence it will not give
3
rise to a convergent sequence.
9 2
In case of (ii) g ( x ) 2 x 2
3 and for x in [2, 4], | g ( x) | 1
x x
1
In case of (iii) g ( x) 9 1 1
2
and| g ( x) | 1
x 2x2
Thus, the forms (ii) and (iii) would give convergent sequences for finding the
root in [2, 3].
We start the iterations taking x0 = 2 in the iteration scheme (iii). The result for
successive iterations are,
x0 = 2.0 x1 = 2.91548 x4 = 2.94282
x2 = 2.94228 x3 = 2.94281
Self - Learning
Thus, the root can be taken as 2.94281, correct to four decimal places. Material 23
Representation of Numbers
1.5.2 Newton-Raphson Method
Newton-Raphson method is a widely used numerical method for finding a root of
an equation f (x) = 0, to the desired accuracy. It is an iterative method which has
a faster rate of convergence and is very useful when the expression for the derivative
NOTES
f (x) is not complicated. Newton-Raphson method, also called the Newton’s
method, is a root finding algorithm that uses the first few terms of the Taylor series
of a function f(x) in the neighborhood of a suspected root. In the Newton-Raphson
method, to find the root start with an initial guess x1 at the root, the next guess x2 is
the intersection of the tangent from the point [x1, f(x1)] to the x-axis. The next
guess x3 is the intersection of the tangent from the point [x2, f(x2)] to the x-axis as
shown in Figure 1.4.
f(x)
f(x1) B
x3 x2 x1
To derive the formula for this method, we consider a Taylor’s series expansion of
f (x0 + h), x0 being an initial guess of a root of f (x) = 0 and h a small correction to
the root.
h2
Self - Learning f ( x0 h) f ( x0 ) h f ( x0 ) f" ( x0 ) ...
24 Material 2 !
Assuming h to be small, we equate f (x0 + h) to 0 by neglecting square and Representation of Numbers
higher powers of h.
f ( x0 ) h f ( x0 ) 0 2
NOTES
f ( x0 )
or, h
f ( x0 )
Thus, we can write an improved value of the root as,
x1 x0 h
f ( x0 )
i.e., x1 x0
f ( x0 )
f ( x1 )
x2 x1
f ( x1 )
f ( x2 )
x3 x 2
f ( x2 )
... ... ...
f ( xn )
xn 1 xn
f ( xn ) (1.13)
NOTES x 0 1 2 3 4
f ( x) 4 13 12 1 28
xn2 a
We have, xn 1 xn
2 xn
1 a
or, xn 1 (k 1) xn k 1 , for n 0, 1, 2,...
k xn
Now, for evaluating 3 2 , we take x0 = 1.25 and use the iterative formula,
1 2
xn 1 2 xn 2 .
3 xn
1 2
We have, x1 1.25 2 1.26
3 (1.25)2
x2 1.259921, x3 1.259921
f ( xn ) 1 f ( xn )
xn ( xn ) 2 . ...
f ( xn ) 2 f '( xn )
1 f ( xn )
xn 1 ( xn ) 2 .
2 f ( xn )
f ( x) f ( x)
1, in the interval near the root.
[ f ( x)]2 NOTES
Next, we compute f (x2) and determine the interval in which the root lies in the
following manner. If (a) f (x2) and f (x1) are of opposite signs, then the root lies in
(x2, x1). Otherwise if (b) f (x0) and f (x2) are of opposite signs, then the root lies in
(x0, x2). The next approximate root is determined by changing x0 by x2 in the first
case and x1 by x2 in the second case.
The aforesaid process is repeated until the root is computed to the desired
accuracy , i.e., the condition
Self - Learning
30 Material
Representation of Numbers
( xk 1 xk ) / xk , should be satisfied.
Regula-Falsi method can be geometrically interpreted by the following
Figure 1.6.
NOTES
Y
x1, f (x1)
X
O
x2, f (x2)
x0, f (x2)
roots is three or one. Thus, it must have a real root. In fact, every polynomial
equation of odd degree has a real root.
We can also use Descarte’s rule to determine the number of negative roots by
NOTES
finding the number of changes of signs in pn(–x). For the above equation,
pn ( x) 3 x 5 2 x 4 x3 2 x 2 x 2 0 and it has two changes of sign. Thus,
it has either two negative real roots or none.
1.7 SUMMARY
Numerical methods are methods used for solving problems through numerical
calculations providing a table of numbers and/or graphical representations
or figures. Numerical methods emphasize that how the algorithms are
implemented.
To perform a numerical calculation, approximate them first by a
representation involving a finite number of significant digits. If the numbers
to be represented are very large or very small, then they are written in
Self - Learning floating point notation.
34 Material
The Institute of Electrical and Electronics Engineers (IEEE) has published a Representation of Numbers
Short-Answer Questions
1. What are floating point numbers?
Self - Learning
38 Material
Interpolation and
2.0 INTRODUCTION
Interpolation is the process of defining a function that takes on specified values at
specified points. Polynomial interpolation is the most known one-dimensional
interpolation method. Its advantages lies in its simplicity of realization and the good
quality of interpolants obtained from it. You will learn about the various interpolation
methods, namely Lagrange’s interpolation, Newton’s forward and backward
difference interpolation formulae, iterative linear interpolation and inverse
interpolation.
Curve fitting is the process of constructing a curve, or mathematical function,
which has the best fit to a series of data points, possibly subject to constraints.
In mathematics, the trigonometric functions, also called the circular functions,
are functions of an angle. They relate the angles of a triangle to the lengths of its
sides. The most familiar trigonometric functions are the sine, cosine and tangent. In
the context of the standard unit circle (a circle with radius 1 unit), where a triangle
Self - Learning
is formed by a ray originating at the origin and making some angle with the x-axis, Material 39
Interpolation and the sine of the angle gives the length of the y-component (the opposite to the angle
Curve Fitting
or the rise) of the triangle, the cosine gives the length of the x-component (the
adjacent of the angle or the run), and the tangent function gives the slope (y-
component divided by the x-component). Trigonometric functions are commonly
NOTES defined as ratios of two sides of a right triangle containing the angle, and can
equivalently be defined as the lengths of various line segments from a unit circle.
Regression analysis, is the mathematical process of using observations to
find the line of best fit through the data in order to make estimates and predictions
about the behaviour of variables. This technique is used to determine the statistical
relationship between two or more variables and to make prediction of one variable
on the basis of one or more other variables.
In this unit, you will learn about the interpolation, curve fitting, trigonometric
function and regression.
2.1 OBJECTIVES
After going through this unit, you will be able to:
Describe the method of iterative linear interpolation
Understand polynomial interpolation
Explain the importance of Lagrange’s interpolation
Perform interpolation of equally spaced tabular values
Explain finite, forward and backward differences
Evaluate interpolation using symbolic, shift and central difference operators
Know differences of polynomials
Define Newton’s forward and backward interpolation formulae
Explain extrapolation and inverse interpolation
Understand the concept of curve fitting
Explain the various trigonometric functions
Discuss regression analysis in detail
2.2 INTERPOLATION
The problem of interpolation is very fundamental problem in numerical analysis.
The term interpolation literally means reading between the lines. In numerical analysis,
interpolation means computing the value of a function f (x) in between values of x
in a table of values. It can be stated explicitly as ‘given a set of (n + 1) values y0,
y1, y2,..., yn for x = x0, x1, x2, ..., xn respectively. The problem of interpolation is
to compute the value of the function y = f (x) for some non-tabular value of x.’
The computation is often made by finding a polynomial called interpolating
polynomial of degree less than or equal to n such that the value of the polynomial
is equal to the value of the function at each of the tabulated points. Thus if,
Self - Learning
40 Material ( x ) a0 a1 x a2 x 2 an x n (2.1)
is the interpolating polynomial of degree n , then Interpolation and
Curve Fitting
( xi ) yi , for i 0, 1, 2, ..., n (2.2)
It is true that, in general, it is difficult to guess the type of function to
approximate f (x). In case of periodic functions, the approximation can be made NOTES
by a finite series of trigonometric functions. Polynomial interpolation is a very useful
method for functional approximation. The interpolating polynomial is also useful as
a basis to develop methods for other problems such as numerical differentiation,
numerical integration and solution of initial and boundary value problems associated
with differential equations.
The following theorem, developed by Weierstrass, gives the justification for
approximation of the unknown function by a polynomial.
Theorem 2.1: Every function which is continuous in an interval (a, b) can be
represented in that interval by a polynomial to any desired accuracy. In other
words, it is possible to determine a polynomial P(x) such that f ( x) P( x) ,
for every x in the interval (a, b) where is any prescribed small quantity.
Geometrically, it may be interpreted that the graph of the polynomial y = P(x) is
confined to the region bounded by the curves y f ( x ) and y f ( x ) for
all values of x within (a, b), however small may be.
This form of p01(x) is easy to visualize and is convenient for desk computation.
Thus, the linear interpolating polynomial through the pair of points (x0, f0) and
( x j , f j ) can be easily written as,
1 f0 x0 x
p0 j ( x) , for j 1, 2, ..., n (2.4)
x j x0 f j xj x
Now, consider the polynomial denoted by p01j (x) and defined by,
1 p01 ( x) x1 x
p01 j ( x) , for j 2, 3, ..., n (2.5)
x j x1 p0 j ( x) x j x
The polynomial p01j(x) interpolates f(x) at the points x0, x1, xj (j > 1) and is a
polynomial of degree 2, which can be easily verified that,
p0ij ( x0 ) f 0 , p0ij ( xi ) f i and p0ij ( x j ) f j because p01 ( x0 ) f 0 p0ij ( x0 ), etc.
1 p012 ( x) x2 x
p012 j ( x) , for j 3, 4, ..., n (2.6)
x j x2 p01 j ( x) x j x
Self - Learning
42 Material
Interpolation and
xk fk p0 j p01 j ... x j x Curve Fitting
x0 f0 x0 x
x1 f1 p01 x1 x
x2 f2 p02 p012 x2 x
NOTES
x3 f3 p03 p013 x3 x
... ... ... ... ... ...
xj fj p0 j p01 j xj x
... ... ... ... ... ...
xn fn p0n p01n xn x
Solution: Here, x = 2.12. The following table gives the successive iterative linear
interpolation results. The details of the calculations are shown below in the table.
xj s( x j ) p0 j p01 j p012 j xj x
2.0 0.7909 0.12
2.1 0.7875 0.78682 0.02
2.2 0.7796 0.78412 0.78628 0.08
2.3 0.7673 0.78146 0.78628 0.78628 0.18
1 0.7909 0.12
p01 0.78682
2.1 2.0 0.7875 0.02
1 0.7909 0.12
p02 0.78412
2.2 2.0 0.7796 0.08
1 0.7909 0.12
p03 0.78146
2.3 2.0 0.7673 0.18
1 0.78682 0.02
p012 0.78628
2.2 2.1 0.78412 0.08
1 0.78682 0.02
p013 0.78628
2.3 2.1 0.78146 0.18
1 0.78628 0.08
p012 0.78628
2.3 2.2 0.78628 0.18
The boldfaced results in the table give the value of the interpolation at x =
2.12. The result 0.78682 is the value obtained by linear interpolation. The result
Self - Learning
0.78628 is obtained by quadratic as well as by cubic interpolation. We conclude Material 43
Interpolation and that there is no improvement in the third degree polynomial over that of the second
Curve Fitting
degree.
Notes 1. Unlike Lagrange’s methods, it is not necessary to find the degree of the
NOTES interpolating polynomial to be used.
2. The approximation by a higher degree interpolating polynomial may
not always lead to a better result. In fact it may be even worse in some
cases.
Consider, the function f(x) = 4.
We form the finite difference table with values for x = 0 to 4.
x f ( x ) f ( x) 2 f ( x) 3 f ( x) 4 f ( x)
0 1
3
1 4 9
12 27
2 16 36 81
48 108
3 64 144
192
4 256
x x0 9 27 81
u x, ( x ) 1 3x x( x 1) x( x 1)( x 2) x( x 1)( x 2)( x 3)
h 2 6 24
Now, consider values of ( x) at x = 0.5 by taking successively higher and
higher degree polynomials.
Thus,
x y y 2 y
1 1
0 NOTES
2 1 2
2
3 1 2
4
4 5
Since the differences of second order are constant, the interpolating polynomial
is of degree two. Using Newton’s forward difference interpolation, we get
u (u 1) 2
y y0 u y0 y0 ,
2!
Here, x0 1, u x 1.
( x 1)( x 2)
Thus, y 1 ( x 1) 0 2 x 2 3x 1.
2
Example 2.3: Compute the value of f(7.5) by using suitable interpolation on the
following table of data.
x 3 4 5 6 7 8
f ( x) 28 65 126 217 344 513
Solution: The data is equally spaced. Thus for computing f(7.5), we use Newton’s
backward difference interpolation. For this, we first form the finite difference table
as shown below.
x f ( x) f ( x) 2 f ( x) 3 f ( x)
3 28
37
4 65 24
61 6
5 126 30
91 6
6 217 36
127 6
7 344 42
169
8 513
The differences of order three are constant and hence we use Newton’s
backward difference interpolating polynomial of degree three.
v(v 1) 2 v(v 1)(v 2) 3
f ( x) yn v yn yn yn ,
2 ! 3 !
x xn
v , for x 7.5, xn 8 Self - Learning
h Material 45
Interpolation and 7.5 8
Curve Fitting v 0.5
1
( 0.5) ( 0.5 1) 0.5 0.5 1.5
f (7.5) 513 0.5 169 42 6
2 6
NOTES
513 84.5 5.25 0.375
422.875
Example 2.4: Determine the interpolating polynomial for the following data:
x 2 4 6 8 10
f ( x) 5 10 17 29 50
Solution: The data being equally spaced, we use Newton’s forward difference
interpolation for computing f(0.23), and for computing f(0.29), we use Newton’s
backward difference interpolation. We first form the finite difference table,
x f ( x ) f ( x ) 2 f ( x)
0.20 1.6596
102
0.22 1.6698 4
106
0.24 1.6804 2
108
0.26 1.6912 4
112
0.28 1.7024 3
115
0.30 1.7139
We observe that differences of order higher than two would be irregular. Hence,
we use second degree interpolating polynomial. For computing f(0.23), we take
x x0 0.23 0.22
x0 = 0.22 so that u 0.5.
h 0.02
Using Newton’s forward difference interpolation, we compute
(0.5)(0.5 1.0)
f (0.23) 1.6698 0.5 0.0106 0.0002
2
1.6698 0.0053 0.000025
1.675075
1.6751
Again for computing f (0.29), we take xn = 0.30,
x xn 0.29 0.30
so that v 0.5 Self - Learning
n 0.02 Material 47
Interpolation and Using Newton’s backward difference interpolation we evaluate,
Curve Fitting
( 0.5)(0.5 1.0)
f (0.29) 1.7139 0.5 .0115 0.0003
2
NOTES 1.7139 0.00575 0.00004
1.70811
1.7081
Example 2.7: Compute values of ex at x = 0.02 and at x = 0.38 using suitable
interpolation formula on the table of data given below.
x 0.0 0.1 0.2 0.3 0.4
e x 1.0000 1.1052 1.2214 1.3499 1.4918
Solution: The data is equally spaced. We have to use Newton’s forward difference
interpolation formula for computing ex at x = 0.02, and for computing ex at
x = 0.38, we have to use Newton’s backward difference interpolation formula.
We first form the finite difference table.
x y ex y 2 y 3 y 4 y
0.0 1.0000
1052
0.1 1.1052 110
1162 13
0.2 1.2214 123 2
1285 11
0.3 1.3499 134
1419
0.4 1.4918
x x0 0.02 0.0
u 0 .2
h 0.1
By Newton’s forward difference interpolation formula, we have
( 0.2)(0.2 1)
e0.38 1.4918 (0.2) 0.1419 0.0134
2
( 0.2)( 0.2 1)(0.2 2) 0.2( 0.2 1)(0.2 2)(0.2 3)
0.0011 ( 0.0002)
6 24
1.4918 0.02838 0.00107 0.00005 0.00001
1.49287 0.02844
1.46443 1.4644
Self - Learning
48 Material
2.2.2 Lagrange’s Interpolation Interpolation and
Curve Fitting
Lagrange’s interpolation is useful for unequally spaced tabulated values. Let y = f
(x) be a real valued function defined in an interval (a, b) and let y0, y1,..., yn be the
(n + 1) known values of y at x0, x1,...,xn, respectively. The polynomial (x), NOTES
which interpolates f (x), is of degree less than or equal to n. Thus,
( xi ) yi , for i 0,1, 2, ..., n
(2.7)
The polynomial (x) is assumed to be of the form,
n
( x) li ( x) yi
i0
(2.8)
where each li(x) is a polynomial of degree n in x and is called Lagrangian
function.
Now, (x ) satisfies Equation (2.7) if each li(x) satisfies,
li ( x j ) 0 when i j
1 when i j
(2.9)
Equation (2.9) suggests that li(x) vanishes at the (n+1) points x0, x1, ... xi–1,
xi+1,..., xn. Thus, we can write,
li(x) = ci (x – x0) (x – x1) ... (x – xi–1) (x – xi+1)...(x – xn)
where ci is a constant given by li (xi) =1,
i.e., ci ( xi x0 ) ( xi x1 )...( xi xi 1 ) ( xi xi 1 )... ( xi xn ) 1
( x x0 )( x x1 )...( x xi 1 )( x xi 1 )...( x xn )
Thus, li ( x) for i 0, 1, 2, ..., n
( xi x0 )( xi x1 )...( xi xi 1 )( xi xi 1 )...( xi xn )
(2.10)
Equations (2.8) and (2.10) together give Lagrange’s interpolating polynomial.
Algorithm: To compute f (x) by Lagrange’s interpolation.
Step 1: Read n [n being the number of values]
Step 2: Read values of xi, fi for i = 1, 2,..., n.
Step 3: Set sum = 0, i = 1
Step 4: Read x [x being the interpolating point]
Step 5: Set j = 1, product = 1
Step 6: Check if j i, product = product × (x – xj)/(xi – xj) else go to Step
7
Step 7: Set j = j + 1
Step 8: Check if j > n, then go to Step 9 else go to Step 6
Step 9: Compute sum = sum + product × fi
Step 10: Set i = i + 1 Self - Learning
Material 49
Interpolation and Step 11: Check if i > n, then go to Step 12
Curve Fitting
else go to Step 5
Step 12: Write x, sum
NOTES Example 2.8: Compute f (0.4) for the table below by Lagrange’s interpolation.
x 0.3 0.5 0.6
f ( x) 0.61 0.69 0.72
where
( x 0)( x 1)( x 2) 1
l0 ( x) x( x 1)( x 2)
(1 0)(1 1)(1 2) 6
( x 1)( x 1)( x 2) 1
l1 ( x ) ( x 1)( x 1)( x 2)
(0 1)(0 1)(0 2) 2
( x 1)( x 0)( x 2) 1
l2 ( x) ( x 1) x( x 2)
(1 1)(1 0)(1 2) 2
( x 1)( x 0)( x 1) 1
l3 ( x ) ( x 1) x ( x 2)
Self - Learning (2 1)(2 0)(2 1) 6
50 Material
1 1 1 1 Interpolation and
f ( x) x( x 1)( x 2) 1 ( x 1)( x 1)( x 2) 1 ( x 1) x ( x 2) 1 ( x 1) x( x 2) (3) Curve Fitting
6 2 2 6
1 3
(4 x 4 x 6)
6
1 NOTES
(2 x 3 2 x 3)
3
Example 2.11: Evaluate the values of f (2) and f (6.3) using Lagrange’s interpolation
formula for the table of values given below.
Since, the computed result cannot be more accurate than the data, the final
result is rounded-off to the same number of decimals as the data. In some cases,
a higher degree interpolating polynomial may not lead to better results.
2.2.3 Finite Difference for Interpolation
For interpolation of an unknown function when the tabular values of the argument
x are equally spaced, we have two important interpolation formulae, viz.,
Self - Learning
Material 51
Interpolation and (i) Newton’s forward difference interpolation formula
Curve Fitting
(ii) Newton’s backward difference interpolation formula
We will first discuss the finite differences which are used in evaluating the
NOTES above two formulae.
Finite Differences
Let us assume that values of a function y = f (x) are known for a set of equally
spaced values of x given by {x0, x1,..., xn}, such that the spacing between any
two consecutive values is equal. Thus, x1 = x0 + h, x2 = x1 + h,..., xn = xn–1 + h,
so that xi = x0 + ih for i = 1, 2, ...,n. We consider two types of differences known
as forward differences and backward differences of various orders. These
differences can be tabulated in a finite difference table as explained in the subsequent
sections.
Forward Differences
Let y0, y1,..., yn be the values of a function y = f (x) at the equally spaced values of
x = x0, x1, ..., xn. The differences between two consecutive y given by y1 – y0, y2
– y1,..., yn – yn–1 are called the first order forward differences of the function y = f
(x) at the points x0, x1,..., xn–1. These differences are denoted by,
y0 y1 y0 , y1 y2 y1 , ..., yn1 yn yn1
(2.11)
where is termed as the forward difference operator defined by,,
f ( x ) f ( x h ) f ( x )
(2.12)
Thus, yi = yi+1 – yi, for i = 0, 1, 2, ..., n – 1, are the first order forward
differences at xi.
The differences of these first order forward differences are called the second
order forward differences.
Thus, 2
yi (yi )
yi 1 yi , for i 0, 1, 2, ..., n 2
(2.13)
Evidently,
2 y0 y1 y0 y 2 y1 ( y1 y0 ) y2 2 y1 y0
And, 2 yi yi 2 yi 1 ( yi 1 yi )
i.e., 3 y i y i 3 3 y i 2 3 y i 1 y i
(2.15)
Self - Learning
52 Material
Finally, we can define the nth order forward difference by, Interpolation and
Curve Fitting
n( n 1)
n y 0 y n ny n 1 y n 2 ... ( 1) n y0
2!
(2.16) NOTES
The coefficients in above equations are the coefficients of the binomial expansion
(1 – x)n.
The forward differences of various orders for a table of values of a function
y = f (x), are usually computed and represented in a diagonal difference table. A
diagonal difference table for a table of values of y = f (x), for six points x0, x1, x2,
x3, x4, x5 is shown here.
Diagonal difference Table for y = f(x):
i xi yi yi 2 yi 3 yi 4 yi 5 yi
0 x0 y0
y 0
1 x1 y1 2 y0
y1 3 y0
2
2 x2 y2 y1 4 y 0
y 2 3 y1 5 y0
2 4
3 x3 y3 y2 y1
y 3 3 y 2
4 x4 y4 2 y 3
y 4
5 x5 y5
The entries in any column of the differences are computed as the differences
of the entries of the previous column and one placed in between them. The upper
data in a column is subtracted from the lower data to compute the forward
differences. We notice that the forward differences of various orders with respect
to yi are along the forward diagonal through it. Thus y0, 2y0, 3y0, 4y0 and
5y0 lie along the top forward diagonal through y0. Consider the following example.
Example 2.12: Given the table of values of y = f (x),
x 1 3 5 7 9
y 8 12 21 36 62
form the diagonal difference table and find the values of f (5), 2 f (3), 3 f (1) .
Solution: The diagonal difference table is,
i xi yi y i 2 yi 3 yi 4 yi
0 1 8
4
1 3 12 5
9 1
2 5 21 6 4
15 5
3 7 36 11
26 Self - Learning
4 9 62 Material 53
Interpolation and
Curve Fitting
From the table, we find that f (5) 15, the entry along the diagonal through
the entry 21 of f (5).
Similarly, 2 f (3) 6, the entry along the diagonal through f (3). Finally,,
NOTES
3 f (1) 1.
Backward Differences
The backward differences of various orders for a table of values of a function y =
f (x) are defined in a manner similar to the forward differences. The backward
difference operator (inverted triangle) is defined by f ( x) f ( x) f ( x h).
Thus, yk yk yk 1 , for k 1, 2, ..., n
i.e., y1 y1 y0 , y 2 y 2 y1 ,..., y n y n y n 1
(2.17)
The backward differences of second order are defined by,
2 yk yk yk 1 yk 2 yk 1 yk 2
Hence,
2 y2 y2 2 y1 y0 , and 2 yn yn 2 yn 1 yn 2
(2.18)
Higher order backward differences can be defined in a similar manner.
Thus, 3 yn yn 3 yn 1 3 yn 2 yn 3 , etc.
(2.19)
Finally,
n( n 1)
n yn yn nyn 1 yn 2 – ... ( 1) n y0
2 i
(2.20)
The backward differences of various orders can be computed and placed in a
diagonal difference table. The backward differences at a point are then found
along the backward diagonal through the point. The following table shows the
backward differences entries.
Diagonal difference Table of backward differences:
i xi yi y i 2 yi 3 yi 4 yi 5 yi
0 x0 y0
y1
1 x1 y1 2 y2
y 2 3 y3
2
2 x2 y2 y3 4 y4
y 3 3 y4
2
3 x3 y3 y4 4 y5
3
y 4 y5
4 x4 y4 2 y5
y5
Self - Learning 5 x5 y5
54 Material
The entries along a column in the table are computed (as discussed in previous Interpolation and
Curve Fitting
example) as the differences of the entries in the previous column and are placed in
between. We notice that the backward differences of various orders with respect
to yi are along the backward diagonal through it. Thus, y5 , 2 y5 , 3 y5 , 4 y5 and
NOTES
5 y 5 are along the lowest backward diagonal through y5.
We may note that the data entries of the backward difference table in any
column are the same as those of the forward difference table, but the differences
are for different reference points.
Specifically, if we compare the columns of first order differences we can see
that,
y0 y1 , y1 y 2 , ..., y n 1 y n
Similarly, 2 y 0 2 y 2 , 2 y1 2 y3 ,..., 2 y n 2 2 y n
In general, k yi k yi k .
Conversely, k yi k yi k .
Example 2.13: Given the following table of values of y = f (x):
x 1 3 5 7 9
y 8 12 21 36 62
xi yi y i 2 yi 3 yi 4 yi
1 8
4
3 12 5
9 1
5 21 6 4
15 5
7 36 11
26
9 62
From the table, we can easily find y(7) 15, y(9) 11, y(9) 5.
2 3
y ( x ) y ( x h ) y ( x )
Ey ( x) y ( x)
Self - Learning ( E 1) y ( x)
56 Material
This leads to the operator relation, Interpolation and
Curve Fitting
E 1
or, E 1 (2.27)
NOTES
Similarly, for the second order forward difference, we have
2 y ( x) y ( x h) y ( x)
y ( x 2 h) 2 y ( x h) y ( x )
E 2 y ( x ) 2 Ey ( x ) y ( x )
( E 2 2 E 1) y ( x )
2 f ( x ) f ( x ) f ( x h )
f ( x ) f ( x h) f ( x h) f ( x 2h)
f ( x ) 2 f ( x h) f ( x 2h)
f ( x) E 1 f ( x) E 2 f ( x)
(1 E 1 E 2 ) f ( x)
(1 E 1 ) 2 f ( x)
m (1 E 1 ) m (2.30)
Relations between the operators E, D and
We have by Taylor’s theorem,
h2
f ( x h) f ( x ) hf ( x) f ( x) ...
2!
h2 D2 d
Thus, Ef ( x) f ( x) hDf ( x) f ( x) ..., where D
2! dx
h2 D2
or, (1 ) f ( x ) 1 hD ... f ( x)
2!
Self - Learning
e f ( x)
hD
Material 57
Interpolation and
Curve Fitting
Thus, e hD 1 E
(2.31)
Also, hD log (1 )
NOTES 2 3 4
or, hD ...
2 3 4
1 2 3 4
D
h 2
3
4
...
E1/ 2 E 1/ 2 y E
n 1/ 2
1/ 2
E 1/ 2 y
n 1/ 2
E 1/ 2
yn1/ 2 yn1/ 2 E 1/ 2
yn 1/ 2 yn1/ 2
yn 1 yn yn yn 1
yn 1 2 yn yn 1 ( E1/ 2 E 1/ 2 ) 2 yn 2 yn 1 2 yn 1
( E E 1 2) yn
2 E E 1 2 (2.32)
Even though the central difference operator uses fractional arguments, still it is
widely used. This is related to the averaging operator and is defined by,
1 1/ 2
( E E 1/ 2 ) (2.33)
2
1 1 1
Squaring, 2 ( E 2 E 1 ) ( 2 2 2) 1 2
4 4 4
1
2 1 2 (2.34)
4
It may be noted that, y1/ 2 y1 y0 y1
Self - Learning
58 Material
Also, E1/ 2 y1 y 1 y2 y1 y1 Interpolation and
1 Curve Fitting
2
E1/ 2 E 1 (2.35)
Further, NOTES
3 yn ( 2 yn ) y 1 y 1
n 2 n
2
2 y 1 2 y 1 ( yn 1 2 yn yn 1 )
n n
2 2
E 1
Thus, or E E 1
E
Hence proved.
(ii) From Equation (1), we have E 1 (3)
and from Equation (2) we get E 1 1 (4)
Combining Equations (3) and (4), we get (1 )(1 ) 1.
Example 2.15: If fi is the value of f (x) at xi where xi = x0 + ih, for i = 1,2,...,
prove that,
i
i
fi E i fo i f0
j 0 j
h2
Ef ( x) f ( x) hf ( x) f ( x) ...
2!
h2 d
f ( x) hDf ( x) D2 f ( x) ..., where D
2! dx
hD2 2
(1 ) f ( x) 1 hD ... f ( x), since E 1
2!
ehD . f ( x)
1 ehD
f i (1 )i f ( x0 ), since E 1
NOTES i
i
f i i f 0 , using binomial expansion.
j 0 j
Hence proved.
Example 2.16: Compute the following differences:
(i) n e x (ii) n x n
Solution:
(i) We have, e x e x h e x e x (e h 1)
Thus by induction, n e x (e h 1) n e x .
(ii) We have,
( x n ) ( x h) n x n
n(n 1) 2 n 2
n h x n 1 h x .... h n
2!
f ( x)
(ii) {log f ( x)} log 1
f ( x)
Solution:
(i) We have,
f ( x ) f ( x h) f ( x )
g ( x ) g ( x h) g ( x )
f ( x h) g ( x ) f ( x ) g ( x h)
g ( x h) g ( x )
Self - Learning
60 Material
f ( x h) g ( x ) f ( x ) g ( x ) f ( x ) g ( x ) f ( x ) g ( x h) Interpolation and
Curve Fitting
g ( x h) g ( x )
g ( x){ f ( x h) f ( x)} f ( x){g ( x h) g ( x)}
g ( x ) g ( x h)
g ( x) f ( x) f ( x) g ( x)
NOTES
g ( x ) g ( x h)
(ii) We have,
{log f ( x)} log{ f ( x h)} log{ f ( x)}
f ( x h) f ( x h) f ( x ) f ( x )
log log
f ( x) f ( x)
f ( x )
log 1
f ( x)
Self - Learning
Material 61
Interpolation and Example 2.18: Compute the horizontal difference table for the following data
Curve Fitting
and hence, write down the values of f (4), 2 f (3) and 3 f (5).
x 1 2 3 4 5
NOTES f ( x) 3 18 83 258 627
Solution: The horizontal difference table for the given data is as follows:
x f ( x ) f ( x ) 2 f ( x ) 3 f ( x ) 4 f ( x )
1 3
2 18 15
3 83 65 50
4 258 175 110 60
5 627 369 194 84 24
From the table we read the required values and get the following result:
f (4) 175, 2 f (3) 50, 3 f (5) 84
Example 2.19: Form the difference table of f (x) on the basis of the following
table and show that the third differences are constant. Hence, conclude about the
degree of the interpolating polynomial.
x 0 1 2 3 4
f ( x) 5 6 13 32 69
x f ( x) f ( x) 2 f ( x) 3 f ( x)
0 5
1
1 6 6
7 6
2 13 12
19 6
3 32 18
37
4 69
It is clear from the above table that the third differences are constant and
hence, the degree of the interpolating polynomial is three.
yn yn 1 yn
or, b1 (2.42)
h h
Again
( xn 2 ) yn 2 , gives yn 2 b0 b1 ( xn 2 xn ) b2 ( xn 2 xn )( xn 1 xn )
yn yn 1
or, yn 2 yn ( 2h) b2 (2h)(h)
Self - Learning h
64 Material
Interpolation and
y 2 yn 1 yn 2 2 yn
b2 n (2.43) Curve Fitting
2h 2 2 ! h2
By induction or by proceeding as mentioned earlier, we have
NOTES
3 yn 4 yn n yn
b3 , b4 , ..., bn (2.44)
3 ! h3 4 ! h4 n ! hn
Substituting the expressions for bi in Equation (2.39), we get
yn 2 yn n yn
( x ) yn ( x xn ) ( x x n ) ( x xn 1 ) ... ( x xn )( x xn 1 )...( x x1 )
h 2 ! h2 n ! hn
(2.45)
This formula is known as Newton’s backward difference interpolation formula.
It uses the backward differences along the backward diagonal in the difference
table.
x xn
Introducing a new variable v ,
h
x xn 1 x ( xn h)
we have, v 1.
h h
x xn 2 x x1
Similarly, v 2,..., v n 1.
h h
Thus, the interpolating polynomial in Equation (2.45) may be rewritten as,
(2.46)
This formula is generally used for interpolation at a point near the end of a
table.
The error in the given interpolation formula may be written as,
E ( x) f ( x) ( x)
( n 1)
( x x n )( x x n 1 )...( x x1 )( x x 0 ) f ( )
, where x0 x n
( n 1) !
y ( n 1) ( ) n 1
v ( v 1)( v 2)...( v n ) h
( n 1) !
2.2.10 Extrapolation
The interpolating polynomials are usually used for finding values of the tabulated
function y = f(x) for a value of x within the table. But, they can also be used in
some cases for finding values of f(x) for values of x near to the end points x0 or xn
outside the interval [x0, xn]. This process of finding values of f(x) at points beyond
the interval is termed as extrapolation. We can use Newton’s forward difference
interpolation for points near the beginning value x0. Similarly, for points near the
end value xn, we use Newton’s backward difference interpolation formula.
Example 2.20: With the help of appropriate interpolation formula, find from the
Self - Learning
following data the weight of a baby at the age of one year and of ten years: Material 65
Interpolation and
Curve Fitting Age x 3 5 7 9
Weight y (kg ) 5 8 12 17
NOTES Solution: Since the values of x are equidistant, we form the finite difference table
for using Newton’s forward difference interpolation formula to compute weight of
the baby at the age of required years.
x y y 2 y
3 5
3
5 8 1
4
7 12 1
5
9 17
x x0
Taking x = 2, u 0.5.
h
Newton’s forward difference interpolation gives,
(0.5)(1.5)
y at x 1, y (1) 5 0.5 3 1
2
5 1.5 0.38 3.88 3.9 kg.
Similarly, for computing weight of the baby at the age of ten years, we use
Newton’s backward difference interpolation given by,
x xn 10 9
v 0.5
h 2
0.5 1.5
y at x 10, y (10) 17 0.5 5 1
2
17 2.5 0.38 19.88
unique, if y = f (x) is a single valued function of x and dy exists and does not
dx
vanish in the neighbourhood of the point where inverse interpolation is desired.
When the values of x are unequally spaced, we can apply Lagrange’s
interpolation or iterative linear interpolation simply by interchanging the roles of x
and y. Thus Lagrange’s formula for inverse interpolation can be written as,
n
x li ( y) xi
i 0
n
where li ( y ) [( y y j ) /( yi y j )]
j 0
j i
Self - Learning When x values are equally spaced, we can apply the method of successive
66 Material approximation as described below.
Consider Newton’s formula for forward difference interpolation given by, Interpolation and
Curve Fitting
u (u 1) 2 u (u 1)(u 2) 3
y y 0 u y 0 y0 y0 ...
2! 3!
Retaining only two terms on the RHS, we can write the first approximation, NOTES
1
u (1) ( y y0 )
y0
1 u (1) (u (1) 1) 2
u ( 2) ( y y 0 ) y0
y 0 2
1 u ( 2) (u ( 2) 1) 2 u ( 2) (u ( 2) 1)(u ( 2) 2) 3
u (3) y y 0 y 0 y0
yo 2 6
x 1 3 4
y 3 12 19
Solution: We first form the Divided Difference (DD) table as given below.
x f ( x) 1st DD 2nd DD
4 43
42
7 83 16
122
Self - Learning
68 Material 9 327
Newton’s divided difference interpolation formula is, Interpolation and
Curve Fitting
f ( x) f ( x0 ) ( x x0 ) f ( x0 , x1 ) ( x x0 ) ( x x1 ) f x0 , x1 , x2
f ( x) 43 ( x 4) 42 ( x 4) ( x 7) 16
16 x 2 134 x 237
NOTES
x y y 2 y
1000 3.00000
432
1010 3.00432 4
428
1020 3.00860 4
424
1030 3.01284 5
419
1040 3.01703
We observe that, the differences of second order are nearly constant. Thus,
the degree of the interpolating polynomial is 2 and is given by,
u (u 1) 2 x x0
y y 0 u y 0 y0 , where u
2 h
For x = 1001, we take x0 = 1000.
1001 1000
u 0.1
10
0.1 0.9
log e 1001 3.00000 0.1 0.00432 (0.00004)
2
3.000430 3.00043
Example 2.25: Determine the interpolating polynomial for the following data table
using both forward and backward difference interpolating formulae. Comment on
the result.
x 0 1 2 3 4
f ( x) 1.0 8.5 36.0 95.5 199.0
Self - Learning
Material 69
Interpolation and Solution: Since the data points are equally spaced, we construct the Newton’s
Curve Fitting
forward difference interpolating polynomial for which we first form the finite
difference table as given below:
NOTES x f ( x) f ( x) 2 f ( x) 3 f ( x)
0 1 .0
7.5
1.0 8 .5 20.0
27.5 12.0
2 .0 36.0 32.0
59.5 12.0
3 .0 95.5 44.0
103.5
4.0 199.0
Since the differences of order 3 are constant, we construct the third degree
Newton’s forward difference interpolating polynomial given by,
x ( x 1) x ( x 1) ( x 2)
f ( x) 1.0 x 0.75 20 12
2 6
Since x0 0, h 1.0
x x0
u x
h
( x 4) ( x 3)
f ( x) 199 ( x 4) 103.5 44
2
( x 4) ( x 3) ( x 3)
12
6
1.0 1.5 x 4 x 2 2 x 3 , on simplification.
x 4 5 7 10 11 13
f ( x) 48 100 294 900 1210 2028
Self - Learning
70 Material
Solution: We first form the divided difference table as given below. Interpolation and
Curve Fitting
Since 3rd order divided differences are same, higher order divided differences
vanish. We have the Newton’s divided difference interpolation given by,
f ( x ) f 0 ( x x0 ) f x , x1 ( x x0 )( x x1 ) f x0 , x1 , x2
( x x0 )( x x1 )( x x 2 ) f x0 , x1 , x2 , x3
For x = 8, we take x0 = 4,
f (8) 48 (8 4)52 (8 4)(8 5) 15 (8 4)(8 5)(8 7) 1
48 208 180 12 448
Example 2.27: Using inverse interpolation, find the zero of f (x) given by the
following tabular values.
Self - Learning
Material 71
Interpolation and 0.06 0.04 0.06 0.3 0.14 0.04 0.06 0.4
Curve Fitting Thus, P3 (0)
0.08 0.18 0.20 0.08 0.1 0.12
0.14 0.06 0.06 0.6 0.14 0.06 0.04 0.7
0.18 0.1 0.02 0.2 0.12 0.02
NOTES 0.015 0.14 0.84 0.49 0.475
Thus, the zero of f (x) is 0.475 which is approximately equal to 0.48, since the
accuracy depends on the accuracy of the data which is the significant digits.
(2.47)
The function g(x) may have some parameters, , , . In order to determine
these parameters we have to form the necessary conditions for S to be minimum,
which are
S S S
0, 0, 0 (2.48)
These equations are called normal equations, solving which we get the
parameters for the best approximate function g(x).
S n
and, 0, i.e., xi ( xi yi ) 0 (2.51)
i 1
Solving,
1 S S
. Also 01 1 .
S1S11 S1S2 nS11 S1S01 nS2 S12
n n
Algorithm: Fitting a straight line y = a + bx.
Step 1: Read n [n being the number of data points]
Step 2: Initialize : sum x = 0, sum x2 = 0, sum y = 0, sum xy = 0
Step 3: For j = 1 to n compute
Begin
Read data xj, yj
Compute sum x = sum x + xj
Compute sum x2 + xj × xj
Compute sum y = sum y + yi × yj
Compute sum xy = sum xy + xj × yj
End
Step 4: Compute b = (n × sum xy – sum x × sum y)/ (n × sum x2 – (sum x)2)
Step 5: Compute x bar = sum x / n
Step 6: Compute y bar = sum y / n
Step 8: Compute a = y bar – b × x bar
Step 9: Write a, b
Step 10: For j = 1 to n
Begin
Compute y estimate = a + b × x
write xj, yj, y estimate
Self - Learning End
74 Material
Step 11: Stop Interpolation and
Curve Fitting
Curve Fitting by a Quadratic (A Parabola): Let g(x) = a + bx + cx2, be the
approximating quadratic to fit a set of data (xi, yi), i = 1, 2, ..., n. Here the
parameters are to be determined by the method of least squares, i.e., by minimizing
the sum of the squares of the deviations given by, NOTES
n
S ( a bxi cxi2 yi ) 2
i 1
(2.52)
S S S
Thus the normal equations, 0, 0, 0, are as follows:
a b c
(2.53)
n
(a bx cx
i 1
i
2
i yi ) 0
n
x (a bx
i 1
i i cxi2 yi ) 0
n
x
i 1
2
i (a bxi cxi2 yi ) 0. (2.54)
na s1b s2 c s01 0
s1a s2b s3c s11 0
s2 a s3b s4 c s21 0 (2.55)
n n n n
It is clear that the normal equations form a system of linear equations in the
unknown parameters a, b, c. The computation of the coefficients of the normal
equations can be made in a tabular form for desk computations as shown below.
xi 4 6 8 10 12
y1 13.72 12.90 12.01 11.14 10.31
NOTES
Solution: Let y = a + bx, be the straight line which fits the data. We have the
S S
normal equations 0, 0 for determining a and b, where
a b
5
S ( y a bx )
i 1
i i
2
.
5 5
Thus, y
i 1
i na b xi 0
i 1
5 5 5
x y a xi b xi 0
2
and , i i
i 1 I 1 i 1
xi 60 61 62 63 64
yi 40 40 48 52 55
Solution: Let the straight line fitting the data be y = a + bx. The data values being
large, we can use a change in variable by substituting u = x – 62 and v = y – 48.
Let v = A + B u, be a straight line fitting the transformed data, where the
normal equations for A and B are,
5 5
i 1
vi 5 A B u
i 1
i
5 5 5
76
Self - Learning
Material
i 1
u i vi A
i 1
ui B u
i 1
2
i
The computation of the various sums are given in the table below, Interpolation and
Curve Fitting
xi yi ui vi ui vi ui2
60 40 –2 –8 16 4
61 42 1 6 6 1 NOTES
62 48 0 0 0 0
63 52 1 4 4 1
64 55 2 7 14 4
Sum 0 3 40 10
n n
and, z px , where x xi n , z log yi n (2.62)
i 1 i 1
X
O M
Fig. 2.2
Remarks:
1. It follows from the definition that
1 1 1
sec = , cosec = , cot = ,
cos sin tan
sin cos
tan = , cot = .
cos sin
2. Trigonometrical ratios are same for the same angle. For, let P be any
point on the revolving line OP. Draw PM OX. Then triangles OPM
MP M P
and OPM are similar, so = i.e., each of these ratios is sin .
OP OP
P'
X
O M M'
Fig. 2.3
Self - Learning
80 Material
For Any Angle Interpolation and
Curve Fitting
1. sin2 + cos2 = 1
2. sec2 = 1 + tan2
3. cosec2 = 1 + cot2 NOTES
Proof. Let the revolving line OP start from OX and trace out an angle in the
anti-clockwise direction. From P draw PM OX. (Produce OX, if necessary.)
(Refer Figure 2.2).
Then XOP =
MP OM
(1) sin = , cos =
OP OP
( MP ) 2 (OM ) 2 (OP ) 2
Then sin2 + cos2 = = = 1.
( OP) 2 (OP ) 2
OP MP
(2) sec = , tan =
OM OM
( MP) 2 (OM ) 2 ( MP) 2
Then 1 + tan2 = 1 =
(OM ) 2 (OM ) 2
=
(OP ) 2
=
FG OP IJ 2
= (sec )2 = sec2
(OM ) 2 H OM K
OM OP
(3) cot = , cosec = .
MP MP
Then 1 + cot2 = 1
FG OM IJ 2
=
( MP ) 2 (OM ) 2
H MP K ( MP ) 2
=
(OP ) 2
=
FG OP IJ 2
= (cosec )2 = cosec2
( MP) 2 H MP K
Signs of Trigonometrical Ratios
Consider four lines OX, OX, OY, OY at right angles to each other. Let a revolving
line OP start from OX in the anticlockwise direction. From P draw PM OX or
OX. We have the following convention of signs regarding the sides of OPM.
1. OM is positive, if it is along OX.
2. OM is negative, if it is along OX.
3. MP is negative, if it is along OY.
4. MP is positive, if it is along OY.
5. OP is regarded always positive.
Y
P
P
M O M
X' X
M M
P P
Y'
Self - Learning
Fig. 2.4 Material 81
Interpolation and First Quadrant: If the revolving line OP is in the first quadrant, then all the sides
Curve Fitting
of the triangle OPM are positive. Therefore, all the trigonometrical ratios are positive
in the first quadrant.
Second Quadrant: If the revolving line OP is in the second quadrant, then OM is
NOTES
negative and the other two sides of OPM are positive. Therefore, ratios involving
OM will be negative. So, cosine, secant, tangent, cotangent of an angle in the
second quadrant are negative while sine and cosecant of anlge in the second quadrant
are positive.
Third Quadrant: If the revolving line is in the third quadrant, then sides OM and
MP both are negative. Since OP is always positive, therefore, ratios involving
each one of OM and MP alone will be negative. So, sine, cosine, cosecant and
secant of an angle in the third quadrant are negative. Since tangent or cotangent of
any angle involve both OM and MP, therefore, these will be positive. So, tangent
and cotangent of an angle in the third quadrant are positive.
Fourth Quadrant: If the revolving line OP is in the fourth quadrant, then MP is
negative and the other two sides of OPM are positive. Therefore, ratios involving
MP will be negative and others positive. So, sine, cosecant, tangent and cotangent
of an angle in the fourth quadrant are negative while cosine and secant of an angle
in the fourth quadrant are positive.
Limits to the Value of Trigonometrical Ratios
We know that sin2 + cos2 = 1 for any angle . sin2 and cos2 being perfect
squares, will be positive. Again neither of them can be greater than 1 because then
the other will have to be negative.
Thus sin2 1, cos2 1.
sin and cos cannot be numerically greater than 1.
1 1
Similarly, cosec = and sec = cannot be numerically less
sin cos
than 1.
There is no restriction on tan and cot . They can have any value.
Example 2.30: Prove that sin6 + cos6 = 1 – 3 sin2 cos2 .
Solution: Here LHS = sin6 + cos6
= (sin2 )3 + (cos2 )3
= (sin2 + cos2 )(sin4 – sin2 cos2 + cos4 )
= 1 . (sin4 – sin2 cos2 + cos4 )
= [(sin2 + cos2 )2 – 3 sin2 cos2 ]
= 1 – 3 sin2 cos2 = RHS.
1 cos
Example 2.31: Prove that = cosec + cot . Provided cos 1.
1 cos
1 cos
Solution. LHS =
1 cos
(1 cos )(1 cos ) 1 cos
Self - Learning = =
82 Material (1 cos )(1 cos ) 1 cos2
1 cos 1 cos Interpolation and
= = = cosec + cot Curve Fitting
sin sin sin
Example 2.32: Prove that (1 + cot – cosec )(1 + tan + sec ) = 2.
Solution: LHS = (1 + cot – cosec )(1 + tan + sec ) NOTES
F cos 1 IJ FG1 sin 1 IJ
= G1
H sin sin K H cos cos K
(sin cos 1)(cos sin 1)
LHS =
sin cos
(sin cos ) 2 1
=
sin cos
sin 2 cos2 2 sin cos 1
=
sin cos
sec 2
= 1 = sec cosec + 1 = RHS.
tan
Example 2.34: Which of the six trigonometrical ratios are positive for (i) 960º
(ii) – 560º?
Solution: (i) 960º = 720º + 240º.
Therefore, the revolving line starting from OX will make two complete
Self - Learning
revolutions in the anticlockwise direction and further trace out an angle of 240º in Material 83
Interpolation and the same direction. Thus, it will be in the third quadrant. So, the tangent and
Curve Fitting
cotangent are positive and rest of trigonometrical ratios will be negative.
(ii) – 560º = – 360º – 200º.
NOTES Therefore, the revolving line after making one complete revolution in the
clockwise direciton, will trace out further an angle of 200º in the same direction.
Thus, it will be in the second quadrant. So, only sine and cosecant are positive.
7
Example 2.35: In what quadrants can lie if sec = ?
6
Solution: As sec is negative in second and third quadrants, can lie in second
or third quadrant only.
12
Example 2.36: If sin = , determine other trigonometrical ratios of .
13
Solution. cos2 = 1 – sin2
144 169 144 25
= 1 = = .
169 169 169
5
cos = .
13
sin
So tan = = ± 12
cos 5
13 13 5
cosec = , sec = , cot = ± .
12 5 12
Example 2.37: Express all the trigonometrical ratios of in terms of the sin .
Solution: Let sin = k.
Then, cos2 = 1 – sin2 = 1 – k2 cos = 1 k 2
sin k
tan = =
cos 1 k2
cos 1 k2
cot = =
sin k
1 1
sec = =
cos 1k2
1 1
cosec = = .
sin k
1
Example 2.38: Prove that sin = a is impossible, if a is real.
a
1 a2 1
Solution: sin = a sin =
a a
a2 – a sin + 1 = 0
sin sin 2 4
a=
2
For a to be real, the expression under the radical sign, must be positive or
Self - Learning
zero.
84 Material i.e., sin2 – 4 0
or sin2 4 sin is numerically greater than or Interpolation and
Curve Fitting
equal to 2 which is impossible.
1
Thus, if a is real, sin = a is impossible.
a NOTES
Example 2.39: Determine the quadrant in which must lie if cot is positive and
cosec is negative.
Solution: cot is positive lies in first or third quadrant.
cosec is negative lies in third or fourth quadrant.
In order that cot is positive and cosec is negatie, we see that must lie in
third quadrant.
Example 2.40: Prove that
1 1 1 1
=
cosec cot sin sin cosec cot
1 1
Solution: LHS =
cosec cot sin
sin 1
=
1 cos sin
sin 2 (1 cos )
=
(1 cos ) sin
(1 sin 2 ) cos
=
(1 cos ) sin
cos2 cos
=
(1 cos ) sin
cos (1 cos )
= = – cot
(1 cos ) sin
1 1
RHS =
sin cosec cot
1 sin
=
sin 1 cos
1 cos sin 2
=
sin (1 cos )
cos 2 cos
=
sin (1 cos )
cos (1 cos )
= = – cot
sin (1 cos )
Therefore, LHS = RHS.
Example 2.41: Prove that,
sin (1 + tan ) + cos (1 + cot ) = sec + cosec .
Self - Learning
Material 85
Interpolation and Solution: LHS = sin (1 + tan ) + cos (1 + cot )
Curve Fitting
FG
= sin 1
sin IJ FG
cos 1
cos IJ
H cos K Hsin K
NOTES sin 2 cos2
= sin cos
cos sin
= cos
LM (1 sin ) (1 sin ) OP
N (1 cos ) (1 cos ) Q
Self - Learning
86 Material
Interpolation and
(1 sin ) (1 cos ) Curve Fitting
(1 sin ) (1 cos )
= cos
1 cos 2
NOTES
LM1 sin cos sin cos OP
1 sin cos sin cos
= cos M PP
MM sin 2
N QP
= cos
LM 2 2 sin cos OP
MN sin PQ
2
= tan
LM 1 1 OP
N sec 1 sec 1Q
= tan M
L 2 sec OP
NM sec 1QP
2
L 2 sec OP
= tan M
MN tan PQ
2
2 sec 2
= = = 2 cosec = RHS.
tan sin
2.5 REGRESSION
The term ‘Regression’ was first used in 1877 by Sir Francis Galton who made a
study that showed that the height of children born to tall parents will tend to move
back or ‘regress’ toward the mean height of the population. He designated the
word regression as the name of the process of predicting one variable from another
variable. He coined the term multiple regression to describe the process by which
Self - Learning several variables are used to predict another. Thus, when there is a well-established
88 Material
relationship between variables, it is possible to make use of this relationship in Interpolation and
Curve Fitting
making estimates and to forecast the value of one variable (the unknown or the
dependent variable) on the basis of the other variable/s (the known or the
independent variable/s). A banker, for example, could predict deposits on the
basis of per capita income in the trading area of bank. A marketing manager, may NOTES
plan his advertising expenditures on the basis of the expected effect on total sales
revenue of a change in the level of advertising expenditure. Similarly, a hospital
superintendent could project his need for beds on the basis of total population.
Such predictions may be made by using regression analysis. An investigator may
employ regression analysis to test his theory having the cause and effect relationship.
All these explain that regression analysis is an extremely useful tool especially in
problems of business and industry involving predictions.
Assumptions in regression analysis
While making use of the regression techniques for making predictions, the following
are always assumed:
(i) There is an actual relationship between the dependent and independent
variables.
(ii) The values of the dependent variable are random but the values of the
independent variable are fixed quantities without error and are chosen by
the experimentor.
(iii) There is a clear indication of direction of the relationship. This means that
dependent variable is a function of independent variable. (For example,
when we say that advertising has an effect on sales, then we are saying that
sales has an effect on advertising).
(iv) The conditions (that existed when the relationship between the dependent
and independent variable was estimated by the regression) are the same
when the regression model is being used. In other words, it simply means
that the relationship has not changed since the regression equation was
computed.
(v) The analysis is to be used to predict values within the range (and not for
values outside the range) for which it is valid.
2.5.1 Linear Regression
In case of simple linear regression analysis, a single variable is used to predict
another variable on the assumption of linear relationship (i.e., relationship of the
type defined by Y = a + bX) between the given variables. The variable to be
predicted is called the dependent variable and the variable on which the prediction
is based is called the independent variable.
Simple linear regression model3 (or the Regression Line) is stated as,
Yi = a + bXi + ei
Where, Yi = The dependent variable
Xi = The independent variable
ei = Unpredictable random element (usually called
Self - Learning
residual or error term) Material 89
Interpolation and (i) a represents the Y-intercept, i.e., the intercept specifies the value of the
Curve Fitting
dependent variable when the independent variable has a value of zero.
(However, this term has practical meaning only if a zero value for the
independent variable is possible).
NOTES
(ii) b is a constant, indicating the slope of the regression line. Slope of the line
indicates the amount of change in the value of the dependent variable for a
unit change in the independent variable.
If the two constants (viz., a and b) are known, the accuracy of our prediction of Y
(denoted by Ŷ and read as Y--hat) depends on the magnitude of the values of ei. If
in the model, all the ei tend to have very large values then the estimates will not be
very good, but if these values are relatively small, then the predicted values ( Ŷ )
will tend to be close to the true values (Yi).
Estimating the intercept and slope of the regression model (or estimating
the regression equation)
The two constants or the parameters viz., ‘a’ and ‘b’ in the regression model for
the entire population or universe are generally unknown and as such are estimated
from sample information. The following are the two methods used for estimation:
(i) Scatter diagram method
(ii) Least squares method
1. Scatter diagram method
This method makes use of the Scatter diagram also known as Dot diagram. Scatter
diagram is a diagram representing two series with the known variable, i.e.,
independent variable plotted on the X-axis and the variable to be estimated, i.e.,
dependent variable to be plotted on the Y-axis on a graph paper (Refer Figure
2.5) to get the following information illustrated in Table 2.1:
Table 2.1 Table Derived from Scatter Diagram
Self - Learning
90 Material
y-axis Interpolation and
Curve Fitting
The scatter diagram by itself is not sufficient for predicting values of the dependent
variable. Some formal expression of the relationship between the two variables is
necessary for predictive purposes. For the purpose, one may simply take a ruler
and draw a straight line through the points in the scatter diagram and this way can
determine the intercept and the slope of the said line and then the line can be
defined as Yˆ a bX i , with the help of which we can predict Y for a given value of
X. However, there are shortcomings in this approach. For example, if five different
persons draw such a straight line in the same scatter diagram, it is possible that
there may be five different estimates of a and b, especially when the dots are more
dispersed in the diagram. Hence, the estimates cannot be worked out only through
this approach. A more systematic and statistical method is required to estimate the
constants of the predictive equation. The least squares method is used to draw the
best fit line.
2. Least square method
The least squares method of fitting a line (the line of best fit or the regression line)
through the scatter diagram is a method which minimizes the sum of the squared
vertical deviations from the fitted line. In other words, the line to be fitted will pass
through the points of the scatter diagram in such a way that the sum of the squares
of the vertical deviations of these points from the line will be a minimum.
The meaning of the least squares criterion can be easily understood through
as shown in Figure 2.6, where the earlier as shown in Figure 2.5 in scatter diagram
has been reproduced along with a line which represents the least squares line to fit
the data.
Fig. 2.6 Scatter Diagram, Regression Line and Short Vertical Lines Self - Learning
Material 91
Representing ‘e’
Interpolation and As shown in Figure 2.6, the vertical deviations of the individual points from the line
Curve Fitting
are shown as the short vertical lines joining the points to the least squares line.
These deviations will be denoted by the symbol ‘e’. The value of ‘e’ varies from
one point to another. In some cases it is positive, while in others it is negative. If
NOTES the line drawn happens to be the least squares line, then the values of ei is the
least possible. It is because of this feature the method is known as Least Squares
Method.
Why we insist on minimizing the sum of squared deviations is a question that
needs explanation. If we denote the deviations from the actual value Y to the
n
estimated value Ŷ as (Y – Yˆ ) or ei, it is logical that we want the (Y – Yˆ ) or ei , to
i 1
n
be as small as possible. However, mere examining (Y – Yˆ ) or ei , is
i 1
inappropriate, since any ei can be positive or negative. Large positive values and
large negative values could cancel one another. However, large values of ei
regardless of their sign, indicate a poor prediction. Even if we ignore the signs
n
while working out | ei | , the difficulties may continue. Hence, the standard
i 1
Self - Learning
92 Material
Solution: Interpolation and
Curve Fitting
We are to fit a regression line Yˆ a bX i to the given data by the method of
Least squares. Accordingly, we work out the ‘a’ and ‘b’ values with the help
of the normal equations as stated above and also for the purpose, work out NOTES
X, Y, XY, X2 values from the given sample information table on
summations for regression equation.
Summations for Regression Equation
S.E. of Yˆ (or Se )
(Y Yˆ )2
e2
n2 n2
where, S.E. of Ŷ (or Se) = Standard error of the estimate
Y = Observed value of Y
Ŷ = Estimated value of Y
e = The error term = (Y– Ŷ )
n = Number of observations in the sample
Self - Learning
94 Material
Note: In the above formula, n – 2 is used instead of n because of the fact that two degrees of Interpolation and
freedom are lost in basing the estimate on the variability of the sample observations about Curve Fitting
the line with two constants viz., ‘a’ and ‘b’ whose position is determined by those same
sample observations.
The square of the Se, also known as the variance of the error term, is the NOTES
basic measure of reliability. The larger the variance, the more significant are the
magnitudes of the e’s and the less reliable is the regression analysis in predicting
the data.
Interpreting the standard error of estimate and finding the confidence
limits for the estimate in large and small samples
The larger the S.E. of estimate (SEe), the greater happens to be the dispersion,
or scattering, of given observations around the regression line. However, if the
S.E. of estimate happens to be zero, then the estimating equation is a ‘Perfect’
estimator (i.e., cent per cent correct estimator) of the dependent variable.
(i) In case of large samples, i.e., where n > 30 in a sample, it is assumed that
the observed points are normally distributed around the regression line and we
may find that,
68 per cent of all points lie within Yˆ 1 SEe limits.
95.5 per cent of all points lie within Yˆ 2 SEe limits.
99.7 per cent of all points lie within Yˆ 3 SEe limits.
This can be stated as,
a. The observed values of Y are normally distributed around each estimated
value of Ŷ and;
b. The variance of the distributions around each possible value of Ŷ is the
same.
(ii) In case of small samples, i.e., where n 30 in a sample the ‘t’ distribution
is used for finding the two limits more appropriately.
This is done as follows:
Upper limit = Ŷ + ‘t’ (SEe)
Lower limit = Ŷ – ‘t’ (SEe)
Where, Ŷ = The estimated value of Y for a given value of X.
SEe = The standard error of estimate.
‘t’ = Table value of ‘t’ for given degrees of freedom for a
specified confidence level.
2
XY nX Y Y 2 nY
= 2 2
Y 2 nY 2 X 2 n X X 2 nX
XY n X Y
= 2
X 2 n X
Y
and a = r X Y
X
Y
= Y bX since b r
X
Y
or
X
X̂ = r Y Y X
Y
and the
X XY n X Y
Regression coefficient of X on Y (or bXY) r
Y Y 2 nY
2
If we are given the two regression equations as stated above, along with the values
of ‘a’ and ‘b’ constants to solve the same for finding the value of X and Y, then the
values of X and Y so obtained, are the mean values of X (i.e., X ) and the mean
value of Y (i.e., Y ).
Self - Learning
96 Material
If we are given the two regression coefficients (viz., bXY and bYX), then we Interpolation and
Curve Fitting
can work out the value of coefficient of correlation by just taking the square root
of the product of the regression coefficients as shown,
r = bYX .bXY
NOTES
Y X
= r .r
X Y
= r.r =r
The (±) sign of r will be determined on the basis of the sign of the given regression
coefficients. If regression coefficients have minus sign then r will be taken with
minus (–) sign and if regression coefficients have plus sign then r will be taken with
plus (+) sign, (Remember that both regression coefficients will necessarily have
the same sign, whether it is minus or plus, for their sign is governed by the sign of
coefficient of correlation.) To understand it better, Refer Examples 2.50 and 2.51.
Example 2.50: Given is the following information:
X Y
Mean 39.5 47.5
Standard Deviation 10.8 17.8
Simple correlation coefficient between X and Y is = + 0.42.
Find the estimating equation of Y and X.
Solution:
Estimating equation of Y can be worked out as,
Ŷ Y = r Xi X
Y
or Ŷ = r
Y
X X Y
X i
17.8
= 0.42 X i 39.5 47.5
10.8
= 0.69 X i 27.25 47.5
= 0.69Xi + 20.25
Similarly, the estimating equation of X can be worked out as
X
X̂ X = r
Y Y
Y i
X
or X̂ = r
Y i
Y Y X
10.8
or = 0.42 Yi 47.5 39.5
17.8
= 0.26Yi – 12.35 + 39.5
= 0.26Yi + 27.15
Self - Learning
Material 97
Interpolation and Example 2.51: The following is the given data:
Curve Fitting
Variance of X = 9
Regression equations:
NOTES 4X – 5Y + 33 = 0
20X – 9Y – 107 = 0
Find: (i) Mean values of X and Y
(ii) Coefficient of Correlation between X and Y
(iii) Standard deviation of Y
Solution:
(i) For finding the mean values of X and Y, we solve the two given regression
equations for the values of X and Y as follows:
4X – 5Y + 33 = 0 ...(1)
20X – 9Y –107 = 0 ...(2)
If we multiply Equation (1) by 5, we have the following equations:
20X – 25Y = –165 ...(3)
20X – 9Y = 107 ...(2)
– + –
– 16Y = –272
Subtracting Equations (2) from (3)
or Y = 17
Putting this value of Y in Equation (1) we have,
4X = – 33 + 5(17)
33 85 52
or X = 13
4 4
_
Hence, X = 13 and Y = 17
(ii) For finding the coefficient of correlation, first of all we presume one of the
two given regression equations as the estimating equation of X. Let equation
4X – 5Y + 33 = 0 be the estimating equation of X, then we have,
5Y 33
Xˆ i
4 4
and
5
From this we can write bXY .
4
The other given equation is then taken as the estimating equation of Y and can
be written as,
20 X i 107
Yˆ
9 9
20
and from this we can write bYX .
Self - Learning
9
98 Material
If the above equations are correct then r must be equal to, Interpolation and
Curve Fitting
r = 5 / 4 20 / 9 25 / 9 = 5/3 = 1.6
which is an impossible equation, since r can in no case be greater than 1.
Hence, we change our supposition about the estimating equations and by reversing NOTES
it, we re-write the estimating equations as,
9Y 107
Xˆ i
20 20
4 X i 33
and Yˆ
5 5
Hence, r = 9 / 20 4 / 5
= 9 / 25
= 3/5
= 0.6
Since, regression coefficients have plus signs, we take r = + 0.6
(iii) Standard deviation of Y can be calculated,
Variance of X = 9 Standard deviation of X = 3
Y 4
bYX r = 0.6 Y 0.2 Y
X 5 3
Hence, Y = 4
Alternatively, we can work it out as,
X 9 1.8
Y
bXY r = 20 0.6 3
Y Y
Hence, Y = 4
(This can be seen by replacing x in this equation with x+1 and subtracting
the equation in x from the equation in x+1.) For infinitesimal changes in x, the effect
on y is given by the total derivative with respect to x:
The fact that the change in yield depends on x is what makes the relationship
between x and y nonlinear even though the model is linear in the parameters to be
estimated.
In general, we can model the expected value of y as an nth degree polynomial,
yielding the general polynomial regression model
Conveniently, these models are all linear from the point of view of estimation,
since the regression function is linear in terms of the unknown parameters β0, β1,
.... therefore, for least squares analysis, the computational and inferential problems
of polynomial regression can be completely addressed using the techniques of
multiple regression. This is done by treating x, x2, ... as being distinct independent
Self - Learning variables in a multiple regression model.
100 Material
Matrix form and Calculation of Estimates Interpolation and
Curve Fitting
The polynomial regression model
NOTES
can be expressed in matrix form in terms of a design matrix X is a
Vandermonde matrix, the invertibility condition is guaranteed to hold if all the xi
values are distinct. This is the unique least-squares solution.
Explanation of Polynomial Regression
Although polynomial regression is technically a special case of multiple linear
regression, the interpretation of a fitted polynomial regression model requires a
somewhat different perspective. It is often difficult to interpret the individual
coefficients in a polynomial regression fit, since the underlying monomials can be
highly correlated. For example, x and x2 have correlation around 0.97 when x is
uniformly distributed on the interval (0, 1). Although the correlation can be reduced
by using orthogonal polynomials, it is generally more informative to consider the
fitted regression function as a whole. Point-wise or simultaneous confidence bands
can then be used to provide a sense of the uncertainty in the estimate of the
regression function.
Alternative Approaches of Polynomial Regression
Polynomial regression is one example of regression analysis using basis functions
to model a functional relationship between two quantities. More specifically, it
replaces in linear regression with polynomial basis
A drawback of polynomial bases is that the basis
functions are ‘Non-Local’, meaning that the fitted value of y at a given value x = x0
depends strongly on data values with x far from x0. In modern statistics, polynomial
basis-functions are used along with new basis functions, such as splines, radial basis
functions, and wavelets. These families of basis functions offer a more parsimonious
fit for many types of data.
The goal of polynomial regression is to model a non-linear relationship
between the independent and dependent variables (technically, between the
independent variable and the conditional mean of the dependent variable). This is
similar to the goal of nonparametric regression, which aims to capture non-linear
regression relationships. Therefore, non-parametric regression approaches such
as smoothing can be useful alternatives to polynomial regression. Some of these
methods make use of a localized form of classical polynomial regression. An
advantage of traditional polynomial regression is that the inferential framework of
multiple regression can be used (this also holds when using other families of basis
functions such as splines). A final alternative is to use kernelized models such as
support vector regression with a polynomial kernel. If residuals have unequal
variance, a weighted least squares estimator may be used to account for that.
2.5.3 Fitting Exponential
In this section, we consider the problem of approximating an unknown function
whose values, at a set of points, are generally known only empirically and are, thus
subject to inherent errors, which may sometimes be appreciably larger in many Self - Learning
Material 101
Interpolation and engineering and scientific problems. In these cases, it is required to derive a
Curve Fitting
functional relationship using certain experimentally observed data. Here the
observed data may have inherent or round-off errors, which are serious, making
polynomial interpolation for approximating the function inappropriate. In polynomial
NOTES interpolation the truncation error in the approximation is considered to be important.
But when the data contains round-off errors or inherent errors, interpolation is not
appropriate.
The subject of this section is curve fitting by least square approximation. Here
we consider a technique by which noisy function values are used to generate a
smooth approximation. This smooth approximation can then be used to approximate
the derivative more accurately than with exact polynomial interpolation.
There are situations where interpolation for approximating function may not
be efficacious procedure. Errors will arise when the function values f (xi), i = 1, 2,
…, n are observed data and not exact. In this case, if we use the polynomial
interpolation, then it would reproduce all the errors of observation. In such situations
one may take a large number of observed data, so that statistical laws in effect
cancel the errors introduced by inaccuracies in the measuring equipment. The
approximating function is then derived, such that the sum of the squared deviation
between the observed values and the estimated values are made as small as
possible.
Mathematically, the problem of curve fitting or function approximation may be
stated as follows:
To find a functional relationship y = g(x), that relates the set of observed data
values Pi(xi, yi), i = 1, 2,...,n as closely as possible, so that the graph of y = g(x)
goes near the data points Pi’s though not necessarily through all of them.
The first task in curve fitting is to select a proper form of an approximating
function g(x), containing some parameters, which are then determined by minimizing
the total squared deviation.
For example, g (x) may be a polynomial of some degree or an exponential or
logarithmic function. Thus g (x) may be any of the following:
(i) g ( x) x (ii) g ( x) x x 2
(iii) g ( x) e x (iv) g ( x) e x
(v) g ( x) log( x)
Here , , are parameters which are to be evaluated so that the curve
y = g(x), fits the data well. A measure of how well the curve fits is called the
goodness of fit.
In the case of least square fit, the parameters are evaluated by solving a system
of normal equations, derived from the conditions to be satisfied so that the sum of
the squared deviations of the estimated values from the observed values, is minimum.
(2.63)
The function g(x) may have some parameters, , , . In order to determine NOTES
these parameters we have to form the necessary conditions for S to be minimum,
which are
S S S
0, 0, 0 (2.64)
These equations are called normal equations, solving which we get the
parameters for the best approximate function g(x).
Curve Fitting by a Straight Line: Let g ( x) x, be the straight line which
fits a set of observed data points (xi, yi), i = 1, 2, ..., n.
Let S be the sum of the squares of the deviations g(xi) – yi, i = 1, 2,...,n; given
by,
n
S x y
i 1
i i
2
(2.65)
0, i.e., (
i 1
xi yi ) 0 (2.66)
S n
and,
0, i.e., x (
i 1
i xi yi ) 0 (2.67)
n n n n
S1 xi , S01 yi , S 2 xi , S11 xi yi
2
where,
i 1 i 1 i 1 i 1
Solving,
1 S S
. Also 01 1 .
S1 S11 S1 S2 nS11 S1S01 nS2 S12
n n
Algorithm. Fitting a straight line y = a + bx.
Step 1. Read n [n being the number of data points]
Step 2. Initialize : sum x = 0, sum x2 = 0, sum y = 0, sum xy = 0
Step 3. For j = 1 to n compute
Begin
Read data xj, yj
Self - Learning
Compute sum x = sum x + xj Material 103
Interpolation and Compute sum x2 + xj × xj
Curve Fitting
Compute sum y = sum y + yi × yj
Compute sum xy = sum xy + xj × yj
NOTES End
Step 4. Compute b = (n × sum xy – sum x × sum y)/ (n × sum x2 – (sum x)2)
Step 5. Compute x bar = sum x / n
Step 6. Compute y bar = sum y / n
Step 8. Compute a = y bar – b × x bar
Step 9. Write a, b
Step 10. For j = 1 to n
Begin
Compute y estimate = a + b × x
write xj, yj, y estimate
End
Step 11. Stop
Curve Fitting by a Quadratic (A Parabola): Let g(x) = a + bx + cx2, be the
approximating quadratic to fit a set of data (xi, yi), i = 1, 2, ..., n. Here the
parameters are to be determined by the method of least squares, i.e., by minimizing
the sum of the squares of the deviations given by,
n
S (a bx cx
i 1
i
2
i yi ) 2
(2.68)
Thus the normal equations, S 0, S 0, S 0, are as follows:
a b c
(2.69)
n
(a bx cx
i 1
i
2
i
yi ) 0
x (a bx cx
i 1
i i
2
i yi ) 0
x
i 1
2
i (a bxi cxi2 yi ) 0. (2.70)
n n n
Self - Learning
s01 y ,
i 1
i
s11 x
i 1
i
yi , s 21 x
i 1
2
y
i i
(2.72)
104 Material
It is clear that the normal equations form a system of linear equations in the Interpolation and
Curve Fitting
unknown parameters a, b, c. The computation of the coefficients of the normal
equations can be made in a tabular form for desk computations as shown below,
2 3 4 2 NOTES
x xi yi xi xi xi xi y i xi y i
2 3 4 2
1 x1 y1 x1 x1 x1 x1 y1 x1 y1
2 3 4 2
2 x2 y2 x2 x2 x2 x2 y 2 x2 y 2
... ... ... ... ... ... ... ...
2 3 4 2
n xn yn xn xn xn xn y n xn y n
sum s1 s01 s2 s3 s4 s11 s 21
xi 4 6 8 10 12
y1 13.72 12.90 12.01 11.14 10.31
Solution: Let y = a + bx, be the straight line which fits the data. We have the
5
S ( y a bx )
i 1
i i
2
5 5
Thus, y
i 1
i na b xi 0
i 1
5 5 5
x y a xi b xi 0
2
and i i
i 1 I 1 i 1
xi yi xi2 xi yi
4 13.72 16 54.88
6 12.90 36 77.40
8 12.01 64 96.08
10 11.14 100 111.40
12 10.31 144 123.72
Sum 40 60.08 360 463.48
Solution: Let the straight line fitting the data be y = a + bx. The data values being
large, we can use a change in variable by substituting u = x – 62, and v = y – 48.
Let v = A + B u, be a straight line fitting the transformed data, where the
normal equations for A and B are,
5 5
i 1
vi 5 A B u
i 1
i
5 5 5
i 1
u i vi A i 1
ui B u
i 1
2
i
The computation of the various sums are given in the table below,
xi yi ui vi ui vi ui2
60 40 –2 –8 16 4
61 42 1 6 6 1
62 48 0 0 0 0
63 52 1 4 4 1
64 55 2 7 14 4
Sum 0 3 40 10
n n n
n
i 1
xi log yi log y
i 1
xi
i 1
i
2
n n (6.77)
n
i 1
xi2
xi
i 1
n
n
and, z px , where x xi n , z log yi n (6.78)
i 1 i 1
After computing and , we can determine a and b given by equation
(2.75). Finally, the exponential curve fitting the data set is given by equation (2.73).
Algorithm. To fit a straight line for a given set of data points by least square
error method.
Step 1. Read the number of data points, i.e., n
Step 2. Read values of data-points, i.e., Read (xi, yi) for i = 1, 2,..., n
Step 3 Initialize the sums to be computed for the normal equations,
i.e., sx = 0, sx2 = 0, sy = 0, syx = 0
Step 4. Compute the sums, i.e., For i = 1 to n do
Begin
sx sx xi
sx 2 sx 2 xi2
sy sy yi
syx syx xi yi
End
Step 5. Solve the normal equations, i.e., solve for a and b of the line y = a +
bx
Compute d n sx 2 sx sx
b (n sxy sy sx) / d
xbar sx / n
ybar sy / n
a ybar b x bar
Step 6. Print values of a and b
Step 7. Print a table of values of xi , yi , y pi a bxi for i 1, 2, ..., n
Step 8. Stop
Algorithm. To fit a parabola y a bx cx 2 , for a given set of data points by
Self - Learning
least square error method. Material 107
Interpolation and Step 1. Read n, the number of data points
Curve Fitting
Step 2. Read (xi, yi) for i = 1, 2,..., n; the values of data points
Step 3. Initialize the sum to be computed for the normal equations,
i.e., sx = 0, sx2 = 0, sx3 = 0, sx4 = 0, sy = 0, sxy = 0.
NOTES
Step 4. Compute the sums, i.e., For i = 1 to n do
Begin
sx sx xi
x 2 xi xi
sx 2 sx 2 x 2
sx 3 sx 3 xi x 2
sx 4 sx 4 x 2 x 2
sy sy yi
sxy sxy xi yi
sx 2 y sx 2 y x 2 yi
End
Step 5. Form the coefficients {aij } matrix of the normal equations, i.e.,
11. If the revolving line OP is in the first quadrant, then all the sides of the
triangle OPM are positive. Therefore, all the trigonometrical ratios are positive Self - Learning
in the first quadrant. Material 109
Interpolation and 12. The following are the two methods used for estimation:
Curve Fitting
(i) Scatter diagram method
(ii) Least squares method
NOTES 13. Squaring each term accomplishes two purposes, viz., (i) It magnifies
(or penalizes) the larger errors, and (ii) It cancels the effect of the positive
and negative values (since a negative error when squared becomes positive).
2.7 SUMMARY
The problem of interpolation is very fundamental problem in numerical
analysis.
In numerical analysis, interpolation means computing the value of a function
f (x) in between values of x in a table of values.
Lagrange’s interpolation is useful for unequally spaced tabulated values.
For interpolation of an unknown function when the tabular values of the
argument x are equally spaced, we have two important interpolation
formulae, viz., Newton’s forward difference interpolation formula and
Newton’s backward difference interpolation formula.
The forward difference operator is defined by, f ( x) f ( x h) f ( x) .
The backward difference operator is defined by, f ( x) f ( x h) f ( x) .
We define different types of finite differences such as forward differences,
backward differences and central differences, and express them in terms of
operators.
The shift operator is denoted by E and is defined by E f (x) = f (x + h).
The first order difference of a polynomial of degree n is a polynomial of
degree n–1. For polynomial of degree n, all other differences having order
higher than n are zero.
Newton’s forward difference interpolation formula is generally used for
interpolating near the beginning of the table while Newton’s backward
difference formula is used for interpolating at a point near the end of a table.
In iterative linear interpolation, we successively generate interpolating
polynomials, of any degree, by iteratively using linear interpolating functions.
The process of finding values of a function at points beyond the interval is
termed as extrapolation.
Horner’s method of synthetic substitution is used for evaluating the values
of a polynomial and its derivatives for a given x.
Descarte’s rule is used to determine the number of negative roots by finding
the number of changes of signs in pn(–x).
By using the method of least squares, noisy function values are used to
generate a smooth approximation. This smooth approximation can then be
used to approximate the derivative more accurately than with exact
Self - Learning
110 Material
polynomial interpolation.
The term ‘Regression’ was first used in 1877 by Sir Francis Galton who Interpolation and
Curve Fitting
made a study that showed that the height of children born to tall parents will
tend to move back or ‘Regress’ toward the mean height of the population.
NOTES
2.8 KEY TERMS
Interpolation: Interpolation means computing the value of a function f(x)
in between values of x in a table of values.
Extrapolation: The process of finding values of a function at points beyond
the interval is termed as extrapolation.
Newton-Raphson method: Newton-Raphson method is a widely used
numerical method for finding a root of an equation f (x) = 0, to the desired
accuracy.
First quadrant: If the revolving line OP is in the first quadrant, then all the
sides of the triangle OPM are positive. Therefore, all the trigonometrical
ratios are positive in the first quadrant.
Scatter diagram: A diagram representing two series with the known
variable, i.e., independent variable plotted on the X-axis and the variable to
be estimated, i.e., dependent variable to be plotted on the Y-axis on a
graph paper.
Short-Answer Questions
1. What is the significance of polynomial interpolation?
2. Define the symbolic operators E and D.
3. What is the degree of the first order forward difference of a polynomial of
degree n?
4. What is the degree of the nth order forward difference of a polynomial of
degree n?
5. Write newton’s forward and backward difference formulae.
6. State an application of iterative linear interpolation.
7. What is the advantage of extrapolation?
8. State Lagrange’s formula for inverse interpolation.
9. How many roots are there in a polynomial equation of degree n?
10. How many positive real roots are there in a polynomial equation?
11. Define the term first quadrant.
12. List the basic precautions and limitations of regression and correlation
analyses.
13. Differentiate between scatter diagram and least square method. Self - Learning
Material 111
Interpolation and Long-Answer Questions
Curve Fitting
1. Use Lagrange’s interpolation formula to find the polynomials of least degree
which attain the following tabular values:
NOTES x 2 1 2
(a) y 25 8 15
x 0 1 2 5
(b) y 2 3 12 147
x 1 2 3 4
(c) y 1 1 1 5
2. Form the finite difference table for the given tabular values and find the
values of:
(a) f (2)
(b) f 2(1)
(c) f 3(0)
(d) f 4(1)
(e) f (5)
(f) f (3)
x 0 1 2 3 4 5 6
f ( x) 3 4 13 36 79 148 249
3. How are the forward and backward differences in a table related? Prove
the following:
(a) yi yi 1
(b) 2 yi 2 yi 2
(c) n yi n yi n
4. Describe Newton’s forward and backward difference formulae using
illustrations.
5. Explain iterative linear interpolation with the help of examples.
6. Illustrate inverse interpolation procedure.
7. Use the method of least squares to fit a straight line for the following data
points:
x 1 0 1 2 3 4 5 6
y 10 9 7 5 4 3 0 1
8. Discuss about the trigonometric function with the help of giving examples.
9. What is regression analysis? What are the assumptions in it?
10. Explain scatter diagram and the least square method in detail. Also, mention
how scatter diagram helps in studying correlation between two variables.
Self - Learning
112 Material
Interpolation and
2.10 FURTHER READING Curve Fitting
Self - Learning
Material 113
Numerical Differentiation
DIFFERENTIATION
AND INTEGRATION NOTES
Structure
3.0 Introduction
3.1 Objectives
3.2 Numerical Differentiation Formulae
3.2.1 Differentiation Using Newton’s Forward Difference Interpolation Formula
3.2.2 Differentiation Using Newton’s Backward Difference Interpolation
Formula
3.3 Numerical Integration Formule
3.3.1 Simpson’s One-Third Rule
3.3.2 Weddle’s Formula
3.3.3 Errors in Itegration Formulae
3.3.4 Gaussian Quadrature
3.4 Solving Numerical
3.4.1 Taylor Series Method
3.4.2 Euler’s Method
3.4.3 Runge-Kutta Methods
3.4.4 Higher Order Differential Equations
3.5 Answers to ‘Check Your Progress’
3.6 Summary
3.7 Key Terms
3.8 Self-Assessment Questions and Exercises
3.9 Further Reading
3.0 INTRODUCTION
In numerical analysis, numerical differentiation is the process of finding the numerical
value of a derivative of a given function at a given point. It is the process of
computing the derivatives of a function f(x) when the function is not explicitly
known, but the values of the function are known only at a given set of arguments
x = x0, x1, x2,..., xn. For finding the derivatives, a suitable interpolating polynomial
is used and then its derivatives are used as the formulae for the derivatives of the
function. Thus, for computing the derivatives at a point near the beginning of an
equally spaced table, Newton’s forward difference interpolation formula is used,
whereas Newton’s backward difference interpolation formula is used for computing
the derivatives at a point near the end of the table.
Numerical integration constitutes a broad family of algorithms for calculating
the numerical value of a definite integral. The numerical computation of an integral
is sometimes called quadrature. The most straightforward numerical integration
technique uses the Newton-Cotes formulas, which approximate a function tabulated
at a sequence of regularly spaced intervals by various degree polynomials. If the
functions are known analytically instead of being tabulated at equally spaced
intervals, the best numerical method of integration is called Gaussian quadrature. Self - Learning
Material 115
Numerical Differentiation The basic problem considered by numerical integration is to compute an
and Integration
b
approximate solution to a definite integral a f ( x) dx. If f(x) is a smooth well performed
function integrated over a small number of dimensions and the limits of integration
NOTES are bounded then there are many methods of approximating the integral with
arbitrary precision. Numerical integration methods can generally be described as
combining evaluations of the integrand to get an approximation to the integral. The
integrand is evaluated at a finite set of points called integration points and a weighted
sum of these values is used to approximate the integral. The integration points and
weights depend on the specific method used and the accuracy required from the
approximation. Modern numerical integrations methods based on information theory
have been developed to simulate information systems such as computer controlled
systems, communication systems, and control systems.
An ordinary differential equation is a relation that contains functions of only
one independent variable and one or more of their derivatives with respect to that
variable. Ordinary differential equations are distinguished from partial differential
equations, which involve partial derivatives of functions of several variables.
Ordinary differential equations arise in many different contexts including geometry,
mechanics, astronomy and population modelling. The Picard—Lindelöf theorem,
Picard’s existence theorem or Cauchy–Lipschitz theorem is an important theorem
on existence and uniqueness of solutions to first-order equations with given initial
conditions. The Picard method is a way of approximating solutions of ordinary
differential equations. Originally, it was a way of proving the existence of solutions.
It is only by advanced symbolic computing that it has become a practical way of
approximating solutions. Euler’s method is a first-order numerical procedure for
solving ordinary differential equations with a given initial value. It is the most basic
kind of explicit method for numerical integration of ordinary differential equations
and is the simplest kind of Runge-Kutta method.
In this unit, you will learn about the numerical differentiation formulae,
Simpson’s rule, errors in integration formulae, Gaussian quadrature formulae,
solving numerical differential equation, Euler’s method, Taylor series method,
Runge-Kutta method and higher order differential equation.
3.1 OBJECTIVES
After going through this unit, you will be able to:
Describe numerical differentiation
Differentiate using Newton’s forward difference interpolation formula
Differentiate using Newton’s backward difference interpolation formula
Describe numerical integration
Identify the numerical methods for evaluating a definite integral
Know Newton-Cotes general quadrature
Understand Simpson’s one-third and three-eighth rule
Self - Learning Explain interval halving technique
116 Material
Numerically evaluate double integrals solution of non-linear equation Numerical Differentiation
and Integration
Define Picard’s method of successive approximation
Describe Euler’s method and Taylor series method
Explain Runge-Kutta and multistep methods NOTES
Understand predictor-corrector methods
Find numerical solution of boundary value problems
Define higher order differential equations
dy d2y
For a given x near the end of the table, the values of and are com-
dx dx 2
puted by first computing v = (x – xn)/h and using the above formulae. At the
tabulated point xn, the derivatives are given by,
1 1 1 1
y ( xn ) y n 2 y n 3 y n 4 y n ...
h 2 3 4
(3.7)
1 2 11 5
y ( xn ) 2
y n 3 y n 4 y n 5 y n ...
h 12 6
(3.8)
Example 3.1: Compute the values of f (2.1), f (2.1), f (2.0) and f (2.0) when
f (x) is not known explicitly, but the following table of values is given:
Self - Learning
118 Material
x f(x) Numerical Differentiation
and Integration
2.0 0.69315
2.2 0.78846
2.4 0.87547
Solution: Since the points are equally spaced, we form the finite difference table. NOTES
x f ( x) f ( x) 2 f ( x)
2.0 0.69315
9531
2.2 0.78846 83
8701
2.4 0.87547
1 1
f (2.0) f 0 2 f 0
0.2 2
1 1
0.09531 0.00083
0.2 2
0.09572
0.4786
0.2
1
f (2.0) ( 0.0083)
(0.2) 2
0.21
Example 3.2: For the function f(x) whose values are given in the table below
compute values of f (1), f (1), f (5.0), f (5.0).
x 1 2 3 4 5 6
f ( x) 7.4036 7.7815 8.1291 8.4510 8.7506 9.0309
Solution: Since f(x) is known at equally spaced points, we form the finite differ-
ence table to be used in the differentiation formulae based on Newton’s interpo-
lating polynomial.
Self - Learning
Material 119
Numerical Differentiation 2 3 4 5
and Integration x f ( x) f ( x ) f ( x) f ( x) f ( x) f ( x)
1 7.4036
0.3779
NOTES 2 7.7815 303
0.3476 46
3 8.1291 257 12
0.3219 34 8
4 8.4510 223 4
0.2996 30
5 8.7506 193
0.2803
6 9.0309
To calculate f (1) and f (1), we use the derivative formulae based on Newton’ss
forward difference interpolation at the tabulated point given by,
1 1 1 1 1
f ( x0 ) f 0 2 f 0 3 f 0 4 f 0 5 f 0
h 2 3 4 5
2
1 11 4 5 5
f ( x0 ) f 0 f 0 12 f 0 6 f 0
3
h2
1 1 1 1 1
f (1) 0.3779 (0.0303) 0.0046 (0.0012) 0.0008
1 2 3 4 5
0.39507
11 5
f (1) 0.0303 0.0046 (0.0012) 0.0008
12 6
0.0367
Similarly, for evaluating f (5.0) and f (5.0), we use the following formulae
1 1 1 1 1
f ( xn ) f n 2 f n 3 f n 4 f n 5 f n
h 2 3 4 5
1 2 11 5
f ( xn ) 2
f n 3 f n 4 f n 5 f n
h 12 6
1 1 1
f (5) 0.2996 (0.0223) 0.0034 (0.0012)
2 3 4
0.2893
11
f (5) [0.0223 0.0034 0.0012]
12
0.0178
Example 3.3: Compute the values of y (0), y (0.0), y (0.02) and y (0.02) for the
function y = f(x) given by the following tabular values:
x 0.0 0.05 0.10 0.15 0.20 0.25
y 0.00000 0.10017 0.20134 0.30452 0.41075 0.52110
Solution: Since the values of x for which the derivatives are to be computed lie
near the beginning of the equally spaced table, we use the differentiation formulae
based on Newton’s forward difference interpolation formula. We first form the
Self - Learning
120 Material
finite difference table.
Numerical Differentiation
x y y 2 y 3 y 4 y
and Integration
0. 0 0.00000
0.10017
0.05 0.10017 100
0.10117 101
NOTES
0.10 0.20134 201 3
0.10318 104
0.15 0.30452 305 3
0.10623 107
0.20 0.41075 412
0.11035
0.25 0.52110
h2 12
1 11
0.00100 0.00101 0.00003
(0.05) 2 12
0.007
For evaluating y (0.02) and y (0.02), we use the following formulae, with
0.02 0.00
u 0.4
0.05
1 2u 1 2 3u 2 6u 2 3 2u 3 9u 2 11u 3 4
y (0.02) y0 y0 y0 y0
h 2 6 12
1 2 6(u 1) 3 6u 2 18u 11 4
y (0.02) y0 y0 y0
h2 6 12
1 2 0.4 1 3 (0.4) 2 6 0.4 2
y (0.02) 0.10017 0.00100 0.00101
0.05 2 6
2 0.43 9 0.42 11 0.4 3
0.00003
12
4.00028
1 6 0.16 18 0.4 11
y (0.02) 0.00100 0.00101 (0.6) 0.00003
(0.05)2 12
0.800
Self - Learning
Material 121
Numerical Differentiation Solution: We first form the finite difference table,
and Integration
x f ( x) f ( x) 2 f ( x) 3 f ( x)
6.0 0.1750
248
NOTES
6.1 0.1998 23
225 3
6.2 0.2223 26
199 1
6.3 0.2422 25
174
6.4 0.2596
1 1 1
f ( x0 ) f 0 2 f 0 3 f 0
h 2 3
1 1 1
f (6.0) 0.0248 0.0023 0.0003
0.1 2 3
10[0.0248 0.00115 0.0001]
0.2585
For evaluating f (6.3), we use the formula obtained by differentiating Newton’ss
backward difference interpolation formula. It is given by,
1 2
f ( xn ) 2
[ f n 3 f n ]
h
1
f (6.3) [0.0026 0.0003] 0.29
(0.1) 2
Example 3.5: Compute the values of y (1.00) and y (1.00) using suitable numeri-
cal differentiation formulae on the following table of values of x and y:
Solution: For computing the derivatives, we use the formulae derived on differ-
entiating Newton’s forward difference interpolation formula, given by
1 1 1 1
f ( x0 ) y0 2 y0 3 y0 4 y0 ...
h 2 3 4
1 2 11
f ( x0 ) 2
y0 3 y0 4 y 0 ...
h 12
Now, we form the finite difference table.
Self - Learning
122 Material
Numerical Differentiation
x y y 2 y 3 y 4 y and Integration
1.00 1.00000
2470
1.05 1.02470 59
NOTES
2411 5
1.10 1.04881 54 2
2357 3
1.15 1.07238 51
2306
1.20 1.09544
x 0 1 2 3
f ( x) 1 3 15 40
Solution: Since the values of x are equally spaced we use Newton’s forward
difference interpolating polynomial for finding f ( x) and f (0.5). We first form the
finite difference table as given below:
x f ( x ) f ( x) 2 f ( x) 3 f ( x )
0 1
2
1 3 10
12 3
2 15 13
25
3 40
x x0
Taking x0 0, we have u x. Thus the Newton’s forward difference
h
interpolation gives,
u (u 1) 2 u (u 1) (u 2) 3
f f 0 uf 0 f0 f0
2! 3!
x ( x 1) x ( x 1) ( x 2)
i.e., f ( x) 1 2 x 10 3
2 6
or, 13 2 1 3
f ( x) 1 3 x x x
2 2
3 2
f ( x ) 3 13 x x
2
3
and, f (0.5) 3 13 0.5 (0.5) 2 3.12 Self - Learning
2
Material 123
Numerical Differentiation Example 3.7: The population of a city is given in the following table. Find the rate
and Integration
of growth in population in the year 2001 and in 1995.
Year x 1961 1971 1981 1991 2001
NOTES Population y 40.62 60.80 79.95 103.56 132.65
dy
Solution: Since the rate of growth of the population is , we have to compute
dx
dy
at x = 2001 and at x = 1995. For this we consider the formula for the deriva-
dx
tive on approximating y by the Newton’s backward difference interpolation given
by,
dy 1 2u 1 2 3u 2 6u 2 3 2u 3 9u 2 11u 3 4
y n yn yn y n ...
dx h 2 6 12
x xn
Where u
h
For this we construct the finite difference table as given below:
x y y 2 y 3 y 4 y
1961 40.62
20.18
1971 60.80 1.03
19.15 5.49
1981 79.95 4.46 4.47
23.61 1.02
1991 103.56 5.48
29.09
2001 132.65
x xn
For x = 2001, u 0
h
dy 1 1 1 1
29.09 5.48 1.02 (4.47)
dx 2001 10 2 3 4
3.105
1995 1991
For x = 1995, u 0.4
10
(3.11)
x x0
where, s
h
Replacing f (x) by (s) in Equation (3.9), we get
xn n
s ( s 1) 2
f ( x) dx h f
x0 0
0 s f 0
2!
f 0 ... ds
since when x = x0, s = 0 and x = xn, s = n and dx = h du.
Performing the integration on the RHS we have,
xn
n2 1 n3 n 2 1 n4 n3 n2
x0
f ( x)dx h nf 0
2
f 0 2 f 0
2 3 2 6 4
3 2 3 f 0
3 2
1 n 5 3n 4 11n 3 (3.12)
3n 2 4 f 0 ...
24 5 2 3
We can derive different integration formulae by taking particular values of
n = 1, 2, 3, .... Again, on replacing the differences, the Newton-Cotes formula can
be expressed in terms of the function values at x0, x1,..., xn, as
xn n
f ( x)dx h c
k 0
k f ( xk ) (3.13)
x0
(x1, f1)
f(x)
(x0, f0)
O X
x0 x1
Trapezoidal Rule
xn
For evaluating the integral f ( x)dx, we have to sum the integrals for each of the
x0
xn
h (3.17)
Self - Learning
or f ( x)dx 2 [ f
x0
0 2( f1 f 2 ... f n 1 ) f n ]
126 Material
This is known as trapezoidal rule of numerical integration. Numerical Differentiation
and Integration
The error in the trapezoidal rule is,
xn
h
E f ( x)dx 2 [ f 2( f1 f 2 ... f n 1 ) f n ]
n
T 0
x0 NOTES
h 3
[ f (1 ) f ( 2 ) ... f ( n )]
12
where x0 1 x1 , x1 2 x2 ,..., xn 1 n xn
Thus, we can write
h3
ETn [nf ()], f () being the mean of f (1 ), f ( 2 ),..., f ( n )
12
h2
nh f ()
12
h2
where ETn (b a) f (), since nh b a
12
or, x0 xn
b
Algorithm: Evaluation of f ( x)dx by trapezoidal rule.
a
Parabola y = f (x)
X
0 x0 x1 x2
a
f ( x )dx
x0
f ( x) dx f ( x )dx ...
x2
f ( x)dx
x2 m 2
h
3
( f 0 4 f1 f 2 ) ( f 2 4 f 3 f 4 ) ( f 4 4 f 5 f 6 ) ... ( f 2 m 2 4 f 2 m 1 f 2 m )
b
f ( x)dx 3 f
h
0 4 ( f1 f 3 f 5 ... f 2 m 1 ) 2 ( f 2 f 4 f 6 ... f 2 m 2 ) f 2 m .
a
3
u
2
1u
3
u 2
2
1u
4
2 3
f 0 f0 u u f 0
3
h uf 0
2
2 3 2
6 4 0
9 9 2 3 3
h 3 y0 y 0 y0 y0
2 4 8
9 9 3
h 3 y0 ( y1 y0 ) ( y 2 2 y1 y0 ) ( y3 3 y 2 3 y1 y 0 )
2 4 8
x3
3h
f ( x) dx
x0
( y 3 y1 y3 )
8 0
Self - Learning
Material 129
Numerical Differentiation (3.21)
and Integration
3h5 iv
The truncation error in this formula is f (), x0 x3 .
80
NOTES This formula is known as Simpson’s three-eighth formula of numerical inte-
gration.
As in the case of Simpson’s one-third rule, we can write Simpson’s three-eighth
rule of numerical integration as,
b
3h
f ( x) dx
a
[ y 3 y1 3 y 2 2 y3 3 y 4 3 y5 2 y 6 ... 2 y3m 3 3 y3m 2 3 y3m 1 y3m ]
8 0
(3.22)
where h = (b–a)/(3m); for m = 1, 2,...
i.e., the interval (b–a) is divided into 3m number of sub-intervals.
The rule in Equation (3.22) can be rewritten as,
b
3h
f ( x) dx
a
8
[ y0 y3m 3 ( y1 y 2 y 4 y5 ... y3m 2 y3m 1 ) 2 ( y3 y6 ... y3m 3 )]
(3.23)
The truncation error in Simpson’s three-eighth rule is
3h4
(b a) f iv (), x0 xg m
240
41 6
This formula takes a very simple form if the last term y0 is replaced by
140
42 6 3
y0 6 y0 . Then the error in the formula will have an additional term
140 10
1 6
y0 . The above formula then becomes,
140
x6
123 5 3
ydx h 6 y
x0
0 18y0 27 2 y0 243 y0
10
y0 6 y0
10
x6
3h
ydx 10
x0
[ y0 5 y1 y2 6 y3 y4 5 y5 y6 ]
(3.24)
On replacing the differences in terms of yi’s, this formula is known as Weddle’s
Self - Learning formula.
130 Material
Numerical Differentiation
1 7 ( vi ) and Integration
The error Weddle’s formula is h y ( ) (3.25)
140
Weddle’s rule is a composite Weddle’s formula, when the number of sub-
intervals is a multiple of 6. One can use a Weddle’s rule of numerical integration by NOTES
sub-dividing the interval (b – a) into 6m number of sub-intervals, m being a posi-
tive integer. The Weddle’s rule is,
b
3h
f ( x)dx 10 [y +5y +y +6y +y +5y +2y +5y +y +6y +y
a
0 1 2 3 4 5 6 7 8 9 10
+5y11+...
+2y6m–6+5y6m–5+y6m–4+6y6m–3+y6m–2+5y6m–1+y6m] (3.26)
where, b–a = 6mh
b
3h
i.e., f ( x) dx 10 [ y
a
0
y6 m 5 ( y1 y5 y7 y11 ... y6 m 5 y6 m 1 ) y 2 y 4 y8 y10 ...
1 6
The error in Weddle’s rule is given by h (b a ) y ( vi ) ()
840
(3.27)
2
Example 3.8: Compute the approximate value of x 4 dx by taking four sub-inter-
0
(4 x 3x
2
)dx by taking n = 10 and using the following rules:
0
NOTES
(i) Trapezoidal rule and (ii) Simpson’s one-third rule. (iii) Also compare them
with the exact value and find the error in each case.
Solution: We tabulate f (x) = 4x–3x2, for x = 0, 0.1, 0.2, ..., 1.0.
x 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
f ( x) 0.0 0.37 0.68 0.93 1.12 1.25 1.32 1.33 1.28 1.17 1.0
0
intervals and (ii) Trapezoidal rule.
2
Solution: (i) We tabulate values of e x for the 11 points x = 0, 0.1, 0.2, 0.3, ....,
1.0 as given below.
2
x e x
0.0 1.00000
0.1 0.990050
0.2 0.960789
0.3 0.913931
0.4 0.852144
0.5 0.778801
0.6 0.697676
0.7 0.612626
0.8 0.527292
0.9 0.444854
1.0 0.367879
Self - Learning
132 Material 1.367879 3.740262 3.037901
Hence, by Simpson’s one-third rule we have, Numerical Differentiation
and Integration
1
h
e
x2
dx [ f 0 f10 4 ( f1 f 3 f5 f 7 f 9 ) 2 ( f 2 f 4 f 6 f8 )]
0 3
0.1
NOTES
[1.367879 4 3.740262 2 3.037901]
3
0.1
[1.367879 14.961048 6.075802]
3
2.2404729
0.7468243 0.746824
3
(ii) Using trapezoidal rule, we get
1
h
e
x2
dx [ f 0 f10 2 ( f1 f 2 ... f9 )]
0
2
0.1
[1.367879 6.778163]
2
0.4073021
4
Example 3.11: Compute the integral I ( x 3 2 x 2 1)dx, using Simpson’s one-
0
third rule taking h = 1 and show that the computed value agrees with the exact
value. Give reasons for this.
Solution: The values of f (x) = x3–2x2+1 are tabulated for x = 0, 1, 2, 3, 4 as
x 0 1 2 3 4
f ( x) 1 0 1 10 33
1 1
I [1 4 0 2 1 4 10 33] 25
3 3
44 43 1
The exact value 2 1 4 25
4 3 3
Thus, the computed value by Simpson’s one-third rule is equal to the exact
value. This is because the error in Simpson’s one-third rule contains the fourth
order derivative and so this rule gives the exact result when the integrand is a
polynomial of degree less than or equal to three.
0.5
Example 3.12: Compute e dx by (i) Trapezoidal rule and (ii) Simpson’s one-
x
0.1
third rule and compare the results with the exact value, by taking h = 0.1.
Solution: We tabulate the values of f (x) = ex for x = 0.1 to 0.5 with spacing
h = 0.1.
Self - Learning
Material 133
Numerical Differentiation
and Integration x 0.1 0.2 0.3 0.4 0.5
x
f ( x) e 1.1052 1.2214 1.3498 1.4918 1.6847
0.1
IS [1.1052 4 (1.2214 1.4918) 2 1.3498 1.6487]
3
0.1 0.1
[2.7539 4 2.7132 2.6996] [16.3063] 0.5435
3 3
third rule taking 10 sub-intervals. Hence, (iii) Find log e and compare it with the 2
0.1
[1.500000 2 (3.4595391 2.7281745)]
2
0.1
Self - Learning [1.500000 12.3754272] 0.6437714.
134 Material 2
(ii) Using Simpson’s one-third rule, we get Numerical Differentiation
and Integration
1
dx h
1 x 3 [ f
0
0 f10 4 ( f1 f 3 ... f 9 ) 2 ( f 2 f 4 ... f 8 )]
0.1
NOTES
[1.500000 4 3.4595391 2 2.7281745]
3
0.1 0.1
[1.5 13.838156 5.456349] 20.794505 0.6931501
3 3
(iii) Exact value:
1
dx 0.1
1 x
0
log e2
3
[1.500000 4 3.4595391 2 2.7281745]
0.6931472
The trapezoidal rule gives the value of the integral having an error 0.693147 –
0.6437714 = 0.0493758, while the error in the value by Simpson’s one-third rule
is – 0. 000029.
2
Example 3.14: Compute 0
cos d , by (i) Simpson’s rule and (ii) Weddle’ss
Solution: On dividing the interval into six sub-intervals, the length of each sub-
1
interval will be h 0 26179 15. For computing the integral by Weddle’ss
6 2
Self - Learning
formula, we tabulate f () 1 0 162sin 2 . Material 135
Numerical Differentiation
and Integration 0 15 30 45 60 75 90
f ( ) 1.0 0.99455 0.97954 0.95864 0.93728 0.92133 0.91542
max f iv ( x) 24
[1, 2 ]
1 24
4
Thus, h 0 5 10 3 or h < 0.102
180
But h has to be so chosen such so that the interval [1, 2] is divided into an
even number of sub-intervals. Hence we may take h = 0.1 < 0.102, for which n =
10, i.e., there will be 10 sub-intervals.
The value of the integral is,
2
dx 0.1 1 1 1 1 1 1 1 1 1 1
1
x
3
1.0 4
1.1 1.3 1.5 1.7 1.9
2
1.2 1.4 1.6 1.8 2
0.1
[1.5 4 3.4595 2 2.7282]
3
Self - Learning 0.1
2.0749 0.06931 which agrees with the exact value of log e2 .
136 Material 3
Interval Halving Technique Numerical Differentiation
and Integration
When the estimation of the truncation error is cumbersome, the method of interval
halving is used to compute an integral to the desired accuracy.
In the interval halving technique, an integral is first computed for some moderate NOTES
h
value of h. Then, it is evaluated again for spacing , i.e., with double the number
2
of subdivisions. This requires the evaluation of the integrand at the new points of
subdivision only and the previous function values with spacing h are also used.
Ih
Now the difference between the integral Ih and is used to check the accuracy
2
h
Step 8: Set h
2
Step 9: Set x = a + h
Step 10: Set I2 = I2+ h × f (x)
Step 11: If x < b, go to Step 9 else go to next step
Step 12: Go to Step 7
Step 13: Write I 2 , h,
Step 14: End
138
Self - Learning
Material
I f ( x, y)dx dy
R
(3.28)
where R is the rectangular region a x b, c y d. The double integral Numerical Differentiation
and Integration
can be transformed into a repeated integral in the following form,
b d
a
dx f ( x, y )dy
c
(3.29) NOTES
d
Writing F ( x) f ( x, y )dy considered as a function of x, we have (3.30)
c
I F ( x)dx
a
(3.31)
Now for numerical integration, we can divide the interval [a, b] into n sub-
intervals with spacing h and then use a suitable rule of numerical integration.
Trapezoidal Rule for Double Integral
By trapezoidal rule, we can write the integral Equation (3.31) as,
b
h
F ( x) dx 2 [ F
a
0
Fn 2 ( F1 F2 F3 ... Fn 1 )] (3.32)
ba
where x0 = a, xn = b, h and
n
1
Fi F ( xi ) f ( x , y) dy, x
0
i i
a ih (3.33)
for i = 0, 1, 2,..., n.
Each Fi can be evaluated by trapezoidal rule. For this, the interval [c, d] may
ba
where h , n is even and Self - Learning
n Material 139
Numerical Differentiation d
and Integration
Fi = F(xi) f ( xi , y )dy, xi = a + ih, for i = 0, 1, 2,..., n
c
(3.37)
NOTES and, x0 = a and xn = b
For evaluating I, we have to evaluate each of the (n + 1) integrals given in
Equation (3.37). For evaluation of Fi, we can use Simpson’s one-third rule by
dividing [c, d] into m sub-intervals. Fi can be written as,
k
Fi [ f ( xi , y0 ) f ( xi , y m ) 2 f ( xi , y 2 ) f ( xi , y4 ) ... f ( xi , y m 2 ) 4{ f ( xi , y1 ) f ( xi , y3 )
3
... f ( xi , y m1 )}]
...(3.38)
Equation (3.38) can be written in a compact notation as,
k
Fi [ f f im 2 ( f i 2 f i 4 ... f in 2 ) 4 ( f i1 , f i 3 ... f im 1 )]
3 i0
where fij = f (xi, yj), j = 0, 1, 2,...,m.
( x
2
y 2 )dx dy
Example 3.16: Evaluate the following double integral where R
R
is the rectangular region 1 x 3, 1 y 2, by Simpson’s one-third rule taking
h = k = 0.5.
Solution: We write the integral in the form of a repeated integral,
3 2
1
I dx ( x 2 y 2 )dy
1
2
Taking n = 4 sub-intervals along x, so that h = 0.5
4
y=2
y=1
x=1 x=3
3
0 .5
I F ( x)dx
1
3 [F0 + F4 + 2F2 + 4(F1 + F3)]
2
1
For evaluating Fi’s, we take k 0.5 and get,
2
2
0.5 0 .5
(1 y
2
F0 ) dy [1 12 4 {1 (1.5) 2 } 1 2 2 ] 20
3 3
1
2
Self - Learning 0 .5 0.5
(1.5
2
140 Material F1 y 2 ) dy [(1.5) 2 12 4{1.5) 2 (1.5) 2 } (1.5) 2 2 2 ] 27.50
3 3
1
2
0.5 0.5 Numerical Differentiation
F2 (22 y 2 ) dy [22 12 4 (22 1.5) 2 } 22 22 ] 38 and Integration
1
3 3
2
0.5 0.5
F3 ((2.5) 2 y 2 )dy [(2.5)2 12 4 {(2.5)2 (1.5)2 } (2.5) 2 22 ] 51.50
1 3 3
2
0.5 2 2 0.5 NOTES
F4 (32 y 2 ) dy [3 1 4{32 (1.5)2 } 32 22 ] 68
1
3 3
0.25
I [20 68 2 38 4 (27.50 51.50)]
9
0.25
480 13.333
9
( x
2
Example 3.17: Compute y 2 )dx dy by trapezoidal rule with h = 0.5.
R
y=2
y=1
x=1 x=3
3
0.5
Solution: I T F ( x)dx [F0+F4+2 (F1+F2+F3)]
2
1
2
where Fi = F(xi) ( xi2 y 2 )dy, xi = 1 + 0.5i, i = 0, 1, 2, 3, 4.
1
2
0.5 2 2
Thus, F0 (1 y 2 )dy [1 1 2{12 (1.5)2 } 12 22 ]
1
2
0.5
13.50 3.375
2
2
0.5
F1 [(1.5)2 y 2 ]dy [(1.5)2 12 2 {(1.5)2 (1.5)2 } (1.5) 2 22 ]
1
2
0.5
18.50 4.625
2
2
0.5
F2 [22 y 2 ]dy [22 12 2{22 (1.5) 2 } 22 22 ]
1
2
0.5
25.50 6.375
2
2
0.5
F3 [(2.5) 2 y 2 ]dy [(2.5)2 12 2{(2.5) 2 (1.5) 2 } (2.5)2 22 ]
1
2
0.5
34.50 8.625
2
2
0.5 2 2
F4 [32 y 2 ]dy [3 1 2{32 (1.5) 2 } 32 22 ]
1 2
0.5
45.50 11.375
2
0.5
IT [3.375 11.375 2(4.625 6.375 8.625)]
2
1
[14.750 2 19.625]
4
1 1 Self - Learning
14.750 39.250 54 13.5 Material 141
4 4
Numerical Differentiation Example 3.18: Evaluate the following double integral using trapezoidal rule with
and Integration
2 2
dx dy
length of sub-intervals h = k = 0.5, x y .
1 1
NOTES 1
Solution: Let f ( x, y )
x y
y
1.5
x
1 1.5 2
0.5 0.5
I [ f (1, 1) f (2, 1) f (1, 2) f (2, 2) 2{ f (1.5, 1) f (1, 1.5)
4
f (2, 1.5) f (1.5, 2)} 4 f (1.5, 1.5)]
1 1 1 1 1 2 2 2 2 1
2 4
16 2 3 3 4 5 5 7 7 3
1 4 12 4
0.666667 0.75 2
16 35 3
1
5.492857
16
0.343304.
2 2
dxdy
Example 3.19: Evaluate x y by Simpson’s one-third rule. Take sub-inter--
1 1
0.5 0.5
I [ f (1, 1) f (2, 1) f (1, 2) f (2, 2) 4{ f (1, 1.5) f (1.5, 1)
33
f (2, 1.5) f (1.5, 2)} 16 f (1.5, 1.5)]
1 1 1 1 1 2 2 2 2 1
4 16
36 2 3 3 4 5 5 7 7 3
1 4 12 16
0.666667 0.75 4
36 35 3
1
Self - Learning [12.235714] 0.339880
142 Material 36
3.3.4 Gaussian Quadrature Numerical Differentiation
and Integration
We have seen that Newton-Cotes formula of numerical integration is of the form,
b n
f ( x)dx c f (x )
i 0
i i (3.39) NOTES
a
ba
where xi = a+ih, i = 0, 1, 2, ..., n; h
n
This formula uses function values at equally spaced points and gives the exact
result for f (x) being a polynomial of degree less than or equal to n. Gaussian
quadrature formula is similar to Equation (3.39) given by,
1 n
F (u )du w F (u )
i 1
i i (3.40)
1
where wi’s and ui’s called weights and abscissae, respectively are derived such
that above Equation (3.40) gives the exact result for F(u) being a polynomial of
degree less than or equal to 2n–1.
In Newton-Cotes Equation (3.39), the coefficients ci and the abscissae xi are
rational numbers but the weights wi and the abscissae ui are usually irrational
numbers. Even though Gaussian quadrature formula gives the integration of F(u)
between the limits –1 to +1, we can use it to find the integral of f (x) from a to b
by a simple transformation given by,
ba ab
x u (3.41)
2 2
Evidently, then limits for u become –1 to 1 corresponding to x = a to b and
writing,
b a a b
f ( x) f u F (u )
2 2
b 1
ba
we have, a
f ( x)dx
2 F (u)du
1
(3.42)
It can be shown that the ui are the zeros of the Legendre polynomial Pn(u) of
degree n. These roots are real but irrational and the weights are also irrational.
Given below is a simple formulation of the relevant equations to determine ui
and wi. Let F(u) be a polynomial of the form,
2 n 1
F (u ) a u
k 0
k
k
(3.43)
(3.45)
NOTES
Equation (3.40) gives,
1 n 2 n 1
F (u )du
i 1
wi
k 0
ak uik
1
w a
n
i 0 a1ui a2 ui2 ... a2 n 1ui2 n 1
i 1
(3.46)
The Equations (3.45) and (3.46) are assumed to be identical for all polynomi-
als of degree less than or equal to 2n–1 and hence equating the coefficients of ak
on either side we obtain the following 2n equations for the 2n unknowns w1,
w2,...,wn and u1, u2,...,un.
n n n n
2
i 1
wi 2, i 1
wi ui 0,
i 1
wi ui2
,... wi ui2 n 1 0
3 i 1
(3.47)
The solution of Equation (3.47) is quite complicated. However, use of Legendre
polynominals makes the labour unnecessary. It can be shown that the abscissae ui
are the zeros of the Legendre polynomial Pn(x) of degree n. The weights wi can
then be easily determined by solving the first n Equations of Equations (3.47). As
an illustration, we take n = 2. The four equations for u1, u2, w1 and w2 are,
w1+w 2 = 2
w1u1+w2u2 = 0
2
w1u12 w2 u 22
3
w1u13 w2u 23 0
Eliminating w1, w2, we get
w1 u u3
2 23
w2 u1 u1
2 1 1
Also, w1 = w2 = 1. The third equation gives, 2u12 u1 , u2
3 3 3
Hence, two point Gauss-Legendre quadrature formula is,
1
1 1
Self - Learning F (u )du F
1
F
3
3
144 Material
The Table 3.1 gives the abscissae and weights of the Gauss-Legendre Numerical Differentiation
and Integration
quadraturen for values of n from 2 to 6.
Table 3.1 Values of Weights and Abscissae for Gauss-Legendre Quadrature
It is seen that the abscissae are symmetrical with respect to the origin and the
weights are equal for equidistant points.
2
Example 3.20: Compute (1 x)dx, by Gauss two point quadrature formula.
0
2
Solution: Substituting x = u + 1, the given integral (1 x)dx
0
reduces to
1
I (u 2)du .
1
Using a two point Gauss quadrature formula, we have
I = (0.57735027+2) + (– 0.57735027+2) = 4.0.
As expected, the result is equal to the exact value of the integral.
Example 3.21: Show that Gauss two-point quadrature formula for evaluating
b b N
1
where ri = xi + hp, si = xi + (1 – p)h, p (3 3 ).
6
Solution: We subdivide the interval [a, b] into N sub-intervals, each of length h,
given by h b a .
N
xi1
h h 1 h h h
Ii f xi f xi
2 2 3 2 2 3 2
NOTES h
[ f (ri ) f ( si )]
2
1
where ri = xi + ph, si = xi + (1 – p)h, p (3 3 )
6
b N 1 N 1
h
Hence, f ( x)dx
i 0
Ii
2 [ f ( r ) f ( s )]
i 0
i i
a
Note: Instead of considering Gauss integration formula for more and more num-
ber of points for better accuracy, one can use a two point composite formula for
larger number of sub-intervals.
Example 3.22: Evaluate the following integral by Gauss three point quadrature
formula:
1
dx
I
0
1 x
Solution: We first transform the interval [0, 1] to the interval (–1, 1) by substitut-
i n g
1 1
dx dt
t = 2x – 1, so that .
0
1 x 1 t 3
Now by Gauss three point quadrature we have,
1 1
I [8 F (0) 5 F (3 0.77459667) 5F (3.77459667)] with F (t )
9 t 3
I 0.693122
1
dx
The exact value of 1 x ln 2 0.693147
0
Error = 0.000025
Romberg’s Procedure
This procedure is used to find a better estimate of an integral using the evaluation
of the integral for two values of the width of the sub-intervals.
b
Let I1 and I2 be the values of an integral I f ( x) dx, with two different num-
a
ber of sub-intervals of width h1 and h2 respectively using the trapezoidal rule. Let
E1 and E2 be the corresponding truncation errors. Since the errors in trapezoidal
rule is of order of h2, we can write,
Self - Learning
146 Material
Numerical Differentiation
I I1 Kh12 and I I 2 Kh22 , where K is approximately same. and Integration
I1 Kh12 I 2 Kh22
I1 I 2
K NOTES
h22 h12
I1 I 2 I1h22 I 2 h12
Thus, I I1 .h 2
2 1
h22 h1 h22 h12
h1
In Romberg procedure, we take h2 and we then have,
2
2
h
I1 1 I 2 h12
4I I
I 2
2
2 1
h1 3
h
2
2
I I
or, I I2 2 1
3
This is known as Romberg’s formula for trapezoidal integration.
The use of Romberg procedure gives a better estimate of the integral without
any more function evaluation. Further, the evaluation of I2 with h/2 uses the func-
tion values required in evaluation of I1.
1
dx
Example 3.23: Evaluate I 1 x
0
2 by trapezoidal rule with h1 0.5and h2 0.25
and then use Romberg procedure for a better estimate of I. Compare the result
with exact value.
1
Solution: We tabulate the value of x and y with h = 0.25.
1 x2
x 0 0.25 0.5 0.75 1.0
y 1 0.9412 0.80 0.64 0.5
0.5
I1 [1 0.5 2 0.6667] 0.7084
2
0.25
I2 [1 0.5 2 (0.8 0.6667 0.5714)] 0.6970
2
By Romberg proecedure,
I 2 I1 1
I I2 0.6970 (0.0114)
3 3
0.6970 0.0038 0.6932
1
dx
Example 3.25: Compute the value of 1 x ,
0
(i) By Gauss two point and
1 1 1
dx 1 1 1 2
0
1 x 2
11
1 1
t
2 3t
1
dt
2 2
1
1 1
(i) By Gauss two point quadrature F (t ) dt F F we get ,
1 3 3
1
1 1 1
dt 0.6923
3t 1 1
Self - Learning 1 3 3
148 Material 3 3
(ii) By Gauss three point quadrature, Numerical Differentiation
and Integration
1
dt 1 0.55555556
3t
1
dt 0.888888
3 3 0.77459667
0.443478 NOTES
2
Example 3.26: Compute e x dx by Gauss three point quadrature.
1
1 2 1
3 1
e 0.88888889 e 0.55555556 e 2 0.77459667 e 2 0.77459667
0
2
4.67077
y ( x0 ) f xx ( x0 , y0 ) 2 f xy ( x0 , y 0 ) y ( x0 ) f yy ( x0 , y0 ) { y ( x0 )}2 f y ( x, y ) y ( x0 )
x2 x3 x 4 ( iv ) x 5 (v)
y ( x) y (0) xy (0) y (0) y (0) y (0) y (0)
2 3! 4! 5! NOTES
x 2 x3 x4 x5 x 2 x3 x 4 x5
1 x 2 3 6 1 x
2 6 24 120 2 3 8 20
0.01 0.001 0.0001 0.00001
y (0.1) 1 0.1 1.1053
2 3 8 20
y 2 x 2 yy , y (0) 0
y 2 2[ yy ( y ) 2 , y (0) 2
(iv ) ( iv )
y 2 ( yy 3 y y ), y (0 ) 0
(v) (iv ) 2
y 2[ yy 4 y y 3 ( y ) ], y (v) 0
y (vi ) 2[ yy ( v ) 5 y y (iv ) 10 y y ], y ( vi ) 0
y ( vii ) 2[ yy ( vi ) 6 y y ( v ) 15 y y (iv ) 10 ( y ) 2 ] y ( vi ) (0) 80
x3 x7 80 1 x7
The Taylor series up to two terms is y ( x) 2 x3
6 7 7! 3 63
Example 3.29: Given x y = x – y2, y(2) = 1, evaluate y(2.1), y(2.2) and y(2.3)
correct to four decimal places using Taylor series method.
Solution: Given y x y 2 , i.e., y 1 y 2 / x and y 1 for x 2. To compute
y(2.1) by Taylor series method, we first find the derivatives of y at x = 2.
1
y 1 y 2 / x y (2) 1 0.5
2
xy y 1 2 yy
1 1 1 2 1
2 y (2) 1 2. y (2) 0.25
2 2 4 2 2
xy 2 y 2 y 2 2 yy
2
1 1 1
2 y (2) 2 2 2
4 2 4
1 1
2 y (2) y (2) 0.25
Or, 2 4
xy 3 y 4 y y 2 y y 2 yy
( iv )
Self - Learning
Material 151
Numerical Differentiation
1 1 1 1
and Integration 2 y (2) 3 6 2
4 2 4 4
3 3 1 1
y (2) 0.25
4 4 2 2
NOTES
(0.1) 2 (0.1)3 (0.1)4
y (2.1) y (2) 0.1 y (2) y (2) y (2) y (2)
2 3! 4!
0.01 0.001 0.0001
1 0.1 0.5 (0.25) 0.25 (0.25)
2 6 24
1 0.05 0.00125 0.00004 0.000001
1.0488
The integral contains the unknown function y (x) and it is not possible to
integrate it directly. In Picard’s method, the first approximate solution y (1) ( x) is
obtained by replacing y (x) by y0.
x
(3.53)
The second approximate solution is derived on replacing y by y (x). Thus, (1)
f ( x, y
( 2) (1)
y ( x) y0 ( x)) dx
(3.54)
x0
The process can be continued, so that we have the general approximate solu-
tion given by,
Self - Learning
152 Material
x Numerical Differentiation
and Integration
f ( x, y
(n) ( n 1)
y ( x) y 0 ( x))dx,
for n = 2, 3... (3.55)
x0
Example 3.32: Compute y (0.25) and y (0.5) correct to three decimal places by
solving the following initial value problem by Picard’s method:
dy x2
, y (0) 0
dx 1 y 2
dy x2
Solution: We have dx 1 y 2 , y (0) 0
x
x2 x3
x6
dx tan 1
3
0 1
9
(0.25) 2
For x 0.25, y (1) (0.25) 0.0052
3
(0.25) 2
y (2) (0.25) tan 1 0.0052
3
y (0.25) 0.005, Correct to three decimal place.
(0.5)2
Again, for x = 0.5, y (1) (0.5) 0.083333
3
(0.5) 3
y ( 2) (0.5) tan 1 0.0416
3
Self - Learning
154 Material
Thus, correct to three decimal places, y (0.5) = 0.042.
Note: For this problem we observe that, the integral for getting the third and Numerical Differentiation
and Integration
higher approximate solution is either difficult or impossible to evalute, since
x
x2
y (3) ( x) 2 is not integrable.
0 x3 NOTES
1 tan 1
3
Example 3.33: Use Picard’s method to find two successive approximate solu-
tions of the initial value problem,
dy y x
, y (0 ) 1
dx y x
f ( x, y
( 2) (1)
y ( x) y 0 ( x))dx
0
x x
x 2 x 2 log e | 1 x | x
1
0
1 2 log e | 1 x |
dx 1 x 2 1 2 log
0 e |1 x |
dx
We observe that, it is not possible to obtain the integral for getting y(2)(x). Thus
Picard’s method is not applicable to get successive approximate solutions.
x0
dy
x0
f ( x0 , y0 )dx
y ( x0 h) y ( x0 ) hf ( x0 , y0 )
X
0 x0 x1 x2
ek y ( xk h) { yk hf ( xk , yk )}
h2
yk hy ( xk ) y( xk h) yk hy( xk ), 0 1
2
h2
ek y ( xk h), 0 1
2
Note: The Euler’s method finds a sequence of values {yk} of y for the sequence
of values {xk}of x, step by step. But to get the solution up to a desired accuracy,
we have to take the step size h to be very small. Again, the method should not be
used for a larger range of x about x0, since the propagated error grows as integra-
tion proceeds.
Example 3.34: Solve the following differential equation by Euler’s method for
dy
x = 0.1, 0.2, 0.3; taking h = 0.1; x 2 y, y (0) 1. Compare the results with
dx
exact solution.
dy
Solution: Given x 2 y, with y (0) = 1.
dx
In Euler’s method one computes in successive steps, values of y1, y2, y3,... at
x1 = x0+ h, x2 = x0 + 2h, x3 = x0 + 3h, using the formula,
yn 1 yn hf ( xn , yn ), for n 0, 1, 2,...
yn 1 yn h ( x 2 n yn )
dy
The analytical solution of the differential equation written as y x 2 , is
dx
ye x x 2 e x dx c
or, ye x x 2 e x 2 xe x 2e x c.
Since, y 1 for x 0, c 1.
y x 2 2 x 2 e x .
The following table compares the exact solution with the approximate solution
by Euler’s method.
n xn Approximate Solution Exact Solution % Error
1 0.1 0.9000 0.9052 0.57
2 0.2 0.8110 0.8213 1.25
3 0.3 0.7339 0.7492 2.04
Example 3.35: Compute the solution of the following initial value problem by
Euler’s method, for x = 0.1 correct to four decimal places, taking h = 0.02,
dy y x
, y (0) 1 .
dx y x
1 0
y (0.02) y1 y 0 h f ( x0 , y0 ) 1 0.02 1.0200
1 0
1.0200 0.02
y (0.04) y 2 y1 h f ( x1 , y1 ) 1.0200 0.02 1.0392
1.0200 0.02
1.0392 0.04
y (0.06) y3 y 2 h f ( x2 , y 2 ) 1.0392 0.02 1.0577
1.0392 0.04
1.0577 0.06
y (0.08) y 4 y3 h f ( x3 , y3 ) 1.0577 0.02 1.0756
1.0577 0.06
1.0756 0.08
y (0.1) y5 y 4 h f ( x4 , y 4 ) 1.0756 0.02 1.0928
1.0756 0.08
Hence, y (0.1) = 1.0928.
Self - Learning
Material 157
Numerical Differentiation Modified Euler’s method
and Integration
In order to get somewhat moderate accuracy, Euler’s method is modified by
computing the derivative y f ( x, y ), at a point xn as the mean of f (xn, yn) and f
(xn+1, y(0)n+1), where,
NOTES
y (0) n 1 yn h f ( xn , yn )
h
y (1) n 1 yn [ f ( xn , yn ) f ( xn 1 , y (0) n 1 )]
2
(3.59)
This modified method is known as Euler-Cauchy method. The local truncation
error of the modified Euler’s method is of the order O(h3).
Note: Modified Euler’s method can be used to compute the solution up to a
desired accuracy by applying it in an iterative scheme as stated below.
Compute y ( k ) n 1 yn h f ( xn , yn )
h
Compute yn( k11) yn f ( xn , y n ) f ( xn 1 , yn( k)1 , for k 0, 1, 2,...
2
(3.60)
The iterations are continued until two successive approximations yn( k)1 and yn( k11)
coincide to the desired accuarcy. As a rule, the iterations converge rapidly for a
sufficiently small h. If, however, after three or four iteration the iterations still do
not give the necessary accuracy in the solution, the spacing h is decreased and
iterations are performed again.
Example 3.36: Use modified Euler’s method to compute y (0.02) for the initial
value problem, dy x 2 y, with y (0) = 1, taking h = 0.01. Compare the result
dx
with the exact solution.
Solution: Modified Euler’s method consists of obtaining the solution at successive
points, x1 = x0 + h, x2 = x0 + 2h,..., xn = x0 + nh, by the two stage computations
given by,
y n(0)1 yn hf ( xn , yn )
y n(1)1 y n
h
2
f ( xn , y n ) f ( x n 1 , y n( 0)1 ) .
Self - Learning
158 Material
Numerical Differentiation
0.01 and Integration
y (1)
2 1.01005 [(0.01)2 1.01005 (0.01)2 1.02015]
2
0.01
1.01005 (2.02140)
2 NOTES
1.01005 0.10107
1.11112
y2 y (0.02) 1.11112
dy dz
We write z, so that g ( x, y , z ) with y (x0) = y0 and z (x0) = y0 .
dx dx
Example 3.37: Compute y(1.1) and y(1.2) by solving the initial value problem,
y
y y 0, with y (1) = 0.77, y (1) = –0.44
x
z
Solution: We can rewrite the problem as y z, z y; with y, (1) = 0.77 and
x
z (1.1) = –0.44.
Taking h = 0.1, we use Euler’s method for the problem in the form,
yi 1 yi hzi
z
z i 1 z i h 1 yi , i 0, 1, 2,...
xi
Thus, y1 = y (1.1) and z1 = z (1.1) are given by,
y1 y0 hz 0 0.77 0.1 (0.44) 0.726
z
z1 z 0 h 0 y 0 0.44 0.1 (0.44 0.77)
x0
0.44 0.33 0.473 Self - Learning
Material 159
Numerical Differentiation Similarly, y2 = y(1.2) = y1 + hz1 = 0.726 – 0.1 (–0.473) =
and Integration
0.679
z
NOTES z2 z (1.2) z1 h 1 y1
x1
0.473
0.473 0.1 0.726
1.1
0.473 0.1 0.296 0.503
Thus, y (1.1) 0.726 and y (12) 0.679.
Example 3.38: Using Euler’s method, compute y (0.1) and y (0.2) for the initial
value problem,
y y 0, y (0) 0, y (0) 1
And in general, y(2n)(0) = 0, y ( 2n 1) (0) 2ny ( 2n 1) (0) (1) n 2 n.2!
x3 x5 2n n ! x 2n 1
Thus, y ( x) x 3 15 ... (1) (2n 1)! ...
n
This is an alternating series whose terms decrease. Using this, we form the
solution for y up to 0.2 as given below:
Self - Learning
160 Material
3.4.3 Runge-Kutta Methods Numerical Differentiation
and Integration
Runge-Kutta method can be of different orders. They are very useful when the
method of Taylor series is not easy to apply because of the complexity of finding
higher order derivatives. Runge-Kutta methods attempt to get better accuracy
NOTES
and at the same time obviate the need for computing higher order derivatives.
These methods, however, require the evaluation of the first order derivatives at
several off-step points.
Here we consider the derivation of Runge-Kutta method of order 2.
The solution of the (n + 1)th step is assumed in the form,
yn+1 = yn+ ak1+ bk2 (3.64)
where k1 = h f (xn, yn) and
k2 = h f(xn+ h, yn+ k1), for n = 0, 1, 2,... (3.65)
The unknown parameters a, b, , and are determined by expanding in
Taylor series and forming equations by equating coefficients of like powers of h.
We have,
h2 h3
y n 1 y ( xn h) y n h y ( xn ) y ( xn ) y ( xn ) 0 (h 4 )
2 6
h2 h3
y n h f ( xn , y n ) [ f x ff y ]n [ f xx 2 ff yy f yy f 2 f x f y f y2 f ]n 0(h 4 )
2 6
(3.66)
The subscript n indicates that the functions within brackets are to be evaluated
at (xn, yn).
Again, expanding k2 by Taylor series with two variables, we have
2 2 2 k12
k 2 h[ f n ah ( f x ) n k1 ( f y ) n ( f xx ) n hk1 ( f xy ) n ( f yy ) n 0(h3 )]
2 2
(3.66) Thus on substituting the expansion of k2, we get from Equation (3.66)
2 2 2
yn 1 yn ( a b) h f n bh 2 (f x ff y ) n bh3 f xx ff xx f f yy 0(h 4 )
2 2
On comparing with the expansion of yn+1 and equating coefficients of h and h2
we get the relations,
1
a b 1, b b
2
There are three equations for the determination of four unknown parameters.
Thus, there are many solutions. However, usually a symmetric solution is taken by
1
setting a b , then 1
2
Thus we can write a Runge-Kutta method of order 2 in the form,
h
yn 1 yn [ f ( xn , yn ) f ( xn h, yn h f ( xn , yn ))], for n 0, 1, 2,...
2
(3.67)
Proceeding as in second order method, Runge-Kutta method of order 4 can
be formulated. Omitting the derivation, we give below the commonly used Runge-
Kutta method of order 4. Self - Learning
Material 161
Numerical Differentiation
and Integration 1 5
y n 1 y n (k1 2k 2 2k3 k 4 ) 0 (h )
6
k1 h f ( xn , y n )
NOTES h k
k 2 h f xn , y n 1
2 2
h k
k 3 h f xn , y n 2
2 2
k 4 h f ( x n h, y n k 3 ) (3.68)
Runge-Kutta method of order 4 requires the evaluation of the first order
derivative f (x, y), at four points. The method is self-starting. The error estimate
with this method can be roughly given by,
y n* y n
|y (xn) – yn| (3.69)
15
h
where yn* and yn are the approximate values computed with and h respectively
2
as step size and y (xn) is the exact solution.
Note: In particular, for the special form of differential equation y F (x), a function
of x alone, the Runge-Kutta method reduces to the Simpson’s one-third formula
of numerical integration from xn to xn+1. Then,
xn1
h h
or, yn+1 = yn+ [F(xn) + 4F(xn+ ) + F(xn+h)]
6 2
Runge-Kutta methods are widely used particularly for finding starting values
at steps x1, x2, x3,..., since it does not require evaluation of higher order derivatives.
It is also easy to implement the method in a computer program.
Example 3.40: Compute values of y (0.1) and y (0.2) by 4th order Runge-Kutta
method, correct to five significant figures for the initial value problem,
dy
x y , y ( 0) 1
dx
dy
Solution: We have x y , y ( 0) 1
dx
f ( x, y ) x y, h 0.1, x0 0, y0 1
By Runge-Kutta method,
1
y (0.1) y (0) ( k 2k 2 2k 3 k 4 )
6 1
where, k1 h f ( x0 , y0 ) 0.1 (0 1) 0.1
h k
k2 h f x0 , y0 2 0.1 (0.05 1.05) 0.11
2 2
h k
k3 h f x0 , y0 2 0.1 (0.05 1.055) 0.1105
2 2
k4 h f ( x0 h, y0 k3 ) 0.1 (0.1 1.1105) 0.12105
Self - Learning 1
162 Material y (0.1) 1 [0.1 2 (0.11 0.1105 0.12105] 1.130516
6
Thus, x1 0.1, y1 1.130516 Numerical Differentiation
and Integration
1
y (0.2) y (0.1) (k1 2k2 2k3 k4 )
where, 6
k1 h f ( x1 , y1 ) 0.1 (0.1 1.11034) 0.121034
NOTES
h k
k2 h f x1 , y1 1 0.1 (0.15 1.17086) 0.132086
2 2
h k
k3 h f x1 , y1 2 0.1 (0.15 1.17638) 0.132638
2 2
k4 h f ( x1 h, y1 k3 ) 0.1 (0.2 1.24298) 0.144298
1
y2 y (0.2) 1.11034 [0.121034 2 (0.132086 0.132638) 0.144298] 1.2428
6
Example 3.41: Use Runge-Kutta method of order 4 to evaluate y (1.1) and y
(1.2), by taking step length h = 0.1 for the initial value problem,
dy
x 2 y 2 , y (1) 0
dx
Solution: For the initial value problem,
dy
f ( x, y ), y ( x0 ) y0 , the Runge-Kutta method of order 4 is given as,
dx
1
y n 1 y n ( k1 2k 2 2k 3 k 4 )
6
k1 h f ( xn , yn )
h k
k2 h f xn , yn 1
2 2
h k2
where k3 h f xn , yn
2 2
k4 h f ( xn h, yn k3 ), for n 0, 1, 2,...
Self - Learning
Material 163
Numerical Differentiation For y (1.2):
and Integration
k1
Step 6: Compute y y0
2
Step 7: Compute k2 = h f (x, y)
k2
Step 8: Compute y y0
2
Step 9: Compute k3 = h f(x, y)
Step 10: Compute x1 = x0+ h
Self - Learning
164 Material Step 11: Compute y = y0+ k3
Step 12: Compute k4 = h f (x1, y) Numerical Differentiation
and Integration
Step 13: Compute y1 = y0+ (k1+ 2 (k2+ k3) + k4)/6
Step 14: Write x1, y1
Step 15: Set x0 = x1 NOTES
Step 16: Set y0 = y1
Step 17: Stop
dy dz
f ( x, y, z ), g ( x, y , z )
dx dx
with y ( x0 ) y0 and z ( x0 ) z0
1
yi 1 yi (k1 2k2 2k3 k4 )
6
1 (3.70)
zi 1 zi (l1 2l2 2l3 l4 ) for i 0, 1, 2,...
6
where k1 hf ( xi , yi , z i ), l1 hg ( xi , yi , z i )
h k l h k l
k 2 hf xi , yi 1 , z i 1 , l 2 hg xi , yi 1 , z1 1
2 2 2 2 2 2
h k l h k l
k 3 hf xi , yi 2 , z i 2 , l3 hg xi , yi 2 , zi 2
2 2 2 2 2 2
k 4 hf ( xi h, y1 k 3 , z i l3 ), l 4 hf ( xi h, yi k 3 , z i l3 )
yi y ( xi ), z i z ( xi ), i 0, 1, 2,...
The solutions for y(x) and z(x) are determined at successive step points x1 = x0+h,
x2 = x1+h = x0+2h,..., xN = x0+Nh.
NOTES 1
yi 1 yi (k1 2k 2 2k3 k4 ),
6
1
zi 1 yi1 zi (l1 2l2 2l3 l4 ) for i 0, 1, 2,...
6
(3.71)
where k1 h( zi ), l1 hg ( xi , yi , zi )
l h k l
k2 h zi 1 , l2 hg xi , yi 1 , zi 1
2 2 2 2
l h k l
k3 h zi 2 , l3 hg xi , yi 2 , zi 2
2 2 2 2
k4 h( zi l3 ), l4 hg ( xi h, yi k3 , zi l3 )
Multistep Methods
We have seen that for finding the solution at each step, the Taylor series method
and Runge-Kutta methods requires evaluation of several derivatives. We shall
now develop the multistep method which require only one derivative evaluation
per step; but unlike the self starting Taylor series or Runge-Kutta methods, the
multistep methods make use of the solution at more than one previous step points.
Let the values of y and y1 already have been evaluated by self-starting methods
at a number of equally spaced points x0, x1,..., xn. We now integrate the differential
equation,
dy
f ( x, y ), from xn to xn 1
dx
xn1 xn1
i.e., dy f ( x, y ) dx
xn xn
xn 1
yn 1 yn
xn
f ( x, y ( x )) dx
To evaluate the integral on the right hand side, we consider f (x, y) as a function
of x and replace it by an interpolating polynomial, i.e., a Newton’s backward
difference interpolation using the (m + 1) points xn, xn+1, xn–2,..., xn–m,
m
x xn
pm ( x ) (1)k (k s ) k f n k , where s
k 0 h
1
s
k s ( s 1)( s 2)...( s k 1)
k!
Self - Learning
166 Material
1 m Numerical Differentiation
yn 1 yn h (1) k
s
k
k
f n k ds and Integration
0 k 0
yn h [ f n 1f n 1 2 2 f n 2 ... m f n m ]
1
NOTES
where k (1)k ds
s
k
0
E h f ( )
5 iv
ds (3.73)
0
4
s 3
1
E h f ( )
5 iv
ds
0
4
or, 251 (3.74)
E h5 f iv ( ).
720
The fourth order Adams-Bashforth formula requires four starting values, i.e.,
the derivaties, f3, f2, f1 and f0. This is a multistep method.
Predictor-Correction Methods
These methods use a pair of multistep numerical integration. The first is the Predictor
formula, which is an open-type explicit formula derived by using, in the integral, an
interpolation formula which interpolates at the points xn, xn–1,..., xn–m. The second
is the Corrector formula which is obtained by using interpolation formula that
interpolates at the points xn+1, xn, ..., xn–p in the integral.
Self - Learning
Material 167
Numerical Differentiation Euler’s Predictor-Corrector Formula
and Integration
The simplest formula of the type is a pair of formula given by,
y n( p1) y n h f ( xn , yn ) (3.75)
NOTES
(c) h ( p)
y n 1 y n [ f ( xn , y n ) f ( xn 1 , y n 1 )] (3.76)
2
In order to determine the solution of the problem upto a desired accuracy, the
corrector formula can be employed in an iterative manner as shown below:
Step 1: Compute yn(0)1 , using Equation (3.75)
h
i.e., yn 1 yn
(k )
[ f ( xn , yn ) f ( xn 1 , yn( k11) )], for K 1, 2, 3,...,
2
The computation is continued till the condition given below is satisfied,
yn( k1) yn( k11)
(3.77)
yn( k1)
where is the prescribed accuracy.
It may be noted that the accuracy achieved will depend on step size h and on
the local error. The local error in the predictor and corrector formula are,
h2 h3
y ( 1 ) and y ( 2 ), respectively..
2 12
4h
The Predictor formula gives, y4 = y(0.4) = y0+ (2 y1 y 2 2 y3 ) .
3
4 0.1
y4(0) 1 (2 1.11053 1.24458 2 1.40658)
3
1.50528 y4 1 0.4 1.50528 1.602112
(1) h
The Corrector formula gives, y 4 y 2 ( y 2 4 y3 y 4 ) .
3
0.1
y (0.4) 1.22288 (1.24458 4 1.40658 1.60211)
3
1.22288 0.28243
1.50531
B1 C1 0 0... 0 0 0
A B2 C2 0... 0 0 0
2
0 A3 B3 C3 ... 0 0 0
A (3.97)
... ... ... ... ... ... ...
0 0 0 0... AN 2 BN 2 C N 2
0 0 0 0... 0 AN 1 B N 1
Where, Bi 4 2h 2 qi , i 1, 2,..., N 1
Ci 2 hpi , i 1, 2,..., N 2
Ai 2 hpi , i 2, 3,..., N 1
(3.98)
Self - Learning
Material 171
Numerical Differentiation The vector b has components,
and Integration
b1 2 1h 2 (2 hp1 )
bi 2 i h 2 , for i 2, 3,..., N 2
NOTES bN 1 2 N 1 h 2 (2 hlp N 1 )
(3.99)
The system of linear equations can be directly solved using suitable methods.
Example 3.43: Compute values of y (1.1) and y (1.2) on solving the following
initial value problem, using Runge-Kutta method of order 4:
y
y y 0 , with y(1) = 0.77, y (1) = –0.44
x
Solution: We first rewrite the initial value problem in the form of pair of first order
equations.
z
y z, z y
x
with y (1) = 0.77 and z (1) = –0.44.
We now employ Runge-Kutta method of order 4 with h = 0.1,
1
y (1.1) y (1) (k1 2k2 2k3 k4 )
6
1
y (1.1) z (1.1) 1 (l1 2l2 2l3 l4 )
6
k1 0.44 0.1 0.044
0.44
l1 0.1 0.77 0.033
1
0.033
k2 0.1 0.44 0.04565
2
0.4565
l2 0.1 0.748 0.031323809
1.05
0.03123809
k3 0.1 0.44 0.0455661904
2
0.0455661904
l3 0.1 0.747175 0.031321128
1.05
k4 0.1 (0.47132112) 0.047132112
0.047132112
l4 0.1 0.72443381 0.068158643
1.1
1
y (1.1) 0.77 [0.044 2 (0.045661904) 0.029596005] 0.727328602
6
1
y (1.1) 0.44 [0.033 2(0.031323809) 2(0.031321128) 0.029596005]
6
1
0.44 [0.33 0.062647618 0.062642256 0.029596005]
6
Self - Learning 0.526322021
172 Material
Example 3.44: Compute the solution of the following initial value problem for Numerical Differentiation
and Integration
x = 0.2, using Taylor series solution method of order 4: n.l.
d2 y dy
2
y x , y(0) 1, y(0) 0
dx dx NOTES
Solution: Given y y xy , we put z y , so that
z y xz , y z and y (0) 1, z (0) 0.
We solve for y and z by Taylor series method of order 4. For this we first
compute y (0), y (0), y iv (0),...
We have, y (0) y(0) 0 y (0) 1, z (0) 1
y (0) z (0) y (0) z (0) 0.z (0) 0
y iv (0) z (0) y (0) 2 z (0) 0.z (0) 3
z iv (0) 4 z (0) 0.z (0) 0
By Taylor series of order 4, we have
x2 x3 x 4 iv
y (0 x) y (0) xy (0) y (0) y (0) y (0)
2! 3! 4!
x2 x4
or, y ( x) 1 3
2! 4!
(0.2) 2 (0.2) 4
y (0.2) 1 1.0202
2! 8
(0.2)3
Similarly, y (0.2) z (0.2) 0.2 3 0.204
4!
Example 3.45: Compute the solution of the following initial value problem for
2
d y
x = 0.2 by fourth order Runge -Kutta method: n.l. xy, y (0) 1, y (0) 1
dx 2
Solution: Given y xy, we put y z and the simultaneous first order problem,
y z f ( x, y , z ), say z xy g ( x, y , z ), say with y (0) 1 and z (0) 1
We use Runge-Kutta 4th order formulae, with h = 0.2, to compute y (0.2)
and y (0.2), given below..
k1 h f ( x0 , y0 , z0 ) 0.2 1 0.2
l1 h g ( x0 , y0 , z0 ) 0.2 0 0
h k l
k2 h f x0 , y0 1 , z0 1 0.2 (1 0) 0.2
2 2 2
h k l 0.2 0.2
l2 h g x0 , y0 1 , z0 1 0.2 1 0.022
2 2 2 2 2
h k l
k3 h f x0 , y0 2 , z0 2 0.2 1.011 0.2022
2 2 2
h k l
l3 h g x0 , y0 2 , z0 2 0.2 0.1 1.1 0.022
2 2 2
k4 h f ( x0 h, y0 k3 , z0 l3 ) 0.2 1.022 0.2044
l4 h g ( x0 h, y0 k3 , z0 l3 ) 0.2 0.2 1.2022 0.048088
1
y (0.2) 1 (0.2 2(0.2 0.2022) 0.2044) 1.2015
6
1 Self - Learning
y (0.2) 1 (0 2 (0.022 0.022) 0.048088) 1.02268
6 Material 173
Numerical Differentiation
and Integration
Check Your Progress
10. How are Euler's method and Taylor's method related?
NOTES 11. Define Picard's method of successive approximation.
12. Why should we not use Euler's method for a larger range of x?
13. When are Runge-Kutta methods applied?
14. What is a predictor formula?
15. What are local errors in Milne's predictor-corrector formulae?
16. Where can the method of reduction to a pair of initial value problem be
applied?
where u x x0
h
3. Newton’s backward difference interpolation formula is,
v(v 1) 2 v(v 1)(v 2) 3 v (v 1)(v 2)(v 3) 4
(v ) yn v yn yn yn yn ...
2 ! 3 ! 4 !
v(v 1)...(v n 1) n
yn
n !
x xn
Where v
h
4. The evaluation of a definite integral cannot be carried out when the integrand
f(x) is not integrable, as well as when the function is not explicitly known
but only the function values are known at a finite number of values of x.
There are two types of numerical methods for evaluating a definite integral
based on the following formula:
b
f ( x) dx
a
x1
h
5. The formula is, f ( x)dx 2 [ f
x0
0 f1 ] .
x2
h
6. The formula is, f ( x)dx 3 [ f
x0
0 4 f1 f 2 ] .
Self - Learning
174 Material
Numerical Differentiation
b
3h and Integration
7. Simpson’s three-eighth rule of numerical integration is, f ( x) dx [y
a
8 0
+ 3y1 + 3y2 + 2y3 + 3y4 + 3y5 + 2y6 + …+ 2y3m – 3 + 3y3m – 2 + 3y3m – 1 + y3m]
where h = (b–a)/(3m); for m = 1, 2, ... NOTES
b
3h
8. The Weddle’s rule is, f ( x)dx 10
a
[y0 + 5y1 + y2 + 6y3 + y4 + 5y5 + 2y6 +
5y7 + y8 + 6y9 + y10 + 5y11 + ... + 2y6m – 6 + 5y 6m – 5 + y6m – 4 + 6y6m – 3 + y6m
–2
+ 5y6m – 1 + y6m], where b – a = 6mh.
9. This procedure is used to find a better estimate of an integral using the
evaluation of the integral for two values of the width of the sub-intervals.
10. If we take k = 1, we get the Euler’s method, y1 = y0 + h f(x0, y0).
11. In Picard’s method the first approximate solution y (1) ( x ) is obtained by
x
replacing y(x) by y 0. Thus, y (1) ( x ) y0 f ( x, y0 )dx . The second
x0
f ( x, y
( 2) (1)
y ( x) y0 ( x)) dx
.
x0
3.6 SUMMARY
Numerical differentiation is the process of computing the derivatives of a
function f(x) when the function is not explicitly known, but the values of the
function are known only at a given set of arguments x = x0, x1, x2,..., xn. Self - Learning
Material 175
Numerical Differentiation For finding the derivatives, we use a suitable interpolating polynomial and
and Integration
then its derivatives are used as the formulae for the derivatives of the function.
For computing the derivatives at a point near the beginning of an equally
spaced table, Newton’s forward difference interpolation formula is used,
NOTES
whereas Newton’s backward difference interpolation formula is used for
computing the derivatives at a point near the end of the table.
Let the values of an unknown function y = f(x) be known for a set of
equally spaced values x0, x1, …, xn of x, where xr = x0 + rh. Newton’s
forward difference interpolation formula is,
x x0
where u .
h
At the tabulated point x0, the value of u is zero and the formulae for the
derivatives are given by,
1 1 1 1 1
y ( x0 ) y0 2 y0 3 y0 4 y0 5 y0 ...
h 2 3 4 5
1 2 11 5
y ( x0 ) 2
y 0 3 y 0 4 y 0 5 y 0 ...
h 12 6
dy d2y
For a given x near the end of the table, the values of and 2 are
dx dx
computed by first computing v = (x – xn)/h and using the above formulae.
At the tabulated point xn, the derivatives are given by,
1 1 1 1
y ( x n ) y n 2 y n 3 y n 4 y n ...
h 2 3 4
1 2 11 5
y ( xn ) 2
y n 3 y n 4 y n 5 y n ...
h 12 6
For computing the derivatives at a point near the middle of the table, the
derivatives of the central difference interpolation formula is used.
If the arguments of the table are unequally spaced, then the derivatives of
the Lagrange’s interpolating polynomial are used for computing the derivatives
of the function.
Numerical differentiation is the process of computing the derivatives of a
function f(x) when the function is not explicitly known, but the values of the
function are known only at a given set of arguments x = x0, x1, x2, ..., xn.
For computing the derivatives at a point near the beginning of an equally
spaced table, Newton’s forward difference interpolation formula is used,
Self - Learning
176 Material
whereas Newton’s backward difference interpolation formula is used for Numerical Differentiation
and Integration
computing the derivatives at a point near the end of the table.
Numerical methods can be applied to determine the value of the integral
when the integrand is not integrable as well as when the function is not
NOTES
explicitly known but only the function values are known.
The two types of numerical methods for evaluating a definite integral are
Newton-Cotes quadrature and Gaussian quadrature.
Taking n = 2 in the Newton-Cotes formula, we get Simpson’s one-third
formula of numerical integration while taking n = 3, we get Simpson’s three-
eighth formula of numerical integration.
In Newton-Cotes formula with n = 6 some minor modifications give the
Weddle’s formula.
For evaluating a definite integral correct to a desired accuracy, one has to
make a suitable choice of the value of h, the length of sub-interval to be
used in the formula.
There are two ways of determining h, by considering the truncation error in
the formula to be used for numerical integration or by successive evaluation
of the integral by the technique of interval halving and comparing the results.
In the truncation error estimation method, the value of h to be used is
determined by considering the truncation error in the formula for numerical
integration.
When the estimation of the truncation error is cumbersome, the method of
interval halving is used to compute an integral to the desired accuracy.
Numerical evaluation of double integrals is done by applying trapezoidal
rule and Simpson’s one-third rule.
This procedure is used to find a better estimate of an integral using the
evaluation of the integral for two values of the width of the sub-intervals.
There are many methods available for finding a numerical solution for
differential equations.
Picard’s iteration is a method of finding solutions of a first order differential
equation when an initial condition is given.
Euler’s method is a crude but simple method for solving a first order initial
value problem.
Euler’s method is a particular case of Taylor’s series method.
Runge-Kutta methods are useful when the method of Taylor series is not
easy to apply because of the complexity of finding higher order derivatives.
For finding the solution at each step, the Taylor series method and Runge-
Kutta methods require evaluation of several derivatives.
The multistep method requires only one derivative evaluation per step; but
unlike the self starting Taylor series on Runge-Kutta methods, the multistep
methods make use of the solution at more than one previous step points.
These methods use a pair of multistep numerical integration. The first is the
predictor formula, which is an open-type explicit formula derived by using,
in the integral, an interpolation formula which interpolates at the points
Self - Learning
Material 177
Numerical Differentiation xn, xn – 1, ..., xn – m. The second is the corrector formula which is obtained by
and Integration
using interpolation formula that interpolates at the points xn + 1, xn, ..., xn – p in
the integral.
The solution of ordinary differential equation of order 2 or more, when
NOTES values of the dependent variable is given at more than one point, usually at
the two ends of an interval in which the solution is required.
The methods used to reduce the boundary value problem into initial value
problems are reduction to a pair of initial value problem and finite difference
method.
Self - Learning
178 Material
Numerical Differentiation
3.8 SELF-ASSESSMENT QUESTIONS AND and Integration
EXERCISES
NOTES
Short-Answer Questions
1. Define the term numerical differentiation.
2. Give the differentiation formula for Newton’s forward difference interpolation.
dy
3. How the derivative can be evaluated?
dx
4. Give the formulae for the derivatives at the tabulated point x0 where the
value of u is zero.
5. Give the differentiation formula for Newton’s backward difference
interpolation.
6. Give the Newton’s backward difference interpolation formula for an equally
spaced table of a function.
7. State Newton-Cotes formula.
8. State the trapezoidal rule.
9. What is the difference between Simpson’s one-third formula and one-third
rule?
10. What is the error in Weddle’s rule?
11. Give the truncation error in Simpson’s one-third rule.
12. Where is interval halving technique used?
13. Name the methods used for numerical evaluation of double integrals.
14. State the Gauss quadrature formula.
15. State an application of Romberg’s procedure.
16. What are ordinary differential equations?
17. Name the methods for computing the numerical solution of differential
equations.
18. What is the significance of Runge-Kutta methods of different orders?
19. When is multistep method used?
20. Name the predictor-corrector methods.
21. How will you find the numerical solution of boundary value problems?
Long-Answer Questions
1. Discuss numerical differentiation using Newton’s forward difference
interpolation formula and Newton’s backward difference interpolation
formula.
Self - Learning
Material 179
Numerical Differentiation
3
and Integration
2. Use the following table of values to compute f ( x ) dx :
0
NOTES x 0 1 2 3
f ( x) 1.6 3.8 8.2 15.4
3. Use suitable formulae to compute y (1.4) and y (1.4) for the function y =
f(x), given by the following tabular values:
x 1.4 1.8 2.2 2.6 3.0
y 0.9854 0.9738 0.8085 0.5155 0.1411
dy d2y
4. Compute and for x =1 where the function y = f(x) is given by the
dx dx 2
following table:
x 1 2 3 4 5 6
y 1 8 27 64 125 216
5. A rod is rotating in a plane about one of its ends. The following table gives
the angle (in radians) through which the rod has turned for different values
d
of time t seconds. Find its angular velocity and angular acceleration
dt
d 2
at t = 1.0.
dt 2
dy d2y
6. Find and at x = 1 and at x = 3 for the function y = f(x), whose
dx dx 2
values in [1, 6] are given in the following table:
x 1 2 3 4 5 6
y 2.7183 3.3210 4.0552 4.9530 6.0496 7.3891
dy d2y
7. Find and at x = 0.96 and at x = 1.04 for the function y = f(x)
dx dx 2
given in the following table:
x 0.96 0.98 1.0 1.02 1.04
y 0.7825 0.7739 0.7651 0.7563 0.7473
8. Use suitable formulae to compute y(1.4) and y(1.4) for the function y = f
(x), given by the following tabular values.
x 1.4 1.8 2.2 2.6 3.0
y 0.9854 0.9738 0.8085 0.5155 0.1411
Self - Learning
180 Material
dy d2y Numerical Differentiation
9. Compute dx and dx 2
for x = 1 where the function y = f (x) is given by the and Integration
following table:
x 1 2 3 4 5 6
NOTES
y 1 8 27 64 125 216
20
10. Compute f ( x) dx by Simpson’s one-third rule, where:
0
x 0 5 10 15 20
f ( x) 1.0 1.6 3.8 8.2 15.4
4
11. Compute x 3 dx by Simpson’s one-third formula and comment on the result:
0
x 0 2 4
x3 0 8 64
2
13. Compute e x dx by Simpson’s one-third formula and compare with the exact
0
value, where e0 = 1, e1 = 2.72, e2 = 7.39.
1
dx
14. Compute an approximate value of , by integrating
1 x
0
2
, by Simpson’ss
one-third formula.
15. A rod is rotating in a plane about one of its ends. The following table gives
the angle (in radians) through which the rod has turned for different values
d
of time t seconds. Find its angular velocity and angular acceleration
dt
d 2
at t = 1.0.
dt 2
dy d2y
16. Find and at x = 1 and at x = 3 for the function y = f (x), whose
dx dx 2
values are given in the following table:
x 1 2 3 4 5 6
y 2.7183 3.3210 4.0552 4.9530 6.0496 7.3891
Self - Learning
Material 181
Numerical Differentiation
and Integration dy d2y
17. Find and at x = 0.96 and at x = 1.04 for the function y = f (x)
dx dx 2
given in the following table:
NOTES x 0.96 0.98 1.0 1.02 1.04
y 0.7825 0.7739 0.7651 0.7563 0.7473
20. Evaluate cos x dx, correct to three significant figures taking five equal sub-
0
intervals.
1
xdx
21. Compute the value of the integral correct to three significant figures
1 x 0
2
27. Compute the integral sin x dx by (a) Trapezoidal rule and (b) Simpson’ss
0
one-third rule taking six sub-intervals and compare the results with the exact
value.
28. Evaluate the following integrals by Weddle’s rule,
1
dx
(a) 1 x 2 , taking n = 12
0
1
x2 1
Self - Learning (b) x 2 1 dx, taking n = 12
182 Material 0
1 Numerical Differentiation
29. Compute xe x dx by Gauss-Legendre two point and three point formula, and Integration
0
(b) cos
2
xe x dx
0
1
dx
(c)
1 x2
0
Self - Learning
184 Material
Statistical Computation and
AND PROBABILITY
DISTRIBUTION NOTES
Structure
4.0 Introduction
4.1 Objectives
4.2 History and Meaning of Statistics
4.2.1 Scope of Statistics
4.3 Various Measures of statistical computations
4.3.1 Average
4.3.2 Mean
4.3.3 Median
4.2.4 Mode
4.3.5 Geometric Mean
4.3.6 Harmonic Mean
4.3.7 Quartiles, Percentiles and Deciles
4.3.8 Box Plot
4.4 Measures of Dispersion
4.4.1 Range
4.4.2 Quartile Deviation
4.4.3 Mean Deviation
4.5 Standard Deviation
4.5.1 Calculation of Standard Deviation by Short-cut Method
4.5.2 Combining Standard Deviations of Two Distributions
4.5.3 Comparison of Various Measures of Dispersion
4.6 Probability
4.6.1 Probability Distribution of a Random Variable
4.6.2 Axiomatic or Modern Approach to Probability
4.6.3 Theorems on Probability
4.6.4 Counting Techniques
4.6.5 Mean and Variance of Random Variables
4.7 Standard Probability Distribution
4.7.1 Binomial Distribution
4.7.2 Poisson Distribution
4.7.3 Exponential Distribution
4.7.4 Normal Distribution
4.7.5 Uniform Distribution (Discrete Random and Continous Variable)
4.8 Answers to ‘Check Your Progress’
4.9 Summary
4.10 Key Terms
4.11 Self-Assessment Questions and Exercises
4.12 Further Reading
4.0 INTRODUCTION
Statistics is the discipline that concerns the collection, organization, analysis,
interpretation, and presentation of data. Every day we are confronted with some
form of statistical information through different sources. All raw data cannot be Self - Learning
Material 185
Statistical Computation and termed as statistics. Similarly, single or isolated facts or figures cannot be called
Probability Distribution
statistics as these cannot be compared or related to other figures within the same
framework. Hence, any quantitative and numerical data can be identified as statistics
when it possesses certain identifiable characteristics according to the norms of
NOTES statistics.
In statistics, the term statistical computation specifies the method through
which the quantitative data have a tendency to cluster approximately about some
value. A measure of statistical computation is any precise method of specifying this
‘Central Value’. In the simplest form, the measure of statistical computation is an
average of a set of measurements, where the word average refers to as mean,
median, mode or other measures of location. Typically the most commonly used
measures are arithmetic mean, mode and median. The measures of dispersion,
which in itself is a very important property of a distribution and needs to be measured
by appropriate statistics. Hence, this unit has taken into consideration several aspects
of dispersion. It describes absolute and relative measures of dispersion. It deals
with range, the crudest measure of dispersion. It also explains quartile deviation,
mean deviation and standard deviation. The standard deviation is the most useful
measure of dispersion.
The subject of probability in itself is a cumbersome one, hence only the
basic concepts will be discussed in this unit. The word probability or chance is
very commonly used in day-to-day conversation, and terms such as possible or
probable or likely, all have similar meanings. Probability can be defined as a measure
of the likelihood that a particular event will occur. It is a numerical measure with a
value between 0 and 1 of such likelihood where the probability of zero indicates
that the given event cannot occur and the probability of one assures certainty of
such an occurrence. The probability theory helps a decision-maker to analyse a
situation and decide accordingly. We study why all these uncertainties require
knowledge of probability so that calculated risks can be taken. Since the outcomes
of most decisions cannot be accurately predicted because of the impact of many
uncontrollable and unpredictable variables, it is necessary that all the known risks
be scientifically evaluated. Probability theory, sometimes referred to as the science
of uncertainty, is very helpful in such evaluations. It helps the decision-maker with
only limited information to analyse the risks and select the strategy of minimum
risk.
The probability distribution of a discrete random variable is a list of
probabilities associated with each of its possible values. It is also sometimes called
the probability function or the probability mass function. The probability density
function of a continuous random variable is a function which can be integrated to
obtain the probability that the random variable takes a value in a given interval.
The binomial distribution is used in finite sampling problems where each observation
is one of two possible outcomes (‘Success’ or ‘Failure’). The Poisson distribution
is used for modelling rates of occurrence. The exponential distribution is used to
describe units that have a constant failure rate. The term ‘Normal Distribution’
refers to a particular way in which observations will tend to pile up around a
particular value rather than be spread evenly across a range of values, i.e., the
Self - Learning central limit theorem.
186 Material
In this unit, you will learn about the history and meaning of statistics, scope Statistical Computation and
Probability Distributiona
of statistics, various measures of statistical computations, measures of dispersion,
standard deviation, probability and standard probability distribution.
NOTES
4.1 OBJECTIVES
After going through this unit, you will be able to:
Examine the functions and meaning of statistics
Understand the various measures of statistical data
Analysis the absolute and relative measures of distribution
Discuss the meaning, uses and merits of range in statistical presentation
Define standard deviation
Understand the basic concept of probability
Do random experiment
Explain about the concepts of probability distribution
Describe the Poisson distribution
Analyse Poisson distribution as an approximation of binomial distribution
Understand exponential distribution
Learn about the uniform distribution (discrete random and continuous
variable)
Self - Learning
Material 187
Statistical Computation and Table 4.1 Statistics of Students of a College where Total Number of Students is 1,000
Probability Distribution
Height Number
Total 1,000
Self - Learning
192 Material
Table 4.2 Characteristics of Statistical Data Statistical Computation and
Probability Distributiona
Column (a) Column (b)
Population Population
Year (in lakhs) Year (in lakhs)
NOTES
1951 3,569 1911 2,490
1921 2,481
1931 2,755
1941 3,128*
1951 3,569
1961 4,390
1971 5,470
*After deducting estimated amount of inflation of returns in West Bengal and Punjab
(20 lakhs).
They must be affected to a marked extent by a multiplicity of causes:
The term statistical data can be used only when we cannot predict exactly the
values of the various physical quantities. This means that the numerical value of
any quantity at any particular moment is the result of the action and interaction of
a number of forces, differing amongst themselves and it is not possible to say as to
how much of it is due to any one particular cause. Thus, the volume of wheat
production is attributable to a number of factors, viz., rainfall, soil, fertility, quality
of seed, methods of cultivation, etc. All these factors acting jointly determine the
amount of the yield and it is not possible for any one to assess the individual
contribution of any one of these factors.
Statistics must be enumerated or estimated according to reasonable
standards of accuracy: This means that if aggregates of numerical facts are to be
called’ statistics’ they must be reasonably accurate. This is necessary because
statistical data are to serve as a basis for statistical investigations. If the basis
happens to be incorrect the results are bound to be misleading. It must, however,
be clearly stated that it is not ‘mathematical accuracy, but only reasonable accuracy’
that is necessary in statistical work. What standard of accuracy is to be regarded
as reasonable will depend upon the aims and objects of inquiry. Where precision
is required’ accuracy is necessary; where generel impressions are sufficient,
appreciable errors may be tolerated. Again, whatever standard of accuracy is
once adopted, it should be uniformly maintained throughout the inquiry.
Statistics are collected in a systematic manner for a predetermined
purpose: Numerical data can be called statistics only if they have been compiled
in a properly planned manner and for a purpose about which the enumerator had
a definite idea. So long as the compiler is not clear about the object for which facts
are to be collected, he will not be able to distinguish between facts that are relevant
and those that are unnecessary; and as such the data collected will, in all probability,
be a heterogeneous mass of unconnected facts. Again, the procedure of data
collection must be properly planned, i.e., it must be decided beforehand as to
what kind of information is to be collected and the method that is to be applied in
obtaining it. This involves decisions on matters like ‘statistical unit,’ ‘standard of
accuracy,’ ‘list of questions,’ etc. Facts collected in an unsystematic manner, and
without a complete awareness of the object, will be confusing and cannot be
made the basis of valid conclusions. Self - Learning
Material 193
Statistical Computation and Statistics should be placed in relation to each other. Numerical facts
Probability Distribution
may be placed in relation to each other either in point of time, space or condition.
The phrase ‘placed in relation to each other’ suggests that the facts should be
comparable. Facts are comparable in point of time when we have measurements
NOTES of the same object, obtained in an identical manner, for different periods. They are
said to be related in point of space or condition when we have the measurements
of the same phenomenon at different places or in different conditions, but at the
same time. Numerical facts will be comparable, if they pertain to the same inquiry
and have been compiled in a systematic manner for a predetermined purpose.
Putting all these characteristics together, Secrets has defined statistics
(numerical descriptions) as: ‘Aggregates of facts, affected to a marked extent by
multiplicity of causes, numerically expressed, enumerated or estimated, according
to reasonable standard of accuracy, collected in a systematic manner, for a
predetermined purpose, and placed in relation to each other.’
Some Other Definitions of Statistics
As numerical data
Waster has defined statistics as ‘Classified facts respecting the condition of the
people in a state especially those facts which can be stated in numbers or in tables
or in any other tabular or classified arrangement.’ No doubt, this definition was
correct at a time when statistics were collected only for purposes of internal
administration or for knowing, for purposes of war, the wealth of the State. The
scope of statistics is now considerably wider and it has almost a universal application.
Obviously, therefore, the definition is inadequate.
Bowley defines statistics as ‘numerical statements of facts in any department
of inquiry placed in relation to each other.’ This is somewhat more accurate. It
means that if numerical facts do not pertain to a department of inquiry or if such
facts are not related to each other they cannot be called statistics. The leads us to
the conclusion that ‘all statistics are numerical facts but all numerical facts are not
statistics.’ This definition is certainly better than the previous one But it is not
comprehensive enough in as much as it does not give any importance either to the
nature of facts or the standard of accuracy.
As Statistical Methods
Bowley has called it ‘The science of measurement of the social organism, regarded
as a whole, in all its manifestations.’ This definition is too narrow as it confines the
scope of statistics only to human activities. Statistics in fact has a much wider
application and is not confined only to the social organism. Besides, statistics is
not only the technique of measuring but also of analysing and interpreting. Again,
statistics, strictly speaking, is not a science but a scientific method. It is a device of
inferring knowledge and not knowledge itself.
Bowley has also called statistics ‘the science of counting,’ and ‘the science
of average.’ These definitions are again incomplete in the sense that they pertain to
only a limited field. True, statistical work includes counting and averaging, but it
also includes many other processes of treating quantitative data. In fact, while
Self - Learning dealing with large numbers, actual count becomes illusory and only estimates are
194 Material
made. Thus these definitions can also be discarded on the ground of inadequacy.
Origin of Statistics Statistical Computation and
Probability Distributiona
Statistics originated from two quite dissimilar fields, viz., games of chance and
political states. These two different fields are also termed as two distinct
disciplines—one primarily analytical and the other essentially descriptive. The NOTES
former is associated with the concept of chance and probability and the latter is
concerned with the collection of data.
The theoretical development of the subject has its origin in the mid-17 century
and many mathematicians and gamblers of France, Germany and England are
credited for its development. Notable amongst them are Pascal (1623–1662),
who investigated the properties of the coefficients of binomial expansion and James
Bernoulli (1654–1705). who wrote the first treatise on the theory of probability.
As regards the descriptive side of statistics it may be stated that statistics is
as old as statecraft. Since time immemorial men must have been compiling
information about wealth and manpower for purpose of peace and war. This activity
considerably expanded at each upsurge of social and political development and
received added impetus in periods of war.
The development of statistics can be divided into the following three stages:
The empirical stage (1600): During this, the primitive stage of the subject,
numerical facts were utilized by the rulers, principally as an aid in the administration
of Government. Information was gathered about the number of people and the
amount of property held by them—the former serving the ruler as an index of
human fighting strength and the latter as an indication of actual and potential taxes.
The comparative stage (1600–1800): During this period statisticians
frequently made comparisons between nations with a view to judging their relative
strength and prosperity. In some countries enquiries were instituted to judge the
economic and social conditions of their people. Colbert introduced in France a
‘mercantile’ theory of government whose basis was essentially statistical in
character. In 1719, Frederick William I began gathering information about
population occupation, house-taxes, city finance, etc., which helped to study the
condition of the people.
The modern stage (1800 up to date): During this period statistics is viewed
as a way of handling numerical facts rather than a mere device of collecting numerical
data. Besides, there has been a considerable extension of the field of its applicability.
It has now become a useful tool and statistical methods of analysis are now being
increasingly used in biology, psychology, education, economics and business.
Self - Learning
198 Material
Characteristics of the Mean Statistical Computation and
Probability Distributiona
The arithmetic mean has some interesting properties. These are as follows:
(i) The sum of the deviations of individual values of X from the mean will always
add up to zero. This means that if we subtract all the individual values from NOTES
their mean, then some values will be negative and some will be positive, but
if all these differences are added together then the sum will be zero. In other
words, the positive deviations must balance the negative deviations. Or
symbolically,
n
(Xi X ) = 0, i = 1, 2, ... n
i 1
(iv) The product of the arithmetic mean and the number of values on which the
mean is based is equal to the sum of all given values. In other words, if we
replace each item in series by the mean, then the sum of these substitutions
will equal the sum of individual items. Thus, if we take random figures as an
example like 3, 5, 7, 9, and if we substitute the mean for each item 6, 6, 6,
6 then the total is 24, both in the original series and in the substitution series.
This can be shown like,
X
Since, X =
N
N X = X
For example, if we have a series of values 3, 5, 7, 9, the mean is 6. The
squared deviations will be:
X X X X X 2
3 3 6 3 9
5 5 6 1 1
7 7 6 1 1
9 963 9
2
X 20
This property provides a test to check if the computed value is the correct
arithmetic mean.
Self - Learning
Material 199
Statistical Computation and Example 4.2: The mean age of a group of 100 persons (grouped in intervals
Probability Distribution
10–, 12–,..., etc.) was found to be 32.02. Later, it was discovered that age 57
was misread as 27. Find the corrected mean.
NOTES Solution: Let the mean be denoted by X. So, putting the given values in the
formula of arithmetic mean, we have,
X , i.e.,
32.02 =
100
X = 3202
3232
Correct AM = = 32.32
100
Example 4.3: The mean monthly salary paid to all employees in a company is
500. The monthly salaries paid to male and female employees average 520 and
420, respectively. Determine the percentage of males and females employed by
the company.
Solution: Let N1 be the number of males and N2 be the number of females
employed by the company. Also, let x1 and x2 be the monthly average salaries
paid to male and female employees and x be the mean monthly salary paid to all
the employees.
N1 x1 N 2 x2
x =
N1 N 2
520 N 1 420 N 2
or 500 = or 20N1= 80N2
N1 N 2
N1 80 4
or =
N2 20 1
Hence, the males and females are in the ratio of 4 : 1 or 80 per cent are
males and 20 per cent are females in those employed by the company.
Short-Cut Methods for Calculating Mean
We can simplify the calculations of mean by noticing that if we subtract a constant
amount A from each item X to define a new variable X' = X – A, the mean X of
X' differs from X by A. This generally simplifies the calculations and we can then
add back the constant A, termed as the assumed mean such as,
f ( X )
X = A X A
f
Self - Learning
200 Material
Table 4.3 Short-Cut Method of Calculating Mean Statistical Computation and
Probability Distributiona
X (f) Deviation from f(X')
Assumed Mean (13) X'
9 1 –4 –4 NOTES
10 2 –3 –6
11 3 –2 –6
12 6 –1 –6
13 10 0 –22
14 11 +1 +11
15 7 +2 +14
16 3 +3 +9
17 2 +4 +8
18 1 +5 +5
+47
–22
f = 46 fX = 25
The mean,
f ( X ) 25
X = A 13 = 4.54
f 46
This mean is the same as calculated in Example 4.1.
In the case of grouped frequency data, the variable X is replaced by midvalue
m, and in the short-cut technique, we subtract a constant value A from each m, so
that the formula becomes:
f ( m A)
X = A
f
In cases where the class intervals are equal, we may further simplify calculation
by taking the factor i from the variable m–A defining,
m A
X' =
i
where i is the class width. It can be verified that when X' is defined, then the mean
of the distribution is given by
f ( X )
X = A i
f
Self - Learning
Material 201
Statistical Computation and Calculate the arithmetic mean of the two groups after the classification.
Probability Distribution
S.No. Age of Husband Age of Wife
1 28 23
2 37 30
NOTES 3 42 40
4 25 26
5 29 25
6 47 41
7 37 35
8 35 25
9 23 21
10 41 38
11 27 24
12 39 34
13 23 20
14 33 31
15 36 29
16 32 35
17 22 23
18 29 27
19 38 34
20 48 47
Solution:
Calculation of Arithmetic Mean of Husbands’ Age
m 37
Class Intervals Midvalues Husband x1' = f1x1'
5
m Frequency ( f1)
20–24 22 3 –3 –9
25–29 27 5 –2 –10
30–34 32 2 –1 –2
21
35–39 37 6 0 0
40–44 42 2 1 2
45–49 47 2 2 4
6
f1 = 20 f1x1' = –15
20–24 22 5 –3 –15
25–29 27 5 –2 –10
30–34 32 4 –1 –4
35–39 37 3 0 0
40–44 42 2 1 2
45–49 47 1 2 2
f2 = 20 f2x2' = –25
Self - Learning
202 Material
Arithmetic mean of wife’s age, Statistical Computation and
Probability Distributiona
f x 25
x = 2 2 i A 5 37 = 30.75
N 20
The Weighted Arithmetic Mean NOTES
In the computation of arithmetic mean we had given equal importance to each
observation in the series. This equal importance may be misleading if the individual
values constituting the series have different importance as in Example 4.5.
Example 4.5: The Raja Toy shop sells
Toy Cars at 3 each
Toy Locomotives at 5 each
Toy Aeroplanes at 7 each
Toy Double Decker at 9 each
What will be the average price of the toys sold, if the shop sells 4 toys, one of each
kind?
Solution:
x 24
Mean Price, i.e., x = = = 6
4 4
In this case, the importance of each observation (price quotation) is equal in
as much as one toy of each variety has been sold. In the computation of the
arithmetic mean, this fact has been taken care of by including ‘once only’ the price
of each toy.
If , however the shop sells 100 toys: 50 cars, 25 locomotives, 15 aeroplanes
and 10 double deckers, the importance of the four price quotations to the dealer
is not equal as a source of earning revenue. In fact, their respective importance is
equal to the number of units of each toy sold, i.e.,
The importance of Toy Car 50
The importance of Locomotive 25
The importance of Aeroplane 15
The importance of Double Decker 10
It may be noted that 50, 25, 15, 10 are the quantities of the various classes
of toys sold. It is for these quantities that the term ‘weights’ is used in statistical
language. Weight is represented by symbol ‘w’, and w represents the sum of
weights.
While determining the ‘Average price of toy sold’, these weights are of
great importance and are taken into account in the manner as shown,
w1 x1 w2 x2 w3 x3 w4 x4
x = = wx
w1 w2 w3 w4 w
When w1, w2, w3, w4 are the respective weights of x1, x2, x3, x4 which in turn
represent the price of four varieties of toys, viz., car, locomotive, aeroplane and
double decker, respectively.
(50 3) (25 5) (15 7) (10 9) Self - Learning
x = Material 203
50 25 15 10
Statistical Computation and (150) (125) (105) (90) 470
Probability Distribution = = = 4.70
100 100
The following table summarizes the steps taken in the computation of the
NOTES weighted arithmetic mean.
Weighted Arithmetic Mean of Toys Sold by Raja Toy Shop
w = 6,000 wx = 38,000
wx
x =
w
38,000
=
6, 000
= 6.33
Self - Learning
204 Material
These examples explain that ‘Arithmetic Means and Percentage’ are not Statistical Computation and
Probability Distributiona
original data. They are derived figures and their importance is relative to the original
data from which they are obtained. This relative importance must be taken into
account by weighting while averaging them (means and percentage).
NOTES
Advantages of Mean
(i) Its concept is familiar to most people and is intuitively clear.
(ii) Every data set has a mean, which is unique and describes the entire data to
some degree. For example, when we say that the average salary of a
professor is 25,000 per month, it gives us a reasonable idea about the
salaries of professors.
(iii) It is a measure that can be easily calculated.
(iv) It includes all values of the data set in its calculation.
(v) Its value varies very little from sample to sample taken from the same
population.
(vi) It is useful for performing statistical procedures, such as computing and
comparing the means of several data sets.
Disadvantages of Mean
(i) It is affected by extreme values, and hence are not very reliable when the
data set has extreme values especially when these extreme values are on
one side of the ordered data. Thus, a mean of such data is not truly a
representative of such data. For example, the average age of three persons
of ages 4, 6 and 80 years gives us an average of 30.
(ii) It is tedious to compute for a large data set as every point in the data set is
to be used in computations.
(iii) We are unable to compute the mean for a data set that has open-ended
classes either at the high or at the low-end of the scale.
(iv) The mean cannot be calculated for qualitative characteristics, such as beauty
or intelligence, unless these can be converted into quantitative figures such
as intelligence into IQs.
4.3.3 Median
The second measure of central tendency that has a wide usage in statistical works
is the median. Median is that value of a variable which divides the series in such a
manner that the number of items below it is equal to the number of items above it.
Half the total number of observations lie below the median, and half above it. The
median is thus a positional average.
The median of ungrouped data is found easily if the items are first arranged
in order of the magnitude. The median may then be located simply by counting,
and its value can be obtained by reading the value of the middle observations. If
we have five observations whose values are 8, 10, 1, 3 and 5, the values are first
arrayed: 1, 3, 5, 8 and 10. It is now apparent that the value of the median is 5,
since two observations are below that value and two observations are above it.
Self - Learning
Material 205
Statistical Computation and When there is an even number of cases, there is no actual middle item and the
Probability Distribution
median is taken to be the average of the values of the items lying on either side of
(N + 1)/2, where N is the total number of items. Thus, if the values of six items of
a series are 1, 2, 3, 5, 8 and 10, then the median is the value of item number (6 +
NOTES 1)/2 = 3.5, which is approximated as the average of the third and the fourth items,
i.e., (3+5)/2 = 4.
Thus, the steps required for obtaining median are as follows:
(i) Arrange the data as an array of increasing magnitude.
(ii) Obtain the value of the (N+ l)/2th item.
Even in the case of grouped data, the procedure for obtaining median is
straightforward as long as the variable is discrete or non-continuous as is clear
from Example 4.7.
Example 4.7: Obtain the median size of shoes sold from the following data:
Number of Shoes Sold by Size in One Year
5 30 30
5 21 40 70
6 50 120
6 21 150 270
7 300 570
7 21 600 1170
8 950 2120
8 21 820 2940
9 750 3690
9 21 440 4130
10 250 4380
10 21 150 4530
11 40 4570
11 21 39 4609
Total 4609
( N 1) 4609 + 1
Solution: Median is the value of th = th = 2305th item. Since the
2 2
items are already arranged in ascending order (size-wise), the size of 2305th item
is easily determined by constructing the cumulative frequency. Thus, the median
size of shoes sold is 8½, the size of 2305th item.
In the case of grouped data with continuous variable, the determination of
median is a bit more involved. Consider the following table where the data relating
to the distribution of male workers by average monthly earnings is given. Clearly
the median of 6291 is the earnings of (6291 + l)/2 = 3l46th worker arranged in
ascending order of earnings.
From the cumulative frequency, it is clear that this worker has his income in
Self - Learning the class interval 67.5 – 72.5. However, it is impossible to determine his exact
206 Material
income. We therefore, resort to approximation by assuming that the 795 workers Statistical Computation and
Probability Distributiona
of this class are distributed uniformly across the interval 67.5 – 72.5. The median
worker is (3146–2713) = 433rd of these 795, and hence, the value corresponding
to him can be approximated as,
NOTES
433
67.5 × ( 72.5 67.5) = 67.5 + 2.73 = 70.23
795
Distribution of Male Workers by Average Monthly Earnings
Total 6291
The value of the median can thus be put in the form of the formula,
N 1
C
Me = l 2 ×i
f
Where l is the lower limit of the median class, i its width, f its frequency, C the
cumulative frequency upto (but not including) the median class, and N is the total
number of cases.
Finding Median by Graphical Analysis
The median can quite conveniently be determined by reference to the ogive which
plots the cumulative frequency against the variable. The value of the item below
which half the items lie, can easily be read from the ogive as is shown in
Example 4.8.
Self - Learning
Material 207
Statistical Computation and Example 4.8: Obtain the median of data given in the following table:
Probability Distribution
27.5 __ 0 6291
NOTES 32.5 120 120 6171
37.5 152 272 6019
42.5 170 442 5849
47.5 214 656 5635
52.5 410 1066 5225
57.5 429 1495 4796
62.5 568 2063 4228
67.5 650 2713 3578
72.5 795 3508 2783
77.5 915 4423 1868
82.5 745 5168 1123
87.5 530 5698 593
92.5 259 5957 334
97.5 152 6109 182
102.5 107 6216 75
107.5 50 6266 25
112.5 25 6291 0
Solution: It is clear that this is grouped data. The first class is 27.5 – 32.5, whose
frequency is 120, and the last class is 107.5 – 112.5 whose frequency is 25.
Figure 4.1 shows the ogive of less than cumulative frequency. The median is the
value below which N/2 items lie, is 6291/2 = 3145.5 items lie, which is read of
from Figure 4.2 as about 70. More accuracy than this is unobtainable because of
the space limitation on the earning scale.
6291
6000
5000
MORE THAN LESS THAN
4000
Number of Workers
3000
2000
1000
MEDIAN
0
27.5
32.5
37.5
42.5
47.5
52.5
57.5
67.5
72.5
77.5
82.5
87.5
92.5
97.5
102.5
107.5
112.5
62.5
Fig. 4.1 Median Determination by Plotting Less than and More than
Cumulative Frequency
Self - Learning
208 Material
The median can also be determined by plotting both ‘Less Than’ and ‘More Statistical Computation and
Probability Distributiona
than’ cumulative frequency as shown in Figure 4.1. It should be obvious that the
two curves should intersect at the median of the data.
NOTES
6000
5000
4000
Number of Workers
3000
2000
MEDIAN
1000
0
27.5
32.5
37.5
42.5
47.5
52.5
57.5
62.5
67.5
72.5
77.5
82.5
87.5
92.5
97.5
102.5
107.5
112.5
Advantages of Median
(i) Median is a positional average and hence the extreme values in the data set
do not affect it as much as they do to the mean.
(ii) Median is easy to understand and can be calculated from any kind of data,
even from grouped data with open-ended classes.
(iii) We can find the median even when our data set is qualitative and can be
arranged in the ascending or the descending order, such as average beauty
or average intelligence.
(iv) Similar to mean, median is also unique, meaning that, there is only one median
in a given set of data.
(v) Median can be located visually when the data is in the form of ordered
data.
(vi) The sum of absolute differences of all values in the data set from the median
value is minimum. This means that, it is less than any other value of central
tendency in the data set, which makes it more central in certain situations.
Disadvantages of Median
(i) The data must be arranged in order to find the median. This can be very
time consuming for a large number of elements in the data set.
(ii) The value of the median is affected more by sampling variations. Different
samples from the same population may give significantly different values of
the median.
(iii) The calculation of median in case of grouped data is based on the assumption
that the values of observations are evenly spaced over the entire class interval
Self - Learning
and this is usually not so. Material 209
Statistical Computation and (iv) Median is comparatively less stable than mean, particularly for small samples,
Probability Distribution
due to fluctuations in sampling.
(v) Median is not suitable for further mathematical treatment. For example, we
cannot compute the median of the combined group from the median values
NOTES
of different groups.
4.2.4 Mode
Mode is that value of the variable which occurs or repeats itself the greatest
number of times. The mode is the most ‘Fashionable’ size in the sense that it is
the most common and typical, and is defined by Zizek as ‘the value occurring
most frequently in a series (or group of items) and around which the other items
are distributed most densely’.
The mode of a distribution is the value at the point around which the items
tend to be most heavily concentrated. It is the most frequent or the most common
value, provided that a sufficiently large number of items are available, to give a
smooth distribution. It will correspond to the value of the maximum point (ordinate),
of a frequency distribution if it is an ‘ideal’ or smooth distribution. It may be regarded
as the most typical of a series of values. The modal wage, for example, is the wage
received by more individuals than any other wage. The modal ‘hat’ size is that,
which is worn by more persons than any other single size.
It may be noted that the occurrence of one or a few extremely high or low
values has no effect upon the mode. If a series of data are unclassified, not have
been either arrayed or put into a frequency distribution, the mode cannot be readily
located.
Taking first an extremely simple example, if seven men receive daily wages
of 5, 6, 7, 7, 7, 8 and 10, it is clear that the modal wage is 7 per day. If we
have a series such as 2, 3, 5, 6, 7, 10 and 11, it is apparent that there is no mode.
There are several methods of estimating the value of the mode. However, it
is seldom that the different methods of ascertaining the mode give us identical
results. Consequently, it becomes necessary to decide as to which method would
be most suitable for the purpose in hand. In order that a choice of the method may
be made, we should understand each of the methods and the differences that exist
among them.
The four important methods of estimating mode of a series are: (i) Locating
the most frequently repeated value in the array; (ii) Estimating the mode by
interpolation; (iii) Locating the mode by graphic method; and (iv) Estimating the
mode from the mean and the median. Only the last three methods are discussed in
this unit.
Estimating the Mode by Interpolation: In the case of continuous
frequency distributions, the problem of determining the value of the mode is not so
simple as it might have appeared from the foregoing description. Having located
the modal class of the data, the next problem in the case of continuous series is to
interpolate the value of the mode within this ‘modal’ class.
Self - Learning
210 Material
The interpolation is made by the use of any one of the following formulae: Statistical Computation and
Probability Distributiona
f2 f0
(i) Mo = l1 × i; (ii) Mo = l2 ×i
f0 f2 f0 f2
f1 f 0 NOTES
(iii) Mo = l1 ×i
( f1 f 0 ) ( f1 f 2 )
Where l1 is the lower limit of the modal class, l2 is the upper limit of the modal
class, f0 equals the frequency of the preceding class in value, f1 equals the frequency
of the modal class in value, f2 equals the frequency of the following class (class
next to modal class) in value, and i equals the interval of the modal class. Example
4.9 explains the method of estimating mode.
Example 4.9: Determine the mode for the data given in the following table:
Wage Group Frequency (f)
14 — 18 6
18 — 22 18
22 — 26 19
26 — 30 12
30 — 34 5
34 — 38 4
38 — 42 3
42 — 46 2
46 — 50 1
50 — 54 0
54 — 58 1
Solution:
In the given data, 22 – 26 is the modal class since it has the largest frequency. The
lower limit of the modal class is 22, its upper limit is 26, its frequency is 19, the
frequency of the preceding class is 18, and of the following class is 12. The class
interval is 4. Using the various methods of determining mode, we have,
12 18
(i) Mo = 22 4 (ii) Mo = 26 – 18 12 4
18 12
8 12
= 22 = 26 –
5 5
= 23.6 = 23.6
19 18 4
(iii) Mo = 22 4 = 22 = 22.5
(19 18) ( 19 12) 8
In formulae (i) and (ii), the frequency of the classes adjoining the modal
class is used to pull the estimate of the mode away from the midpoint towards
either the upper or lower class limit. In this particular case, the frequency of the
class preceding the modal class is more than the frequency of the class following
and therefore, the estimated mode is less than the midvalue of the modal class.
This seems quite logical. If the frequencies are more on one side of the modal class
than on the other it can be reasonably concluded that the items in the modal class
are concentrated more towards the class limit of the adjoining class with the larger
frequency.
Self - Learning
Material 211
Statistical Computation and Formula (iii) is also based on a logic similar to that of formulae (i) and (ii). In
Probability Distribution
this case, to interpolate the value of the mode within the modal class, the differences
between the frequency of the modal class, and the respective frequencies of the
classes adjoining it are used. This formula usually gives results better than the
NOTES values obtained by the other and exactly equals results obtained by graphic method.
The formulae (i) and (ii) give values which are different from the value obtained by
formula (iii) and are more close to the central point of modal class. If the frequencies
of the class adjoining the modal are equal, the mode is expected to be located at
the midvalue of the modal class, but if the frequency on one of the sides is greater,
the mode will be pulled away from the central point. It will be pulled more and
more if the difference between the frequencies of the classes adjoining the modal
class is higher and higher. In given example in this book, the frequency of the
modal class is 19 and that of the preceding class is 18. So, the mode should be
quite close to the lower limit of the modal class. The midpoint of the modal class is
24 and the lower limit of the modal class is 22.
Locating the Mode by the Graphic Method: The method of graphic
interpolation is shown in Figure 4.3. The upper corners of the rectangle over the
modal class have been joined by straight lines to those of the adjoining rectangles
as shown in Figure 4.3; the right corner to the corresponding one of the adjoining
rectangle on the left, etc. If a perpendicular is drawn from the point of intersection
of these lines, we have a value for the mode indicated on the base line. The
graphic approach is, in principle, similar to the arithmetic interpolation explained
earlier.
The mode may also be determined graphically from an ogive or cumulative
frequency curve. It is found by drawing a perpendicular to the base from that
point on the curve where the curve is most nearly vertical, i.e., steepest (in other
words, where it passes through the greatest distance vertically and smallest distance
horizontally). The point where it cuts the base gives us the value of the mode. How
accurately this method determines the mode is governed by: (i) The shape of the
ogive, (ii) The scale on which the curve is drawn.
n 1
1
= a n 1
b
Gn = ar n–1
= a 2b n1 n1
a
Example 4.12: Find 7 GM’s between 1 and 256.
Solution: Let G1, G2, ... G7, be 7 GM’s between 1 and 256
Then, 256= 9th term of GP,
= 1. r8, where r is the common ratio of the GP
This gives that, r8 = 256 r = 2
Thus, G1 = ar = 1.2 = 2
G2 = ar2 = 1.4 = 4
G3 = ar3 = 1.8 = 8
G4 = ar4 = 1.16 = 16
G5 = ar5 = 1.32 = 32
G6 = ar6 = 1.64 = 64
G7 = ar7 = 1.128 = 128
Hence, required GM’s are 2, 4, 8, 16, 32, 64, 128.
Example 4.13: Sum the series 1 + 3x + 5x2 + 7x3 + ... up to n terms, x 1.
Solution: Note that nth term of this series = (2n – 1) xn – 1
Let Sn = 1 + 3x + 5x2 + ... + (2n – 1) xn – 1
Then, xSn = x + 3x2 + ... + (2n – 3) xn – 1 + (2n – 1) xn
Subtracing, we get
Sn(1 – x) = 1 + 2x + 2x2 + ... + 2xn – 1 – (2n – 1) xn
1 xn 1
= 1 + 2x 1 x – (2n – 1) xn
1 x 2 x 2 x n (2n 1) x n (1 x)
=
1 x
1 x 2 x n (2n 1) x n (2n 1) xn 1
= 1 x
Self - Learning
Material 215
Statistical Computation and
Probability Distribution 1 x (2n 1) x n (2n 1) x n 1
=
1 x
1 x (2n 1) x n (2n 1) x n 1
NOTES Hence, S=
(1 x)2
64 16
= 18750
125 25
1024
= 750
125
= 1024 × 6
= 6144 rupees
Example 4.18: Show that a given sum of money accumulated at 20 % per annum,
more than doubles itself in 4 years at compound interest.
6a
Solution: Let the given sum be a rupees. After 1 year it becomes (it is increased
5
a Self - Learning
by ). Material 217
5
Statistical Computation and 2
Probability Distribution 6 6a 6
At the end of two years it becomes a
5 5 5
Proceeding in this manner, we get that at the end of 4th year, the amount will
4
NOTES be a =
6 1296
a
5 625
1296 46
Now, a 2a a, since a is a + ve quantity, so the amount after 4
625 625
years is more than double of the original amount.
4.3.6 Harmonic Mean
If a, b, c are in HP, then b is called a Harmonic Mean between a and c, written as
HM
Let H1, H2, H3, ..., Hn be the required Harmonic Means. Then
a, H1, H2, ..., Hn, b are in HP
1 1 1 1 1
i.e., , ,
a H1 H 2
, ..., ,
Hn b are in AP
1
Then, = (n + 2)th term of an AP
b
1
= + (n + 1)d
a
Where d is the common difference of AP.
ab
This gives, d=
(n 1)ab
1 1 1 ab
Now, = d
H1 a a (n 1) ab
nb b a b a nb
=
(n 1) ab (n 1) ab
1 a nb
So, =
H1 (n 1) ab
(n 1) ab
H1 =
a nb
1 1 1 2 ( a b)
Again, = 2d
H2 a a (n 1) ab
nb b 2a 2b 2a b nb
=
(n 1) ab (n 1) ab
(n 1) ab
H2 =
2a b nb
1 1 3a 2b nb
Similarly, = 3d
H3 a ( n 1) ab
(n 1) ab
H3 = and so on,
3a 2b nb
1 1 1 n ( a b)
= nd
Hn a a (n 1) ab
nb b na nb
=
(n 1) ab
Self - Learning
na b (n 1) ab
218 Material = Hn =
(n 1) ab na b
Statistical Computation and
Example 4.19: Find the 5th term of 2, 2 12 , 3 13 , ... ... Probability Distributiona
1 1 2 3
Solution: Let 5th term be x. Then, is 5th term of corresponding AP , , , ... ...
x 2 5 10
1 1 2 1 1 1
NOTES
Then, = 4 4
x 2 5 2 2 10
1 1 2 1
= x = 10
x 2 5 10
1 4
Example 4.20: Insert two harmonic means between and .
2 17
1 4
Solution: Let H1, H2 be two harmonic means between and
2 17
1 1 17
Thus, 2, , , are in AP Let d be their common difference
H1 H 2 4
17
Then, = 2 + 3d
4
9 3
3d = d=
4 4
1 3 11 4
Thus, =2+ H1 =
H1 4 4 11
1 3 7 2
=2+2× H2 =
H2 4 2 7
4 2
Required harmonic means are , .
11 7
Step 6: The line from the lower quartile to the minimum can be drawn from the
lower quartile to the smallest point that is greater than L1. Similarly, the line from
the upper quartile to the maximum can be drawn to the largest point smaller than
U1.
Step 7: Points between L1 and L2 or between U1 and U2 can be drawn as small
circles. Points less than L2 or greater than U2 can be drawn as large circles
Thus the box plot identifies the middle 50% of the data, the median and the extreme
points. A single box plot can be drawn for one set of data with no distinct groups.
Alternatively, multiple box plots can be drawn together to compare multiple data
sets or to compare groups in a single data set. For a single box plot, the width of
the box is arbitrary. For multiple box plots, the width of the box plot can be set
proportional to the number of points in the given group or sample.
Self - Learning
222 Material
Statistical Computation and
Probability Distributiona
Check Your Progress
1. Define statistics.
2. How does statistics classify numerical facts? NOTES
3. What is the first step in the statistical treatment of a problem?
4. What is central tendency in statistics?
5. Define the term arithmetic mean.
6. When is weighted arithmetic mean used?
7. Define the term median.
8. What is mode?
9. What are the four important methods of estimating mode of a series?
10. What are positional measures?
Self - Learning
Material 223
Statistical Computation and
Probability Distribution
4.4.1 Range
The crudest measure of dispersion is the range of the distribution. The range of
any series is the difference between the highest and the lowest values in the series.
NOTES If the marks received in an examination taken by 248 students are arranged in
ascending order, then the range will be equal to the difference between the highest
and the lowest marks.
In a frequency distribution, the range is taken to be the difference between
the lower limit of the class at the lower extreme of the distribution and the upper
limit of the class at the upper extreme.
Table 4.5 Weekly Earnings of Labourers in Four Workshops of the Same Type
No. of workers
Weekly earnings
Workshop A Workshop B Workshop C Workshop D
15–16 ... ... 2 ...
17–18 ... 2 4 ...
19–20 ... 4 4 4
21–22 10 10 10 14
23–24 22 14 16 16
25–26 20 18 14 16
27–28 14 16 12 12
29–30 14 10 6 12
31–32 ... 6 6 4
33–34 ... ... 2 2
35–36 ... ... ... ...
37–38 ... ... 4 ...
Total 80 80 80 80
Mean 25.5 25.5 25.5 25.5
Self - Learning
224 Material
For Table 4.5, the relative dispersion would be: Statistical Computation and
Probability Distributiona
9 23
Workshop A = Workshop C =
25.5 25.5
Workshop B =
15
Workshop D =
15 NOTES
25.5 25.5
23 23 15 15
Workshop C = Workshop D =
15 38 53 19 34 53
Self - Learning
Material 225
Statistical Computation and Table 4.6 Distribution with the Same Number of Cases,
Probability Distribution but Different Variability
No. of students
Class
Section Section Section
NOTES A B C
0–10 ... ... ...
10–20 1 ... ...
20–30 12 12 19
30–40 17 20 18
40–50 29 35 16
50–60 18 25 18
60–70 16 10 18
70–80 6 8 21
80–90 11 ... ...
90–100 ... ... ...
Total 110 110 110
Range 80 60 60
The table is designed to illustrate three distributions with the same number of cases
but different variability. The removal of two extreme students from section A would
make its range equal to that of B or C.
The greater range of A is not a description of the entire group of 110 students, but
of the two most extreme students only. Further, though sections B and C have the
same range, the students in section B cluster more closely around the central
tendency of the group than they do in section C. Thus, the range fails to reveal the
greater homogeneity of B or the greater dispersion of C. Due to this defect, it is
seldom used as a measure of dispersion.
Specific uses of range
In spite of the numerous limitations of the range as a measure of dispersion, there
are the following circumstances when it is the most appropriate one:
(i) In situations where the extremes involve some hazard for which preparation
should be made, it may be more important to know the most extreme cases
to be encountered than to know anything else about the distribution. For
example, an explorer, would like to know the lowest and the highest
temperatures on record in the region he is about to enter; or an engineer
would like to know the maximum rainfall during 24 hours for the construction
of a storem water drain.
(ii) In the study of prices of securities, range has a special field of activity. Thus
to highlight fluctuations in the prices of shares or bullion it is a common
practice to indicate the range over which the prices have moved during a
certain period of time. This information, besides being of use to the operators,
gives an indication of the stability of the bullion market, or that of the
investment climate.
(iii) In statistical quality control the range is used as a measure of variation. We,
e.g., determine the range over which variations in quality are due to random
causes, which is made the basis for the fixation of control limits.
Self - Learning
226 Material
4.4.2 Quartile Deviation Statistical Computation and
Probability Distributiona
Another measure of dispersion, much better than the range, is the semi-interquartile
range, usually termed as ‘Quartile Deviation’. As stated in the previous unit, quartiles
are the points which divide the array in four equal parts. More precisely, Q1 gives
NOTES
the value of the item 1/4th the way up the distribution and Q3 the value of the item
3/4th the way up the distribution. Between Q1 and Q3 are included half the total
number of items. The difference between Q1 and Q3 includes only the central
items but excludes the extremes. Since under most circumstances, the central half
of the series tends to be fairly typical of all the items, the interquartile range
(Q3– Q1) affords a convenient and often a good indicator of the absolute variability.
The larger the interquartile range, the larger the variability.
Usually, one-half of the difference between Q3 and Q1 is used and to it is given the
name of quartile deviation or semi-interquartile range. The interquartile range is
divided by two for the reason that half of the interquartile range will, in a normal
distribution, be equal to the difference between the median and any quartile. This
means that 50 per cent items of a normal distribution will lie within the interval
defined by the median plus and minus the semi-interquartile range.
Symbolically:
Q3 Q1
Q.D. = ...(4.1)
2
Let us find quartile deviations for the weekly earnings of labour in the four workshop
whose data is given in Table 4.5. The computations are as shown in Table 4.7.
As shown in the table, Q.D. of workshop A is 2.12 and median value in 25.3.
This means that if the distribution is symmetrical the number of workers whose
wages vary between (25.3–2.1) = 23.2 and (25.3 + 2.1) = 27.4, shall be just
half of the total cases. The other half of the workers will be more than 2.1
removed from the median wage. As this distribution is not symmetrical, the distance
between Q1 and the median Q2 is not the same as between Q3 and the median.
Hence the interval defined by median plus and minus semi inter-quartile range will
not be exactly the same as given by the value of the two quartiles. Under such
conditions the range between 23.2 and 27.4 will not include precisely 50 per
cent of the workers.
If quartile deviation is to be used for comparing the variability of any two series, it
is necessary to convert the absolute measure to a coefficient of quartile deviation.
To do this the absolute measure is divided by the average size of the two quartile.
Symbolically:
Q3 Q1
Coefficient of quartile deviation = ...(4.2)
Q3 Q1
Applying this to our illustration of four workshops, the coefficients of Q.D. are as
given below.
Self - Learning
Material 227
Statistical Computation and Table 4.7 Calculation of Quartile Deviation
Probability Distribution
Workshop Workshop Workshop Workshop
A B C D
N 80 80 80 80
NOTES Location of Q2
2 2
40
2
40
2
40
2
40
40 30 40 30 40 30 40 30
Q2 24.5 + 2 24.5 2 24.5 + 2 24.5 + 2
22 18 16 16
= 24.5 + 0.9 = 24.5 + 1.1 = 24.5 + 0.75 = 24.5 + 0.75
= 25.4 = 25.61 = 25.25 = 25.25
N 80 80 80 80
Location of Q1 20 20 20 20
4 4 4 4 4
20 10 20 16 20 10 20 18
Q1 22.5 2 22.5 2 20.5 2 22.5 2
22 14 10 16
= 22.5 + .91 = 22.5 + .57 = 20.5 + 2 = 22.5 + .25
= 23.41 = 23.07 = 22.5 = 22.75
3N 80
Location of Q3 3 60 60 60 60
4 4
60 52 60 48 60 50
Q3 26.5 2 26.5 2 26.5 2
14 16 12
60 50
26.5 2
12
= 26.5 + 1.14 = 26.5 + 1.5= 26.5 + 1.67 = 26.5 + 1.67
= 27.64 = 28.0 = 28.17 = 28.17
Coefficient of quartile
Q3 Q1 27. 64 23. 41 28 23. 07 28.17 22.5 28.17 22. 75
deviation Q Q =
3 1 27. 64 23. 41 28 23.07 28.17 22.5 28.17 22. 75
= 0.083 = 0.097 = 0.112 = 0.106
6 3 3 3 9
7 6 9 2 12
8 9 18 1 9
9 13 31 0 0
10 8 39 1 8
11 5 44 2 10
12 4 48 3 12
48 60
48 1
Median = the size of = 24.5th item which is 9.
2
Therefore, deviations d are calculated from 9, i.e., | d | = | x – 9 |.
f |d | 60
Mean deviation = f = = 1.25
48
Example 4.23: Calculate the mean deviation from the following data:
Solution:
This is a frequency distribution with continuous variable. Thus, deviations are
calculated from mid-values.
Self - Learning
230 Material
Statistical Computation and
x Mid-value f Less than Deviation f| d |
Probability Distributiona
c.f. from median
|d|
0–10 5 18 18 19 342
10–20 15 16 34 9 144 NOTES
20–30 25 15 49 1 15
30–40 35 12 61 11 132
40–50 45 10 71 21 210
50–60 55 5 76 31 155
60–70 65 2 78 41 82
70–80 75 2 80 51 102
80 1182
80
Median = the size of th item
2
6
= 20 × 10 = 24
15
f |d |
and then, mean deviation = f
1182
= = 14.775.
80
= 0.616
NOTES
4.5 STANDARD DEVIATION
By far the most universally used and the most useful measure of dispersion is the
standard deviation or root mean square deviation about the mean. We have seen
that all the methods of measuring dispersion so far discussed are not universally
adopted for want of adequacy and accuracy. The range is not satisfactory as its
magnitude is determined by most extreme cases in the entire group. Further, the
range is notable because it is dependent on the item whose size is largely matter of
chance. Mean deviation method is also an unsatisfactory measure of scatter, as it
ignores the algebraic signs of deviation. We desire a measure of scatter which is
free from these shortcomings. To some extent standard deviation is one such
measure.
The calculation of standard deviation differs in the following respects from
that of mean deviation. First, in calculating standard deviation, the deviations are
squared. This is done so as to get rid of negative signs without committing algebraic
violence. Further, the squaring of deviations provides added weight to the extreme
items, a desirable feature for certain types of series.
Secondly, the deviations are always recorded from the arithmetic mean,
because although the sum of deviations is the minimum from the median, the sum
of squares of deviations is minimum when deviations are measured from the
arithmetic average. The deviation from x is represented by d.
Thus, standard deviation, (sigma) is defined as the square root of the
mean of the squares of the deviations of individual items from their arithmetic
mean.
2
(x x)
= ...(4.6)
N
For grouped data (discrete variables)
2
f (x x )
= ...(4.7)
f
and, for grouped data (continuous variables)
f (M x)
= ...(4.8)
f
176 10
Example 4.25: Find the standard deviation of the data in the following distributions:
x 12 13 14 15 16 17 18 20
f 4 11 32 21 15 8 6 4
Solution:
For this discrete variable grouped data, we use formula 8. Since for calculation of
x , we need fx and then for we need f ( x x ) 2 , the calculations are
conveniently made in the following format.
x f fx d=x– x d2 fd2
12 4 48 –3 9 36
13 11 143 –2 4 44
14 32 448 –1 1 32
15 21 315 0 0 0
16 15 240 1 1 15
17 8 136 2 4 32
18 5 90 3 9 45
20 4 80 5 25 100
Here x = fx / f = 1500/100 = 15
fd 2
and =
f
304
= = 3. 04 = 1.74 Self - Learning
100
Material 233
Statistical Computation and Example 4.26: Calculate the standard deviation of the following data.
Probability Distribution
1–3 2 1 2 –6 36 36
3–5 4 9 36 –4 16 144
5–7 6 25 150 –2 4 100
7–9 8 35 280 0 0 0
9–11 10 17 170 2 4 68
11–13 12 10 120 4 16 160
13–15 14 3 42 6 36 108
100 800 616
f ( M x )2
=
f
fd 2 616
=
f 100
= 2.48
4.5.1 Calculation of Standard Deviation by Short-cut
Method
The three examples worked out above have one common simplifying feature,
namely x in each, turned out to be an integer, thus, simplifying calculations. In
most cases, it is very unlikely that it will turn out to be so. In such cases, the
calculation of d and d2 becomes quite time-consuming. Short-cut methods have
consequently been developed. These are on the same lines as those for calculation
of mean itself.
In the short-cut method, we calculate deviations x' from an assumed mean A.
Then,
for ungrouped data
= x 2 FG
x IJ 2
...(4.9)
N
N H K
Self - Learning
234 Material
and for grouped data Statistical Computation and
Probability Distributiona
= fx 2
FG IJ
fx
2
...(4.10)
f f H K
NOTES
This formula is valid for both discrete and continuous variables. In case of continuous
variables, x in the equation x' = x – A stands for the mid-value of the class in
question.
Note that the second term in each of the formulae is a correction term because of
the difference in the values of A and x . When A is taken as x itself, this correction
is automatically reduced to zero.
Example 4.27: Compute the standard deviation by the short-cut method for the
following data:
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21
Solution: Let us assume that A = 15.
N = 11 x = 11 x 2 = 121
= x 2 FG
x IJ 2
N
H
N K
= 121 11 FG IJ 2
11 H 11K
= 11 1
= 10
= 3.16.
Another method
If we assumed A as zero, then the deviation of each item from the assumed mean
is the same as the value of item itself. Thus, 11 deviates from the assumed mean of
zero by 11, 12 deviates by 12, and so on. As such, we work with deviations
without having to compute them, and the formula takes the following shape:
Self - Learning
Material 235
Statistical Computation and
x x2
Probability Distribution
11 121
12 144
13 169
NOTES 14 196
15 225
16 256
17 289
18 324
19 361
20 400
21 441
176 2,926
= x2 xFG IJ 2
N
N H K
= 926 176 FG IJ 2
= 266 256 = 3.16
11 H 11 K
Example 4.28: Calculate the standard deviation of the following data by short
method.
Person 1 2 3 4 5 6 7
Monthly income
(Rupees) 300 400 420 440 460 480 580
Solution: In this data, the values of the variable are very large making calculations
cumbersome. It is advantageous to take a common factor out. Thus, we use
x A
x' = . The standard deviation is calculated using x' and then the true value of
20
is obtained by multiplying back by 20. The effective formula then is
= x2 FG
x IJ 2
C×
N
HN K
where C represents the common factor.
Using x' = (x – 420)/20.
300 –120 –6 36
400 –20 –1 1
420 0 0 0
–7
440 20 1 1
460 40 2 4
480 60 3 9
580 160 8 64
+ 14
Self - Learning N=7 7 115
236 Material
Statistical Computation and
= x2 x FG IJ 2
Probability Distributiona
20 ×
N
N H K
= 20 115 7 FG IJ 2
NOTES
7 H 7K
= 78.56
Example 4.29: Calculate the standard deviation from the following data:
Size 6 9 12 15 18
Frequency 7 12 19 10 2
Solution:
x Frequency Deviation Deviation x' times x'2 times
f from divided frequency frequency
assumed by common fx' fx'2
mean 12 factor 3
x'
6 7 –6 –2 –14 28
9 12 –3 –1 –12 12
12 19 0 0 0 0
15 10 3 1 10 10
18 2 6 2 4 8
N = 50 fx fx 2
= –12 = 58
=C fx 2
FG
fx IJ 2
N N H K
= 3 58 12 FG IJ 2
50 H 50 K
= 3 1.1600 . 0576 = 3 × 1.05 = 3.15.
Example 4.30: Obtain the mean and standard deviation of the first N natural
numbers, i.e., of 1, 2, 3, ..., N – 1, N.
Solution: Let x denote the variable which assumes the values of the first N natural
numbers.
Then
N N ( N 1)
x
1 2 N 1
x =
N N 2
N
because x = 1 + 2 + 3 + ... + (N – 1) + N
1
N ( N 1)
=
2 Self - Learning
Material 237
Statistical Computation and To calculate the standard deviation , we use 0 as the assumed mean A. Then
Probability Distribution
= x2 x FG IJ 2
N
N H K
NOTES N ( N 1) ( 2 N 1)
But x2 = 12 + 22 + 32 + ... (N – 1)2 + N2 =
6
Therefore
N ( N 1) ( 2 N 1) N 2 ( N 1) 2
=
6N 4N 2
=
LM
( N 1) 2 N 1 N 1
OP = ( N 1) ( N 1)
2 N3 2 Q 12
Thus for first 11 natural numbers
11 1
x = 6
2
(11 1) (11 1)
and = = 10 = 3.16
12
Example 4.31:
Mid- Frequency Deviation Deviation Squared
point f from class time deviation
x of assumed frequency times
mean fx' frequency
x' fx'2
0–10 5 18 –2 –36 72
10–20 15 16 –1 –16 16
–52
20–30 25 15 0 0 0
30–40 35 12 1 12 12
40–50 45 10 2 20 40
50–60 55 5 3 15 45
60–70 65 2 4 8 32
70–80 75 1 5 5 25
– 60
79 60 242
–52
fx = 8
Solution: Since the deviations are from assumed mean and expressed in terms of
class-interval units,
= x2
FG IJ
fx
2
i×
N H K
N
242 F 8 I
2
= 10 × G J
79 H 79 K
= 10 × 1.75 = 17.5.
Self - Learning
238 Material
4.5.2 Combining Standard Deviations of Two Statistical Computation and
Probability Distributiona
Distributions
If we were given two sets of data of N1 and N2 items with means x1 and x 2 and
standard deviations 1 and 2 respectively, we can obtain the mean and standard NOTES
deviation x and of the combined distribution by the following formulae:
N 1 x1 N 2 x 2
x = ...(4.11)
N1 N 2
N 1 12 N 2 22 N 1 ( x x1 ) 2 N 2 ( x x 2 ) 2
and = ...(4.12)
N1 N 2
Example 4.32: The mean and standard deviations of two distributions of 100
and 150 items are 50, 5 and 40, 6 respectively. Find the standard deviation of all
taken together.
Solution: Combined mean
N 1 x1 N 2 x 2 100 × 50 150 × 40
x = =
N1 N 2 100 150
= 44
Combined standard deviation
N112 N 2 22 N1 ( x x1 )2 N 2 ( x x2 ) 2
=
N1 N 2
= 7.46.
Example 4.33: A distribution consists of three components with 200, 250, 300
items having mean 25, 10 and 15 and standard deviation 3, 4 and 5, respectively.
Find the standard deviation of the combined distribution.
Solution: In the usual notations, we are given here
N 1 = 200, N2= 250, N3 = 300
x1 = 25, x 2 = 10, x 3 = 15
The Equations (4.11) and (4.12) can easily be extended for combination of three
series as
N 1 x1 N 2 x 2 N 3 x 3
x =
N1 N 2 N 3
200 × 25 250 × 10 300 × 15
=
200 250 300
= 12000 = 16
750
Self - Learning
Material 239
Statistical Computation and and
Probability Distribution
N112 N 2 22 N3 32 N1 ( x x1 )2
N 2 ( x x2 ) 2 N3 ( x x3 )2
NOTES =
N1 N 2 N 3
Self - Learning
240 Material
Statistical Computation and
4.6 PROBABILITY Probability Distributiona
Self - Learning
244 Material
4.6.3 Theorems on Probability Statistical Computation and
Probability Distributiona
When two events are mutually exclusive, then the probability that either of the
events will occur is the sum of their separate probabilities. For example, if you
roll a single die then the probability that it will come up with a face 5 or face
NOTES
6, where event A refers to face 5 and event B refers to face 6, both events
being mutually exclusive events, is given by,
P[A or B] = P[A] + P[B]
or P[5 or 6] = P[5] + P[6]
= 1/6 +1/6
= 2/6 = 1/3
P [A or B] is written as P[A B] and is known as P [A union B].
However, if events A and B are not mutually exclusive, then the probability
of occurrence of either event A or event B or both is equal to the probability
that event A occurs plus the probability that event B occurs minus the probability
that events common to both A and B occur.
Symbolically, it can be written as,
P[A B] = P[A] + P[B] – P[A and B]
P[A and B] can also be written as P[A B], known as P [A intersection
B] or simply P[AB].
Events [A and B] consist of all those events which are contained in both
A and B simultaneously. For example, in an experiment of taking cards out of
a pack of 52 playing cards, assume that:
Event A = An ace is drawn
Event B = A spade is drawn
Event [AB] = An ace of spade is drawn
Hence, P[A B] P[A] + P[B] – P[AB]
= 4/52 + 13/52 – 1/52
= 16/52 = 4/13
This is because there are 4 aces, 13 cards of spades, including 1 ace of
spades out of a total of 52 cards in the pack. The logic behind subtracting
P[AB] is that the ace of spades is counted twice—once in event A (4 aces) and
once again in event B (13 cards of spade including the ace).
Another example for P[A B], where event A and event B are not
mutually exclusive is as follows:
Suppose a survey of 100 persons revealed that 50 persons read India
Today and 30 persons read Time magazine and 10 of these 100 persons read
both India Today and Time. Then:
Event [A] = 50
Event [B] = 30
Event [AB] = 10
Self - Learning
Material 245
Statistical Computation and Since event [AB] of 10 is included twice, both in event A as well as in
Probability Distribution
event B, event [AB] must be subtracted once in order to determine the event
[A B] which means that a person reads India Today or Time or both.
Hence,
NOTES
P[A B] = P [A] + P [B] – P [AB]
= 50/100 + 30/100 –10/100
= 70/100 = 0.7
Multiplication Rule
Multiplication rule is applied when it is necessary to compute the probability if
both events A and B will occur at the same time. The multiplication rule is
different if the two events are independent as against the two events being not
independent.
If events A and B are independent events, then the probability that they
both will occur is the product of their separate probabilities. This is a strict
condition so that events A and B are independent if, and only if,
P [AB] = P[A] × P[B] or
= P[A]P[B]
For example, if we toss a coin twice, then the probability that the first toss
results in a head and the second toss results in a tail is given by,
P [HT] = P[H] × P[T]
= 1/2 × 1/2 = 1/4
However, if events A and B are not independent, meaning that the probability
of occurrence of an event is dependent or conditional upon the occurrence or
non-occurrence of the other event, then the probability that they will both occur
is given by,
P[AB] = P[A] × P[B/given the outcome of A]
This relationship is written as:
P[AB] = P[A] × P[B/A] = P[A] P[B/A]
where P[B/A] means the probability of event B on the condition that event A
has occurred. As an example, assume that a bowl has 6 black balls and 4 white
balls. A ball is drawn at random from the bowl. Then a second ball is drawn
without replacement of the first ball back in the bowl. The probability of the
second ball being black or white would depend upon the result of the first draw
as to whether the first ball was black or white. The probability that both these
balls are black is given by,
P [two black balls] = P [black on 1st draw] × P [black on 2nd draw/black
on 1st draw]
= 6/10 × 5/9 = 30/90 = 1/3
This is so because, first there are 6 black balls out of a total of 10, but
if the first ball drawn is black then we are left with 5 black balls out of a total
of 9 balls.
Self - Learning
246 Material
4.6.4 Counting Techniques Statistical Computation and
Probability Distributiona
Reverend Thomas Bayes (1702–1761) introduced his theorem on probability
which is concerned with a method for estimating the probability of causes which
are responsible for the outcome of an observed effect. Being a religious preacher NOTES
himself as well as a mathematician, his motivation for the theorem came from his
desire to prove the existence of God by looking at the evidence of the world that
God created. He was interested in drawing conclusions about the causes by
observing the consequences. The theorem contributes to the statistical decision
theory in revising prior probabilities of outcomes of events based upon the
observation and analysis of additional information.
Bayes’ theorem makes use of conditional probability formula where the
condition can be described in terms of the additional information which would
result in the revised probability of the outcome of an event.
Suppose that there are 50 students in our statistics class out of which 20 are
male students and 30 are female students. Out of the 30 females, 20 are Indian
students and 10 are foreign students. Out of the 20 male students, 15 are Indians
and 5 are foreigners, so that out of all the 50 students, 35 are Indians and 15 are
foreigners. This data can be presented in a tabular form as follows:
Indian Foreigner Total
Male 15 5 20
Female 20 10 30
Total 35 15 50
In the example discussed here, there are 2 basic events which are A1 (female)
and A2 (male). However, if there are n basic events, A1, A2, .....An, then Bayes’
theorem can be generalized as,
P ( A1 ) P( B / A1 )
P( A1 / B)
P( A1 ) P ( B / A1 ) P( A2 )(P (B / A2 ) ... P ( An ) P ( B / An )
Self - Learning
Material 247
Statistical Computation and Solving the case of 2 events we have,
Probability Distribution
( 30 / 50)( 20 / 30)
P ( A1 / B ) 20 / 35 4 / 7 0.57
( 30 / 50)( 20 / 30) ( 20 / 50)(15 / 20)
NOTES
This example shows that while the prior probability of picking up a female
student is 0.6, the posterior probability becomes 0.57 after the additional
information that the student is an American is incorporated in the problem.
Another example of application of Bayes’ theorem is as follows:
Example 4.34: A businessman wants to construct a hotel in New Delhi. He
generally builds three types of hotels. These are 50 rooms, 100 rooms and 150
rooms hotels, depending upon the demand for the rooms, which is a function of
the area in which the hotel is located, and the traffic flow. The demand can be
categorized as low, medium or high. Depending upon these various demands, the
businessman has made some preliminary assessment of his net profits and possible
losses (in thousands of dollars) for these various types of hotels. These pay-offs
are shown in the following table.
States of Nature
Demand for Rooms
Low (A1) Medium (A2) High (A3)
Solution: The businessman has also assigned ‘prior probabilities’ to the demand
structure or rooms. These probabilities reflect the initial judgement of the
businessman based upon his intuition and his degree of belief regarding the outcomes
of the states of nature.
Demand for rooms Probability of Demand
Low (A1) 0.2
Medium (A2) 0.5
High (A3) 0.3
Based upon these values, the expected pay-offs for various rooms can be
computed as follows,
EV (50) = ( 25 × 0.2) + (35 × 0.5) + (50 × 0.3) = 37.50
EV (100) = (–10 × 0.2) + (40 × 0.5) + (70 × 0.3) = 39.00
EV (150) = (–30 × 0.2) + (20 × 0.5) + (100 × 0.3) = 34.00
This gives us the maximum pay-off of $39,000 for building a 100 rooms
hotel.
Now the hotelier must decide whether to gather additional information
regarding the states of nature, so that these states can be predicted more accurately
Self - Learning
248 Material
than the preliminary assessment. The basis of such a decision would be the cost of
obtaining additional information. If this cost is less than the increase in maximum Statistical Computation and
Probability Distributiona
expected profit, then such additional information is justified.
Suppose that the businessman asks a consultant to study the market and
predict the states of nature more accurately. This study is going to cost the
NOTES
businessman $10,000. This cost would be justified if the maximum expected profit
with the new states of nature is at least $10,000 more than the expected pay-off
with the prior probabilities. The consultant made some studies and came up with
the estimates of low demand (X1), medium demand (X2), and high demand (X3)
with a degree of reliability in these estimates. This degree of reliability is expressed
as conditional probability which is the probability that the consultant’s estimate of
low demand will be correct and the demand will be actually low. Similarly, there
will be a conditional probability of the consultant’s estimate of medium demand,
when the demand is actually low, and so on. These conditional probabilities are
expressed in Table 4.8.
Table 4.8 Conditional Probabilities
X1 X2 X3
States of (A1) 0.5 0.3 0.2
Nature (A2) 0.2 0.6 0.2
(Demand) (A3) 51 0.3 0.6
The values in the preceding table are conditional probabilities and are
interpreted as follows:
The upper north-west value of 0.5 is the probability that the consultant’s
prediction will be for low demand (X1) when the demand is actually low. Similarly,
the probability is 0.3 that the consultant’s estimate will be for medium demand
(X2) when in fact the demand is low, and so on. In other words, P(X1/ A1) = 0.5
and P(X2/ A1) = 0.3. Similarly, P(X1 / A2) = 0.2 and P(X2 / A2) = 0.6, and so on.
Our objective is to obtain posteriors which are computed by taking the
additional information into consideration. One way to reach this objective is to
first compute the joint probability which is the product of prior probability and
conditional probability for each state of nature. Joint probabilities as computed is
given as,
State Prior Joint Probabilities
Now, the posterior probabilities for each state of nature Ai are calculated as
follows:
Joint probability of Ai and Xj
P (Ai / Xj )
Marginal probability of Xj
Self - Learning
Material 249
Statistical Computation and By using this formula, the joint probabilities are converted into posterior
Probability Distribution
probabilities and the computed table for these posterior probabilities is given as,
States of Nature Posterior Probabilities
NOTES P(A1/X1) P(A1/X2) P(A1/X3)
Now, we have to compute the expected pay-offs for each course of action
with the new posterior probabilities assigned to each state of nature. The net profits
for each course of action for a given state of nature is the same as before and is
restated as follows. These net profits are expressed in thousands of dollars.
Low (A1) Medium (A2) High (A3)
Number of Rooms (R1) 25 35 50
(R2) –10 40 70
(R3) –30 20 100
Let Oij be the monetary outcome of course of action (i) when (j) is the
corresponding state of nature, so that in the above case Oi1 will be the outcome of
course of action R1 and state of nature A1, which in our case is $25,000. Similarly,
Oi2 will be the outcome of action R2 and state of nature A2, which in our case is –
$10,000, and so on. The expected value EV (in thousands of dollars) is calculated
on the basis of actual state of nature that prevails as well as the estimate of the
state of nature as provided by the consultant. These expected values are calculated
as follows,
Course of action = Ri
Estimate of consultant = Xi
Actual state of nature = Ai
where, i = 1, 2, 3
Then
(A) Course of action = R1 = Build 50 rooms hotel
R A
EV 1 = P i Oi1
X1 X1
= 0.435(25) + 0.435 (–10) + 0.130 (–30)
= 10.875 – 4.35 – 3.9 = 2.625
R A
EV 1 = P i Oi1
X2 X2
= 0.133(25) + 0.667 (–10) + 0.200 (–30)
Self - Learning
250 Material
= 3.325 – 6.67 – 6.0 = –9.345 Statistical Computation and
Probability Distributiona
R Ai
EV 1 = P Oi1
X3 X3
NOTES
= 0.125(25) + 0.312(–10) + 0.563(–30)
= 3.125 – 3.12 – 16.89
= –16.885
(B) Course of action = R2 = Build 100 rooms hotel
R A
EV 2 = P i Oi 2
X1 X1
= 0.435(35) + 0.435 (40) + 0.130 (20)
= 15.225 + 17.4 + 2.6 = 35.225
R A
EV 2 = P i Oi 2
X2 X1
= 0.133(35) + 0.667 (40) + 0.200 (20)
= 4.655 + 26.68 + 4.0 = 35.335
R A
EV 2 = P i Oi 2
X3 X3
= 0.125(35) + 0.312(40) + 0.563(20)
= 4.375 + 12.48 + 11.26 = 28.115
(C) Course of action = R3 = Build 150 rooms hotel
R A
EV 3 = P i Oi 3
X1 X1
= 0.435(50) + 0.435(70) + 0.130 (100)
= 21.75 + 30.45 + 13 = 65.2
R A
EV 3 = P i Oi 3
X2 X2
R A
EV 3 = P i Oi 3
X3 X3
= 0.125(50) + 0.312(70) + 0.563(100)
= 6.25 + 21.84 + 56.3 = 84.39 Self - Learning
Material 251
Statistical Computation and The calculated expected values in thousands of dollars, are presented in a
Probability Distribution
tabular form.
Expected Posterior Pay-offs
NOTES
Outcome EV (R1/Xi) EV (R2/Xi) EV (R3/Xi)
Self - Learning
252 Material
Variance Statistical Computation and
Probability Distributiona
The variance of a discrete random variable X is defined by and measures the spread,
or variability, of the distribution:
NOTES
Because the variation in each variable adds to the variation in each case,
variances are added for both the sum and difference of two independent random
variables. Variability in one variable is connected to variability in the other if the
variables are not independent. As a result, applying the preceding formula to
compute the variance of their total or difference may not be possible.
Assume that variable X represents the amount of money (in dollars) spent
on lunch by a group of people, and variable Y represents the amount of money
spent on supper by the same group of people. Because X and Y are not considered
independent variables, the variance of the sum X + Y cannot be calculated as the
sum of the variances.
Self - Learning
Material 253
Statistical Computation and Table 4.9 Probability Distribution for Various Sales Levels
Probability Distribution
Sales (in units) Probability
Xi pr. (Xi)
NOTES X1 50 0.10
X2 100 0.30
X3 150 0.30
X4 200 0.15
X5 250 0.10
X6 300 0.05
Total 1.00
0 1 2 3 4 5 6 7 8
No. of Successes
Fig. 4.4 Bonimial Distribution Skewed to the Right
(ii) When p is equal to 0.5, the binomial distribution is symmetrical and the
graph takes the form as shown in Figure 4.5.
Probability
0 1 2 3 4 5 6 7 8
No. of Successes
Fig. 4.5 Symmetrical Binomial Distribution
(iii) When p is larger than 0.5, the binomial distribution is skewed to the left
and the graph takes the form as shown in Figure 4.6.
Probability
0 1 2 3 4 5 6 7 8
No. of Successes
Self - Learning
Fig. 4.6 Bonimial Distribution Skewed to the Left Material 257
Statistical Computation and But if ‘p’ stays constant and ‘n’ increases, then as ‘n’ increases, the vertical
Probability Distribution
lines become not only numerous but also tend to bunch up together to form a
bell shape, i.e. the binomial distribution tends to become symmetrical and the
graph takes the shape as shown in Figure 4.7.
NOTES
Probability
0, 1, 2, ..........
No. of Successes
Fig. 4.7 The Bell-Shaped Binomial Distribution
1 6 p 6q 2
Kurtosis = 3
npq
i e
f ( X i x)
x
x = 0, 1, 2,…
where = Average number of occurrences per specified interval. In other
words, it is the mean of the distribution.
e = 2.7183 being the basis of natural logarithms.
x = Number of occurrences of a given event.
23 e 2 2 2 2 0.13534
P(with 3 defects, i.e., x = 3) = =
3 3 2 1 Self - Learning
Material 261
Statistical Computation and
0.54136
Probability Distribution = 0.18045
3
24 e 2 2 2 2 2 0.13534
P(with 4 defects, i.e., x = 4) = =
NOTES 4 4 3 2 1
0.27068
= 0.09023
3
f Xi x
np x e np
x
You can explain Poisson distribution as an approximation of the binomial
distribution with the help of following example.
Example 4.37: Given are the following information:
(a) There are 20 machines in a certain factory, i.e. n = 20.
(b) The probability of machine going out of order during any day is 0.02.
What is the probability that exactly three machines will be out of order on the
same day? Calculate the required probability using both binomial and Poissons
distributions and state whether Poisson distribution is a good approximation of
the binomial distribution in this case.
Self - Learning
262 Material
Solution: Probability as per Poisson probability function (using np in place of ) Statistical Computation and
Probability Distributiona
(since n 20 and p 0.05)
f Xi x
np x e np
NOTES
x
Where x means number of machines becoming out of order on the same day.
20 0.02 e
3 200.02
P(Xi = 3) =
3
=
0.4 . 0.67032 (0.064)(0.67032)
3
3 2 1 6
= 0.00715
Probability as per binomial probability function,
f(Xi = r) = nCr prqn–r
where n = 20, r = 3, p = 0.02 and hence q = 0.98
f(Xi = 3) = 20C
3(0.02)
3 (0.98)17
= 0.00650
The difference between the probability of three machines becoming out of order
on the same day calculated using probability function and binomial probability
function is just 0.00065. The difference being very very small, you can state that
in the given case Poisson distribution appears to be a good approximation of
binomial distribution.
Example 4.38: How would you use a Poisson distribution to find approximately
the probability of exactly 5 successes in 100 trials the probability of success in
each trial being p = 0.1?
Solution: Given:
n = 100 and p = 0.1
= n.p = 100 × 0.1 = 10
To find the required probability, the Poisson probability function can be used as
an approximation to the binomial probability function as follows:
f Xi x
x e
np x e np
x x
X or X
–3 –2 – +2 +3
(vi) The normal distribution has only one mode since the curve has a single
peak. In other words, it is always a unimodal distribution.
(vii) The maximum ordinate divides the graph of normal curve into two equal
parts.
(viii) In addition to all the above stated characteristics the curve has the following
properties:
_
(a) µ = x
(b) µ2=2 = Variance
(c) µ4=34
(d) Moment coefficient of Kurtosis = 3
Family of normal distributions or curves
You can have several normal probability distributions but each particular normal
distribution is being defined by its two parameters, namely the mean (µ) and the
standard deviation (). There is, thus, not a single normal curve but rather a family
of normal curves. Figures 4.10–4.12 exhibit some of these normal curves:
Curve having small standard
deviation say ( = 1)
Curve having large standard
deviation say ( = 5)
Curve having very large standard
deviation say ( = 10)
in a normal
distribution
NOTES
σ = 6.45
μ = 18
z=0 X = 22
Calculate Z as under:
X 22 18
Z= 0.62
6.45
The value from the table showing the area under the normal curve for Z = 0.62 is
0.2324. This means that the area of the curve between µ = 18 and X = 22 is
0.2324. Hence, the area of the shaded portion of the curve is (0.5) – (0.2324) =
0.2676 since the area of the entire right hand portion of the curve always happens
to be 0.5. Thus, the probability that there will still be money in 22 months in a
savings account is 0.2676.
(b) For finding the required probability, you are interested in the area of the portion
of the normal curve as shaded and shown in the following figure:
σ = 6.45
μ = 18 X = 24
z=0
Calculate Z as under:
24 18
Z 0.93
6.45
The value from the concerning table, when Z = 0.93, is 0.3238 which refers to the
area of the curve between µ = 18 and X = 24. The area of the entire left hand
Self - Learning portion of the curve is 0.5 as usual.
268 Material
Hence, the area of the shaded portion is (0.5) + (0.3238) = 0.8238 which is the Statistical Computation and
Probability Distributiona
required probability that the account will have been closed before two years, i.e.
before 24 months.
Example 4.41: Regarding a certain normal distribution concerning the income of
NOTES
the individuals, you are given that mean = 500 rupees and standard deviation
=100 rupees. Find the probability that an individual selected at random will belong
to income group:
(a) Rs 550 to Rs 650 (b) Rs 420 to 570
Solution: (a) For finding the required probability, you are interested in the area of
the portion of the normal curve as shaded and shown in the following figure:
= 100
=500
= 500 X = 650
z = 0 X = 550
For finding the area of the curve between X = 550 to 650, do the following
calculations:
550 500 50
Z 0.50
100 100
Corresponding to which the area between µ = 500 and X = 550 in the curve as
per table is equal to 0.1915 and
650 500 150
Z 1.5
100 100
Corresponding to which the area between µ = 500 and X = 650 in the curve as
per table is equal to 0.4332
Hence, the area of the curve that lies between X = 550 and X = 650 is
(0.4332) – (0.1915) = 0.2417
This is the required probability that an individual selected at random will belong to
income group of Rs 550 to Rs 650.
(b) For finding the required probability, you are interested in the area of the portion
of the normal curve as shaded and shown in the following figure:
To find the area of the shaded portion we make the following calculations:
= 100
= 100
X = 420z = 0X = 570
z=0 Self - Learning
Material 269
Statistical Computation and
Probability Distribution 570 500
Z 0.70
100
Corresponding to which the area between µ = 500 and X = 570 in the curve as
NOTES per table is equal to 0.2580.
420 500
and Z 0.80
100
Corresponding to which the area between µ = 500 and X = 420 in the curve as
per table is equal to 0.2881.
Hence, the required area in the curve between X = 420 and X = 570 is:
(0.2580) + (0.2881) = 0.5461
This is the required probability that an individual selected at random will belong to
income group of Rs 420 to Rs 570.
1''
Example 4.42: A certain company manufactures 1 all-purpose rope made from
2
imported hemp. The manager of the company knows that the average load-bearing
capacity of the rope is 200 lbs. Assuming that normal distribution applies, find the
1''
standard deviation of load-bearing capacity for the 1 rope if it is given that the
2
rope has a 0.1210 probability of breaking with 68 lbs or less pull.
Solution: Given information can be depicted in a normal curve as shown in the
following figure:
Probability of this
area (0.5) – (0.1210) = 0.3790
μ = 200
X = 68 z=0
If the probability of the area falling within µ = 200 and X = 68 is 0.3790 as stated
above, the corresponding value of Z as per the table showing the area of the
normal curve is – 1.17 (minus sign indicates that we are in the left portion of the
curve)
Now to find , you can write:
X
Z
68 200
or 1.17
or 1.17 132
Self - Learning
270 Material or = 112.8 lbs approx.
Thus, the required standard deviation is 112.8 lbs approximately. Statistical Computation and
Probability Distributiona
Example 4.43: In a normal distribution,
_ 31 per cent items are below 45 and 8 per
cent are above 64. Find the X and of this distribution.
Solution: You can depict the given information in a normal curve as follows: NOTES
X X
X
X
If the probability of the area falling within µ and X = 45 is 0.19 as stated above, the
corresponding value of Z from the table showing the area of the normal curve is
– 0.50. Since, you are in the left portion of the curve so we can express this as
under,
45
0.50 (1)
Similarly, if the probability of the area falling within µ and X = 64 is 0.42 as stated
above, the corresponding value of Z from the area table is +1.41. Since, you are
in the right portion of the curve, so you can express this as under:
64
1.41 (2)
If you solve Equations (1) and (2) above to obtain the value of µ or X , you have:
0.5 = 45 – µ (3)
1.41 = 64 – µ (4)
By subtracting the Equation (4) from (3), you have:
1.91 = –19
= 10
Putting = 10 in Equation (3), you have:
5 = 45 –
= 50
_
Hence, X (or µ)=50 and =10 for the concerning normal distribution.
p(X = 0) =
1 1
2 16
3
11 4
p(X = 1) = C1
4
2 2 16
2 2
p(X = 2) = C2 1 1 6
4
2 2 16
3
p(X = 3) = C3
41 1 4
2 2 16
4 0
p(X = 4) = 4 C4
1 1 1
2 2 16
6
16
5
16
4
16
3
16
2
16
1
16
O
0 1 2 3 4
4
1 4 6 4 1
p( x) 16 16 16 16 16 1
x 2
1 5
For a = , p(X 3) = 0 + a + 2a + 2a2 = 2a2 + 3a =
6 9
4
p(X 3) = 4a2 + 2a =
9
Discrete Distributions
There are several discrete distributions. Some other discrete distributions are
described as follows:
(i) Uniform or Rectangular Distribution
Each possible value of the random variable x has the same probability in the uniform
distribution. If x takes vaues x1, x2....,xk, then,
1
p(xi, k) =
k
The numbers on a die follow the uniform distribution,
1
p(xi, 6) = (Here, x = 1, 2, 3, 4, 5, 6)
6
Bernoulli Trials
In a Bernoulli experiment, an even E either happens or does not happen (E).
Examples are, getting a head on tossing a coin, getting a six on rolling a die, and so
on.
The Bernoulli random variable is written,
X = 1 if E occurs
= 0 if E occurs
Since there are two possible values it is a case of a discrete variable
where, Self - Learning
Material 273
Statistical Computation and Probability of success = p = p(E)
Probability Distribution
Profitability of failure = 1 – p = q = p(E)
We can write,
NOTES For k = 1, f(k) = p
For k = 0, f(k) = q
For k = 0 or 1, f(k) = pkq1–k
Negative Binomial
In this distribution, the variance is larger than the mean.
Suppose, the probability of success p in a series of independent Bernoulli
trials remains constant.
Suppose the rth success occurs after x failures in x + r trials, then
(i) The probability of the success of the last trial is p.
(ii) The number of remaining trials is x + r – 1 in which there should be
r – 1 successes. The probability of r – 1 successes is given by,
x r –1
Cr –1 p r –1 q x
The combined pobability of cases (i) and (ii) happening together is,
x r –1
p(x) = px Cr –1 p r –1 q x x = 0, 1, 2,....
This is the Negative Binomial distribution. We can write it in an alternative
form,
p(x) = –r
Cx p r (q) x x = 0, 1, 2,....
This can be summed up as,
In an infinite series of Bernoulli trials, the probability that x + r trials will be
required to get r successes is the negative binomial,
x r –1
p(x) = Cr –1 p r –1 q x r0
If r = 1, it becomes the geometric distribution.
If p 0, , rp = m a constant, then the negative binomial tends to be the
Poisson distribution.
(ii) Geometric Distribution
Suppose the probability of success p in a series of independent trials remains
constant.
Suppose, the first success occurs after x failures, i.e., there are x failures
preceding the first success. The probability of this event will be given by p(x) =
qxp (x = 0, 1, 2,.....)
This is the geometric distribution and can be derived from the negative
binomial. If we put r = 1 in the negative binomial distribution, then
x r –1
Self - Learning
p(x) = Cr –1 p r –1 q x
274 Material
We get the geometric distribution, Statistical Computation and
Probability Distributiona
p(x) = x C0 p1 q x pq x
p
p
p(x) = q
n0
x
p
1 q
1 NOTES
p
E(x) = Mean =
q
p
Variance =
q2
x
1
Mode =
2
Refer Example 4.45 to understand it better.
Example 4.45: Find the expectation of the number of failures preceding the first
success in an infinite series of independent trials with constant probability p of
success.
Solution:
The probability of success in,
1st trial = p (Success at once)
2nd trial = qp (One failure, then success, and so on)
3rd trial = q2p (Two failures, then success, and so
on)
The expected number of failures preceding the success,
E(x) = 0 . p + 1. pq + 2p2p + ............
= pq(1 + 2q + 3q2 + .........)
1 1 q
= pq (1 – q )2 qp p 2 p
Since p = 1 – q.
(iii) Hypergeometic Distribution
From a finite population of size N, a sample of size n is drawn without replacement.
Let there be N1 successes out of N.
The number of failures is N2 = N – N1.
The disribution of the random variable X, which is the number of successes
obtained in the discussed case, is called the hypergeometic distribution.
N1
C xN Cn x
p(x) = N (X = 0, 1, 2, ...., n)
Cn
Where, n
i 1
i n
x 0 1 2 3 4 5
p ( x) 0 a a/2 a/2 a/4 a/4
Self - Learning
Material 277
Statistical Computation and Solution:
Probability Distribution
a a a a
Since, p(x) = 1,0 a 1
2 2 4 4
NOTES 5 2
a =1 or a=
2 5
9 1
p(x > 4) = p(x = 5) =
4 10
a a a 9a 9
p(x 4) = 0 + a +
2 2 4 4 10
Example 4.50: A fair coin is tossed 400 times. Find the mean number of heads
and the corresponding standard deviation.
Solution:
1
This is a case of binomial distribution with p = q = , n = 400.
2
1
The mean number of heads is given by = np = 400 = 200.
2
1 1
and S. D. = npq 400 10
2 2
Example 4.51: A manager has thought of 4 planning strategies each of which has
an equal chance of being successful. What is the probability that at least one of his
1 3
strategies will work if he tries them in 4 situations? Here p = ,q= .
4 4
Solution:
The probability that none of the strategies will work is given by,
0 4 4
p(0) = C0
4 1 3 3
4 4 4
4
3 175
The probability that at least one will work is given by 1 .
4 256
Example 4.52: For the Poisson distribution, write the probabilities of 0, 1, 2, ....
successes.
Solution:
mx
x p ( x) e – m
x!
0 p (0) e m0 / 0!
–m
m
1 p (1) e – m p (0).m
Self - Learning 1!
278 Material
Statistical Computation and
2 Probability Distributiona
m m
2 e– m p (2) p (1).
2! 2
3
m m
3 e– m p (3) p (2). NOTES
3! 3
and so on.
Total of all probabilities p(x) = 1.
Example 4.53: What are the raw moments of Poisson distribution?
Solution:
First raw moment 1 = m
Second raw moment 2 = m2 + m
Third raw moment 3 = m3 + 3m2 + m
(v) Continuous probability distributions
When a random variate can take any value in the given interval a x b, it is a
continuous variate and its distribution is a continuous probability distribution.
Theoretical distributions are often continuous. They are useful in practice
because they are convenient to handle mathematically. They can serve as good
approximations to discrete distributions.
The range of the variate may be finite or infinite.
A continuous random variable can take all values in a given interval. A
continuous probability distribution is represented by a smooth curve.
The total area under the curve for a probability distribution is necessarily
unity. The curve is always above the x axis because the area under the curve for
any interval represents probability and probabilities cannot be negative.
If X is a continous variable, the probability of X falling in an interval with end
points z1, z2 may be written p(z1 X z2).
This probability corresponds to the shaded area under the curve in Figure
4.13.
z1 z2
Self - Learning
Material 279
Statistical Computation and A function is a probability density function if,
Probability Distribution
–
p ( x ) dx 1, p ( x ) 0, – x , i.e., the area under the curve p(x) is
NOTES 1 and the probability of x lying between two values a, b, i.e., p(a < x < b) is
positive. The most prominent example of a continuous probability function is the
normal distribution.
Cumulative Probability Function (CPF)
The Cumulative Probability Function (CPF) shows the probability that x takes a
value less than or equal to, say, z and corresponds to the area under the curve up
to z:
z
p( x z ) p ( x )dx
–
4.9 SUMMARY
Statistics influence the operations of business and management in many
dimensions.
Statistical applications include the area of production, marketing, promotion
of product, financing, distribution, accounting, marketing research, manpower
planning, forecasting, research and development, and so on.
In statistics, the term central tendency specifies the method through which
the quantitative data have a tendency to cluster approximately about some
value.
A measure of central tendency is any precise method of specifying this
‘central value’. In the simplest form, the measure of central tendency is an
average of a set of measurements, where the word average refers to as
mean, median, mode or other measures of location. Typically the most
commonly used measures are arithmetic mean, mode and median.
While arithmetic mean is the most commonly used measure of central location,
mode and median are more suitable measures under certain set of conditions
and for certain types of data.
There are several commonly used measures, such as arithmetic mean, mode
and median. These values are very useful not only in presenting the overall
picture of the entire data, but also for the purpose of making comparisons
among two or more sets of data.
Arithmetic mean is also commonly known as the mean. Even though average,
in general, means measure of central tendency, when we use the word
average in our daily routine, we always mean the arithmetic average. The
term is widely used by almost everyone in daily communication.
The weighted arithmetic mean is particularly useful where we have to compute
the mean of means. If we are given two arithmetic means, one for each of
two different series, in respect of the same variable, and are required to find
the arithmetic mean of the combined series, the weighted arithmetic mean is
the only suitable method of its determination. Self - Learning
Material 283
Statistical Computation and Median is that value of a variable which divides the series in such a manner
Probability Distribution
that the number of items below it is equal to the number of items above it.
Half the total number of observations lie below the median, and half above
it. The median is thus a positional average.
NOTES
The median of ungrouped data is found easily if the items are first arranged
in order of the magnitude. The median may then be located simply by
counting, and its value can be obtained by reading the value of the middle
observations.
The median can quite conveniently be determined by reference to the ogive
which plots the cumulative frequency against the variable. The value of the
item below which half the items lie, can easily be read from the ogive.
Median is a positional average and hence the extreme values in the data set
do not affect it as much as they do to the mean.
The mode of a distribution is the value at the point around which the items
tend to be most heavily concentrated. It is the most fre-quent or the most
common value, provided that a sufficiently large number of items are available,
to give a smooth distribution.
The measures of dispersion bring out this inequality. In engineering problems
too the variability is an important concern.
The amount of variability in dimensions of nominally identical components
is critical in determining whether or not the components of a mass-produced
item will be really interchangeable.
Probability can be defined as a measure of the likelihood that a particular
event will occur. It is a numerical measure with a value between 0 and 1 of
such a likelihood where the probability of zero indicates that the given event
cannot occur and the probability of one assures certainty of such an
occurrence.
Probability theory provides us with a mechanism for measuring and analysing
uncertainties associated with future events. Probability can be subjective or
objective.
The objective probability of an event, on the other hand, can be defined as
the relative frequency of its occurrence in the long run.
Binomial distribution is probably the best known of discrete distributions.
The normal distribution, or Z-distribution, is often used to approximate the
binomial distribution.
If the sample size is very large, the Poisson distribution is a philosophically
more correct alternative to binomial distribution than normal distribution.
One of the main differences between the Poisson distribution and the binomial
distribution is that in using the binomial distribution all eligible phenomena
are studied, whereas in the Poisson, only the cases with a particular outcome
are studied.
Exponential distribution is a very commonly used distribution in reliability
Self - Learning
engineering. The reason for its widespread use lies in its simplicity, so much
284 Material that it has even been employed in cases to which it does not apply directly.
Amongst all types of distributions, the normal probability distribution is by Statistical Computation and
Probability Distributiona
far the most important and frequently used distribution because it fits well in
many types of problems.
NOTES
4.10 KEY TERMS
Statistics: Numerical statements of facts in any department of inquiry placed
in relation to each other.
Median: Measure of central tendency and it appears in the centre of an
ordered data.
Mode: A form of average that can be defined as the most frequently
occurring value in the data.
The weighted arithmetic mean: The weighted arithmetic mean is
particularly useful where we have to compute the mean of means.
Mean deviation: The mean deviation of a series of values is the arithmetic
mean of their absolute deviations.
Standard deviation: The square root of the average of the squared
deviations from their mean of a set of observations.
Range: The difference between the maximum and minimum values of a set
of number. It indicates the limits within which the values fall.
Classical theory of probability: It is the theory of probability based on
the number of favourable outcomes and the number of total outcomes.
Binomial distribution: It is also called the Bernoulli process and is used to
describe a discrete random variable.
Poisson distribution: It is used to describe the empirical results of past
experiments relating to the problem and plays an important role in queuing
theory, inventory control problems and risk models.
Exponential distribution: It is a continuous probability distribution and is
used to describe the probability distribution of time between two events.
Normal distribution: It is referred to as most important and frequently
used continuous probability distribution as it fits well in many types of
problems.
Short-Answer Questions
1. State the significance of statistical methods.
2. How does statistics aid in interpreting conditions?
3. List the various characteristics of statistical data.
4. Write a short note on the origin of statistics. Self - Learning
Material 285
Statistical Computation and 5. Define the term arithmetic mean.
Probability Distribution
6. Differentiate between a mean and a mode.
7. Write three characteristics of mean.
NOTES 8. What is the importance of arithmetic mean in statistics?
9. Define the term median with the help of an example.
10. Differentiate between geometric and harmonic mean.
11. Define the terms quartiles, percentiles and deciles.
12. Write the definition and formula of quartile deviation.
13. How will you calculate the mean deviation of a given data?
14. What is standard deviation? Why is it used in statistical evaluation of data?
15. Define the concept of probability.
16. What are the different theories of probability? Explain briefly.
17. Define probability distribution and probability functions.
18. What do you mean by the binomial distribution and its measures?
19. How can a binomial distribution be fitted to given data?
20. How will define the Poisson distribution and its important measures?
21. Poisson distribution can be an approximation of binomial distribution. Define.
22. When is the Poisson distribution used?
23. What is exponential distribution?
24. Define any six characteristics of normal distribution.
25. Write the formula for measuring the area under the curve.
26. How will you define the circumstances when the normal probability
distribution can be used?
27. What is CPF?
Long-Answer Questions
1. Give a detailed description on the various functions of statistics.
2. Describe the various features of the statistical procedure.
3. According to Bowley statistics is ‘The science of counting’. Do you agree?
Give reasons.
4. An elevator is designed to carry a maximum load of 3200 pounds. If 18
passengers are riding in the elevator with an average weight of 154 pounds,
is there any danger that the elevator might be overloaded?
5. In a car assembly plant, the cars were diagnostically checked after assembly
and before shipping them to the dealers. All such cars with any defect were
returned for extra work. The number of such defective cars returned in one
day of a 16-days period is given as follows:
30, 34, 10, 16, 28, 9, 22, 2, 6, 23, 25, 10, 15, 10, 8, 24
Self - Learning
286 Material
(i) Find the average number of defective cars returned for extra work per Statistical Computation and
Probability Distributiona
day.
(ii) Find the median for defective cars per day.
(iii) Find the mode for defective cars per day.
NOTES
(iv) Find Q2.
(v) Find D2.
(vi) Find P70.
6. Calculate mean deviation and its coefficient about median, arithmetic mean
and mode for the following figures, and show that the mean deviation about
the median is least.
(103, 50, 68, 110, 108, 105, 174, 103, 150, 200, 225, 350, 103)
7. A group has = 10, N = 60, 2 = 4. A subgroup of this has X 1 = 11, N1 =
4 0 ,
1 = 2.25. Find the mean and the standard deviation of the other subgroup.
2
10. Find the Q.D. from the mean for the series 5, 7, 10, 12, 6.
11. (i) Calculate the mean deviation from the mean for the following data.
What light does it throw on the social conditions of the community?
Data showing differences in ages of husbands and wives.
Difference in years Frequency
0–5 499
5 – 10 705
10 – 15 507
15– 20 281
20 – 25 109
25 – 30 52
30 – 35 164
Self - Learning
Material 287
Statistical Computation and (ii) The age distribution of 100 life insurance policy-holders is as follows:
Probability Distribution
Age No. of policy holders
17 – 19 9
20 – 25 16
NOTES
26 – 35 12
36 – 40 26
41 – 50 14
51 – 55 12
56 – 60 5
61 – 70 5
12. Calculate the mean deviation from the mean and the median and their
coefficients for the following data.
Size of shoes: 3 6 11 2 4 10 5 7 8 9
No. of pairs sold: 10 15 25 6 4 3 2 8 9 4
13. Discuss briefly about the measures of dispersion with the help of giving
examples and characteristics.
14. Explain briefly about the standard deviation. Give appropriate examples.
15. A family plans to have two children. What is the probability that both children
will be boys? (List all the possibilities and then select the one which would
be two boys.)
16. A card is selected at random from an ordinary well-shuffled pack of 52
cards. What is the probability of getting,
(i) A king (ii) A spade
(iii) A king or an ace (iv) A picture card
17. A wheel of fortune has numbers 1 to 40 painted on it, each number being at
equal distance from the other so that when the wheel is rotated, there is the
same chance that the pointer will point at any of these numbers. Tickets
have been issued to contestants numbering 1 to 40. The number at which
the wheel stops after being rotated would be the winning number. What is
the probability that,
(i) Ticket number 29 wins.
(ii) One person who bought 5 tickets numbered 18 to 22 (inclusive), wins
the prize.
18. (a) Explain the meaning of the Bernoulli process pointing out its main
characteristics.
(b) Give a few examples narrating some situations wherein binomial pr
distribution can be used.
19. State the distinctive features of the binomial, Poisson and normal probability
distributions. When does a binomial distribution tend to become a normal
and a Poisson distribution? Explain.
Self - Learning
288 Material
20. Explain the circumstances when the following probability distributions are Statistical Computation and
Probability Distributiona
used:
(a) Binomial distribution
(b) Poisson distribution
NOTES
(c) Exponential distribution
(d) Normal distribution
21. Certain articles have been produced of which 0.5 per cent are defective
and the articles are packed in cartons each containing 130 articles. What
proportion of cartons are free from defective articles? What proportion of
cartons contain two or more defective articles?
(Given e–0.5=0.6065).
22. The following mistakes per page were observed in a book:
No. of Mistakes No. of Times the Mistake
Per Page Occurred
0 211
1 90
2 19
3 5
4 0
Total 345
Fit a Poisson distribution to the given data and test the goodness of fit.
23. In a distribution exactly normal, 7 per cent of the items are under 35 and
89 per cent are under 63. What are the mean and standard deviation of the
distribution?
24. Assume the mean height of soldiers to be 68.22 inches with a variance of
10.8 inches. How many soldiers in a regiment of 1000 would you expect to
be over six feet tall?
25. Fit a normal distribution to the following data:
Height in inches Frequency
60–62 5
63–65 18
66–68 42
69–71 27
72–74 8
26. Analyse the types of discrete distributions with the help of giving examples.
Self - Learning
290 Material
Estimation and
HYPOTHESIS TESTING
NOTES
Structure
5.0 Introduction
5.1 Objectives
5.2 Sampling Theory
5.2.1 Parameter and Statistic
5.2.2 Sampling Distribution of Sample Mean
5.3 Sampling Distribution of the Number of Successes
5.4 The Student’s Distribution
5.5 Theory of Estimation
5.5.1 Point Estimation
5.5.2 Interval Estimation
5.6 Hypothesis Testing
5.6.1 Test of Hypothesis Concerning Mean and Proportion
5.6.2 Test of Hypothesis Conerning Standard Deviation
5.7 Answers to ‘Check Your Progress’
5.8 Summary
5.9 Key Terms
5.10 Self-Assessment Questions and Exercises
5.11 Further Reading
5.0 INTRODUCTION
In statistics, quality assurance, and survey methodology, sampling is the selection
of a subset (a statistical sample) of individuals from within a statistical population
to estimate characteristics of the whole population. Statisticians attempt to collect
samples that are representative of the population in question. Sampling has lower
costs and faster data collection than measuring the entire population and can provide
insights in cases where it is infeasible to sample an entire population. Each
observation measures one or more properties (such as weight, location, colour) of
independent objects or individuals. In survey sampling, weights can be applied to
the data to adjust for the sample design, particularly in stratified sampling. Results
from probability theory and statistical theory are employed to guide the practice.
In business and medical research, sampling is widely used for gathering information
about a population. Acceptance sampling is used to determine if a production lot
of material meets the governing specifications.
Single or isolated facts or figures cannot be called statistics as these cannot
be compared or related to other figures within the same framework. Hence, any
quantitative and numerical data can be identified as statistics when it possesses
certain identifiable characteristics as per the norms of statistics. The area of statistics
can be split up into two identifiable sub-areas. These sub-areas constitute descriptive
statistics and inferential statistics. This unit will describe some of the terms used
extensively in the field of statistics for scientific measurement. Statistical investigation
is a comprehensive process and requires systematic collection of data about some
group of people or objects, describing and organizing the data, analysing the data
with the help of various statistical methods, summarizing the analysis and using the Self - Learning
results for making judgements, decisions and predictions. Material 291
Estimation and A sampling distribution or finite-sample distribution is the probability
Hypothesis Testing
distribution of a given random-sample-based statistic. If an arbitrarily large number
of samples, each involving multiple observations (data points), were separately
used in order to compute one value of a statistic (such as, for example, the sample
NOTES mean or sample variance) for each sample, then the sampling distribution is the
probability distribution of the values that the statistic takes on. In many contexts,
only one sample is observed, but the sampling distribution can be found theoretically.
Sampling distributions are important in statistics because they provide a major
simplification en route to statistical inference. More specifically, they allow analytical
considerations to be based on the probability distribution of a statistic, rather than
on the joint probability distribution of all the individual sample values.
Estimation (or estimating) is the process of finding an estimate, or
approximation, which is a value that is usable for some purpose even if input data
may be incomplete, uncertain, or unstable. The value is nonetheless usable because
it is derived from the best information available. Typically, estimation involves ‘Using
the value of a statistic derived from a sample to estimate the value of a corresponding
population parameter’. The sample provides information that can be projected,
through various formal or informal processes, to determine a range most likely to
describe the missing information. An estimate that turns out to be incorrect will be
an overestimate if the estimate exceeded the actual result, and an underestimate if
the estimate fell short of the actual result.
Statistical hypothesis test is a method of statistical inference used to determine
a possible conclusion from two different, and likely conflicting, hypotheses. In a
statistical hypothesis test, a null hypothesis and an alternative hypothesis is proposed
for the probability distribution of the data. If the sample obtained has a probability
of occurrence less than the pre-specified threshold probability, the significance
level, given the null hypothesis is true, the difference between the sample and the
null hypothesis in deemed statistically significant. The hypothesis test may then
lead to the rejection of null hypothesis and acceptance of alternate hypothesis.
The process of distinguishing between the null hypothesis and the alternative
hypothesis is aided by considering Type I error and Type II error which are
controlled by the pre-specified significance level. Hypothesis tests based on
statistical significance are another way of expressing confidence intervals (more
precisely, confidence sets). In other words, every hypothesis test based on
significance can be obtained via a confidence interval, and every confidence interval
can be obtained via a hypothesis test based on significance.
In this unit, you will learn about the sampling theory, sampling distribution of
the number of successes, the student’s distribution theory of estimation, theory of
estimation and hypothesis testing.
5.1 OBJECTIVES
After going through this unit, you will be able to:
Explain about the sampling theory
Discuss the methods of sampling
Learn about parameter and statistics
Self - Learning
292 Material
Explain the concept of population in statistics Estimation and
Hypothesis Testing
Know about estimation
Understand hypothesis distribution and test of significance
NOTES
5.2 SAMPLING THEORY
A universe is the complete group of items about which knowledge is sought. The
universe may be finite or infinite. Finite universe is one which has a definite and
certain number of items but when the number of items is uncertain and infinite, the
universe is said to be an infinite universe. Similarly the universe may be hypothetical
or existent. In the former case the universe in fact does not exist and we can only
imagine the items constituting it. Tossing of a coin or throwing of a dice are examples
of hypothetical universes. Existent universe is a universe of concrete objects, i.e.,
the universe where the items constituting it really exist. On the other hand, the term
sample refers to that part of the universe which is selected for the purpose of
investigation. The theory of sampling studies the relationships that exist between
the universe and the sample or samples drawn from it.
5.2.1 Parameter and Statistic
It would be appropriate to explain the meaning of two terms viz., parameter and
statistic. All the statistical measures based on all items of the universe are termed
as parameters whereas statistical measures worked out on the basis of sample
studies are termed as sample statistics. Thus, a sample mean or a sample standard
deviation is an example of statistic whereas the universe mean or universe standard
deviation is an example of a parameter.
The main problem of sampling theory is the problem of relationship between
a parameter and a statistic. The theory of sampling is concerned with estimating
the properties of the population from those of the sample and also with gauging the
precision of the estimate. This sort of movement from particular Sample towards
general Universe is what is known as statistical induction or statistical inference. In
more clear terms, ‘From the sample we attempt to draw inferences concerning the
universe. In order to be able to follow this inductive method, we first follow a
deductive argument which is that we imagine a population or universe (finite or
infinite) and investigate the behaviour of the samples drawn from this universe
applying the laws of probability.’ The methodology dealing with all this is known as
Sampling Theory.
Objects of Sampling Theory
Sampling theory is to attain one or more of the following objectives:
(a) Statistical Estimation: Sampling theory helps in estimating unknown
population quantities or what are called parameters from a knowledge of
statistical measures based on sample studies often called as ‘Statistic’. In
other words, to obtain the estimate of parameter from statistic is the main
objective of the sampling theory. The estimate can either be a point estimate
or it may be an interval estimate. Point estimate is a single estimate
Self - Learning
Material 293
Estimation and expressed in the form of a single figure but interval estimate has two limits,
Hypothesis Testing
the upper and lower limits. Interval estimates are often used in statistical
induction.
(b) Tests of Hypotheses or Tests of Significance: The second objective of
NOTES
sampling theory is to enable us to decide whether to accept or reject
hypotheses or to determine whether observed samples differ significantly
from expected results. The sampling theory helps in determining whether
observed differences are actually due to chance or whether they are really
significant. Tests of significance are important in the theory of decisions.
(c) Statistical Inference: Sampling theory helps in making generalization about
the universe from the studies based on samples drawn from it. It also helps
in determining the accuracy of such generalizations.
5.2.2 Sampling Distribution of Sample Mean
In sampling theory we are concerned with what is known as the sampling
distribution. For this purpose we can take certain number of samples and for each
sample we can compute various statistical measures such as mean, standard
deviation etc. It is to be noted that each sample will give its own value for the
statistic under consideration. All these values of the statistic together with their
relative frequencies with which they occur, constitute the sampling distribution.
We can have sampling distribution of means or the sampling distribution of standard
deviations or the sampling distribution of any other statistical measure. The sampling
distribution tends quite closer to the normal distribution if the number of samples is
large. The significance of sampling distribution follows from the fact that the
mean of a sampling distribution is the same as the mean of the universe.
Thus, the mean of the sampling distribution can be taken as the mean of the
universe.
The Concept of Standard Error (or S.E.)
The standard deviation of sampling distribution of a statistic is known as its standard
error and is considered the key to sampling theory. The utility of the concept of
standard error in statistical induction arises on account of the following reasons:
(a) The standard error helps in testing whether the difference between observed
and expected frequencies could arise due to chance. The criterion usually
adopted is that if a difference is upto 3 times the S.E. then the difference is
supposed to exist as a matter of chance and if the difference is more than 3
times the S.E., chance fails to account for it, and we conclude the difference
as significant difference. This criterion is based on the fact that at x ± 3(S.E.),
the normal curve covers an area of 99.73 per cent. The product of the
critical value at certain level of significance and the S. E. is often described
as the Sampling Error at that particular level of significance. We can test the
difference at certain other levels of significance as well depending upon our
requirement.
Self - Learning
294 Material
(b) The standard error gives an idea about the reliability and precision of a Estimation and
Hypothesis Testing
sample. If the relationship between the standard deviation and the sample
size is kept in view, one would find that the standard error is smaller than
the standard deviation. The smaller the S.E. the greater the uniformity of
the sampling distribution and hence greater is the reliability of sample. NOTES
Conversely, the greater the S.E., the greater the difference between
observed and expected frequencies and in such a situation the unreliability
of the sample is greater. The size of S.E. depends upon the sample size;
the greater the number of items included in the sample the smaller the
error to be expected and vice versa.
(c) The standard error enables us to specify the limits, maximum and minimum,
within which the parameters of the population are expected to lie with a
specified degree of confidence. Such an interval is usually known as
confidence interval. The degree of confidence with which it can be asserted
that a particular value of the population lies within certain limits is known
as the level of confidence.
Self - Learning
Material 295
Estimation and
Hypothesis Testing Standard Deviation
Standard Error of Mean , the standard error of
n
Standard Deviation
NOTES Standard Deviation = , the standard error of Karl
2n
1 r2
Pearson’s Coefficient of Correlation = and so on. (A detailed
n
description of important standard errors formulae has been given on the
pages that follow).
(e) Calculation of the Significance Ratio: The significance ratio, symbolically
described as z, t, f etc. depending on the test we use, is often calculated
by diving the difference between a parameter and a statistic by the standard
error concerned. Thus, in context of mean, of small sample when population
x
variance is not known, in context of t and in context of difference
SE x
X1 X 2
between two sample means t . (All this has been fully
SE diff x x
1 2
Population
A population, in statistical terms, is the totality of things under consideration. It is
the collection of all values of the variable that is under study. For instance, if we
are interested in knowing as to how much on an average an American bachelor
spends on his clothes per year, then all American bachelors would constitute the
population. Similarly, if we want to know the percentage of adult American travellers
who go to Europe, then only those adult Americans who travel are considered as
population.
The amount paid by parents in one year for an average Class I day-boarding
public school student can be evaluated by calculating the fee structure, such as
admission fee, day-boarding fees, tuition fees and annual charges. Thus, Class I
Self - Learning
296 Material day-boarding students would constitute the specific population group.
Another example we can consider about is the population of coal mines Estimation and
Hypothesis Testing
workers, who are suffering from pneumoconiosis throughout the country. To
evaluate this, we collect information on all cases of pneumoconiosis in different
coal mines in the country.
NOTES
A summary measure that describes any given characteristic of the population
is known as a parameter. For example, the measure, the average income of
American professors, would be considered as a parameter since it describes the
characteristic of income of the population of American professors.
Sample
A sample is a portion of the total population that is considered for study and
analysis. For instance, if we want to study the income pattern of professors at City
University of New York and there are 10,000 professors, then we may take a
random sample of only 1,000 professors out of this entire population of 10,000
for the purpose of our study. Then this number of 1,000 professors constitutes a
sample. The summary measure that describes a characteristic such as average
income of this sample is known as a statistic.
Sampling is the process of selecting a sample from the population. It is
technically and economically not feasible to take the entire population for analysis.
So we must take a representative sample out of this population for the purpose of
such analysis. A sample is part of the whole, selected in such a manner as to be
representing the whole.
Random Sample
A random sample is a collection of items selected from the population in such a
manner that each item in the population has exactly the same chance of being
selected, so that the sample taken from the population would be truly representative
of the population. The degree of randomness of selection would depend upon the
process of selecting the items from the sample. A true random sample would be
free from all biases whatsoever. For example, if we want to take a random sample
of five students from a class of twenty-five students, then each one of these twenty-
five students should have the same chance of being selected into the sample. One
way to do this would be writing the names of all students on separate but small
pieces of paper, folding each piece of this paper in a similar manner, putting each
folded piece into a container, mixing them thoroughly and drawing out five pieces
of paper from this container.
Sampling without Replacement
The sample as taken in the above example is known as sampling without replacement,
as each person can only be selected once. This is because once a piece of paper is
taken out of the container, it is kept aside so that the person whose name appears on
this piece of paper has no chance of being selected again.
Sampling with Replacement
There are certain situations in which a piece of paper once selected and taken into
consideration is put back into the container in such a manner that the same person
has the same chance of being selected again as any other person. For example, if
we are randomly selecting five persons for award of prizes so that each person is
Self - Learning
eligible for any and all prizes, then once the slip of paper is drawn out of the Material 297
Estimation and container and the prize is awarded to the person whose name appears on the
Hypothesis Testing
paper, the same piece of paper is put back into the container and the same person
has the same chance of winning the second prize as anybody else.
Non-probability Probability
Samples Samples
40
σ 8 2.83
5
Now, let us assume the sample size, n = 2, and take all the possible samples of
size 2, from this population. There are 10 such possible samples. These are as
follows, along with their means.
X1, X 2 (2, 4) X1 = 3
X1, X 3 (2, 6) X2 = 4
X1, X 4 (2, 8) X3 = 5
X1, X 5 (2,10) X4 = 6
X2, X 3 (4, 6) X5 = 5
X2, X 4 (4, 8) X6 = 6
X2, X 5 (4, 10) X7 = 7
X3, X 4 (6, 8) X8 = 7
X3, X 5 (6, 10) X9 = 8
X4, X 5 (8, 10) X 10 = 9
Now, if only the first sample was taken, the average of the sample would be
3. Similarly, the average of the last sample would be 9. Both of these samples are
totally unrepresentative of the population. However, if a grand mean X of the
distribution of these sample means is taken, then,
10
X i
X i 1
10
3 4 5 6567 7 89
Self - Learning 60 / 10 6
306 Material 10
This grand mean has the same value as the mean of the population. Let us Estimation and
Hypothesis Testing
organize this distribution of sample means into a frequency distribution and
probability distribution.
Sample mean Freq. Rel.freq. Prob.
NOTES
3 1 1/10 .1
4 1 1/10 .1
5 2 2/10 .2
6 2 2/10 .2
7 2 2/10 .2
8 1 1/10 .1
9 1 1/10 .1
1.00
This probability distribution of the sample means is referred to as ‘sampling
distribution of the mean.’
Sampling Distribution of the Mean
The sampling distribution of the mean can thus be defined as, ‘A probability
distribution of all possible sample means of a given size, selected from a population’.
Accordingly, the sampling distribution of the means of the ages of children as
tabulated in Example 5.1, has 3 predictable patterns. These are as follows:
(i) The mean of the sampling distribution and the mean of the population are
equal. This can be shown as follows:
Sample mean ( X ) Prob. P( X )
3 .1
4 .1
5 .2
6 .2
7 .2
8 .1
9 .1
1.00
Then,
XP ( X ) = (3 × .l) + (4 × .l) + (5×.2) + (6 × .2) + (7 × .2) + (8 × .l)
+ 9 ×.l) = 6
This value is the same as the mean of the original population.
(ii) The spread of the sample means in the distribution is smaller than in the
population values. For example, the spread in the distribution of sample
means above is from 3 to 9, while the spread in the population was from 2
to 10.
(iii) The shape of the sampling distribution of the means tends to be, ‘Bell-
shaped’ and approximates the normal probability distribution, even when
the population is not normally distributed. This last property leads us to
Self - Learning
the ‘Central Limit Theorem’. Material 307
Estimation and Central Limit Theorem
Hypothesis Testing
Central Limit Theorem states that, ‘Regardless of the shape of the population, the
distribution of the sample means approaches the normal probability distribution as
NOTES the sample size increases.’
The question now is how large should the sample size be in order for the
distribution of sample means to approximate the normal distribution for any type
of population. In practice, the sample sizes of 30 or larger are considered adequate
for this purpose. This should be noted however, that the sampling : distribution
would be normally distributed, if the original population is normally distributed, no
matter what the sample size.
As we can see from our sampling distribution of the means, the grand mean
X of the sample means or – × equals , the population mean. However,,
realistically speaking, it is not possible to take all the possible samples of size n
from the population. In practice only one sample is taken, but the discussion on
the sampling distribution is concerned with the proximity of ‘a’ sample mean to the
population mean.
It can be seen that the possible values of sample means tend towards the
population mean, and according to Central Limit Theorem, the distribution of
sample means tend to be normal for a sample size of n being larger than 30.
Hence, we can draw conclusions based upon our knowledge about the
characteristics of the normal distribution.
For example, in the case of sampling distribution of the means, if we know
the grand mean –×of this distribution, which is equal to p, and the standard
deviation of this distribution, known as ‘Standard error of free mean’ and denoted
by –× , then we know from the normal distribution that there is a 68.26 per cent
chance that a sample selected at random from a population, will have a mean that
lies within one standard error of the mean of the population mean. Similarly, this
chance increases to 95.44 per cent, that the sample mean will lie within two standard
errors of the mean (–× ) of the population mean. Hence, knowing the properties
of the sampling distribution tells us as to how close the sample mean will be to the
true population mean.
Standard Error
Standard error of the mean (–×)
Standard error of the mean (–×)is a measure of dispersion of the distribution of
sample means and is similar to the standard deviation in a frequency distribution
and it measures the likely deviation of a sample mean from the grand mean of the
sampling distribution.
If all sample means are given, then (–×)can be calculated as follows:
28
7
42
However, since it is not possible to take all possible samples from the
population, we must use alternate methods to compute –× .
The standard error of the mean can be computed from the following formula,
if the population is finite and we know the population mean. Hence,
σ ( N n)
σ
n ( N 1)
Where,
= population standard deviation
N = population size
n = sample size
This formula can be made simpler to use by the fact that we generally deal
with very large populations, which can be considered infinite, so that if the population
size A’ is very large and sample size n is small, as for example in the case of items
tested from assembly line operations, then,
( N – n)
would approach 1.
( N –1)
Hence,
n
( N – n)
The factor ( N – n)
is also known as the ‘finite correction factor’, and should be
used when the population size is finite.
As this formula suggests, –× decreases as the sample size (w) increases,
meaning that the general dispersion among the sample means decreases, meaning Self - Learning
Material 309
Estimation and further that any single sample mean will become closer to the population mean, as
Hypothesis Testing
the value of (–×) decreases. Additionally, since according to the property of the
normal curve, there is a 68.26 per cent chance of the population mean being
within one –× of the sample mean, a smaller value of –× will make this range
NOTES shorter; thus making the population mean closer to the sample mean
(Refer Example 5.2).
Example 5.2: The IQ scores of college students are normally distributed with the
mean of 120 and standard deviation of 10.
(a) What is the probability that the IQ score of any one student chosen at
random is between 120 and 125?
(b) If a random sample of 25 students is taken, what is the probability that the
mean of this sample will be between 120 and 125.
Solution:
(a) Using the standardized normal distribution formula,
125
= 120
= 10
( X – )
Z
125 –120
Z 5 / 10 .5
10
The area for Z = .5 is 19.15.
This means that there is a 19.15 per cent chance that a student picked up at
random will have an IQ score between 120 and 125.
(b) With the sample of 25 students, it is expected that the sample mean will be
much closer to the population mean, hence it is highly likely that the sample
mean would be between 120 and 125.
The formula to be used in the case of standardized normal distribution for
sampling distribution of the means is given by,
X –
Z
where,
Self - Learning n
310 Material
Hence, Estimation and
Hypothesis Testing
NOTES
125
=120
=10
X –
Z
10 10
where, 2
n 25 5
Then,
125 –120
Z 5 / 2 2.5
2
The area for Z = 2.5 is 49.38.
This shows that there is a chance of 49.38 per cent that the sample mean
will be between 120 and 125. As the sample size increases further, this chance will
also increase. It can be noted that the probability of a sample mean being between
120 and 125 is much higher than the probability of an individual student having an
IQ between 120 and 125.
=X
X
n
s
(X X ) 2
n 1
X
p ps , where ps is the sample proportion.
n
X
Z
X
or X ZX
or X ZX
X ZX
Zx Zx
X1 X X2
This means that the population mean is expected to lie between the values
of X1 and X2 which are both equidistant from X and this distance depends upon
the value of Z which is a function of confidence level.
Suppose that we wanted to find out a confidence interval around the
sample mean within which the population mean is expected to lie 95 per cent of
the time. (We can never be sure that the population mean will lie in any given
interval 100 per cent of the time). This confidence interval is shown in the following
illustration:
95%
47.5% 47.5%
2.5% 2.5%
X1 X X2
Self - Learning
314 Material
The points X1 and X2 above define the range of the confidence interval as Estimation and
Hypothesis Testing
follows:
X1 X ZX
and X2 X ZX NOTES
Looking at the table of Z scores, (given in the Appendix) we find that the
value of Z score for area 10.4750 (half of 95 per cent) is 1.96. This illustration can
be interpreted as follows:
(i) If all possible samples of size n were taken, then on the average 95 per cent
of these samples would include the population mean within the interval around
their sample means bounded by X1 and X2.
(ii) If we took a random sample of size n from a given population, the probability
is 0.95 that the population mean would lie between the interval X1 and X2
around the sample mean, as shown.
(iii) If a random sample of size n was taken from a given population, we can be
95 per cent confident in our assertion that the population mean will lie around
the sample mean in the interval bounded by values of X1 and X2 as shown.
(It is also known as 95 per cent confidence interval.) At 95 per cent
confidence interval, the value of Z score as taken from the Z score table is
1.96. The value of Z score can be found for any given level of confidence,
but generally speaking, a confidence level of 90 per cent, 95 per cent or 99
per cent is taken into consideration for which the Z score values are 1.68,
1.96 and 2.58, respectively.
Refer Examples 5.3 and 5.4 to understand internal estimation better.
Example 5.3: The sponsor of a television programme targeted at the children’s
market (age 4-10 years) wants to find out the average amount of time children
spend watching television. A random sample of 100 children indicated the average
time spent by these children watching television per week to be 27.2 hours. From
previous experience, the population standard deviation of the weekly extent of
television watched () is known to be 8 hours. A confidence level of 95 per cent is
considered to be adequate.
Solution:
1.96X 1.96X
X1 X = 27.2 X2
= 8
The confidence interval is given by,
X 27.2
Z 1.96
NOTES
8
n 100
8 8
Hence X 0.8
n 100 10
Then,
X1 X Z X
27.2 (1.96 0.8) 27.2 1.568
25.632
and
X 2 X Z X
27.2 (1.96 0.8) 27.2 1.568
28.768
This means that we can conclude with 95 per cent confidence that a child on an
average spends between 25.632 and 28.768 hours per week watching television. (It
should be understood that 5 per cent of the time our conclusion would still be wrong.
This means that because of the symmetry of distribution, we will be wrong 2.5 per
cent of the times because the children on an average would be watching television
more than 28.768 hours and another 2.5 per cent of the time we will be wrong in our
conclusion, because on an average, the children will be watching television less than
25.632 hours per week.)
Example 5.4: Calculate the confidence interval in the previous problem, if we
want to increase our confidence level from 95 per cent to 99 per cent. Other values
remain the same.
Solution:
.495 .495
.005 .005
2.58X 2.58X
X1 X = 27.2 X2
= 8
If we increase our confidence level to 99 per cent, then it would be natural to
assume that the range of the confidence interval would be wider, because we would
want to include more values which may be greater than 28.768 or smaller than
25.632 within the confidence interval range. Accordingly, in this new situation,
Z 2.58
Self - Learning
316 Material X 0.8
Then Estimation and
Hypothesis Testing
X1 X Z X
27.2 – (2.58 0.8) 27.2 2.064
25.136
NOTES
and
X 2 X Z X
27.2 2.064
29.264
(The value of Z is established from the table of Z scores against the area of 0.495 or
a figure closest to it. The table shows that the area close to 0.495 is 0.4949 for
which the Z score is 2.57 or 0.4951 for which the Z score is 2.58. In practice, the Z
score of 2.58 is taken into consideration when calculating 99 per cent confidence
interval.)
The level of significance implies the probability of Type I error. A 5 per cent level
implies that the probability of committing a Type I error is 0.05. A 1 per cent level
implies 0.01 probability of committing Type I error.
Lowering the significance level and hence the probability of Type I error is good
but unfortunately, it would lead to the undesirable situation of committing Type II
error.
To Sum Up:
Type I Error: Rejecting H0 when H0 is true.
Type II Error: Accepting H0 when H0 is false.
Note: The probability of making a Type I error is the level of significance of a statistical test.
It is denoted by
X
z is approximately nominal. The theoretical region for z depending on
SE ( X )
the desired level of significance can be calculated.
For example, a factory produces items, each weighing 5 kg with variance 4. Can
a random sample of size 900 with mean weight 4.45 kg be justified as having been
Self - Learning
taken from this factory?
320 Material
n = 900 Estimation and
Hypothesis Testing
X = 4.45
µ=5
= NOTES
4 =2
X X 4.45 5
z = = = 8.25
SE ( X ) / n 2 / 30
We have z > 3. The null hypothesis is rejected. The sample may not be regarded
as originally from the factory at 0.27 per cent level of significance (corresponding
to 99.73 per cent acceptance region).
Test for equality of two proportions
If P1, P2 are proportions of some characteristic of two samples of sizes n1, n2,
drawn from populations with proportions P1, P2, then we have H0: P1 = P2 vs
H1:P1 ¹ P2 .
Case (I): If H0 is true, then let P1 = P2 = p
Where, p can be found from the data,
n1 P1 n2 P2
p
n1 n2
q 1 p
We write z ~ N(0, 1)
The usual rules for rejection or acceptance are applicable here.
Case (II): If it is assumed that the proportion under question is not the same in
the two populations from which the samples are drawn and that P1, P2 are the true
proportions, we write,
Pq P q
SE ( P1 P2 ) 1 1 2 2
n1 n2
Pq P q
( P1 P2 ) z / 2 1 1 2 2
n1 n2
The 90% confidence limits would be [with a = 0.1, 100 (1 – a) = 0.90] Self - Learning
Material 321
Estimation and
Hypothesis Testing Pq P q
( P1 P2 ) 1.645 1 1 2 2
n1 n2
NOTES Consider Example 5.5 to further understand the test for equality.
Example 5.5: Out of 5000 interviewees, 2400 are in favour of a proposal, and
out of another set of 2000 interviewees, 1200 are in favour. Is the difference
significant?
2400 1200
Where, P1 0.48 P2 0.6
5000 2000
Solution:
2400 1200
Given, P1 0.48 P2 0.6
5000 2000
n1 = 5000 n2 = 2000
P1 P2 0.12
z 9.2 3
SE 0.013
H 0 : 1 2
H1 : 1 2
Self - Learning
Material 323
Estimation and Rejection region
Hypothesis Testing
At a per cent level for two-tailed test if | t | > ta/2 reject.
For one-tailed test, (right) if t > ta reject
NOTES (left) if t > –ta reject
At 5 per cent level the three cases are,
If | t | > t0.025 reject two-tailed
If t > t0.05 reject one-tailed right
If t £ t0.05 reject one-tailed left
For proportions, the same procedure is to be followed.
Example 5.7: A firm produces tubes of diameter 2 cm. A sample of 10 tubes is
found to have a diameter of 2.01 cm and variance 0.004. Is the difference significant?
Given t0.05,9= 2.26.
Solution:
X
t
s / n 1
2.01 2
0.004/ 10 1
0.01
0.021
0.48
Since, |t| < 2.26, the difference is not significant at 5 per cent level.
NOTES
5.9 KEY TERMS
Population (in statistics): It is a complete set of objects under study,
living or non-living.
Standard error of mean: Measures the likely deviation of a sample mean
from the grand mean of the sampling distribution.
Efficiency: An estimator is considered to be efficient if its value remains
stable from sample to sample. The best estimator would be the one which
would have the least variance from sample to sample taken randomly from
the same population. From the three point estimators of central tendency,
namely the mean, the mode and the median, the mean is considered to be
the least variant and hence, a better estimator.
Null hypothesis: A hypothesis stated in the hope of being rejected.
Short-Answer Questions
1. What is meant by statistical estimation?
2. What do you understand by computation of the standard error?
3. Describe the terms sample and sampling.
4. What are the two laws of statistics?
5. How do sampling and non-sampling errors arise?
6. What is estimation?
7. What are the characteristics of a hypothesis?
Long-Answer Questions
1. A company claims that 5% of its products are defective. In a sample of 400
items 320 are good. Test whether the claim is valid.
2. Discuss why sampling is necessary with the help of giving examples.
3. Write an explanatory note on census and sampling.
4. Explain the various non-probability sampling methods.
5. Discuss the various types of probability sampling methods.
6. Briefly explain about the estimation with the help of giving examples.
7. Explain the two types of errors in statistical hypothesis.
Self - Learning
Material 327
Estimation and
Hypothesis Testing 5.11 FURTHER READING
Chance, William A. 1969. Statistical Methods for Decision Making. Illinois:
NOTES Richard D Irwin.
Chandan, J.S., Jagjit Singh and K.K. Khanna. 1995. Business Statistics. New
Delhi: Vikas Publishing House.
Elhance, D.N. 2006. Fundamental of Statistics. Allahabad: Kitab Mahal.
Freud, J.E., and F.J. William. 1997. Elementary Business Statistics – The
Modern Approach. New Jersey: Prentice-Hall International.
Goon, A.M., M.K. Gupta, and B. Das Gupta. 1983. Fundamentals of Statistics.
Vols. I & II, Kolkata: The World Press Pvt. Ltd.
Gupta, S.C. 2008. Fundamentals of Business Statistics. Mumbai: Himalaya
Publishing House.
Kothari, C.R. 1984. Quantitative Techniques. New Delhi: Vikas Publishing
House.
Levin, Richard. I., and David. S. Rubin. 1997. Statistics for Management. New
Jersey: Prentice-Hall International.
Meyer, Paul L. 1970. Introductory Probability and Statistical Applications.
Massachusetts: Addison-Wesley.
Gupta, C.B. and Vijay Gupta. 2004. An Introduction to Statistical Methods,
23rd Edition. New Delhi: Vikas Publishing House Pvt. Ltd.
Hooda, R. P. 2013. Statistics for Business and Economics, 5th Edition. New
Delhi: Vikas Publishing House Pvt. Ltd.
Anderson, David R., Dennis J. Sweeney and Thomas A. Williams. Essentials of
Statistics for Business and Economics. Mumbai: Thomson Learning,
2007.
S.P. Gupta. 2021. Statistical Methods. Delhi: Sultan Chand and Sons.
Self - Learning
328 Material