Statistics
A Virtual Lecture Facilitated by
J. N. Onyeka-Ubaka (Ph.D)
[email protected] +2348059839937
Specific Objectives
At the end of the course, students should be able to
• Define some basic concepts in Statistics;
• Familiarize themselves with the concepts of Random variables;
• Read Percentile Points from Multi-Level Statistical Table;
• Sketch areas under the standard Normal Distributions;
• Evaluate real life problems using concepts of Normal Distribution;
• Prove Chebyshev’s Inequalities;
• Solve simple problems using Laws of large numbers.
Normal Distribution
A normal distribution is the appropriate term used to describe the
function of a bell-shaped curve.
In a standard normal distribution, the mean is zero and the standard
deviation is one. It has zero skew and the kurtosis is three.
Normal distributions are symmetrical, but not all symmetrical
distributions are normal.
The whole curve encompasses the entire population. Fifty percent is
above and 50% below the mean. And, because you know from
Gaussian tables the areas under the normal curve, any value on the
horizontal axis has probabilities associated with it.
Normal Distribution Cont’d
Probability under the normal distribution is an area. Specifically, we learned that if a
histogram is at least approximately bell-shaped, then:
(i) approximately 68% of the data fall within one standard deviation of the mean
(ii) approximately 95% of the data fall within two standard deviations of the mean
(iii) approximately 99.7% of the data fall within three standard deviations of the mean
Normal Distribution (Cont’d)
✓ The probability that a randomly selected data value from a normal
distribution falls within one standard deviation of the mean is
(−1 Z 1) = P(Z < 1) – P(Z > -1)
= 0.8413 – 0.1587 = 0.6826
That is, we should expect 68.26% (approximately 68%) of the data
values arising from a normal population to be within one standard
deviation of the mean, that is, to fall in the interval: ( − , + )
The probability that a randomly selected data value from a normal distribution falls
within two standard deviations of the mean is
P(-2 < Z < 2) = P(Z < 2) – P(Z > -2) = 0.9772 − 0.0228 = 0.9544
That is, we should expect 95.44% (approximately 95%) of the data values arising
from a normal population to be within two standard deviations of the mean, that is, to
fall in the interval: ( − 2 , + 2 )
The probability that a randomly selected data value from a normal
distribution falls within three standard deviations of the mean is:
P(−3 < Z < 3) = P(Z < 3) − P(Z > 3) = 0.9987 − 0.0013 = 0.9974
That is, we should expect 99.74% (almost all) of the data values
arising from a normal population to be within three standard deviations
of the mean, that is, to fall in the interval: ( − 3 , + 3 )
ASSIGNMENT
The pocket money (X) of students in a certain university is assumed to
be normally distributed with mean ₦80,000 and variance ₦36,000000.
Find the two values of X which enclose;
(i) 68.27% of the distribution. (ii) 95.44% of the distribution.
(iii) 99.73% of the distribution.
Probability Density of a Normal
Distribution
The probability density function (pdf) of a normal distribution is given
by 1 x−
2
1 −
f ( x) = e 2
, − x ; 0
2
If a random variable X obeys the normal probability law with mean
and standard deviation , we write in symbol as ~ ( , 2 ) .
The normal pdf is widely encountered in all branches of science,
engineering, medical, social and demographic studies. For example,
the masses of lecturers in a university, the intelligent quotient of
children, the heights of a growing child, the yields of agricultural
produce in a farm, the noise voltage produced by a thermally agitated
resistor, all are postulated to be approximately normal over a large
range of values.
Characteristics of the Normal Distribution
The normal distribution function depends on the mean and the
standard deviation .
The normal distribution curve is bell-shaped.
The curve is asymptotic to the x-axis.
The function is continuous from − to + .
The curve is symmetrical about the vertical line through the mean.
Since a normal distribution function is a probability function, the total
area under its curve is 1.
Properties of Normal Probability Density
The f(x) of a normal distribution satisfies the following properties:
(i ) f ( x)dx = 1
−
(ii ) −
xf ( x)dx =
(iii ) ( x − ) 2 f ( x)dx = 2
−
Show that −
f ( x)dx = 1.
Proof
x−
Let z = ,
1 ( x− )2 1
1 − 1 − z2
So −
2
e 2 2
dx = −
2
e 2
dz = 1
1 2 1
1 − y 1 − ( z2 + y2 )
1 = dy =
2 2 2
e e dzdy
2 − 2 − −
Let us introduce polar coordinates to evaluate this double integral:
z = r cos , y = r sin
r2
1 2 −r 2 1 2 − 1 2
1 = re drd =
2
−e 2
d = d = 1
2 2
0
2 0 0 0 0
where dzdy = drd .
1 ( x− )2
1 −
Hence,
− 2 e 2 2
dx = 1
For others, see Pages 89-91 of Probability and Distribution Theories
for Professional Competence
Standard Normal Curve
Often, statisticians found it rather convenient to choose a normal curve
with mean 0 and standard deviation 1. Such a normal distribution
curve is called a standard normal distribution curve or a standard
normal curve.
The shape of any normal curve depends on its mean and standard
deviation.
=0 z
If the probability distribution has mean 0 and standard deviation 1, we
say that the distribution has been standardized and z is called
standardized score or z-score. x−
z=
Example
If X is a normal random variable with mean = 7 and variance 2 = 36
Find (a) ( 8) (b) P(5 < X < 9) (c) − 5 2
Solution
− 8−7 1
(a) ( 8) = = Z = ( Z 0.17) = 0.5675
6 6
5−7 − 9−7 −2 2
(b) (5 9) = = Z
6 6 6 6 6
= (−0.33 Z 0.33)
= (0.33) − (−0.33)
= 0.6293 – 0.3707 = 0.2586
(c) − 5 2 = P(X – 5 > 2 or X – 5 < -2) = P (X > 7 or X < 3)
7−7 3−7
= ( 7) + ( 3) = Z + Z
6 6
= ( Z 0) + ( Z −0.67)
= 1 − ( Z 0) + ( Z −0.67)
= 1 − 0.5000 + 0.2514 = 0.7514
Exercise
The life-length of an electronic device manufactured by Company A is
normally distributed with mean 45 and standard deviation 8, while that
of a similar electronic device manufactured by Company B has mean
life-length 48 and standard deviation 4, all measurements being in hours.
Which of the electronic devices is to be preferred if it is required for:
(i) a 48 hour period;
(ii) a 52 hour period.
giving reasons for your answer.
Remarks
See Pages 100 and 101 of your Study materials for the
inter-relationship between Normal Distribution and some
other continuous distributions
Questions and Answers
Conclusion
A random variable X is said to be continuous if its set of possible
values is an entire interval of numbers – that is, if for some A < B, any
number x between A and B is possible.
Let X be a continuous random variable. Then a probability density
function (pdf) of X is a function f(x) such that for any two numbers a
and b with a b ,
b
(a b) = f ( x)dx
a
The two conditions that must be satisfied for f(x) to be a legitimate
pdf:
(i) f ( x) 0 for all x.
(ii)
−
f ( x)dx = area under the entire graph of f(x)
=1
References
Onyeka-Ubaka, J. N. (2022). Probability and Distribution Theories for
Professional Competence. First Edition, Masterprint Educational
Publishers & Printers, Ibadan.
Multi-Level Statistical Table
Thank you!