Probability for Financial Analysis
Probability for Financial Analysis
Structure
1.0 Objectives
1.1 Introduction
1.2 Sets .
1.3 probability
1.4 Randdln Variables
1.1 INTRODUCTION
You have already seen tHe basic theory of prgbability in statistics course (i,e.,
EEC-13) as well as in qu&titative methods (MEC-03). The present discussion is
intended for a review of those themes. Since we will rely on probability-based
approaches most of the time in our discussion of financial and insurance related
themes, the topics covered below aim at providing you the basic inputs for
understanding the analysis being carried on in those fields. While developing the
study material in this u it, we have used the notations, examples and proofs given
useful. L
f'
in ,Bialas (2005) and mbens (2000). For your reference, these sources will be
Notntioa: gets qre denoted by capital l.&ters (A,B,a,.. .) while their elements are
--
Qua*ti*atiueTechniclues denoted by small letters (a ,6, w,.. .). If x is an element of a set A, we w r i t e x ~ A .
for Risk Analysis .
If x is not an element of A, we write x 4.A.
There are two sets of special importance:
i) The universe is the set containing all points under consideration and is
denoted by R .
ii) The empty set (the set containing no elements) is denoted by 0. .
If every point of set A belongs to set B, then we say that A is a subset of B (B is a
superset of A). We write
AcB; b A
A=B if and only ifA_cB and B z A .
A is a proper' subset of B VAcB and A #B.
Two sets A ahd B are disjoint (or mutually exclusive) if A nB = 0
Cardinaliq of Sets
The cardinality of a set A (denoted by ( A ( )is the number of elements in A.
For some (but not all) experiments, we have
IR( is finite.
Each of (he outcomes of the experiment is equally likely.
In such cas$s, it is important to be able to enumerate (i.e., count) the number of
elements of ,subsets of i2 .
Sample spnces and events
Using set notation, we can introduce the first components of our probability
model. I
1.3 PROBABILITY
1.3.1 Probability Spaces
"
In the above, we have seen how to use sample spaces Q and events A c S-2 to
describe an experiment. In following, we will define what we mean by the
probability ofan event, P(A) and the mathematics necessary to compute P(A) for
all events A R .
For infinite sample spaces, for example when C!= R , it is not possible to
reasonably compute P(A) for all A c R . In such cases, we restrict our definition
of P(A) to some smaller subset of events, 3, that does not necessarily include
every possible subset of rR .
Measuring Sets
Let us start with some important definition's.
1) The sample space, Q, is the ,set of allpossible outcomes ofan experiment.
2) An event is any subset of the sample space.
3) A probability measure, P(.), is a function, defined on subsets of ST, that,
assigns to each event a real number P(E) such that the following three
probability axioms hold:
i) P(E) > 0 for all events E
ii) P(SZ)=I
Qua11k Live Techniques Ifthe events E a ~ Fd ure nlzltually excltlsive, then P(E u F)=P(E)+P(F)
-J
for Ri innlysis
The above three specifications are called the Kolmogorov axioms for probibility
measures. We will discuss Inore about the probability l~ieasuresin the next unit.
For the purposes of the present unit, you need to remember that the probability of
an event A for every A c 0 lies in the closed interval of [0, I ] i.e.. 0 5 P ( A )2 1.
Therefore you can take P(.)as a function which assigns to each element of the set
of events 3 ,a single number fiom the interval [o, I]. That is, P : T + [0,1].
In calculus, you worked with real-valued functions X.1. That is, f : R -,R .
Such runctions assign each number in R a single number from IR . For example,
f(x)=x3, maps x=2 into ,fl2)=8. In probability, the function, P(.), is a function
\vhose argument is a subset of Q (liarnely, an event) rather than a number. In
mathematics, we call firnctions that assign real numbers to subsets, measures.
FoI lowing such a logic. we call P(.) as a prohrrhilitll n7eavw.e.
Irzterpretutiolz rvProbnhili(y
Logical Probability. Take i-2 as a Ill~itesct
~ = ~ C O,...~,LO,,;
, C O ~
and each of the outcolnes in i2 is equall~.liliely. That is.
1
P ( { W ~ } ) = f-o r i = I . .... nand f o r a n y A c S 2 1A 1
assign P ( A ) = -
M 17
For example, consider the outcolne of rolling a fair six-sided die once, Then we
have !2={ol,w2,... ,m6) where w, = the die shows i number for ii=1,2 ,,.. ,6. So
}) = - for i= 1 ,
we can assign. for exa~nple,P ({o,
6
1
.6
.. .
3
P ({even number})= P ( { O J ~ , W ~ , ~ "=) ( -
, ) )and, in gcnerul,
G
and we want to find the probability that the factory has at least one accident in a
month.
That is, we want to compute P(u,, E ,).
a'
Since l=O
-i( = e' for any a, we have
Quantitative Techniques 1.3.2 Conditional Probability
for Risk Analysis
The probability of an event A given the event B, denoted P(A(B),is the value that
solves ;he equation p ( A n 8 )= P ( A / 8)P ( B ) provided that P(B)> 0.
Example. Find the probability of two heads given that you have a1 least one head
in the two tosses. '
... ,An
Generalising the above result, we have for any events AA,A2,
P(A,Az ... An)=P(A,)P(A2 (A,)P(A31 A I A ~ ) .P(An
. . I A I A2 ... An.,)
-
Remember that P(.) is ;robability measure if and only if *
I ) P(@. 2 0 for all events E
2) P(SZ)=l
3) If E n F = 0, then P (E u F)=P(E)+P(F)
,1.3.3 Independence
Two events A and B are independent if and only if
P ( ~ B)n=P (A) P (B).
That is, if two events A and B are independent and P (B) >O, then P(AIB)=P(A).
I
, Essentially it says, knowing whether B occurs does not change the probability
' ofA
.
Moreover, ifA and B are independent events, then
A' ond B' are independent events,
a A' and B ore independent events, and
a
c
A and are independent events.
Bayes' Theorem
Take a complete set of alternatives to a collection of events
BI,B2,...,.....,B~ such that
1) B,I)BtU ......LJBn=S2,and
2) BinB,=0foranyi#j.
Law of Total Probability,
Given an event A and a complete set o f alternatives B,,B2,... ,Bn,
We use the definition of independence given above to get results on more than
two events.
i) A finite collection of events A,,A2, ... ,A, are independent if and only if
P(A,, n. n A~,)=P(AI,)...P(A~,)
for all 2 s; j In and all I r k, c... ..< k, 5 n.
ii) For n=3, the events A / , A2, A3 are independent if and only if
P(AI n AZ)=P(AI)P(AZ)
P(Al ~ A ~ ) = P ( A I ) P ( A J )
@ P(A2 nA3)'P(A2) P(A3)
P(AI fl A2 f l A ~ ) = P ( A I ) P ( A ~ ) P ( A ~ )
All of the above must hold for A , , A2,A3to be inde,pendent.If only the first three
conditions hold, then-we say that AI,A2,A3 are pair wise independent. We will
discuss this concept with greater details in the next unit.
If you toss a coin twice, what would be the sample space? List all the
possible events for this experiment.
Basic Concepts
Definition. A random variable (r.v.) is a function X (.), which assigns to each
element w of IR a real value X ( a ) .Thus, we write
P((w E Q : X(w)= a ) )= P((X(w)= a ) ) = P(X = a)
Examples.
1) Flip a coin n times. Here R = {x,T)'.The random variable
X E (0,1,2,,.., n] can be defined to be the number of heads
2) Let Q = R , define the two rev.
2) limF,(a)=O
a+4
p , ( l ) = p , andp,(O)=l-p
- , 0 5 p 5 1 has pmf
Geometric rev.:X ~ e o m ( p ) for
P x ( t ) = p ( l - p ) k - l , for k = l , 2,....
This r.v. represents, for example, the number of coin flips until the first
heads shows up (assuming independent coin flips)
-
Binomial r.v.: X B(n, p ) , for integer n > 0, and 0 2 p 5 1 has pmf
.
fx(x)=
, :[b- a
Exponential distrib,ution
fora~xlb
otherwise
14
P ~ ( YE A ) = P ~ ( ~ ( ExA)) = PI(@ EQ( C J ( X ( . ~E)A) ) . Applied Probability It
This is not very helpful, as we have not specified the actual experiment and the -
random variable. The question is how we can go from these functions for X to
the corresponding functions for Y. Let us discuss the general case for such
transformation:
Result 1. Define the mapping g-' ( A ) as:
( A )= ( xl g ( x ) E A ] .
9-'
Then
on Y. I
*
= &(x 5: g-I ( Y ) )= F, (g-' ( y)). . .
Then use the, chain rule to take the derivative with respect to y you get the result. * :?I
Example 1. Suppose X has a Poisson distribution vvith paranleteiil . That is, the
pmf of X is
fx(x)=
,/ x!
x = o , I,,, ...
.
.,I
.
lo otherwise,
for some positive 2 . Usually we use Poisson distribution for counting events such
as arrival of insurance claims.
-, (-
Qu'ntitaave Techniques Take the distribution of Y = g ( ~= )2. X. Then inverse of this transformation is
for Risk Analysis
: X.=g-'( Y ) = Y / 2 and the pmf of Y is'given as
lo otherwise.
Example 2. Let X be an exponential distribution with pdf
jX (1' = {S x>O
otherwise. +
Then cdf its distribution is F, (x) z 1 - e-' . Units this distribution, when we write
f, ( x ) = ;lexp(-AX), for positive A , we can use it is widely used for modelling
mortalities.
Uniform Distribution
Consider the distribution of Y = c ~ ( x ) = l - e - ~This . is a monoione
transformation with inverse X = g-I(Y) = In (1 - Y ) . Then,
(0 otherwise,
It has a uniform distribution on the interval [-I, I]. Consider the distribution of
non-monotonic formal action Y = g ( ~=)x2. Because of the non-monotonicity 1
i
we have to do this on a more ad hoc basis. In this example we work directly
through the cumulative distribution functions. For its elements we can split things
, I
up into intervals where the transformation is monotone. First, we calculate the cdf
for X : .
..........................................................................................
2) How do you write the PMF of a ~ernoulli'random variable?
..........................................................................................
. 3) What is PDF? Write down its properties.
.................................
+,,...d ..............................................
4) How do you write the PDF of
(i) exponential distribution
(ii) Gaussian distribution
..........................................................................................
5) What is meaning of transformation of distribution function?
Quantitative Techniques
for Risk Analysis
1.5 EXPECTATIONS - - -
Example 1. Suppose you tow a single die. Let X , be the number (x) on top of
. the die, has a distribution withpmj
116 x = 1,2,3,4,5,6,
otherwise.
Then the expected value is
0 o therw ise:
As we have seen above the exponential distribution'with X = 1 can be used to fit
distributions of durations such a9 waiting times of insurance claim arrivals. Its
expected,valueis
Applied Probability I
-1
A exp ( - A X )
=-
For the last step, we have taken the following: 'since z = r - ' ( r ( z ) ), by the ohain
rule for differentiation we differentiate both sides with respect to
Quantitative Techniques dr-' dr ( z )
for Risk Analysis zz 1=-
(4
( z ) )= I--
ar
az r ( 2 ) .
Note some special expectations. In all such cases X is a random variable with
pdflpmf equal to f,y ( x ) :
Mean is another name for the expectation or expected value o f x ,
I).
p = E(x).
dkM , ~
In particular, using M,!, ( t ) as shorthand for -(r), (0) = I ,
we have M.,
at
the moment generating function, we can derive its expectation and avoid the
procedure integration by parts.
Thus,
However, it only works for t K A . If you are interested only about values of t
around zero, you can use this result. The derivative of the mgfis
M,;.( t )k !
a
(A - '
so that
Its mean is 11A, the second moment 2 / A', and the variance 1 / A*.
Example 4. Suppose that an experiment can have one of two outcomes, success
or failure. Let p be the probability of success. Let Y be the indicator for success,
equal to one if the experiment is a success and zero otherwise. This ,is called as a
Bernoulli trial. When we repeat this experiment n times, and assulne that the
repetitions are independent, we have the following result:
Let for i = 1, ...,n denote the success in the it11 ' ~ e r n o lul i trial. Define
I
X= x" (=I
Y, as the total number of successes. Then X has a binomial
distribution. To find its pmJ; consider the probability of particular sequence of x
successes and n - x failures. For example, first x one's and then n - x zero's.
The probability of any one such a sequence is pl(1- p)"-'. To get the prlobability
of x successes and n - x failures, we need to count the number of such
sequences. This is equal to the number of ways you can select x objects out of a
set of n, which is
(3 . Hence thepmf of X is
Another approach to get the above result is to write X as the sum of independent
zn
and identically distribution random variables, X = /=I I; , .with
Result 2. Chebyshev's Inequality. For any random variable Y with mean pand
variance a 2, andfir any k > 0
E ( X ) =, C x f x ( x ) &
When X = ( Y - p ) 2 and c = k 2 d , we rewrite the inequality as Applied Prob?bility 1
P r (X2 C) 5 E [ x ] / c
Note that the second summand is only from k = 1 tor - 1, not from k = O
to r. - l . Changing the second summation from k = 1 to k = r - 1 the
summation from k = 0 to r,- 2 , we can.write this as
when 1 ' . . .
a , . ., ! . .* . ._ ,
.I.
\ . ....
. .
r(a)= p ta-ie-rdl.
Quantitative Tech j f PS It's properties are:
for Risk Analysi.,
r ( a + l ) = a . r ( a ) , ~(l)=$e-'dt=l,and ~(1/2)=&,
s o that
,r(rt)= (n- I)!, ,
/
-- -
______-
for integer n. Tdke a random variable Y with an exponential distribution
with arrival rate A . Then its moment generating function is
A I
M , (t) = E l e x p ( t ~ )=] -=--
A - t 1-pt '
where p=I/A is the mean of the exponential distribution.
Let Y;, ...,Y, be independent exponential random variables with mean 1.
Then the momentgenerating function for the gamma distribution is
- 1 ( ( x - p)2- 2axtl
-Ezexp(- 2a2 Jdr
I
1
- mean is p and the variance cr2.
Properties of normal distribution: Linear transformations of normal
1
random variables are also nor~nallydistributed. Talce a random variable
I
i
X with a ~ ( p0 ',) distribution, and consider the transformation
I Y = a+ bX . Then,
for -CQ < x < a.This distribution has no moments. The median and mode
are both equal to 8. The pdf is similar to the pdf for the normal distribution
with a thicker tail. We use it for modelling variables with high kurtosis,
that is, which infrequently take on extremely large values, such as stock
prices, event like loss in Tsunami. An interesting property of the Cauchy
distribution is that if the sequence of independent random variables
X,,X,, ...,X,, all have the Cauchy distr,$ution centered around 8, then the
Z''
average 1=. !=I X , / n also has that same Cauchy distribution centered
around 0, with the same quantile. In ,other words, laws of large numbers
, will not to apply to the ones like Cauchy distributions.
5) Beta Distribution. If two independent random variables Y; and Y, have
Gamma distributions with parameters a and 1 and O , and 1, then the ratio
X = Y;/(Y, t Y,) has a Beta distribution with parameters a arid/?. Itspdl
is
for 0 < x < 1 and zero elsewhere. The mean and variance are a/(a+ p)
and ap/((a+p)' ( a tp+ I)) respectively. For modelling variables that
take on values in the unit interval, we can apply this distribution.
6) Let X have a standard normal distribution (i.e., a normal distribution with
mean zero and unit variance). Let W have a Chi-squared distribution with
r degrees of freedom, and ,let X and W be independent. Then
Y= ~/,,@q has a t -distribution with degrees of freedom equal to r.
The probabili-Q density function of Y is:
for -a < y < a . The t -distribution .has thicker tails than the normal
distribution. As the degrees of freedom gets larger, then the t-distribution
gets closer to the normal distribution. (For E[W l r ]= I , and
, V ( WI r) = 2 l r , as r -+ a , the mean of the denominator is one, but the
variance is zero.)
7) If V and Ware independent Chi-squared randoin variables with degrees of Applied ~robabilitiI
freedom equal to r, and r2 respectively, then Y = ( V l r , ) / ( /~r 2 ) has an
F-distribution with degrees of freedom equal to r, and r, . The probability
density function is .
for -a3 < x < a,then to q as the natural parameters. Consequenlly we have cases
like:
1) Binomial distribution: .
so, k = l , a n d
Quantitative Techniques 2) Poisson distribution:
for Risk Analysis
f x ( x ) = ~ ( ~ 2, ...}I.(
~ ( l , ~/x!).exp{-l)exp(xlnl).
3) Exponential distributio~:
~y(x)=~{x>~)~~-exp(-xl). '
4) Normal distribution:
I A distribution that does not fit in the above group is the uniform distribution:
I
.fX(x)=1{a<x<b].-.b - a
In case of indicator iirnction 1 {a ix < b} you cannot factor into a function of the
parameters and a function of the variable.
Joint Distributions Conditional Distributions,
Independence of Random Variables
Definition 1. Let any N - dimensional random variable be afirnction from the
sample space to R * .
Definition 2. Take (x,Y) be a discrete bivariate vector. Then the function
F,, (x, y ) from R 2 to R , defined by
h ( ~ )14, =
I 114
214
0
ifx-:O
ifx=l
ifx=2
elsewhere.
(0E(TTT,TTH}),
(~E{HTH,H~T;THH,THT)),
HHT, H H H ) ) ,
(o-;{
Similarly we can calculate the pmf for Y without x . In case of joint random
variables, we typically refer to these variable probability mass functions as
marginal probability mass functions. These functions general they can be
calculated from the jointpmf as
Similarly,
Given the joint pn$ we can calculate probabilities involving both random
variables. For example,
JA f, (x,y)dxdy = p r ( 0 E Q I ( X ( @ ) , Y ( ~ EA),
))
j o r ail sets A.
,
Example 2. Pick a point randomly in the triangle (0,0),(0,0),(0,1), (40).Let X
be the x,-coordinate and y be the y -coordinate. A reasonable choice for the joint
pdf appears to be
Q
for ( ( x , y) I 0 < x < I , 0 < y < 1, x + y < 1) and zero otherwise (that is, constant on
the domain).. What is the appropriate value for c ? Since we know that the
probability of the entire sample space is equal to one, we can write
j,6
1 I-y
Quantitative Techniques.
for Risk Analysis = ~(r.y,,o<r<l.o<"<l,I+Y<Il
fH ( X , Y ) ~ ~=Y C ~ X ~ Y
Hence c = 2.
Another way of writing such integrals differently is:
I I-y 1 I-x
~ i x . y U o ~ x ~ l . o . y ~ ~ .g ~y)dxdy
I + Y( ~ ~~, = 66 g (xlY ) dxdy = 6 9 (xyY) d y h ,
depending on the order of integration, ivithout changing the result. In such a case
, the marginalpdfs are defined as
f, (x) = p fn
-m
( x , y ) dy = f-''2yy = 2 - 2x,
for 0 < x < 1 and zero otherwise, Similarly,
fy(Y)=~-~Y, 0 < y .= 1, and zero otherwise.
In joint random variable case, expectations are calculated in the same way as
before, but using double sums or integrals at present:
,:<
E [ ~ ( x ,y ) ]= j J
J r(xl Y)/,
Y
(x,Y) dY&,
"-:I
~ ( ~ ~ ) = ~ $ ~ ~ 2 sd x~= l xd( l ~- x )d' d xx ~ ~ x ~
1',.2j I
I
= l x 3 - z x z + x a = - x4 --x
3 +-.'(l
2 o =-!-12'
provided Pr (E,) 0.
Case of bivariate discrete random variables.
Definition 4. Given two discrete random variables X and Y wilhjoint probability
mass function S,, (x, y ) , the conditional probability mass function for X given
Y = y is
A,,( x I Y )= fH ( X ~ Y )
fY (Y) '
30
provided fy ( y )> 0. Applied Probability I
[ 0 elsewhere.
Note that this is the same as the marginal pmfof X and will not be' true if we
condition on Y = 2 . In that case we condition on w E {HHH,T H H ) such that.
I
112 x = l (0E ( T H X } ) , I
JYIY (XI Y = 2) = (O E { H H H } ),
0 elsewhere.
Definition 5. (Continuous r.v.) Given two continuous random variables X and
Y with jointoprobability density function f, ( x , y ) , the conditional probability
density,function for X given Y = y is
-00
It"fm ( u ,v ) dvdu .
!i
Now take the limit as A + 0 . In that case both numerator and denominator go tlo 1i
zero, so to get the limit we have to use. L' Hospital's rule and differentiate both 11is
numerator and denominator with respect to A . In that case we get: : If
Take the derivative with respect to x to get the conditional probability density
function:
fm (x9.J))
JYIY (xIY) =
fY (Y) '
31
Quantitative Techniques However, this does not yield the conditional density of X given Y = y . Its
for Risk Analysis
inte~retation is that of the limit .of the conditional densities, conditioning on
. y < Y < y + A, and taking the limit as A goes to zero.
Definition 6. Let (x,Y) be a bivariate random vector with pmfpdf f, (x, y),
~ Then the random variables X and Y are independent if
If X and Y are independent, then we can choose, and g(x) = f,(x), and
h(y) = fy(y)' and the result is trivial. N'ow suppose we can factor the joint
density of X and Y as f, (x, y) = g (x). h ( y ) . Then the marginal density of X
Similarly
. >
for all -co < x < co and -a iy < co . The indicator fuhctian 1 { A } is equal to one Applied Probability I
if the argument is true and zero if it is fails. This indicator function l { x + y < I}
cannot be separated into a function of x and a function of y and therefore X
and Y are not independent.
Let there be two random variables X , and Y with means p, and /I,and
variances 4. and ot respectively. Then the covariance of X and Y is defined
as
C(X,Y) = E[(x-P,Y).(Y-PY)],
and the correlation coefficient as
If the two rando~nvariables are independent, the cohriance is zero (note that the
reverse is not necessarily true), and the variance simplifies to the slim of the
variances cr;, + 0;.
To see how tricky conditional probability density functions can be with
i continuous random variables, consider the Borel-Kolrnogorov paradox. The
joint density of two random variables X and Y is
. .
Take the limit as z 10 and z ? 0 to get in both cases f,(0) = e-' :
The conditional density of xCgivenZ = 0 is then
for x > 0, The parado? is that Z = 0 implies, and is implied by, Y = I , yet the
. conditional density of X given
Z = 0 differs from that of X given Y = 1.
The above results are due to the way the limit is taken. In one case we take the
limit for 1< Y < 1 + 8. &nthe other case we take limit 0 < Z < 6.The latter is the
same as the limit 1 < Y <1+ 6 x which kckarly different from the first limit. For
fixed 6 the latter [o&itiariihg set obviously includes more large values of X .
V ( X + Y )= v(x)+v(Y).
..., X, are N independent random varihbles, we have mean
More generally, if X,;
Applied Probability I
and variance
Note that we only use the independence for the variance. Without independence
the mean is unchanged, but the variance becomks
Independence insures that all the covariance terms in the double sum are equal to
zero.
Supposing all random variables to be independent and have the same mean p,
variance 0 ' . Then the sum has mean
r~ i
and variance
and
which can be made arbitrarily small for fixed E by choosing N large enough. It
would seem uncontroversial to say that in this case X, converges t o p .
In other 'cases this result is not so clear. For example, consider the following
sequence of random variables X,, X,, ... with the pdf of X, equal to
-Iln,x<lln,
n<.~<n+l,
elsewhere
The mean of xll is 1 + 1/2n. l'he variance increases with n and approaches infinity
as n goes to infinity. However, the probability that XI, is more than c away from
zero is at most l / n . Thus, XI, does not converge to its asymptotic mean of 1.
This example demonstrates the need for different concepts of convergence. We
consider three such concepts.
Definition 1. A. sequence of random variables X,,
converges to kt in probability
ifforall E > O ,
Note that in the first example the convergence is clearly in quadratic Inearl and
probability. The independence implies it is also convergence almost surely. The
following relations hold between the different convergence concepts:
(i) Con\;ergence in quadratic mean implies co!ivergence in probability.
(ii) Convergence almost surely implies convergence in probability.
(iii) Convergence in quadratic mean does not imply, and is not implied by,
convergence almost surely.
Difference between convergence in quadratic mean
and convergence almost surely. ,
Consider the following.sequence of random variables, defined as X,,( w ) , for
m € 0= [0,1], and with the probability of o in some interval a, b) equal to b - a
for O ~ a l b ~ l :
X,(a)
= 1 for o E [o, I] and zero elsewhere, ' Applied Probability I
I
X 2 (o) = I for w E [0,1/2] and zero elsewhere,
1
X3 (w) = I for w E [1/2,1/2 + 1/31 and zero elsewhere,
:I
A', (o) = 1 for o E [1/2 + 1/3,1] 1) [o,11121 and zero elsewhere,
Xq(a)= 1 for o E [1/2 + 113,I ] u [1/12,1/12+1/5] and zero elsewhere,
For .~,,(o)the intervals where x,, (u) is equal to one have length l / n . That
mealis probability is l / n . 'They shift to the right till they hit 1 , and then start over
again at 0. Then the lnean is I/n, and the variance is l/n - l/n2. Both of these go
to zcro, so that we have convergence to zero in quadratic mean and probability.
Consider a particular value of w in the sequence of values X, ( w ) , X, ( w ). This
sequence does not converge, No matter liow large n , the sum I=n
Em
l/i diverges,
implying that the sequence is always going to return to 1. Hence the probability of
an w such that the lirnit X,,(a)even exists, let alone that it equals zero, is zero,
and not one as required by allnost sure convergence.
With these convergel-tce'concepts we can formulate laws of large numbers.
Result 1. Let X ,, X,, ... be a sequence of independent and identically distributed
random'vmiubles wit!, cornnmn mean tr and vdriance cr2 =
. Let XN 7/ = I X , / N
he the average up to [he AT'" r a ~ ~ d ovariable.
m The17
fill1
JN P.
The result implies convergence in probability as well, as we already showed using
Chebyshev's inequality.
A second result produces an even stronger.for almost sure convergence, That is;
convergence in quadratic mean does not necessarily hold because the variance .
does not necessarily exist,.
Result 2. Lel X,, X2,... (be a sequence ofindependent and identically distributed
rrr,do,n vnrirrbles with common mean p . Let zN= x-x,
N
be the average up to
'
/he N'"random variable. Then
.TNa'Li .
Definitioln 4. A sequence of random variables X , ,X,, .,, converges in distribution
to m rcndtom variable Y , f a t all continuitypoints of F, (y)l
nm F~~( y )= 6,
II-+m
(y).
/q,"-,x / c .
x,,
The last result is known as the delta method:
satisfies -
~ e s u l t ' 5I. f a sequence of random variables X,,
I .
i) cumulative generating knction; ii) characteristic function
- ,
.......................................................................................... i
-
.................,.......................................,............,.,................
/
I . . .
2) Explain the meaning of Chebyshev's inequality. Applied Probability I
'
-
1.8 LET US SUM UP
.-
h This unit reviews the im'portant concepts of probability which find application in
actuarial problems . The basic set operations, union and intersections, are given
that help describe possible outcomes in model building where component of
uncertainty forms a part. The ideas like universe, empty set and set of events that
are required for probability space have been documented. The concepts like
'conditional probability and independence have also been included and their use in
the context of random events, discrete and continuous, have been highlighted. The
meaning of uniform, exponential and Gausian distributions have been used to
explain the transformation of distribution functions. Use of Poisson distribution to
count events has been pointed out. Expectation of a random variable formulated
for discrete and continuous cases has been examined. Use ~fpmfandpdf has. been
considered. Some special distributions viz., Gamma Chi-squared, normal and
Cauchy, along with the family of exponential distribition have been included.
QualltitativeTechniques The joint a*d conditional distributions of independent random variables and the
for Risk Analysis
convergence and limit laws constihrte the last part of discussion in the unit.
intuition: that conditional probability density functions are not invariant under
coordinate transformations.
Characteristic Function: Function that takes the value I for numbers in the set, . '
b1I.f ( x ) = 1 .
0
mass function (p.m.f.) given that it satisfies following two conditions: 1) f(x) z 0; Applied Probability I
2) fI = ! f ( x , ) = l .
F' (x) =
and
P(~~a)lexp(-at).M,(t),-h<t<0.
Hint: Use Chebyshev's inequality.
10) Let X be a discrete random variabl'e with
P ( X = - I ) = 118, P ( X = 0 ) = 6 / 8 and P(X = 1 ) = 118, and P ( x = c ) = O
for all other values of c . Calculate the bound on P( I X -pxl & k - o ; ) for
k = 2 using Chebyshev's inequality.