Lecture Notes On Actuarial Mathematics: Jerry Alan Veeh
Lecture Notes On Actuarial Mathematics: Jerry Alan Veeh
on
Actuarial Mathematics
The objective of these notes is to present the basic aspects of the theory of
insurance, concentrating on the part of this theory related to life insurance. An
understanding of the basic principles underlying this part of the subject will form a
solid foundation for further study of the theory in a more general setting.
Throughout these notes are various exercises and problems. The reader should
attempt to work all of these.
The central theme of these notes is embodied in the question, “What is the value
today of a random sum of money which will be paid at a random time in the future?”
Such a random payment is called a contingent payment.
The other central consideration in the theory of insurance is the time value of
money. Both claims and premium payments occur at various, possibly random,
points of time in the future. Since the value of a sum of money depends on the
point in time at which the funds are available, a method of comparing the value
of sums of money which become available at different points of time is needed.
This methodology is provided by the theory of interest. The theory of interest will
be studied first in a non-random setting in which all payments are assumed to be
sure to be made. Then the theory will be developed in a random environment, and
will be seen to provide a complete framework for the understanding of contingent
payments.
A typical part of most insurance contracts is that the insured pays the insurer
a fixed premium on a periodic (usually annual or semi–annual) basis. Money has
time value, that is, $1 in hand today is more valuable than $1 to be received one year
hence. A careful analysis of insurance problems must take this effect into account.
The purpose of this section is to examine the basic aspects of the theory of interest.
A thorough understanding of the concepts discussed here is essential.
Exercise 2–1. What is the effective rate of interest corresponding to an interest rate
of 5% compounded quarterly?
It is possible that two different investment schemes with two different nominal
annual rates of interest may in fact be equivalent, that is, may have equal dollar
value at any fixed date in the future. This possibility is illustrated by means of an
example.
Example 2–1. Suppose I have the opportunity to invest $1 in Bank A which pays
5% interest compounded monthly. What interest rate does Bank B have to pay,
compounded daily, to provide an equivalent investment? At any time t in years the
12t 365t
.05
amount in the two banks is given by 1 + 012 i
and 1 + 365 respectively. It
is now an easy exercise to find the nominal interest rate i which makes these two
functions equal.
Exercise 2–2. Find the interest rate i. What is the effective rate of interest?
Situations in which interest is compounded more often than annually will arise
frequently. Some notation will be needed to discuss these situations conveniently.
Denote by i(m) the nominal annual interest rate compounded m times per year which
is equivalent to the interest rate i compounded annually. This means that
m
i(m)
1+ = 1 + i.
m
The converse of the problem of finding the amount after n years at compound
interest is as follows. Suppose the objective is to have an amount A n years hence.
If money can be invested at interest rate i, how much should be deposited today
in order to achieve this objective? It is readily seen that the amount required is
A(1 + i)−n . This quantity is called the present value of A. The factor (1 + i)−1 is
often called the discount factor and is denoted by v.
Example 2–2. Suppose the annual interest rate is 5%. What is the present value
of a payment of $2000 payable in the year 2011? The present value (in 2001) is
$2000(1 + 0.05)−10 = $1227.83.
The notion of present value is used to move payments of money through time in
order to simplify the analysis of a complex sequence of payments. In the simple case
of the last example the important idea is this. Suppose you were given the following
choice. You may either receive $1227.83 today or you may receive $2000 in the
year 2011. If you can earn 5% on your money (compounded annually) you should
be indifferent between these two choices. Under the assumption of an interest rate
of 5%, the payment of $2000 in 2011 can be replaced by a payment of $1227.83
today. Thus the payment of $2000 can be moved through time using the idea of
present value. A visual aid that is often used is that of a time diagram which shows
the time and amounts that are paid. Under the assumption of an interest rate of 5%,
the following two diagrams are equivalent.
$2000 $1227.83
....................................................................................................................................................... .......................................................................................................................................................
The advantage of moving amounts of money through time is that once all
§2: Elements of the Theory of Interest 6
amounts are paid at the same point in time, the most favorable option is readily
apparent.
Exercise 2–6. What happens in comparing these cash flows if the interest rate is
6% rather than 5%?
In an interest payment setting, the payment of interest of i at the end of the period
is equivalent to the payment of d at the beginning of the period. Such a payment
at the beginning of a period is called a discount. What relationship between i and
d must hold for a discount payment to be equivalent to the interest payment? The
time diagram is as follows.
i d
....................................................................................................................................................... .......................................................................................................................................................
0 1 0 1
Exercise 2–7. Denote by d (m) the rate of discount payable m times per year that is
equivalent to a nominal annual rate of interest i. What is the relationship between
d (m) and i? Between d (m) and i(m) ? Hint: Draw the time diagram illustrating the two
payments made at time 0 and 1/ m.
Exercise 2–8. Treasury bills (United States debt obligations) pay discount rather
than interest. At a recent sale the discount rate for a 3 month bill was 5%. What is
the equivalent rate of interest?
The notation and the relationships thus far are summarized in the string of
equalities m −m
i(m) d (m)
1+i= 1+ = 1− = v−1 = eδ .
m m
§2: Elements of the Theory of Interest 7
Problems
d < d (2) < d (3) < ⋅ ⋅ ⋅ < δ < ⋅ ⋅ ⋅ < i(3) < i(2) < i.
Problem 2–3. Calculate the nominal rate of interest convertible once every 4 years
that is equivalent to a nominal rate of discount convertible quarterly.
Problem 2–4. Interest rates are not always the same throughout time. In theoretical
studies such scenarios are usually modelled by allowing the force of interest to
depend on time. Consider the situation in which $1 is invested at time 0 in an
account which pays interest at a constant force of interest δ . What is the amount
A(t) in the account at time t? What is the relationship between A′ (t) and A(t)? More
generally, suppose the force of interest at time t is δ (t). Argue that A′ (t) = δ (t)A(t),
and solve this equation to find an explicit formula for A(t) in terms of δ (t) alone.
Problem 2–6. Show that d = iv. Is there a similar equation involving d (m) and i(m) ?
§2: Elements of the Theory of Interest 8
Solutions to Problems
Problem 2–1. An analytic argument is possible directly from the formulas.
For example, (1 + i(m) / m)m = 1 + i = eδ so i(m) = m(eδ / m − 1). Consider m as a
continuous variable and show that the right hand side is a decreasing function
of m for fixed i. Can you give a purely verbal argument? Hint: How does
an investment with nominal rate i(2) compounded annually compare with an
investment at nominal rate i(2) compounded twice a year?
Problem 2–2. Since i(m) = m((1 + i)1/ m − 1) the limit can be evaluated directly
using L’Hopitals rule, Maclaurin expansions, or the definition of derivative.
1/ 4 −4
Problem 2–3. The relevant equation is 1 + 4i(1/ 4) = 1 − d(4) / 4 .
Problem 2–4. In the constant force setting A(t) = eδ t and A′ (t) = δ A(t). The
equation A′ (t) = δ (t)A(t) can be solved by separation of variables.
Exercise 2–2. Taking tth roots of both sides of the equation shows that t plays
no role in determining i and leads to the equation i = 365((1+0.05/ 12)12/ 365 −1) =
0.04989.
Exercise 2–6. The present value in this case is $2000(1 + 0.06)−10 = $1116.79.
Exercise 2–8. The given information is d(4) = 0.05, from which i can be
obtained using the formula of the previous exercise as i = (1 − 0.05/ 4)−4 − 1 =
0.0410.
§3. Annuities Certain
Example 3–1. Suppose you have the opportunity to buy an annuity, that is, for a
certain amount paid by you today you will receive monthly payments of $400, say,
for the next 20 years. How much is this annuity worth to you? Suppose that the
payments are to begin one month from today. Such an annuity is called an annuity
immediate (a truly unfortunate choice of terminology). It is useful to visualize the
cash stream represented by the annuity on a time diagram.
An Annuity Immediate
240
.05 −j
(1 + ) 400.
j=1 12
This sum is simply the sum of the present value of each of the payments using the
indicated interest rate. It is easy to find this sum since it involves a very simple
geometric series.
Since expressions of this sort occur rather often, actuaries have developed some
special notation for this sum. Write an for the present value of an annuity which
pays $1 at the end of each period for n periods. Then
n
j 1 − vn
an = v =
j=1 i
where the last equality follows from the summation formula for a geometric series.
The interest rate per period is usually not included in this notation, but when such
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§3: Annuities Certain 11
information is necessary the notation is an i . The present value of the annuity in the
previous example may thus be expressed as 400a240 .05/12 .
A slightly different annuity is the annuity due which is an annuity in which the
payments are made starting immediately. The notation än denotes the present value
of an annuity which pays $1 at the beginning of each period for n periods. Clearly
n−1
1 − vn
än = vj =
j=0 d
where again the last equality follows by summing the geometric series. Note that n
still refers to the number of payments. If the present time is denoted by time 0, then
for an annuity immediate the last payment is made at time n, while for an annuity
due the last payment is made at time n − 1, that is, the beginning of the nth period.
It is quite evident that an = v än , and there are many other similar relationships.
The connection between an annuity due and an annuity immediate can be viewed
in the following way. In an annuity due the payment for the period is made at the
beginning of the period, whereas for an annuity immediate the payment for the
period is made at the end of the period. Clearly a payment of 1 at the end of the
period is equivalent to the payment of v = 1/ (1 + i) at the beginning of the period.
This gives an intuitive description of the equality of the previous exercise.
Example 3–2. Suppose that the annuity is paid continuously, that is, that the annu-
itant receives money at a constant rate of σ dollars per unit time. What value of σ
makes this continuous 20 year annuity equivalent to the discrete annuity described
above? Two annuities are said to be equivalent if they have the same present value.
For the continuous annuity the present value of the σ dt dollars received in the time
interval (t, t + dt) is e−δ t σ dt. The present value of this annuity is therefore
20
σ e−δ t dt.
0
Exercise 3–3. Find σ . Note that you must first find δ which is equivalent to an
interest rate of 5% compounded monthly.
1 − vn
Exercise 3–4. Show that an = .
δ
§3: Annuities Certain 12
Thus far the value of an annuity has been computed at time 0. Another common
time point at which the value of an annuity consisting of n payments of 1 is
computed is time n. Denote by sn the value of an annuity immediate at time n, that
is, immediately after the nth payment. Then sn = (1 + i)n an from the time diagram.
The value sn is called the accumulated value of the annuity immediate. Similarly
s̈n is the accumulated value of an annuity due at time n and s̈n = (1 + i)n än .
Here are two examples which further develop skill in the use of these ideas.
Example 3–3. You are going to buy a house for which the purchase price is $100,000
and the downpayment is $20,000. You will finance the $80,000 by borrowing this
amount from a bank at 10% interest with a 30 year term. What is your monthly
payment? Typically such a loan is amortized, that is, you will make equal monthly
payments for the life of the loan and each payment consists partially of interest and
partially of principal. From the banks point of view this transaction represents the
purchase by the bank of an annuity immediate. The monthly payment, p, is thus the
solution of the equation 80000 = pa360 0.10/12 . In this setting the quoted interest rate
on the loan is assumed to be compounded at the same frequency as the payment
period unless stated otherwise.
Exercise 3–5. Find the monthly payment. What is the total amount of the payments
made?
Example 3–4. Long ago I bought a new car from a local dealer. Let us say the total
cost to me was $15,000. The dealer seemed very anxious that I finance the purchase
through him, and he presented several arguments as to why I should do so. I could
borrow the entire purchase price at 11.95% ammortized over 60 months. His first
sales pitch was as follows. If I financed the car I would pay about $5000 in interest.
If I paid cash I would lose about $8000 in interest that I could earn by investing my
money in a savings account at 8.5% interest. Thus I would gain almost $3000 by
financing the car. Is this argument correct? If not, what’s wrong with it?
Exercise 3–6. Find the monthly payment on the car if it is financed through the
dealer. What is the total interest paid? Is this a relevant fact?
The dealers second argument ran as follows. Suppose that at the end of 5
years the car is worth only half its present value, namely, $7500. Let’s analyze my
available assets under the two alternatives. If I pay cash for the car, at the end of
5 years I will have clear title to a $7500 asset. If I finance the car, at the end of 5
years I will have clear title to the car ($7500) plus the cash I did not pay originally
($15000) plus the interest on this cash ($8000) for a total of $30500. Obviously
only a fool would pass up this type of opportunity!
Exercise 3–7. What are the flaws, if any, in this second argument?
§3: Annuities Certain 13
Problems
Problem 3–1. Show that an < an < än . Hint: This should be obvious from the
picture.
Problem 3–2. John borrows $1,000 from Jane at an annual effective rate of interest
i. He agrees to pay back $1,000 after six years and $1,366.87 after another 6 years.
Three years after his first payment, John repays the outstanding balance. What is
the amount of John’s second payment?
Problem 3–4. There are two common ways of analyzing loans which are amortized.
In the prospective method the loan balance at any point in time is seen to be the
present value of the remaining loan payments. Use the previous problem to show
that this statement is correct.
Problem 3–5. In the retrospective method the loan balance at any point in time is
seen to be the accumulated original loan amount less the accumulated value of the
past loan payments. Show that this formula for the loan balance is correct.
Problem 3–6. An annuity immediate pays an initial benefit of one per year, in-
creasing by 10.25% every four years. The annuity is payable for 40 years. If the
effective interest rate is 5% find an expression for the present value of this annuity.
Problem 3–7. You are given an annuity immediate paying $10 for 10 years, then
decreasing by $1 per year for nine years and paying $1 per year thereafter, forever.
If the annual effective rate of interest is 5%, find the present value of this annuity.
Problem 3–2. From Jane’s point of view the equation 1000 = 1000(1 + i)−6 +
1366.87(1 + i)−12 must hold. The outstanding balance at the indicated time is
1366.87(1 + i)−3 , which is the amount of the second payment.
Problem 3–5. Here the assertion is that bk = A(1 + i)k − Psk . Show that this
choice satisfies the required recursion and initial condition.
Problem 3–6. Each 4 year chunk is a simple annuity immediate. Taking the
present value of these chunks forms an annuity due with payments every 4 years
that are increasing.
Problem 3–7. What is the present value of an annuity immediate paying $1 per
year forever? What is the present value of such an annuity that begins payments
k years from now? The annuity desribed here is the difference of a few of these.
Problem 3–8. The initial monthly payment P is the solution of 100, 000 =
Pa360 . The balance after 10 years is Pa240 so the interest paid in the first 10
years is 120P − (100, 000 − Pa240 ). To determine the number of new monthly
payments required to repay the loan the equation Pa240 = (P + 325)ax should be
solved for x. Since after x payments the loan balance is 0 the amount of interest
paid in the second stage can then be easily determined.
Problem 3–9. Since the effective rate of interest for the insurance company is
5%, the rate (0.05)(2) should be used to move the insurance company’s expenses
from July 1 to January 1.
§3: Annuities Certain 15
Solutions to Exercises
Exercise
240 3–1. The sum is the sum of the terms of a geometric series. So
.05 −j −1 −241
j=1 (1+ 12 ) 400 = 400((1+0.05/ 12) −(1+0.05/ 12) )/ (1−(1+0.05/ 12)−1) =
60, 610.12.
Exercise 3–2. This follows from the formulas for the present value of the two
annuities and the fact that d = iv.
Exercise 3–5. Using the earlier formula gives a360 0.10/ 12 = 113.95 from which
p = 702.06 and the total amount of the payments is 360p = 252740.60.
Exercise 3–6. The monthly payment is 15, 000/ a60 .1195/ 12 = 333.29 so that the
total of the payments is 19, 997.27 of which 4, 997.27 is interest. The total of
the interest payments is irrelevant, since the time at which the interest payment
is made is not taken into account.
2. An amortization table is a table which lists the principal and interest portions
of each payment for a loan which is being amortized. Construct an amortization
table for the loan of the previous problem. The table should have four columns:
the payment number, the principal part of that payment, the interest part of that
payment, and the loan balance immediately after that payment is made.
Another aspect of insurance is that money is paid by the company only if some
event, which may be considered random, occurs within a specific time frame. For
example, an automobile insurance policy will experience a claim only if there is an
accident involving the insured auto. In this section a brief outline of the essential
material from the theory of probability is given. Almost all of the material presented
here should be familiar to the reader. The concepts presented here will play a crucial
role in the rest of these notes.
In practice, the sample space of the experiment fades into the background and
one simply identifies the random variables of interest. Once a random variable has
been identified, one may ask about its values and their associated probabilities. All
of the interesting probability information is bound up in the distribution function of
the random variable. The distribution function of the random variable X, denoted
FX (t), is defined by the formula FX (t) = P[X ≤ t].
Two types of random variables are quite common. A random variable X with
distribution function FX is discrete if FX is constant except at at most countably
many jumps. A random t
variable X with distribution function FX is absolutely
d
continuous if FX (t) = FX (s) ds holds for all real numbers t.
−∞ ds
Exercise 5–2. Sketch the distribution function of a Bernoulli random variable with
P[X = 1] = 1/ 3.
Another useful tool is the indicator function. Suppose A is a set. The indicator
function of the set A, denoted 1A (t), is defined by the equation
1 if t ∈ A
1A (t) =
0 if t ∉ A.
Exercise 5–7. Verify that the density of a random variable which is exponential
with parameter λ may be written λ e−λ x 1(0,∞) (x).
Example 5–3. Random variables which are neither of the discrete nor absolutely
continuous type will arise frequently. As an example, suppose that person has a fire
insurance policy on a house. The amount of insurance is $50,000 and there is a $250
deductible. Suppose that if there is a fire the amount of damage may be represented
by a random variable D which has the uniform distribution on the interval (0, 70000).
(This assumption means that the person is underinsured.) Suppose further that in the
time period under consideration there is a probability p = 0.001 that a fire will occur.
Let F denote the random variable which is 1 if a fire occurs and is 0 otherwise. It is
easy to see that the size X of the claim to the insurer in this setting is given by
X = F (D − 250)1[250,50250] (D) + 500001(50250,∞) (D) .
§5: Brief Review of Probability Theory 19
This random variable X is neither discrete nor absolutely continuous.
Exercise 5–8. Verify the correctness of the formula for X. Find the distribution
function of the random variable X.
Often only the average value of a random variable and the spread of the
values around this average are all that are needed. The
expectation (or mean)
of a discrete random variable X is defined by E[X] = t fX (t), while the ex-
t
pectation
∞ of an absolutely continuous random variable X is defined by E[X] =
t fX (t) dt. Notice that in both cases the sum (or integral) involves terms of the
−∞
form (possible value of X) × (probability X takes on that value). When X is neither
discrete nor absolutely continuous, the expectation is defined by the Riemann–
∞
Stieltjes integral E[X] = −∞ t dFX (t), which again has the same form.
Exercise 5–9. Find the mean of a Bernoulli random variable Z with P[Z = 1] = 1/ 3.
Exercise 5–10. Find the mean of an exponential random variable with parameter
λ = 3.
Exercise 5–11. Find the mean and variance of the random variable in the fire
insurance example given above. (The variance of a random variable X is defined
by Var(X) = E[(X − E[X])2 ] and is often computed using the alternate formula
Var(X) = E[X 2 ] − (E[X])2 .)
The events A and B are independent if P[A| B] = P[A]. The intuition underlying
the notion of independent events is that the occurance of one of the events does not
alter the probability that the other event occurs.
Problem 5–1. Suppose X has the uniform distribution on the interval (0, a) where
a > 0 is given. What is the mean and variance of X?
Problem 5–3. Find the moment generating function of a Bernoulli random variable
Y for which P[Y = 1] = 1/ 4.
Problem 5–4. Find the moment generating function of a random variable Z which
has the exponential distribution with parameter λ . Use the moment generating
function to find the mean and variance of Z.
Problem 5–5. A double indemnity life insurance policy has been issued to a person
aged 30. This policy pays $100,000 in the event of non-accidental death and
$200,000 in the event of accidental death. The probability of death during the next
year is 0.002, and if death occurs there is a 70% chance that it was due to an accident.
Write a random variable X which represents the size of the claim filed in the next
year. Find the distribution function, mean, and variance of X.
Problem 5–6. In the preceding problem suppose that if death occurs the day of the
year on which it occurs is uniformly distributed. Assume also that the claim will be
paid immediately at death and the interest rate is 5%. What is the expected present
value of the size of the claim during the next year?
Problem 5–5. Let D be a random variable which is 1 if the insured dies in the
next year and 0 otherwise. Let A be a random variable which is 2 if death is due
to an accident and 1 otherwise. Then X = 100000AD.
Problem
5–7. Hint: In the usual formula for the expectation of Y write
i = ij=1 1 and then interchange the order of summation.
Problem 5–8. Use a trick like that of the previous problem. Double integrals
anyone?
§5: Brief Review of Probability Theory 22
Solutions to Exercises
Exercise 5–1. Using the fact that B = A ∪ (B \ A) and property (3) gives
P[B] = P[A] + P[B \ A]. By property (1), P[B \ A] ≥ 0, so the inequality
P[B] ≥ P[A] follows.
Exercise 5–2. The distribution function F(t) takes the value 0 for t < 0, the
value 2/ 3 for 0 ≤ t < 1 and the value 1 for t ≥ 1.
Exercise 5–3. The distribution function F(t) is 0 if t < 0 and 1 − e−t for t ≥ 0.
The density function is 0 for t < 0 and e−t for t ≥ 0.
Exercise 5–4. The distribution function F(t) takes the value 0 for t < 0, the
value t for 0 ≤ t ≤ 1 and the value 1 for t > 1. The density function takes the
value 1 for 0 < t < 1 and 0 otherwise.
Exercise 5–5. The picture should have a jump and also a smoothly increasing
portion.
Exercise 5–6. This function takes the value 1 for 0 ≤ t < 1 and the value 0
otherwise.
Exercise 5–8. The distribution function F(t) takes the value 0 if t < 0, the value
(1 − 0.001) + 0.001 × (250/ 70000) for 0 ≤ t < 250 (because X = 0 if either there is
no fire or the loss caused by a fire is less than 250),the value0.999+0.001×t/ 70000
for 250 ≤ t < 50250 and the value 1 for t ≥ 50250.
Exercise 5–11. Notice that the loss random variable X is neither discrete nor
absolutely continuous. The distribution function of X has two jumps: one at
t = 0 of size 0.999 + 250/ 70000 and another at 50250of size 0.001 − 0.001 ×
50250
50250/ 70000. So E[X] = 0 × (0.999 + 250/ 70000) + 250 t0.001/ 70000 dt +
50250 × (0.001 − 0.001 × 50250/ 70000). The quantity E[X 2 ] can be computed
similarly.
§6. Laboratory 2
1. What are the possible values of the random variable D? What are the possible
values of the random variable H?
2. For each relevant value of h and d compute P[H = h, D = d]. This probability
is called the joint density of H and D and is denoted fH,D (h, d). Put your computed
values in the form of a rectangular table with rows indexed by h and columns
indexed by d.
3. Find the density and expetation of D. Find the density and expectation of H.
7. Compute E[E[H | D]]. The Theorem of Total Expectation states that for any
two random variables X and Y, E[E[Y | X]] = E[Y]. Do your computations reflect
this fact?
8. The intuition behind the conditional expectation E[H | D] is that this should
be the expected value of H computed after taking the value of D into account. Argue
that E[H | D] = D/ 2. Is this in accord with your computations?
An insurance policy can embody two different types of risk. For some types
of insurance (such as life insurance) the variability in the claim is only the time at
which the claim is made, since the amount of the claim is specified by the policy.
In other types of insurance (such as auto or casualty) there is variability in both the
time and amount of the claim. The problems associated with life insurance will be
studied first, since this is both an important type of insurance and also relatively
simple in some of its aspects.
The central difficulty in issuing life insurance is that of determining the length
of the future life of the insured. Denote by X the random variable which represents
the future lifetime of a newborn. For mathematical simplicity, assume that the
distribution function of X is absolutely continuous. The survival function of X,
denoted by s(x) is defined by the formula
s(x) = P[X > x] = P[X ≥ x]
where the last equality follows from the continuity assumption. The assumption
that s(0) = 1 will always be made.
Example 7–1. In the past there has been some interest in modelling survival func-
tions in an analytic way. The simplest model is that due to Abraham DeMoivre. He
x
assumed that s(x) = 1 − for 0 < x < ω where ω is the limiting age by which
ω
all have died. The DeMoivre law is simply the assertion that X has the uniform
distribution on the interval (0, ω ).
Life insurance is usually issued on a person who has already attained a certain
age x. For notational convenience denote such a life aged x by (x), and denote the
future lifetime of a life aged x by T(x). What is the survival function for (x)? From
the discussion above, the survival function for (x) is P[T(x) > t]. Some standard
notation is now introduced. Set
t px = P[T(x) > t]
and
t qx = P[T(x) ≤ t].
When t = 1 the prefix is ommitted and one just writes px and qx respectively.
Generally speaking, having observed (x) some additional information about the
survival of (x) can be inferred. For example, (x) may have just passed a physical
exam given as a requirement for obtaining life insurance. For now this type of
possibility is disregarded. Operating under this assumption
s(x + t)
t px = P[T(x) > t] = P[X > x + t| X > x] = .
s(x)
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§7: Survival Distributions 25
Exercise 7–1. Write a similar expression for t qx .
which represents the probability that (x) survives at least t and no more than t + u
years. Again, if u = 1 one writes t| qx . The relations t| u qx = t+u qx − t qx = t px − t+u px
follow immediately from the definition.
Exercise 7–4. Compute t px for the DeMoivre law of mortality. Conclude that under
the DeMoivre law T(x) has the uniform distribution on the interval (0, ω − x).
Under the assumption that X is absolutely continuous the random variable T(x)
will be absolutely continuous as well. Indeed
s(x + t)
P[T(x) ≤ t] = P[x ≤ X ≤ x + t| X > x] = 1 −
s(x)
so the density of T(x) is given by
−s′ (x + t) fX (x + t)
fT(x) (t) = = .
s(x) 1 − FX (x)
Intuitively this density represents the rate of death of (x) at time t.
∞
Exercise 7–5. Use integration by parts to show that E[T(x)] = t px dt. This
0
expectation is called the
∞complete expectation of life and is denoted by e̊x . Show
2
also that E[T(x) ] = 2 t t px dt.
0
=− ′
fX (x) s (x)
µx =
1 − FX (x) s(x)
which is called the force of mortality. Intuitively the force of mortality is the
instantaneous rate of death of (x). (In component reliability theory this function is
often referred to as the hazard rate.) Integrating both sides of this equality gives the
useful relation x
s(x) = exp − µt dt .
0
§7: Survival Distributions 26
Exercise 7–7. Derive this last expression.
x+t
µs ds
Exercise 7–8. Show that t px = e− x .
Exercise 7–9. Show that the density of T(x) can be written fT(x) (t) = t px µx+t .
If the force of mortality is constant the life random variable X has an expo-
nential distribution. This is directly in accord with the “memoryless” property of
exponential random variables. This memoryless property also has the interpretation
that a used article is as good as a new one. For human lives (and most manufactured
components) this is a fairly poor assumption, at least over the long term. The force
of mortality usually is increasing, although this is not always so.
The curtate future lifetime of (x), denoted by K(x), is defined by the relation
K(x) = [T(x)]. Here [t] is the greatest integer function. Note that K(x) is a discrete
random variable with density P[K(x) = k] = P[k ≤ T(x) < k + 1]. The curtate
lifetime, K(x), represents the number of complete future years lived by (x).
Exercise 7–12. Show that the curtate expectation of life ex = E[K(x)] is given by
the formula ex = ∞i=0 i+1 px . Hint: E[Y] = ∞i=1 P[Y ≥ i].
§7: Survival Distributions 27
Problems
∂ d
Problem 7–2. Calculate t px and e̊x .
∂x dx
Problem 7–3. A life aged (40) is subject to an extra risk for the next year only.
Suppose the normal probability of death is given by the life table, and that the extra
risk may be expressed by adding the function 0.03(1 − t) to the normal force of
mortality for this year. What is the probability of survival to age 41?
Problem 7–4. Suppose qx is computed using force of mortality µx , and that q′x is
computed using force of mortality 2µx . What is the relationship between qx and q′x ?
Problem 7–5. Show that the conditional distribution of K(x) given that K(x) ≥ k is
the same as the unconditional distribution of K(x + k) + k.
Problem 7–6. Show that the conditional distribution of T(x) given that T(x) ≥ t is
the same as the unconditional distribution of T(x + t) + t.
Problem 7–7. The Gompertz law of mortality is defined by the requirement that
µt = Act for some constants A and c. What restrictions are there on A and c for this
to be a force of mortality? Write an expression for t px and e̊x under Gompertz’ law.
∞
∂ d ∂
Problem 7–2. t px = t px (µx − µx+t ) and e̊x = t px dt = µx e̊x − 1.
∂x dx 0 ∂x
1
− µ40+s +0.03(1−s) ds
Problem 7–3. If µt is the usual force of mortality then p40 = e 0 .
Problem 7–4. The relation p′x = (px )2 holds, which gives a relation for the
death probability.
Exercise 7–2. t px = s(x + t)/ s(x) = s(x + s + (t − s))/ s(x) = (s(x + s + (t − s))/ s(x +
s))(s(x + s)/ s(x)) = t−s px+ss px . What does this mean in words?
Exercise 7–3. For the first one, t| u qx = P[t < T(x) ≤ t + u] = P[x + t < X ≤
t+u+x| X > x] = (s(x+t)−s(t+u+x))/ s(x) = (s(x+t)−s(x)+s(x)−s(t+u+x))/ s(x) =
t+u qx − t qx . The second identity follows from the fourth term by simplifying
(s(x + t) − s(t + u + x))/ s(x) = t px − t+u px . For the last one, t| u qx = P[t < T(x) ≤
t + u] = P[x + t < X ≤ t + u + x| X > x] = (s(x + t) − s(t + u + x))/ s(x) =
(s(x + t)/ s(x))(s(t + x) − s(t + u + x))/ s(x + t) = t px u qx+t .
Exercise 7–6. Under DeMoivre’s law, e̊x = (ω − x)/ 2, since T(x) is uniform on
the interval (0, ω − x).
x x
Exercise 7–7. 0 µt dt = 0 −s′ (t)/ s(t) dt = − ln(s(x)) + ln(s(0)) = − ln(s(x)),
since s(0) = 1. Exponentiating both sides to solve for s(x) gives the result.
x+t
− µ ds
Exercise 7–8. From the previous exercise, s(x + t) = e 0 s . Using this
fact, the previous exercise, and the fact that t px = s(x + t)/ s(x) gives the formula.
Exercise 7–9. Since fT(x) (t) = −s′ (x + t)/ s(x) and s′ (x + t) = −s(x + t)µx+t by the
previous exercise, the result follows.
Exercise 7–10. From the earlier expression for the survival function under
DeMoivre’s law s(x) = (ω − x)/ ω , so that µx = −s′ (x)/ s(x) = 1/ (ω − x), for
0 < x < ω.
Under each of the assumptions an explicit expression for all of the survivor
functions can be found.
Exercise 8–2. Find expressions for t qx and µx+t , 0 ≤ t ≤ 1, under each of the above
3 assumptions.
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§8: Life Tables 31
Having observed (x) may mean more than simply having seen a person aged x.
It may well mean that (x) has just passed a physical exam in preparation for buying
a life insurance policy. One would expect that the survival distribution of such a
person could be different from s(x). If this is believed to be the case the survival
function is actually dependent on two variables: the age at selection (application
for insurance) and the amount of time passed after the time of selection. A life
table which takes this effect into account is called a select table. A family of
survival functions indexed by both the age at selection and time are then required
and notation such as q[x]+i denotes the probability that a person dies between years
x + i and x + i + 1 given that selection ocurred at age x. As one might expect it
is reasonable to suppose that after a certain period of time the effect of selection
on mortality is negligable. The length of time until the selection effect becomes
negligable is called the select period. The Society of Actuaries (based in Illinois)
uses a 15 year select period in its mortality tables. The Institute of Actuaries in
Britain uses a 2 year select period. The implication of the select period of 15 years
in computations is that for each j ≥ 0, l[x]+15+j = lx+15+j .
A life table in which the survival functions are tabulated for attained ages only
is called an aggregrate table. Generally, a select life table contains a final column
which constitutes an aggregate table. The whole table is then referred to as a select
and ultimate table and the last column is usually called an ultimate table. With
these observations in mind it is easy to utilize select life tables in computations.
Exercise 8–3. You are given the following extract from a 3 year select and ultimate
mortality table.
x l[x] l[x]+1 l[x]+2 lx+3 x+3
70 7600 73
71 7984 74
72 8016 7592 75
Assume that the ultimate table follows DeMoivre’s law and that d[x] = d[x]+1 =
d[x]+2 for all x. Find 1000(2| 2 q[71] ).
§8: Life Tables 32
Problems
Problem 8–1. Graph µx+t , 0 ≤ t ≤ 1, under each of the 3 assumptions for fractional
years.
Problem 8–3. What is the maximum difference between the values of t px for the 3
assumptions for fractional years? Express your answer in terms of px .
Problem 8–4. What is the maximum difference between the values of s(x + t) for
the 3 assumptions for fractional years? Express your answer in terms of s(x) and/or
s(x + 1).
Problem 8–5. Use the life table to compute 1/2 p20 under each of the 3 assumptions
for fractional years.
Problem 8–6. Show that under the assumption of uniform distribution of deaths in
the year of death that K(x) and T(x) − K(x) are independent and that T(x) − K(x) has
the uniform distribution on the interval (0, 1).
Exercise 8–2. Under UDD, t qx = (s(x) − s(x + t))/ s(x) = (ts(x) − ts(x + 1))/ s(x) =
tqx and µx+t = −s′ (x+t)/ s(x+t) = (s(x)−s(x+1))/ ((1−t)s(x)+ts(x+1)) = (1−px)/ (1−
t + tpx ). Under constant force, t qx = 1 − (px )t while µx+t = µx . Under Balducci,
s(x + t) = s(x)s(x + 1)/ ((1 − t)s(x + 1) + ts(x)) so that s(x + t)/ s(x) = px / (px + tqx ).
Thus t qx = tqx / (px + tqx ). Also µx+t = −s′ (x + t)/ s(x + t) = (1 − qx )/ (1 + tqx ).
1. Below is a table which gives the values of qx for ages 1 through 105. Use
this table to compute the survival function at integer ages x. Make a graph of the
survival function for this range of ages.
2. At what age(s) does the maximum discrepancy occur for the use of the 3
assumptions for fractional ages? What is the amount of this disrepancy?
x qx x qx x qx
1 0.000637 36 0.000841 71 0.026627
2 0.000430 37 0.000904 72 0.029565
3 0.000357 38 0.000964 73 0.032931
4 0.000278 39 0.001021 74 0.036738
5 0.000255 40 0.001079 75 0.041002
6 0.000244 41 0.001142 76 0.045699
7 0.000234 42 0.001215 77 0.050833
8 0.000216 43 0.001299 78 0.056487
9 0.000209 44 0.001397 79 0.062777
10 0.000212 45 0.001508 80 0.069757
11 0.000219 46 0.001629 81 0.077444
12 0.000228 47 0.001762 82 0.085828
13 0.000240 48 0.001905 83 0.094904
14 0.000254 49 0.002060 84 0.104700
15 0.000269 50 0.002225 85 0.115289
16 0.000284 51 0.002401 86 0.126798
17 0.000301 52 0.002589 87 0.139353
18 0.000316 53 0.002795 88 0.153021
19 0.000331 54 0.003023 89 0.167757
20 0.000345 55 0.003283 90 0.183408
21 0.000357 56 0.003583 91 0.199769
22 0.000366 57 0.003932 92 0.216605
23 0.000373 58 0.004332 93 0.233662
24 0.000376 59 0.004784 94 0.250693
25 0.000376 60 0.005286 95 0.267491
26 0.000378 61 0.005833 96 0.283905
27 0.000382 62 0.006414 97 0.299852
28 0.000393 63 0.007014 98 0.315296
29 0.000412 64 0.007616 99 0.330207
30 0.000444 65 0.008207 100 0.344556
31 0.000499 66 0.008777 101 0.358628
32 0.000562 67 0.009318 102 0.371685
33 0.000631 68 0.009828 103 0.383040
34 0.000702 69 0.010306 104 0.392003
35 0.000773 70 0.010753 105 1.000000
Example 10–1. A common artificial life form is the status which is denoted n . This
is the life form which survives for exactly n time units and then dies.
Example 10–2. Another common status is the joint life status which is constructed
as follows. Given two life forms (x) and (y) the joint life status, denoted x : y, dies
exactly at the time of death of the first to die of (x) and (y).
Exercise 10–1. If (x) and (y) are independent lives, what is the survival function of
the status x : y?
Occasionally, even the order in which death occurs is important. The status
1
x : n is a status which dies at the time of death of (x) if the death of (x) occurs before
time n. Otherwise, this status never dies.
Problem 10–3. If the UDD assumption is valid for (x), does UDD hold for x : n ?
1
Problem 10–4. If the UDD assumption is valid for (x), does UDD hold for x : n ?
1
Problem 10–5. If the UDD assumption is valid for (x), does UDD hold for x : n ?
Problem 10–6. If the UDD assumption is valid for each of (x) and (y) and if (x)
and (y) are independent lives, does UDD hold for x : y?
§10: Status 38
Solutions to Problems
1 1
Problem 10–1. P[T(x : n ) ≥ t] = t px for 0 ≤ t < n and P[T(x : n ) ≥ t] = n px
for t ≥ n.
§10: Status 39
Solutions to Exercises
Exercise 10–1. The joint life status survives t time units if and only if both (x)
and (y) survive t time units. Using the independence gives s(t) = t px t py .
1. Plot the survival function for Gompertz law of mortality with A = 0.001 and
c = 1.06. Compute e̊x for x = 0, . . . , 100 for this same Gompertz law.
2. Suppose (20) and (30) are independent lives that follow the Gompertz law
of mortality given in the previous problem. Plot the survival for the joint life status
20 : 30. Is there a single age (x) whose survival function is the same as the survival
function of 20 : 30?
3. Plot the survival function for Makeham’s law of mortality with A = 0.003,
B = 0.001, and c = 1.06. Compute e̊x for x = 0, . . . , 100 for this same Makeham
law.
4. Suppose (20) and (30) are independent lives that follow the Makeham law
of mortality given in the previous problem. Plot the survival for the joint life status
20 : 30. Is there a single age (x) whose survival function is the same as the survival
function of 20 : 30?
For the sake of simplicity, assume that the force of interest is constant and
known to be equal to δ . Let bt denote the benefit payable if the time of death of the
insured is t. Also simply write T = T(x) whenever clarity does not demand the full
notation. In this context the actuarial present value of the benefit is defined by the
formula
E[vT(x) bT(x) ].
Intuitively, the actuarial present value of the benefit is the premium that an insurance
company with no operating expenses and no desire for profit would charge in order
to provide the benefit payment. The rationale behind this intuition is the following.
Suppose that the company sold a large number of identical policies to people having
the same survival characteristics. By the Law of Large Numbers the average cost
to the company for providing the benefits would be approximately equal to the
actuarial present value. The actuarial present value of a benefit is also called the
net single premium. The net single premium would be the idealized amount an
insured would pay as a lump sum (single premium) at the time that the policy is
issued. The case of periodic premium payments will be discussed later.
A catalog of the various standard types of life insurance policies and the standard
notation for the associated net single premium follows. In most cases the benefit
amount is assumed to be $1, and in all cases the benefit is assumed to be paid at the
time of death. Keep in mind that a fixed constant force of interest is also assumed.
Here v = 1/ (1 + i) = e−δ .
Insurances Payable at the Time of Death
Type Net Single Premium
n-year pure endowment A 1 = n Ex = E[vn 1(n,∞) (T)]
x:n
n-year term Ax1:n = E[vT 1[0,n] (T)]
whole life Ax = E[vT ]
n-year endowment Ax:n = E[vT∧n ]
T
m-year deferred n-year term m| n Ax = E[v 1(m,n+m] (T)]
The first type of insurance is n-year pure endowment insurance which pays
the full benefit amount at the end of the nth year if the insured survives at least n
years. The notation for the net single premium for a benefit amount of 1 is A 1 (or
x:n
occasionally in this context n Ex ). The net single premium for a pure endowment is
just the actuarial present value of a lump sum payment made at a future date. This
differs from the ordinary present value simply because it also takes into account the
mortality characteristics of the recipient.
The second type of insurance is n-year term insurance. The net single premium
with a benefit of 1 payable at the time of death for an insured (x) is denoted Ax1:n .
This type insurance provides for a benefit payment only if the insured dies within n
years of policy inception.
The third type of insurance is whole life in which the full benefit is paid no
matter when the insured dies in the future. The whole life benefit can be obtained
by taking the limit as n → ∞ in the n-year term insurance setting. The notation for
the net single premium for a benefit of 1 is Ax .
Exercise 12–2. Suppose that T(x) has an exponential distribution with mean 50. If
the force of interest is 5%, find the net single premium for a whole life policy for
(x), if the benefit of $1000 is payable at the moment of death.
The fourth type of insurance, n-year endowment insurance, provides for the
payment of the full benefit at the time of death of the insured if this occurs before
time n and for the payment of the full benefit at time n otherwise. The net single
premium for a benefit of 1 is denoted Ax:n .
Exercise 12–5. Use the life table to find the net single premium for a 5 year pure
endowment policy for (30) assuming an interest rate of 5%.
The m-year deferred n-year term insurance policy provides provides the same
§12: Life Insurance 43
benefits as n year term insurance between times m and m + n provided the insured
lives m years.
All of the insurances discussed thus far have a fixed constant benefit. Increasing
whole life insurance provides a benefit which increase linearly in time. Similarly,
increasing and decreasing n-year term insurance provides for linearly increasing
(decreasing) benefit over the term of the insurance.
Corresponding to the insurances payable at the time of death are the same type
of policies available with the benefit being paid at the end of the year of death. The
only difference between these insurances and those already described is that these
insurances depend on the distribution of the curtate life variable K = K(x) instead
of T. The following table introduces the notation.
Insurances Payable the End of the Year of Death
Type Net Single Premium
n-year term A1x:n = E[vK+1 1[0,n) (K)]
whole life Ax = E[vK+1 ]
n-year endowment Ax:n = E[v(K+1)∧n ]
m-year deferred n-year term = E[vK+1 1[m,n+m) (K)]
m| n Ax
These policies have net single premiums which can be easily computed from
the information in the life table. The primary use for these types of policies is the
computational connection between them and the ‘continuous’ policies described
above. To illustrate the ease of computation when using a life table observe that
from the definition
∞
∞
dx+k
Ax = k+1
v k px qx+k = vk+1 .
k=0 k=0 lx
In practice, of course, the sum is finite. Similar computational formulas are readily
obtained in the other cases.
Exercise 12–6. Show that A 1 = A 1 and interpret the result verbally. How would
x:n x:n
you compute A 1 using the life table?
x:n
Under the UDD assumption it is fairly easy to find formulas which relate the
insurances payable at the time of death to the corresponding insurance payable at
§12: Life Insurance 44
the end of the year of death. For example, in the case of a whole life policy
Ax = E[e−δ T(x) ]
= E[e−δ (T(x)−K(x)+K(x)) ]
= E[e−δ (T(x)−K(x)) ] E[e−δ K(x) ]
1
= (1 − e−δ )eδ E[e−δ (K(x)+1) ]
δ
i
= Ax
δ
where the third equality springs from the independence of K(x) and T(x) − K(x)
under UDD, and the fourth equality comes from the fact that under UDD the
random variable T(x) − K(x) has the uniform distribution on the interval (0,1).
Exercise 12–7. Can similar relationships be established for term and endowment
policies?
Exercise 12–8. Use the life table to find the net single premium for a 5 year
endowment policy for (30) assuming an interest rate of 5%.
Exercise 12–9. An insurance which pays a benefit amount of 1 at the end of the
mth part of the year in which death occurs has net single premium denoted by A(m)
x .
Show that under UDD i Ax = δ Ax .
(m) (m)
One consequence of the exercise above is that only the net single premiums for
insurances payable at the end of the year of death need to be tabulated, if the UDD
assumption is made. This leads to a certain amount of computational simplicity.
§12: Life Insurance 45
Problems
Problem 12–1. Write expressions for all of the net single premiums in terms of
either integrals or sums. Hint: Recall the form of the density of T(x) and K(x).
Problem 12–2. Show that δ Ax1:n = iA1x:n , but that δ Ax:n ≠ iAx:n , in general.
Problem 12–3. Use the life table and UDD assumption (if necessary) to compute
A21 , A21:5 , and A1 .
21:5
Problem 12–5. Assume that DeMoivre’s law holds with ω = 100 and i = 0.10.
Find A30 and A30 . Which is larger? Why?
Problem 12–6. Suppose µx+t = µ and i = 0.10. Compute Ax and A1x:n . Do your
answers depend on x? Why?
Problem 12–7. Suppose Ax = 0.25, Ax+20 = 0.40, and Ax:20 = 0.55. Compute A 1
x:20
and A1x:20 .
Problem 12–9. What change in Ax results if for some fixed n the quantity qx+n is
replaced with qx+n + c?
Problem 12–3. Use δ A21 = iA21 , A21:5 = A1 + 5 E21 and the previous problem.
21:5
Problem 12–5. Clearly A30 > A30 since the insurance is paid sooner in the
continuous case. Under DeMoivre’s law the UDD assumption is automatic and
1
70 −δ t
A30 = 70 0 e dt.
Problem 12–6. The answers do not depend on x since the lifetime is exponential
and therefore ageless.
Problem 12–8. Either the person dies in the first year, or doesn’t. If she doesn’t
buy an increasing annually policy for (x + 1) and a whole life policy to make up
for the increasing part the original policy would provide.
Problem 12–9. The new benefit is the old benefit plus a pure endowment
benefit of c at time n.
§12: Life Insurance 47
Solutions to Exercises
Exercise 12–1. Since if the benefit is paid, the benefit payment occurs at time
n, n Ex = E[vn 1[n,∞) (T(x))] = vn P[T(x) ≥ n] = vn n px .
Exercise 12–2. Under the assumptions given the net single premium is
∞
E[1000vT(x) ] = 0 1000e−0.05t (1/ 50)e−t/ 50 dt = 285.71.
Exercise 12–3. For the conditioning argument, break the expectation into two
pieces by writing Ax = E[vT ] = E[vT 1[0,n] (T)] + E[vT 1(n,∞) (T)]. The first expecta-
tion is exactly A1 . For the second expectation, using the Theorem of Total Ex-
x:n
pectation gives E[vT 1(n,∞) (T)] = E[E[vT 1(n,∞) (T)| T ≥ n]]. Now the conditional
distribution of T given that T ≥ n is the same as the unconditional distribution of
T(x+n)+n. Using this fact gives the conditional expectation as E[vT 1(n,∞) (T)| T ≥
n] = E[vT(x+n)+n 1(n,∞) (T(x + n) + n)]1(n,∞) (T) = vn Ax+n 1(n,∞) (T). Taking expecta-
tions gives the result. To use the time diagram, imagine that instead of buying
a whole life policy now, the insured pledges to buy an n year term policy now,
and if alive after n years, to buy a whole life policy at time n (at age x + n). This
will produce the same result. The premium for the term policy paid now is A1
x:n
and the premium for the whole life policy at time n is Ax+n . This latter premium
is only paid if the insured survives, so the present value of this premium is the
second term in the solution.
Exercise 12–4. Using the definition and properties of expectation gives Ax:n =
E[vT 1[0,n] (T) + vn 1(n,∞) (T)] = E[vT 1[0,n] (T)] + E[vn 1(n,∞) (T)] = A1 + A 1 .
x:n x:n
Exercise 12–5. The net single premium for the pure endowment policy is
v5 5 p30 = (1.05)−5l35 / l30 = (1.05)−595808/ 96477 = 0.778.
Exercise 12–8. The net single premium for a pure endowment policy is
v5 5 p30 = (1.05)−5l35 / l30 = (1.05)−595808/ 96477 = 0.778. For the endowment
policy, the net single premium for a 5 year term policy must be added to this
amount. From the relation given earlier, A1 = A30 −v5 5 p30 A35 . The relationship
30:5
between insurances payable at the time of death and insurances payable at the
end of the year of death is used to complete the calculation.
Exercise 12–9. Notice that [mT(x)] is the number of full mths of a year
that (x) lives before dying. (Here [a] is the greatest integer function.) So the
number of mths of a year that pass until the benefit for the insurance is paid
is [mT(x)] + 1, that is, the benefit is paid at time ([mT(x)] + 1)/ m. From here
the derivation proceeds as above. A(m) x = E[v([mT]+1)/ m ] = E[v([m(T−K+K)]+1)/ m ] =
K
E[v ]E[v ([m(T−K)]+1)/ m
]. Now T − K has the uniform distribution on the interval
(0, 1) under UDD, so [m(T − K)] has the uniform distribution over the integers
0,. . . , m − 1. So E[v([m(T−K)]/ m ] = m−1j=0 v j/ m
× (1/ m) = (1/ m)(1 − v)/ (1 − v1/ m )
from the geometric series formula. Substituting this in the earlier expression
§12: Life Insurance 48
−1 1/ m
gives A(m)
x = Ax v v (1/ m)(1 − v)/ (1 − v1/ m) = Ax δ / i(m) since i(m) = m(v−1/ m − 1).
§13. Laboratory 5
2. The one step recursion formulas derived in problem 1 are especially useful
for computational purposes. The formulas are used to work backwards from large
attained ages to smaller ones, since at large attained ages everyone is dead and
the net premium for the insurance must be zero. Use the values of qx given in
Laboratory 3 and i = 5% to compute the values of Ax and Ax for x = 1 to x = 105.
Place the result of your computations into a nice table.
The basic study of life insurance concludes by developing techniques for un-
derstanding what happens when premiums are paid monthly or annually instead of
just when the insurance is issued. In the non–random setting a sequence of equal
payments made at equal intervals in time was referred to as an annuity. Here
interest centers on annuities in which the payments are made (or received) only as
long as the insured survives.
An annuity in which the payments are made for a non–random period of time
is called an annuity certain. From the earlier discussion, the present value of an
annuity immediate (payments begin one period in the future) with a payment of 1
in each period is
n
1 − vn
an = vj =
j=1 i
while the present value of an annuity due (payments begin immediately) with a
payment of 1 in each period is
n−1
1 − vn 1 − vn
än = j
v = = .
j=0 1−v d
These formulas will now be adapted to the case of contingent annuities in which
payments are made for a random time interval.
Suppose that (x) wishes to buy a life insurance policy. Then (x) will pay a
premium at the beginning of each year until (x) dies. Thus the premium payments
represent a life annuity due for (x). Consider the case in which the payment amount
is 1. Since the premiums are only paid annually the term of this life annuity depends
only on the curtate life of (x). There will be a total of K(x) + 1 payments, so the
actuarial present value of the payments is äx = E[äK(x)+1 ] where the left member is
a notational convention. This formula gives
1 − vK(x)+1 1 − Ax
äx = E[äK(x)+1 ] = E[ ]=
d d
as the relationship between this life annuity due and the net single premium for a
whole life policy. A similar analysis holds for life annuities immediate.
Exercise 14–1. Compute the actuarial present value of a life annuity immediate.
What is the connection with a whole life policy?
Exercise 14–2. A life annuity due in which payments are made m times per year
and each payment is 1/ m has actuarial present value denoted by ä(m)
x . Show that
(m) (m) (m)
Ax + d äx = 1.
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§14: Life Annuities 51
Example 14–1. The Mathematical Association of America offers the following
alternative to members aged 60. You can pay the annual dues and subscription rate
of $90, or you can become a life member for a single fee of $675. Life members
are entitled to all the benefits of ordinary members, including subscriptions. Should
one become a life member? To answer this question, assume that the interest rate is
5% so that the Life Table at the end of the notes can be used. The actuarial present
value of a life annuity due of $90 per year is
1 − A60 1 − 0.412195
90 = 90 = 1110.95.
1−v 1 − 1/ 1.05
Exercise 14–3. What is the probability that you will get at least your money’s worth
if you become a life member? What assumptions have you made?
Pension benefits often take the form of a life annuity immediate. Sometimes
one has the option of receiving a higher benefit, but only for a fixed number of years
or until death occurs, whichever comes first. Such an annuity is called a temporary
life annuity.
Example 14–2. Suppose a life annuity immediate pays a benefit of 1 each year
for n years or until (x) dies, whichever comes first. The symbol for the actuarial
present value of such a policy is ax:n . How does one compute the actuarial present
value of such a policy? Remember that for a life annuity immediate, payments are
made at the end of each year, provided the annuitant is alive. So there will be a
total of K(x) ∧ n payments, and ax:n = E[ K(x)∧n j=1 vj ]. A similar argument applies
in the case of an n year temporary life annuity due. In this case, payments are
made at the beginning of each of n years, provided the annuitant is alive. In this
(K(x)+1)∧n
case äx:n = E[ K(x)∧(n−1)
j=0 vj ] = E[ 1−v d ] where the left member of this equality
introduces the notation.
Exercise 14–4. Show that Ax:n = 1 − d äx:n . Find a similar relationship for ax:n .
Especially in the case of pension benefits it is more realistic to assume that the
payments are made monthly. Suppose payments are made m times per year. In this
case each payment is 1/ m. One could begin from first principles (this makes a good
exercise), but instead the previously established facts for insurances together with
the relationships between insurances and annuities given above will be used. Using
§14: Life Annuities 52
the obvious notation gives
1 − A(m) x
ä(m)
x = (m)
d
1 − i(m)i
Ax
= (m)
d
1 − i(m)i
(1 − d äx )
=
d (m)
id i(m) − i
= ä x +
i(m) d (m) i(m) d (m)
where at the second equality the UDD assumption was used.
Exercise 14–5. Find a similar relationship for an annuity immediate which pays
1/ m m times per year.
1 − e−δ T(x)
ax = E[aT(x) ] = E[ ].
δ
1 σ
....................................................................................................................................................... .......................................................................................................................................................
0 1 0 1
1
Equating the present value of the two cash streams gives v = σ e−δ t dt from
0
which σ = δ / i in order for the streams to be equivalent. It follows that the amount
of the final payment, made at time T(x), for the complete annuity immediate must
be T(x) T(x)−K(x)
δ T(x) δ −δ t δ δt
e e dt = e dt.
K(x) i 0 i
Also T(x)
δ −δ t δ
åx = E[ e dt] = ax .
0 i i
Exercise 14–7. When payments are made on an mthly basis (each payment being
1/ m) the actuarial present value of a complete annuity immediate is denoted by å(m)
x .
Find a formula for the adjustment payment and the actuarial present value in this
case.
Exercise 14–8. In the case of mthly payments (each of size 1/ m) find a formula for
äx{m} as well as for the size of the refund payment.
In order to compare apportioned and complete annuities, let us see how pre-
miums paid by a complete annuity immediate would operate. In such a scheme,
§14: Life Annuities 54
premiums would be paid at the end of each year, except that in the year of death
a reduced premium would be paid at the time of death. When viewed from the
insurer’s viewpoint in this way it is obvious that ä{1}
x > åx .
There is one other idea of importance. In the annuity certain setting one may be
interested in the accumulated value of the annuity at a certain time. For an annuity
due for a period of n years the accumulated value of the annuity at time n, denoted
by s̈n , is given by s̈n = (1 + i)n än = (1+i)d −1 . The present value of s̈n is the same as
n
the present value of the annuity. Thus the cash stream represented by the annuity is
equivalent to the single payment of the amount s̈n at time n. This last notion has an
analog in the case of life annuities. In the life annuity context
n Ex s̈x:n = äx:n
x < ax < ⋅ ⋅ ⋅ < ax < ⋅ ⋅ ⋅ < äx < äx < äx .
ax < a(2) (3) (3) (2)
Give an example to show that without the UDD assumption the inequalities may
fail.
Problem 14–3. Show that for any m we have äx{m} < ä(m)
x and that ax < åx .
(m) (m)
1
Problem 14–4. True or false: A1x:n = 1 − d ä1x:n . Hint: When does x : n die?
Problem 14–6. Use the life table to calculate the actuarial present value of $1000
due in 30 years if (40) survives.
Problem 14–7. Use the life table to compute a21 and ä{4}
21 .
Problem 14–8. Find a general formula for m| n äx and use it together with the life
table to compute 5| 10 ä20 .
x = α (m)äx − β (m).
ä(m)
id i − i(m)
Here α (m) = (m) (m) and that β (m) = (m) (m) . The functions α (m) and β (m) defined
i d i d
here are standard actuarial functions.
Problem 14–12. Use the previous problem to show that δ (Ia)x + (IA)x = ax . Here
(Ia)x is the actuarial present value of an annuity in which payments are made at rate
t at time t. Is there a similar formula in discrete time?
Problem 14–14. Show that äx:n = äx − vn n px äx+n and use this to compute ä21:5 .
§14: Life Annuities 56
Solutions to Problems
Problem 14–1. As the type of annuity varies from left to right, the annuitant
receives funds sooner and thus the present value is higher.
Problem 14–2. Since å(m)x = δ / i ax the result follows for the earlier relation-
(m)
ship between the rates of interest. A similar argument resolves the other half of
the inequalities.
Problem 14–3. The difference betwee the two sides of the inequalities is the
amount of the refund (or extra) payment.
Problem 14–4. The status dies only if (x) dies before time n. The result is true.
Problem 14–11. Use integration by parts starting with the formula δ (Ia)T =
T
δ t e−δ t dt.
0
§14: Life Annuities 57
Solutions to Exercises
K(x)+1 1−Ax
Exercise 14–1. In this case, E[aK(x)+1 ] = E[ 1−v i ]= i .
Exercise 14–2. Here there are [mK]+1 payments, so using the geometric series
[mK]
formula gives ä(m) j=0 (1/ m)v ] = E[(1/ m)(1 − v )/ (1 − v1/ m )].
j/ m ([mK]+1)/ m
x = E[
Now m(1 − v ) = d , which gives the result.
1/ m (m)
Exercise 14–3. To get your money’s worth, you must live long enough so that
the actuarial present value of the annual dues will exceed $675.
(K+1)∧n (K+1)∧n
Exercise 14–4. For the first one äx:n = E[ 1−v d ] = E[ 1−v d 1[0,n−1] (K)] +
(K+1)∧n (K+1) n
d 1[n,∞) (K)] = (1/ d)(1 − Ax:n ).
E[ 1−v d 1[n,∞) (K)] = E[ 1−vd 1[0,n−1] (K)] + E[ 1−v
A similar argument shows that ax:n = (1/ i)(A1 + n px vn+1 ).
x:n
Exercise 14–5. The argument proceeds in a similar way, beginning with the
1−A(m)
relation a(m)
x = i(m) .
x
Exercise 14–6. The first relationship follows directly from the given equation
and the fact that Ax = E[e−δ T(x) ]. Since T(x : n ) = T(x) ∧ n a similar argument
gives ax:n = (1/ δ )(1 − Ax:n ).
Exercise 14–7. In this case the payment rate for the corresponding continuous
δ
annuity is δ / i(m) , which gives å(m) = i(m) ax and the adjustment payment as
δ T(x) T(x) (m) −δ t
x
(δ / i(m) )eδ t dt.
T(x)−[mK(x)]/ m
e [mK(x)]/ m (δ / i )e dt = 0
Exercise 14–8. Here the rate is δ / d(m) for the corresponding continuous annuity
[mK]/ m+1/ m δ −δ t
so that ä{m} = (δ / d (m) )ax and the refund payment is eδ T T d (m) e dt =
[mK]/ m+1/ m−T) δ −δ t
x
0 d (m)
e dt.
§15. Laboratory 6
{m}
1. Show that äx = 1 + vpx äx+1 . Find a similar formula for ä(m)
x and äx .
2. The one step recursion formulas for annuities can be used just like the one
step recursions for insurances themselves. Use the qx values from Laboratory 3 and
{12}
i = 5% and compute äx , ä(12)
x , and äx for x = 1 to x = 105. Place the result of your
computations into a nice table.
The common types of insurance policies can now be realistically analyzed from
an insurers point of view.
To develop the ideas consider the case of an insurer who wishes to sell a fully
discrete whole life policy which will be paid for by equal annual premium payments
during the life of the insured. The terminology fully discrete refers to the fact that
the benefit is to be paid at the end of the year of death and the premiums are to
paid on a discrete basis as well. How should the insurer set the premium? A first
approximation is given by the net premium. The net premium is found by using
the equivalence principle: the premium should be set so that the net expected loss
is zero. Using the equivalence principle the net premium P should satisfy
E[vK(x)+1 − PäK(x)+1 ] = 0
or
Ax − Päx = 0.
From here it is easy to determine the net premium, which in this case is denoted Px .
Exercise 16–1. Use the life table to find the net premium, P30 , for (30) if i = 0.05.
The notation for other net premiums for fully discrete insurances parallel the
notation for the insurance policies themselves. For example, the net annual premium
for an n year term policy with premiums payable monthly is denoted P(12) 1 .
x:n
Exercise 16–3. An h payment whole life policy is one in which the premiums are
paid for h years, beginnning immediately. Find a formula for h Px , the net annual
premium for an h payment whole life policy.
Pä(12)
x:85−x
= 100000 A1x:65−x + 75000 65−x| 10 Ax + 50000 75−x| 5 Ax + 25000 80−x| 5 Ax
Exercise 16–4. Compute the actual net monthly premium for (21).
The methodology for finding the net premium for other types of insurance is
exactly the same. The notation in the other cases is now briefly discussed. The most
common type of insurance policy is one issued on a semi-continuous basis. Here
the benefit is paid at the time of death, but the premiums are paid on a discrete basis.
The notation for the net annual premium in the case of a whole life policy is P(Ax ).
The net annual premium for a semi-continuous term policy with premiums payable
mthly is P(m) (A1x:n ). The notation for other semi-continuous policies is similar.
Exercise 16–5. What type of policy has net annual premium P{m} (Ax:n )?
Policies issued on a fully continuous basis pay the benefit amount at the time
of death and collect premiums in the form of a continuous annuity. Obviously, such
policies are of theoretical interest only. The notation here is similar to that of the
semi-continuous case, with a bar placed over the P. Thus P(Ax ) is the premium rate
for a fully continuous whole life policy.
§16: Net Premiums 61
Problems
Problem 16–3. If P(Ax ) = 0.03 and if interest is at the effective rate of 5%, find
P{2}
x .
Problem 16–4. If 15 P45 = 0.038, P45:15 = 0.056 and A60 = 0.625 find P1 .
45:15
Problem 16–5. Recall that apportionable annuities differ from annuities due only
in the fact that the apportionable annuity offers the additional benefit of a ‘pre-
mium refund’. Let APR x denote the net single premium for this refund benefit for a
continuous whole life policy with apportioned premiums payable annually. Show
that
P(Ax )
APR
x = (Ax − Ax ).
δ
P(Ax )
P(APR
x ) = (Ax − Ax ).
δ äx
Problem 16–7. Use the equivalence principle to find the net annual premium for
a fully discrete 10 year term policy with benefit equal to $10,000 plus the return,
with interest, of the premiums paid. Assume that the interest rate earned on the
premiums is the same as the interest rate used in determining the premium. Use the
life table to compute the premium for this policy for (21). How does this premium
compare with 10000P1 ?
21:10
Problem 16–8. A level premium whole life insurance of 1, payable at the end
of the year of death, is issued to (x). A premium of G is due at the beginning
of each year provided (x) survives. Suppose L denotes the insurer’s loss when
G = Px , L∗ denotes the insurer’s loss when G is chosen so that E[L∗ ] = −0.20, and
Var(L) = 0.30. Compute Var(L∗ ).
Use the equivalence principle to find an expression for the renewal net annual
premium.
Problem 16–10. A $1000 whole life policy is issued to (50). The premiums are
payable twice a year, and are calculated on an apportionable basis. The benefit is
payable at the moment of death. Calculate the semi-annual net premium given that
A50 = 0.3, δ = 0.07, and e−0.035 = 0.9656.
Problem 16–11. Polly, aged 25, wishes to provide cash for her son Tad, currently
aged 5, to go to college. Polly buys a policy which will provide a benefit in the
form of a temporary life annuity due (contingent on Tad’s survival) in the amount of
$25,000 per year for 4 years commencing on Tad’s 18th birthday. Polly will make 10
equal annual premium payments beginning today. The 10 premium payments take
the form of a temporary life annuity due (contingent on Polly’s survival). According
to the equivalence principle, what is the amount of each premium payment? Use
the life table and UDD assumption (if necessary).
Problem 16–12. Snow White, presently aged 21, wishes to provide for the welfare
of the 7 dwarfs in the event of her premature demise. She buys a whole life policy
which will pay $7,000,000 at the moment of her death. The premium payments for
the first 5 years will be $5,000 per year. According to the equivalence principle,
what should her net level annual premium payment be thereafter? Use the life table
and UDD assumption (if necessary).
Problem 16–13. The Ponce de Leon Insurance Company computes premiums for
its policies under the assumptions that i = 0.05 and µx = 0.01 for all x > 0. What
is the net annual premium for a whole life policy for (21) which pays a benefit
of $100,000 at the moment of death and has level apportioned premiums payable
annually?
§16: Net Premiums 63
Solutions to Problems
Problem 16–2. This is really a question about the present value of annuities.
Problem 16–4. Use the two equations P45:15 = P1 − 15 E45 / ä45:15 and
45:15
A45 − 15 E45 A60
P1 = = 15 P45 − 15 E45 A60 / ä45:15 with the given information.
45:15 ä45:15
Exercise 16–5. This is the premium for a continuous endowment policy with
mthly apportioned premiums.
§17. Laboratory 7
3. Suppose the benefit amount of the policy in problem 1 is 100,000 for death
in the first 30 years (until age 60), and then decreases by 5,000 per year for each of
the remaining years. What is the semi-annual premium?
A more realistic view of the insurance business includes provisions for expenses.
The profit for the company can also be included here as an expense.
The common method used for the determination of the expense loaded pre-
mium (or the gross premium) is a modification of the equivalence principle. Ac-
cording to the modified equivalence principle the gross premium G is set so that
on the policy issue date the actuarial present value of the benefit plus expenses is
equal to the actuarial present value of the premium income. The premium is usually
assumed to be constant. Under these assumptions it is fairly easy to write a formula
to determine G. Assume that the expenses in policy year k are ek−1 and are paid at
time k − 1, that is, at the beginning of the year. The actuarial present value of the
expenses is then given by
K(x) ∞
E[ k
v ek ] = vk ek k px .
k=0 k=0
Typically expenses are dependent on the premium. Also the sales commission is
usually dependent on the policy size.
Example 18–1. Suppose that the first year expenses for a $100,000 semi-continuous
whole life policy are 20% of premiums plus a sales commission equal to 0.5% of
the policy amount, and that the expenses for subsequent years are 10% of premium
plus $5. The gross premium G for such a policy satisfies
An important, and realistic, feature of the above example is the large amount of
first year expense. Expenses are now examined in greater detail.
Example 18–2. Let’s look at the previous example in the case of a policy for a
person aged 21. Assume that the interest rate is 5% and that the life table applies.
Then
100, 000A21 + 495 + 5ä21
G= = $604.24.
0.9ä21 − 0.1
From this gross premium the company must pay $500 in fixed expenses plus 20%
of the gross premium in expenses ($120.85), plus provide term insurance coverage
for the first year, for which the net single premium is 100, 000A1 = $123.97. Thus
21:1
there is a severe expected cash flow strain in the first policy year! The interested
reader may wish to examine the article “Surplus Loophole” in Forbes, September
4, 1989, pages 44-48.
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§18: Insurance Models Including Expenses 67
Expenses typically consist of two parts. The first part of the expenses can be
expressed as a fraction of gross premium. These are expenses which depend on
policy amount, such as sales commission, taxes, licenses, and fees. The other part
of expenses consist of those items which are independent of policy amount such
as data processing fees, printing of actual policy documents, clerical salaries, and
mailing expenses.
G(b)(1 − f ) = ab + c
where the constant a captures the amount of expenses related to the amount of
benefit, c captures the amount of non-benefit related expenses, and f is the portion
of the premium used to cover expenses which vary with the amount of premium. It
is useful to write
a + c/ b
G(b) = b = b R(b)
1−f
where R(b) is called the premium rate for a policy of amount b. In the policy fee
method of premium determination R(b) is taken as above and the amount c/ (1 − f )
is called the policy fee. This results in the (theoretically) correct premium always
being charged.
A far simpler method is the single rate method in which the premium rate is
taken to be R(b) where b denotes the average benefit amount. This produces the
(theoretically) correct aggregate premium amount. This method results in the lower
policy sizes paying a lower policy fee (why?).
The band method, or quantity discount approach, generalizes the single rate
method by approximating the function R(b) by a set of straight lines. Here the policy
rates increase as the policy size decreases (why?).
Problem 18–1. The expense loaded annual premium for an 35 year endowment
policy of $10,000 issued to (30) is computed under the assumptions that
(1) sales commission is 40% of the gross premium in the first year
(2) renewal commissions are 5% of the gross premium in year 2 through 10
(3) taxes are 2% of the gross premium each year
(4) per policy expenses are $12.50 per 1000 in the first year and $2.50 per 1000
thereafter
(5) i = 0.05
Problem 18–2. A semi-continuous whole life policy issued to (21) has the following
expense structure. The first year expense is 0.4% of the policy amount plus $50. The
expenses in years 2 through 10 are 0.2% of the policy amount plus $25. Expenses in
the remaining years are $25, and at the time of death there is an additional expense
of $100. Find a formula for G(b). Compute G(1) and compare it to A21 .
Problem 18–3. Your company sells supplemental retirement annuity plans. The
benefit under such a plan takes the form of an annuity immediate, payable monthly,
beginning on the annuitant’s 65th birthday. Let the amount of the monthly benefit
payment be b. The premiums for this annuity are collected via payroll deduction
at the end of each month during the annuitant’s working life. Set up expenses for
such a plan are $100. Subsequent expenses are $5 each month during the premium
collection period, $100 at the time of the first annuity payment, and $5 per month
thereafter. Find G(b) for a person buying the plan at age x. What is R(b)?
Problem 18–4. A single premium life insurance policy with benefits payable at the
end of the year of death is issued to (x). Suppose that
(1) Ax = 0.25
(2) d = 0.05
(3) Sales commission is 18% of gross premium
(4) Taxes are 2% of gross premium
(5) per policy expenses are $40 the first year and $5 per year thereafter
A realistic model for both insurance policies and the method and amount of
premium payment is now in hand. The next question is how accounting principles
are applied to the financial operations of insurance companies.
A basic review of accounting principles is given first. There are three broad
categories of items for accounting purposes: assets, liabilities, and equity. Assets
include everything which is owned by the business. Liabilities include everything
which is owed by the business. Equity consists of the difference in the value of the
assets and liabilities. Equity could be negative. In the insurance context liabilities
are referred to as reserve and equity as surplus. When an insurance policy is
issued the insurance company is accepting certain financial obligations in return for
the premium income. The basic question is how this information is reflected in the
accounting statements of the company. Some of the different accounting procedures
available will now be described. Keep in mind that this discussion only concerns
how the insurance company prepares accounting statements reflecting transactions
which have occurred. The method by which gross (or net) premiums are calculated
is not being changed!
Example 19–1. Suppose the following data for an insurance company is given.
Balance Sheet
December 31, 1989 December 31, 1990
Assets 1,725,000 —
Reserves — 1,433,000
Surplus 500,000 —
The missing entries in the tables can be filled in as follows (amounts in thou-
sands). Total income is 341 + 108 = 449 while total expenses are 112 + 93 = 205,
so net income (before reserve contributions) is 449 − 205 = 244. Now the reserves
at the end of 1989 are 1, 725 − 500 = 1, 225, so the increase in reserves must be
1, 433 − 1, 225 = 208. The net income is 244 − 208 = 36. Hence the 1990 surplus
is 536 and the 1990 assets are 1,969.
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§19: Net Premium Reserves 72
The central question in insurance accounting is “How are liabilities measured?”
The answer to this question has some very important consequences for the operation
of the company, as well as for the financial soundness of the company. The general
equation is
Reserve at time t = Actuarial Present Value at time t of future benefits
− Actuarial Present Value at time t of future premiums.
The only accounting assumption required is one regarding the premium to be used
in this formula. Is it the net premium, gross premium, or ???
The first point of view is that liabilities are measured as the net level premium
reserves. This is the reserve computed under the accounting assumption that the
premium charged for the policy is the net level premium. To see that this might
be a reasonable approach, recall that the equivalence principle sets the premium
so that the actuarial present value of the benefit is equal to the actuarial present
value of the premiums collected. However, it is clear that after the policy is
issued the present value of the benefits and of the un-collected premiums will no
longer be equal, but will diverge in time. This is because the present value of the
unpaid benefits will be increasing in time and the present value of the uncollected
premiums will decrease in time. The discrepency between these two amounts at any
time represents an unrealized liability to the company. To avoid a negative surplus
(technical bankruptcy), this liability must be offset in the accounting statments of
the company by a corresponding asset. Assume (for simplicity) that this asset takes
the form of cash on hand of the insurance company at that time. How does one
compute the amount of the reserve at any time t under this accounting assumption?
This computation is illustrated in the context of an example.
Example 19–2. Consider a fully discrete whole life policy issued to (x) in which
the premium is payable annually and is equal to the net premium. What is the
reserve at time k, where k is an integer? To compute the reserve simply note that
if (x) has survived until time k then the (curtate) remaining life of x has the same
distribution as K(x + k). The outstanding benefit has present value vK(x+k)+1 while
the present value of the remaining premium income is äK(x+k)+1 times the annual
premium payment. Denote by k L the random variable which denotes the size of the
future loss at time k. Then
kL = vK(x+k)+1 − Px äK(x+k)+1 .
The reserve, denoted in this case by k Vx , is the expectation of this loss variable.
Hence
k Vx = E[k L] = Ax+k − Px äx+k .
This is called the prospective reserve formula, since it is based on a look at the
future performance of the insurance portfolio.
§19: Net Premium Reserves 73
Certain timing assumptions regarding disbursements and receipts have been
made in the previous computation. Such assumptions are always necessary, so they
are now made explicit. Assume that a premium payment which is due at time t is
paid at time t; an endowment benefit due at time t is paid at time t; a death benefit
payment due at time t is assumed to be paid at time t−, that is, just before time t.
Interest earned for the period is received at time t−. Thus t Vx might more properly
be denoted t− Vx . Also assume that the premium charged is the net level premium.
Therefore the full technical description of what has been computed is the net level
premium terminal reserve. One can also compute the net level premium initial
reserve which is the reserve computed right at time t. This initial reserve differs
from the terminal reserve by the amount of premium received at time t and the
amount of the endowment benefit paid at time t. Ordinarily one is interested only
in the terminal reserve.
In the remainder of this section methods of computing the net level premium
terminal reserve are discussed. For succintness, the term ‘reserve’ is always taken
to mean the net level premium terminal reserve unless there is an explicit statement
to the contrary.
äx+k
Exercise 19–1. Show that k Vx = 1 − . From this lim k Vx = 1. Why is this
äx k→∞
reasonable?
Exercise 19–2. Use the Life Table to compute the reserve for the first five years
after policy issue for a fully discrete whole life policy to (20). Assume the policy
amount is equal to $100,000 and the premium is the net premium.
Example 19–3. Let us examine the expected cash flow associated with a whole life
policy issued to (x). Assume the premium is the net level premium and that the
policy is fully discrete. In policy year k + 1 (that is in the time interval [k, k + 1))
there are the following expected cash flows.
This final cash on hand at time k+1− must be equal to the reserve for the policies
§19: Net Premium Reserves 74
of the survivors. Thus
This point of view makes it easy to see that there are other ways to compute
the reserve. First the reserve may be viewed as maintaining the balance between
income and expenses. Since at time 0 the reserve is 0 (because of the equivalence
principle) the reserve can also be viewed as balancing past income and expenses.
This leads to the retrospective reserve formula
k Ex k Vx = Px äx:k − A1x:k .
Ax = A1x:k + vk k px Ax+k
and
äx = äx:k + vk k px äx+k .
Since the reserve at time 0 is zero,
0 = Ax − Px äx = A1x:k + vk k px Ax+k − Px äx:k + vk k px äx+k
where k is an arbitrary positive integer. Rearranging terms and using the prospective
formula for the reserve given above produces the retrospective reserve formula.
Problem 19–2. Find a formula for the reserve at the end of 5 years for a 10 year
term policy with benefit $1 issued to (30) on a net single premium basis.
This is called the premium difference formula for reserves. Find similar formulas
for the other types of insurance.
Problem 19–6. Given that 10 V35 = 0.150 and that 20 V35 = 0.354 find 10 V45 .
Problem 19–8. For a general fully discrete insurance let us suppose that the benefit
payable if death occurs in the time interval (h − 1, h] is bh and that this benefit is
paid at time h, that is, at the end of the year of death. Suppose also that the premium
paid for this policy at time h is π h . Show that for 0 ≤ t ≤ 1
Problem 19–9. In the notation of the preceding problem show that for 0 ≤ t ≤ 1
Problem 19–10. Suppose that 1000 t V(Ax ) = 100, 1000P(Ax ) = 10.50, and δ =
0.03. Find ax+t .
§19: Net Premium Reserves 76
Problem 19–11. Calculate 20 V45 given that P45 = 0.014, P 1 = 0.022, and P45:20 =
45:20
0.030.
Problem 19–12. A fully discrete life insurance issued to (35) has a death benefit of
$2500 in year 10. Reserves are calculated at i = 0.10 and the net annual premium
P. Calculate q44 given that 9 V + P = 10 V = 500.
§19: Net Premium Reserves 77
Solutions to Problems
Problem 19–1. Use the prospective formula and Ax:n + d äx:n = 1 to see the
formula is true. When k = n the reserve is 1 by the timing assumptions.
Problem 19–3. Use the prospective formula and the premium definitions.
Problem 19–6. Use the prospective formula and the relation Ax + d äx = 1 to
obtain k Vx = 1 − äx+k / äx .
40
Problem 19–7. The prospective and retrospective formulas are 20 V(A20 ) =
A40 − Pä40:20 and 40
20 V(A20 ) = Pä20:20 − a1 .
20:20
Problem 19–8. The value of the reserve, given survival, plus the present value
of the benefit, given death, must equal the accumulated value of the prior reserve
and premium.
Problem 19–10. Use the prospective reserve formula and the relationship
Ax + δ ax = 1.
Exercise 19–2. The reserve amounts are easily computed using the previous
exercise as 1000001V20 = 100000(1 − 19.014/ 19.087) = 382.46, 1000002V20 =
775.40, 1000003V20 = 1184.06, 1000004V20 = 1613.67, and 1000005V20 =
2064.24.
1. Return to Laboratory 7 and find the reserve at the end of each policy year
for the term policy of problem 1 at an interest rate of 5%. Begin by deriving a
recurrence relation between the reserves for successive years.
2. Find the reserve at the end of each policy year for the term policy with de-
clining benefits given in problem 3 of Laboratory 7. Begin by deriving a recurrence
relation between the reserves for successive years.
The study of the basic aspects of life insurance is now complete. Two different
but similar directions will now be followed in the ensuing sections. On the one
hand, types of insurance in which the benefit is paid contingent on the death or
survival of more than one life will be examined. On the other hand, the effects of
competing risks on the cost of insurance will be studied.
The first area of study will be insurance in which the time of the benefit payment
depends on more than one life. For convenience of speech a status will refer to
any collection of objects for which there is a definition of survival and death (or
decrement). The simplest type of status is the single life status. The single life
status (x) dies exactly when (x) does. Another simple status is the certain status n .
This status dies at the end of n years. The joint life status for the n lives (x1 ), . . . (xn )
is the status which survives until the first member of the group dies. This status
is denoted by (x1 x2 . . . xn ). The last survivor status, denoted by (x1 x2 . . . xn ) is the
status which survives as long as at least one member of the group survives.
When discussing a given status the question naturally arises as to how one
would issue insurance to such a status. It is easy to see that if one assumes that
the constituents of the status die independently this problem can be easily solved in
terms of what is already known.
Example 21–1. Consider a fully discrete whole life policy issued to the joint status
(xy). The net annual premium to be paid for such a policy is computed as follows.
Using the obvious notation, the premium, P, must satisfy
Axy = Päxy .
Axy = E[vK(x)∧K(y)+1 ]
and
1 − Axy
äxy =
d
which are obtained as previously.
Exercise 21–1. Obtain an expression for Axy in terms that can be computed from
the life table.
In addition to these new types of status, insurance in which the benefit is paid
only if the status fails in a certain way can be considered. For example, the benefit
for the joint life status (xy) may depend on whether (x) or (y) fails first. This is a
simple case of what is known as a contingent insurance. Again, if the lives are
assumed to fail independently it is a simple matter to reduce computations involving
contingent insurance to the cases already considered. An example of contingent
1
insurance is the case of term insurance. Here the status is x : n , which dies if and
only if (x) dies before time n.
§21: Multiple Lives 82
Problems
t pxy = t pxy + t px (1 − t py ) + t py (1 − t px ).
Problem 21–2. Suppose µx = 1/ (110 − x) for 0 ≤ x < 110. Find 10 p20:30 , 10 p20:30 ,
and e̊20:30 .
Problem 21–3. Find an expression for the actuarial present value of a deferred
annuity of $1 payable at the end of any year as long as either (20) or (25) is living
after age 50.
Problem 21–4. Find the actuarial present value of a 20 year annuity due which
provides annual payments of $50,000 while both (x) and (y) survive, reducing by
1/2 on the death of (x) and by 1/4 on the death of (y).
Problem 21–8. In a mortality table which follows Makeham’s Law you are given
A = 0.003 and c10 = 3. Calculate ∞ q1 if e̊40:50 = 17.
40:50
Problem 21–9. If the probability that (35) will survive for 10 years is a and the
probability that (35) will die before (45) is b, what is the probability that (35) will
die within 10 years after the death of (45)? Assume the lives are independent.
§21: Multiple Lives 83
Solutions to Problems
Problem 21–1. t pxy = P[[K(x) ≥ t] ∪ [K(y) ≥ t]] = P[K(x) ≥ t, K(y) ≥
t] + P[K(x) ≥ t, K(y) ≤ t] + P[K(x) ≤ t, K(y) ≥ t].
Problem 21–2. From the form of the force of mortality, DeMoivre’s Law
holds.
desired probability is 1 − a/ 2 − b.
§21: Multiple Lives 84
Solutions to Exercises
Exercise 21–1. Using the independence gives t pxy = t pxt py , so that Axy =
E[vK(xy)+1 ] = ∞k=0 vk+1 (k pxy − k+1 pxy ) = ∞k=0 vk+1 (k pxk py − k+1 pxk+1 py ).
1. Find an expression for the net single premium for a whole life policy issued
1
to (xy), a status which fails when (x) dies if T(x) < T(y). Use this expression and
the life table data of Laboratory 3 to compute the premium for a $1,000,000 policy
1
issued to (3040). Use 6% as the interest rate.
2. Find an expression for the net single premium for a whole life policy issued to
2
(xy), where the benefit is paid on the death of (y) if T(x) < T(y). Use this expression
and the life table data of Laboratory 3 to compute the premium for a $1,000,000
2
policy issued to (3040). Use 6% as the interest rate.
3. Show that if X and Y are independent random variables and one of them is
absolutely continuous then P[X = Y] = 0. Hence under the standard assumptions of
this section no two people can die simultaneously.
4. One model for joint lives which allows for simultaneous death is the common
shock model. The intuition is that the two lives behave almost independently except
for the possibility death by a common cause. The model is as follows. Let T ∗ (x),
T ∗ (y), and Z be independent random variables. Assume that T ∗ (x) and T ∗ (y) have
the distribution of the remaining lifetimes of (x) and (y) as given by the life table.
The random variable Z represents the time of occurence of the common catastrophe
which will kill any survivors. The common shock model is that the true remaining
lifetimes of (x) and (y) are given as T(x) = min{T ∗ (x), Z} and T(y) = min{T ∗ (y), Z}
respectively. What is the probability that (x) and (y) die simultaneously in this
model? What is the survival function for the joint life status (xy) in this model?
Answer these two questions in general, and then in the special case in which
T ∗ (x), T ∗ (y), and Z have exponential distributions with parameters µx , µy and µz
respectively.
In contrast to the case in which a status is defined in terms of multiple lives, the
way in which a single life fails can be studied. This point of view is particularly
important in the context of the analysis of pension plans. In such a setting a person
may withdraw from the workforce (a ‘death’) due to accident, death, or retirement.
Different benefits may be payable in each of these cases. Another common type
of insurance in which a multiple decrement model is appropriate is in the double
indemnity life policy. Here the benefit is twice the face amount of the policy if
the death is accidental. In actuarial parlance the termination of a status is called a
decrement and multiple decrement models will now be developed. These models
also go by the name of competing risk models in other contexts.
To analyze the new situation, introduce the random variable J = J(x) which is a
discrete random variable denoting the cause of decrement of the status (x). Assume
that J(x) has as possible values the integers 1, . . . , m. It is clear that all of the
information of interest is contained in the joint distribution of the random variables
T(x) and J(x). Note that this joint distribution is of mixed type since (as always) T(x)
is assumed to be absolutely continuous while J(x) is discrete. The earlier notation is
modified in a fairly obvious way to take into account the new model. For example,
(j)
t qx = P[0 < T(x) ≤ t, J(x) = j].
and
(j)
t px = P[T(x) > t, J(x) = j].
Here ∞ q(j)
x gives the marginal density of J(x). To discuss the probability of death
due to all causes the superscript (τ ) is used. For example,
m
(τ )
t qx = P[T(x) ≤ t] = (j)
t qx
j=1
and similar expressions for the survival probability and the force of mortality can
be obtained. Although t q(xτ ) + t p(xτ ) = 1 a similar equation for the individual causes
of death fails unless m = 1. For the force of mortality
(τ ) fT(x) (t)
µx+t =
P[T(x) > t]
and
fT(x) J(x) (t, j)
µx+t
(j)
= .
P[T(x) > t]
One must be slightly careful in the use (misuse) of these formulas. In particular,
while t
(τ ) (τ )
t px = exp{− µx+s ds}
0
There are two basic methodologies used. If a large group of people for which
extensive records are maintained is available the actual survival data with the deaths
in each year of age broken down by cause would also be known. It is then very easy
to construct the multiple decrement table. This is seldom the case.
Example 23–1. An insurance company has a thriving business issuing life insurance
to coal miners. There are three causes of decrement (death): mining accidents, lung
disease, and other causes. From the company’s vast experience with coal miners a
decrement (life) table for these three causes of decrement is available. The company
now wants to enter the life insurance business for salt miners. Here the two causes of
decrement (death) are mining accidents and other. How can the information about
mining accidents for coal miners be used to get useful information about mining
accidents for salt miners?
To see how to proceed the multiple decrement process is examined in a bit more
detail. Some auxillary quantities are introduced. Define
t
′
t px
(j)
= exp{−
0
µx+s
(j)
ds}
§23: Multiple Decrement Models 88
and
t qx′(j) = 1 − t p′x(j).
The “probability” t q′x (j) is called the net probability of decrement (or absolute
rate of decrement). It is these “probabilities” that represent the death rates in the
absence of competing risks. To see why this interpretation is reasonable, note that
fT(x) J(x) (t, j)
µx+t
(j)
=
P[T(x) > t]
fX J(x) (x + t, j)/ s(x)
=
s(x + t)/ s(x)
fX J(x) (x + t, j)
= .
P[X > x + t]
This shows that µx+t
(j)
represents the rate of death due to cause j among those surviving
up to time x + t.
This shows how one can pass from the absolute rate of decrement to total survival
probabilities. Note that this relationship implies that the rates are generally larger
than the total survival probability. Then, under the assumption of constant force of
mortality for each decrement over each year of age in the multiple decrement table,
1
(τ )
q(j)
x = s px µx+s
(j)
ds
0
1
(τ )
= s px µx(j) ds
0
µx(j) 1 (τ ) (τ )
= s px µx ds
µx(τ ) 0
µ (j)
= x(τ ) q(xτ )
µx
log p′x (j) (τ )
= qx .
log p(xτ )
This solves the problem of computing the entries in a multiple decrement table
under the stated assumption about the structure of the causes of decrement in that
table.
Exercise 23–1. Show that the same formula results if one assumes instead that the
time of decrement due to each cause of decrement in the multiple decrement table
has the uniform distribution over the year of decrement.
§23: Multiple Decrement Models 89
Exercise 23–2. Assume that two thirds of all deaths at any age are due to accident.
What is the net single premium for (30) for a double indemnity whole life policy?
How does this premium compare with that of a conventional whole life policy?
Example 23–2. Suppose we are designing a pension plan and that there are two
causes of decrement: death and retirement. In many contexts (such as teaching) it
is reasonable to assume that retirements all occur at the end of a year, while deaths
can occur at any time. How could we construct a multiple decrement table which
reflects this assumption?
t qx′(j) = t q′x(j) .
Exercise 23–3. Show that under this assumption we have t p′x (j) µx+t
(j)
= q′x (j) for 0 ≤
d
t ≤ 1. Hint: Compute t p′x (j) in two different ways.
dt
If this uniformity assumption is made for all causes of decrement it is then easy
to construct the multiple decrement table. The computations are illustrated for the
case of 2 causes of decrement. In this setting
1
(τ ) (1)
q(1)
x = s px µx+s ds
0
1
=
0
′(1)s px′(2)µx+s
s px
(1)
ds
1
= qx′(1) ′(2) ds
s px
0
1
= qx′(1) (1 − sqx′(2) ) ds
0
1
= qx′(1) (1 − qx′(2) )
2
with a similar formula for q(2)
x . It is easy to see how this procedure could be modified
for different assumptions about the decrement in each single decrement table.
Exercise 23–4. Construct a multiple decrement table in which the first cause of
decrement is uniformly distributed and the second cause has all decrements occur
§23: Multiple Decrement Models 90
at the end of the year. The pension plan described in the example above illustrates
the utility of this technique.
and 1
(τ ) (j)
t px µx+t dt
m(j)
x = 0
1
(τ )
t px dt
0
and 1
t px′(j) µx+t
(j)
dt
m′x =
(j) 0
1 .
t p′x dt
(j)
0
The central rate bridge is based on the following approximation. First, under the
UDD assumption in each single decrement table
q′x (j)
m′x (j) = .
1 − 12 q′x (j)
q(j)
m(j)
x =
x
.
1 − 12 q(xτ )
Thirdly, under the constant force assumption in the multiple decrement table
x = µx = m′x .
m(j) (j) (j)
Now assume that all of these equalities are good approximations in any case. This
assumption provides a way of connecting the single and multiple decrement tables.
§23: Multiple Decrement Models 91
There is no guarantee of the internal consistency of the quantities computed in this
way, since, in general, the three assumptions made are not consistent with each
other. The advantage of this method is that the computations are usually simpler
than for any of the ‘exact’ methods.
Exercise 23–5. Show that each of the above equalities hold under the stated as-
sumptions.
§23: Multiple Decrement Models 92
Problems
Problem 23–1. Assume that each decrement has a uniform distribution over each
year of age in the multiple decrement table to construct a multiple decrement table
from the following data.
Problem 23–2. Rework the preceding exercise using the central rate bridge. How
different is the multiple decrement table?
Problem 23–3. In a double decrement table where cause 1 is death and cause 2 is
withdrawal it is assumed that deaths are uniformly distributed over each year of age
while withdrawals between ages h and h + 1 occur immediately after attainment of
age h. In this table one sees that l(50τ ) = 1000, q(2)
50 = 0.24, and d50 = 0.06d50 . What
(1) (2)
′ ? How does your answer change if all withdrawals occur at midyear? At the
(1)
is q50
end of the year?
Problem 23–4. How would you construct a multiple decrement table if you were
given qx′(1) , qx′(2) , and q(3)
x ? What assumptions would you make, and what formulas
would you use? What if you were given qx′(1) , q(2) (3)
x , and qx ?
§23: Multiple Decrement Models 93
Solutions to Problems
Problem 23–1. First, p(62τ ) = (.98)(.97)(.80) and q(62τ ) = 1 − p(62τ ) . Also p′62(j) =
log p′62(j) (τ )
1 − q′62(j) . From the relation q(j) 62 = q62 the first row of the multiple
log p(62τ )
decrement table can be found.
(2) (1)
Problem 23–3. From the information d50 = 240 and d50 = 14. Since with-
drawals occur at the beginning of the year there are 1000 − 240 = 760 people
under observation of whom 14 die. So q′50(1) = 14/ 760. If withdrawals occur at
year end all 1000 had a chance to die so q′50(1) = 14/ 1000. With withdrawals at
50 = q′50 / 2 so q′50 = 28/ 1000.
midyear q(1) (1) (1)
Problem 23–4. The central rate bridge could be used. Is there an exact method
available?
§23: Multiple Decrement Models 94
Solutions to Exercises
(τ )
Exercise 23–1. The assumption is that t q(j) (j)
x = tqx for all j. Hence t px =
(τ ) (j) (j) (τ ) (j) (τ )
1 − tqx and µx+s = ds s qx / s px = qx / s px . Substitution and integration gives
d
1 (j)
p′x (j) = e 0 x+s = (1 − q(xτ ) )qx / qx . Since p(xτ ) = 1 − q(xτ ) , the result follows by
− µ ds (j) (τ )
substitution.
x = q′x 0 s p′x
(1) 1
Exercise 23–4. Since cause 1 obeys UDD, q(1) (2)
ds as in the
derivation above. For cause 2, s p′x = 1 for s < 1, so qx = q′x . For cause 2
(2) (1) (1)
1 (1) d (2)
x = − 0 s p′x ds s p′x
proceed as in the derivation above to get q(2) ds. Now s p′x (2) is
constant except for a jump of size −q′x (2) at s = 1. Hence q(2)
x = q′x
(2) 1
0 s p′x
(1)
ds =
q′x (1 − 2 q′x ).
(2) 1 (1)
Exercise 23–5. Under UDD in the single decrement table t p′x (j) = 1 − tq′x (j) and
1 1
t p′x µx+t = q′x so m′x = 0 q′x (j) dt/ 0 (1 − tq′x (j) ) dt = q′x (j) / (1 − 12 q′x (j) ). Under
(j) (j) (j) (j)
(j) (τ )
UDD in the multiple decrement table µx+s = q(j)
x / s px so that substitution gives
the result. Under the constant force assumption in the multiple decrement table,
x = µx = m′x by substitution.
(j)
µx+s = µx(j) for all j and m(i) (j) (j)
§24. Laboratory 10
The service table at the end of these notes contains information about a group
of workers. There are 4 causes of decrement for this population. The first cause
is death (d), the second cause is withdrawal (w) (termination of employment), the
third cause is incapacity (i), and the fourth cause is retirement (r).
1. Suppose a concerted effort by the company reduces the rate of on the job
injury (incapacity) by 1/ 3 at all ages. Recompute the entries in the service table.
The basic study of insurance is now complete. Some of the more advanced
ideas connected with insurance will now be examined. A very basic question that
arises is this. Why would an insurer assume a risk which the insured is unwilling to
assume?
As already noted above, different people may well have different utility func-
tions. What features should utility functions have in common? One typical as-
sumption is that for any individual the utility function should be non–decreasing.
This property is the mathematical expression of the fact that having more money
is ‘better.’ Furthermore one would also expect that as one’s wealth increases the
utility of an additional dollar should decrease. This expresses the notion that to
someone having only $10 the prospect of gaining an additional dollar is greater than
the prospect of gaining an additional dollar if one already has $1,000,000.
Exercise 25–1. Argue that if the utility function is sufficiently smooth then u′ (w) ≥
0 and u′′ (w) ≤ 0. Thus a utility function will be concave (down).
A real valued function u(w) is said to be a risk averse utility function if u(w) is
non–decreasing and concave down.
The reason for the terminology risk averse will be explained shortly.
Example 25–2. Suppose a person with a risk averse utility function has current
wealth w and is given a choice between two investment schemes. In the first
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§25: Utility Functions 97
scheme the person will receive an amount b outright. In the second scheme the
person will receive a random amount W where E[W] = b. How does the person
decide between these two alternatives?
To answer this question, decision makers will be assumed to act according to the
Expected Utility Principle: a decision maker always chooses the available option
with the largest expected utility.
Example 25–3. In the previous example the expected utility of the first investment
scheme is E[u(w+b)] = u(w+b), while the expected utility of the second investment
scheme is E[u(w + W)].
proof : This proof is valid under the additional assumption that the function f is twice continuously
differentiable. For any fixed x and a, the Fundamental Theorem of Calculus and an
integration by parts gives
x
f (x) = f (a) + f ′ (t) dt
a
x
= f (a) + f ′ (a) (x − a) + (x − t) f ′′ (t) dt
a
≤ f (a) + f ′ (a) (x − a)
Taking expectations of both sides of this last inequality proves the theorem.
Exercise 25–2. Under what conditions does equality hold in Jensen’s inequality?
Continuing the example, from Jensen’s inequality and the fact that a risk averse
utility function is concave
so that such a person will always select the sure payoff b! Would such a person buy
a lottery ticket?
This analysis applies to the person seeking insurance. Now analyze the position
of the insurer. Assume that the insurer has utility function ui and current wealth
wi . Reasoning as before shows that the insurance company will offer insurance at a
premium P if
ui (wi ) ≤ E[ui (wi + P − A)].
Exercise 25–3. What is the situation if both the person and the insurer have linear
utility functions?
Exercise 25–4. What happens if the insurer and the person have the same expo-
nential utility function −e−w/1000 , the persons wealth is $5,000, the insurers wealth is
$5,000,000 and the loss variable A has an exponential distribution with mean $500?
Exercise 25–5. What happens in the previous exercise if the utility function is
log w?
Jensen’s inequality can provide some interesting information about the condi-
tions under which a person will purchase insurance. Consider the largest premium
that an individual would pay for insurance. This premium P must satisfy
u(w − P) = E[u(w − A)].
Using Jensen’s inequality on the right member gives
u(w − P) ≤ u(w − E[A])
and since the utility function is non–decreasing, P ≥ E[A]. Thus a risk averse
decision maker would be willing to pay a premium greater than the pure premium
§25: Utility Functions 99
(expected loss) in order to obtain insurance. Similarly the premium charged by the
insurance company must also exceed the pure premium. These two facts reinforce
intuition and lend a certain credibility to the analysis and assumptions.
where the fact that u′ is decreasing has been used. Taking expectations gives the result,
since the expectation of the last term is zero (why?).
§25: Utility Functions 100
Problems
Problem 25–1. Consider a game of chance in which a fair coin is tossed until a head
apprears. Let N denote the toss on which this occurs. Find the density, expectation,
and variance of N. Suppose a prize of X = 2N is paid when a head first appears.
What is E[X]? If your utility function is u(w) = log(w) what is the expected utility
of the prize?
Problem 25–2. In a typical state lotto game a person who buys a $1 ticket wins
$1,000,000 with probability 10−8 . Find a possible utility function for a person who
plays the lotto. Can this person have a risk averse utility function?
Problem 25–3. Suppose a person has a utility function which is increasing and
concave up. Give an example to explain why such a person might be called a risk
lover.
Problem 25–4. Show that if the amount of random loss X has a uniform distribution
on the interval (0, 1) and if the insured has utility function u(w) = log(w) then the
maximum amount the insured will pay as a premium is
ww
w− .
e(w − 1)w−1
Problem 25–5. Repeat the previous problem if the utility function is u(w) = −e−α w
and the loss random variable X has a chi square distribution with n degrees of
freedom.
Problem 25–6. Use the Taylor expansion of u(w − x) about x = µ to show that the
maximum premium an insured will pay is approximately
1 u′′ (w − µ ) 2
µ− σ .
2 u′ (w − µ )
Here µ and σ 2 are the mean and variance of the loss random variable.
Problem 25–7. A group medical insurance policy pays $D each time a member of
the group is hospitalized. The group consists of g distinct subgroups, which differ in
the rate of hospitalization. Suppose that the annual number of hospital admissions
for subgroup i has a Poisson distribution with parameter λi . Find an expression
for the expected claims payments in one year, and also find the distribution of the
number of admissions in one year.
Problem 25–8. Suppose the loss random variable X has an exponential distribution
with mean 10. Suppose a premium of $5 will be paid. Show that the propor-
tional insurance policy with benefit X/ 2 and the stop loss policy with benefit
§25: Utility Functions 101
(X − 10 log 2)1[10 log 2,∞) (X) are both feasible insurance policies. Which policy would
the insured choose, and why?
Problem 25–9. True or False: A person with an exponential utility function, −e−α w ,
considers wealth irrelevent when deciding the maximum premium to pay for com-
plete protection against a random loss.
Problem 25–10. An insurer with wealth w insures a loss X which has the following
probability distribution:
1
P[X = 0] = P[X = 16] = .
2
The insurer’s utility function is u(x) = log x. The insurer is willing to pay a maximum
of 6 to a reinsurer who accepts 50 percent of the loss. Find w.
Problem 25–12. A decision maker has utility function u(x) = −e−3x and initial
wealth w. The decision maker faces two random losses. The loss X has a normal
distribution with mean α and variance 4. The loss Y has a normal distribution with
mean 10 and variance 8. Determine the maximum value of α for which the decision
maker prefers X to Y.
Problem 25–13. Three insurers have identical utility functions u(w) = log w, w > 0,
and a wealth of 36, 25, and 16 respectively. All three companies insure the same
risk. In the event of a loss each insurer will pay 11. The probability of loss is
0.5. Another company offers to reinsure each company’s complete risk at the same
premium π . Each company is willing to accept the reinsurance at the premium π
if its expected utility is maximized. Determine π such that the reinsurer maximizes
its total expected profit.
§25: Utility Functions 102
Solutions to Problems
Problem 25–1. Here P[N = n] = 2−n for n = 1, 2, . . .. Hence E[X] = +∞, and
∞
E[log(X)] = n log(2)2−n ≈ 1.386.
n=1
Problem 25–9. True. The variable w cancels from both sides of the equation.
§25: Utility Functions 103
Solutions to Exercises
Exercise 25–1. Since more wealth is better, u(w) is increasing. Thus u′ (w) ≥ 0.
Since the utility of an additional dollar decreases as wealth increases, u(w + 1) −
u(w) ≈ u′ (w) is decreasing. Thus u′′ (w) = (u′ (w))′ ≤ 0.
Exercise 25–2. Equality holds if f ′′ (x) = 0 for all x, which means f (x) is a
linear function of x.
Exercise 25–3. The person requires P ≤ E[A] while the insurer requires
P ≥ E[A] so that the only possible premium is P = E[A].
Exercise 25–4. In this case the wealth of each party is irrelevant and the only
common premium is P = 1000 ln E[eA/ 1000 ] = 2000.
In the individual risk model the insurers total risk S is assumed to be expressable
in the form
S = X1 + . . . + Xn
There are two major difficulties in using the individual risk model. The first is
to find at least a reasonable approximation to the probabilistic properties of the loss
random variables Xi . This can often be done using data from the past experience of
the company.
Example 26–1. For short term disability insurance the amount paid by the insurance
company can often be modeled as X = cY where c is a constant representing the
(daily) rate of disability payments and Y is the number of days a person is disabled.
One then is simply interested in modelling the random variable Y. Historical data
can be used to estimate of P[Y > y]. In this context P[Y > y], which was previously
called the survival function, is referred to as the continuance function. The same
notion can be used for the daily costs of a hospitalization policy.
Example 26–2. Suppose X and Y are independent random variables each having
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§26: The Individual Risk Model 105
the exponential distribution with parameter 1. By conditioning
∞
P[X + Y ≤ t] = P[X + Y ≤ t| Y = y] fY (y) dy
−∞
t
= P[X ≤ t − y] fY (y) dy
0
t
= 1 − e−(t−y) fY (y) dy
0
= 1 − e−t − te−t
for t ≥ 0.
This argument has actually shown that if X and Y are independent and absolutely
continuous then ∞
FX+Y (t) = FX (t − y) fY (y) dy.
−∞
This last integral is called the convolution of the two distribution functions and is
often denoted FX ∗ FY .
Exercise 26–1. If X and Y are absolutely continuous random variables show that
X + Y is also absolutely continuous and find a formula for the density function of
X + Y.
Exercise 26–2. Find a similar formula if X and Y are both discrete. Use this formula
to find the density of X + Y if X and Y are independent Bernoulli random variables
with the same success probability.
The importance of this theorem lies in the fact that the approximating normal
distribution does not depend on the detailed nature of the original distribution but
only on the first two moments. The accuracy of this approximation will be explored
in the exercises and laboratory.
Example 26–3. You are a claims adjuster for the Good Driver Insurance Company
of Auburn. Based on past experience the chance of one of your 1000 insureds being
involved in an accident on any given day is 0.001. Your typical claim is $500. What
is the probability that there are no claims made today? If you have $1000 cash on
hand with which to pay claims, what is the probability you will be able to pay all
§26: The Individual Risk Model 106
of todays claims? How much cash should you have on hand in order to have a
99% chance of being able to pay all of todays claims? What assumptions have you
made? How reasonable are they? What does this say about the solvency of your
company?
Example 26–4. Suppose that a company is going to issue 1,000 fire insurance
policies each having a $250 deductible, and a policy amount of $50,000. Denote
by Fi the Bernoulli random variable which is 1 if the ith insured suffers a loss, and
by Di the amount of damage to the ith insureds property. Suppose Fi has success
probability 0.001 and that the actual damage Di is uniformly distributed on the
interval (0,70000)). What is the relative loading so that the premium income will
be 95% certain to cover the claims made? Using the obvious notation, the total
amount of claims made is given by the formula
1000
S= Fi (Di − 250)1[250,50250] (Di ) + 500001(50250,∞) (Di )
i=1
where the F’s and the D’s are independent (why?) and for each i the conditional
distribution of Di given Fi = 1 is uniform on the interval (0,70000). The relative
security loading is determined so that
Exercise 26–3. Compute E[S] and Var(S) and then use the Central Limit Theorem
to find θ . What is the probability of bankruptcy when θ = 0?
Example 26–5. You operate a life insurance company which has insured 2,000 30
year olds. These policies are issued in varying amounts: 1,000 people with $100,000
policies, 500 people with $500,000 policies, and 500 people with $1,000,000 poli-
cies. The probability that any one of the policy holders will die in the next year is
0.001. Stop loss reinsurance may be purchased at the rate of 0.0015 per dollar of
coverage. How should the retention limit be set in order to minimize the probabil-
ity that the total expenses (claims plus reinsurance expense) exceed $1,000,000 is
minimized? Let X, Y, and Z denote the number of policy holders in the 3 catagories
dying in the next year. Then X has the binomial distribution based on 1000 trials
each with success probability 0.001, Y has the binomial distribution based on 500
trials each with success probability 0.001, and Z has the binomial distribution based
on 500 trials each with success probability 0.001. If the retention limit is set at r
then the cost C of claims and reinsurance is given by
It is then a relatively straightforward, though tedious, task to use the central limit
theorem to estimate P[C ≥ 1, 000, 000].
Exercise 26–4. Verify the validity of the above formula. Use the central limit
theorem to estimate P[C ≥ 1, 000, 000] as a function of r. Find the value(s) of r
which minimize this probability.
§26: The Individual Risk Model 108
Problems
Problem 26–2. Find the distribution and density for the sum of three independent
random variables each uniformly distributed on the interval (0,1). Compare the
exact value of the distribution function at a few selected points (say 0.25, 1, 2.25)
with the approximation obtained from the central limit theorem.
Problem 26–3. Repeat the previous problem for 3 independent exponential random
variables each having mean 1. It may help to recall the gamma distribution here.
Problem 26–4. A company insures 1000 essentially identical cars. The probability
that any one car is in an accident in any given year is 0.001. The damage to a car
that is involved in an accident is uniformly distributed on the interval (0,15000).
What relative security loading θ should be used if the company wishes to be 99%
sure that it does not lose money?
§26: The Individual Risk Model 109
Solutions to Problems
Problem 26–1. The amount of damage is BU where B is a Bernoulli variable
with success probability 0.001 and U has the uniform distribution.
1000
Problem 26–4. The loss random variable is of the form B i Ui .
i=1
§26: The Individual Risk Model 110
Solutions to Exercises
Exercise 26–1. Differentiation
∞ of the general distribution function formula
above gives fX+Y (t) = −∞ fX (t − y)fY (y) dy.
Exercise
26–2. In the discrete case the same line of reasoning n fX+Yt−y(t) =
mgives
y fX (t−y)f
Y (y). Applying this in the Bernoulli case, fX+Y (t) = t−y p (1−
n m n+m t y=0
p)n−t+y my py (1 − p)m−y = pt (1 − p)n+m−t m y=0 t−y y = t p (1 − p) m+n−t
.
Exercise 26–3. The loading θ is chosen so that θ E[S]/ √Var(S) = 1.645, from
the normal table. When θ = 0 the bankruptcy probability is about 1/ 2.
1. Write the loss random variable S in this case. Find E[S] and Var(S).
2. Use the Central Limit Theorem to find the relative security loading θ that
should be used if the company wishes to be 99% sure that it does not lose money.
3. Consider one of the loss random variables X that occur in the expression for
S. Explain how a random number generator could be used to simulate X.
Some of the consequences of the collective risk model will now be examined.
In the collective risk model the time at which claims are made is taken to account.
Here the aggregate claims up to time t, denoted by S(t), is assumed to be given by
N(t)
S(t) = Xk
k=1
To gain familiarity with some of the ideas involved, the simpler classical gam-
bler’s ruin problem will be studied first.
A discrete time version of the collective risk model will be studied and some
important new concepts will be introduced.
Suppose that a gambler enters a casino with z dollars and plays a game of chance
in which the gambler wins $1 with probability p and loses $1 with probability
q = 1 − p. Suppose also that the gambler will quit playing if his fortune ever reaches
a > z and will be forced to quit by being ruined if his fortune reaches 0. The main
interest is in finding the probability that the gambler is ultimately ruined and also
the expected number of the plays in the game.
Now in the actual game being played the gambler either reaches his goal or is ruined.
Introduce a random variable, T, which marks the play of the game on which this
occurs. Technically
T = inf{k : z + kj=1 Xj = 0 or a}.
Such a random variable is called a random time. Observe that for this specific
random variable the event [T ≤ k] depends only on the random variables X1 , . . . , Xk .
That is, in order to decide at time k whether or not the game has ended it is not
necessary to look into the future. Such special random times are called stopping
times. The precise definition is as follows. If X1 , X2, . . . are random variables and
T is a nonnegative integer valued random variable with the property that for each
integer k the event [T ≤ k] depends only on X1 , . . . , Xk then T is said to be a stopping
time (relative to the sequence X1 , X2 , . . .).
The random variable z + Tj=1 Xj is the gambler’s fortune when he leaves the
casino, which is either a or 0. Denote by qz the probability that the gambler leaves
the casino with 0. Then by direct computation E[z + Tj=1 Xj ] = a(1 − qz ). A formula
for the ruin probability qz will be obtained by computing this same expectation in a
second way.
Each of the random variables Xj takes values 1 and −1 with equal probability,
so E[Xj ] = 0. Hence for any integer k, E[ kj=1 Xj ] = 0 too. So it is at least plausible
that E[ Tj=1 Xj ] = 0 as well. Using this fact, E[z + kj=1 Xj ] = z, and equating this
There are two important technical ingredients behind this computation. The
first is the fact that T is a stopping time. The second is the fact that the gambling
game with p = q = 1/ 2 is a fair game. The notion of a fair game motivates the
definition of a martingale. Suppose M0 , M1 , M2 , . . . are random variables. The
sequence is said to be a martingale if E[Mk | Mk−1 , . . . , M0 ] = Mk−1 for all k ≥ 1. In
the gambling context, if Mk is the gambler’s fortune after k plays of a fair game then
given Mk−1 the expected fortune after one more play is still Mk−1 .
k
Exercise 29–1. Show that Mk = z + j=1 Xj (with M0 = z) is a martingale.
2
Example 29–1. The sequence M0 = z2 and Mk = z + kj=1 Xj − k for k ≥ 1 is also
a martingale. This follows from the fact that knowing M0 , . . . , Mk−1 is the same as
knowing X1 , . . . , Xk−1 and the fact that the X’s are independent.
Uncovering the appropriate martingale is often the most difficult part of the
process. One standard method is the following. If X1 , X2 , . . . are independent and
identically distributed random variables define
k
et j=1 Xj
Wk =
M k Xj (t)
j=1
§29: Stopping Times and Martingales 115
where MX (t) is the moment generating function of X evaluated at t. For each fixed t
the sequence Wk is a martingale (here W0 = 1). This follows easily from the fact that
if X and Y are independent then MX+Y (t) = MX (t)MY (t). This martingale is called
Wald’s martingale (or the exponential martingale) for the X sequence.
Exercise 29–3. Show that {Wk : k ≥ 0} is a martingale no matter what the fixed
value of t is.
In many important cases a non-zero value of t can be found so that the denomi-
nator part of the Wald martingale is 1. Using this particular value of t then makes
application of the optional stopping theorem neat and easy.
§29: Stopping Times and Martingales 116
Problems
Problem 29–1. By conditioning on the outcome of the first play of the game show
that qz = pqz+1 + qqz−1 . Show that if p = q there is a solution of this equation of the
form qz = C1 + C2 z and find C1 and C2 by using the natural definitions q0 = 1 and
qa = 0. Show that if p ≠ q there is a solution of the form qz = C1 + C2 (q/ p)z and find
the two constants. This provides a solution to the gambler’s ruin problem by using
difference equations instead of probabilistic reasoning.
Problem 29–2. If p ≠ q show that the choice t = ln(q/ p) makes the denominator of
Wald’s martingale 1. Use this choice of t and the optional stopping theorem to find
the ruin probability in this case.
Problem 29–3. Suppose p ≠ q. Define M0 = z and Mk = z + kj=1 Xj − k(p − q) for
k ≥ 1. Show that the sequence Mk is a martingale and use it to compute E[T] in this
case.
§29: Stopping Times and Martingales 117
Solutions to Problems
(q/ p)a −(q/ p)z
Problem 29–2. qz = (q/ p)a −1 .
a 1−(q/ p)z
Problem 29–3. E[T] = z
q−p − q−p 1−(q/ p)a .
§29: Stopping Times and Martingales 118
Solutions to Exercises
Exercise 29–1. Knowing M0 , . . . , Mk−1 is the same as knowing X1 , . . . , Xk−1 . So
E[Mk | M0 , . . . , Mk−1 ] = E[Mk | X0 , . . . , Xk−1 ] = z + k−1
j=1 Xj + E[Xk | X0 , . . . , Xk−1 ] =
Mk−1 since the last expectation is 0 by independence.
2 2
Exercise 29–2. First write Mk = z + k−1 j=1 X j + X k − k = z + k−1
j=1 X j +
k−1
2Xk (z + j=1 Xj ) + Xk − k. Take conditional expectations using the fact that Xk
2
is independent of the other X’s and E[Xk ] = 0 and E[Xk2 ] = 1 to obtain the result.
t X t X
properties of the exponential gives e j=1 = e j=1 ×e k . Direct computation
j j tX
The ideas developed in connection with the gambler’s ruin problem will now
be used to compute the ruin probability in the collective risk model. Since the
processes are now operating in continuous time the details are more complicated
and not every step of the arguments will be fully justified.
k=1 Xk where X1 , X2 , . . . are independent
Recall that the claims process S(t) = N(t)
identically distributed random variables representing the sizes of the respective
claims, N(t) is a stochastic process representing the number of claims up to time
t, and N and the X’s are assumed to be independent. The insurer’s surplus is
given by U(t) = u + ct − S(t). The probability of ruin with initial surplus u will
be denoted by ψ (u). As above, introduce the Wald martingale which is defined
eν U(t)
by Wt = and a stopping time Ta defined by Ta = inf{s : U(s) ≤ 0 or ≥ a}
MU(t) (ν )
where a is an arbitrary but fixed positive number. It is intuitively clear that Ta is a
stopping time in an appropriate sense in the new continuous time setting. Suppose
there is a number R > 0, which does not depend on t, so that MU(t)−u (−R) = 1. Such
a number, if it exists, is called the adjustment coefficient. Substitute ν = R in
Wald’s martingale and compute E[WTa ] in two ways to obtain
1 = E[WTa ]
= eRu E[e−RU(Ta ) | U(Ta ) ≤ 0] P[U(Ta ) ≤ 0]
+ eRu E[e−RU(Ta ) | U(Ta ) ≥ a] (1 − P[U(Ta ) ≤ 0]).
Since this equation is valid for any fixed positive a, and since R > 0, it is possible
to take limits as a → ∞. Since lima→∞ P[U(Ta ) ≤ 0] = ψ (u) and lima→∞ e−Ra = 0 the
following result is obtained.
Theorem. Suppose that in the collective risk model the adjustment coefficient R > 0
satisfies MU(t)−u (−R) = 1 for all t. Let T = inf{s : U(s) ≤ 0} be the random time at
which ruin occurs. Then
e−Ru
ψ (u) = .
E[e−RU(T) | T < ∞]
The more restrictive discussion begins by examining the nature of the process
N(t), the total number of claims up to time t. It is customary to assume that this
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§30: The Collective Risk Model Revisited 120
process is a Poisson process with constant intensity λ > 0. Let us recall what
this means. Suppose W1 , W2 , . . . are independent identically distributed exponential
random variables with mean 1/ λ and common density λ e−λ x 1(0,∞) (x). It is common
to view the W’s as the waiting times between ‘events.’ The Poisson process
can then be viewed as the number of ‘events’ up to time t. This means that
j=1 Wj > t}. It can be shown that for any fixed t the random variable
N(t) = inf{k : k+1
N(t) has the Poisson distribution with parameter λ t and that the stochastic process
{N(t) : t ≥ 0} has independent increments, that is, whenever t1 < t2 < . . . < tn are
fixed real numbers then the random variables N(t2 ) − N(t1 ), . . . , N(tn ) − N(tn−1 ) are
independent.
In this setting a reasonably simple expression for the moment generating func-
tion of U(t) can be obtained. This then leads to a nice equation for the adjustment
coefficient. To this end the moment generating function of S(t) is now computed.
Keep in mind that N and the X’s are independent!
N(t)
MS(t) (ν ) = E[eν k=1
Xk
]
N(t)
= E[E[eν k=1
Xk
| N(t)]]
∞
j
= E[eν k=1
Xk
] P[N(t) = j]
j=0
∞ j
−λ t (λ t)
j
νX
= E[e ] e
j=0 j!
∞
(λ t MX (ν ))j
= e−λ t
j=0 j!
= exp{λ t(MX (ν ) − 1)}.
Thus
MU(t) (ν ) = exp{uν + ctν + λ t(MX (−ν ) − 1)}.
From this point the adjustment coefficient can be easily found to be the positive
solution of the equation λ + cR = λ MX (R).
Exercise 30–1. Verify that the adjustment coefficient, if it exists, must satisfy this
equation.
Example 30–1. Suppose all claims are for a unit amount. Then MX (ν ) = eν so the
adjustment coefficient is the positive solution of λ + cR = λ eR . Note that there is
no solution if c ≤ λ . But in this case the ruin probability is clearly 1.
Exercise 30–2. Show that if c ≤ λ E[X] the ruin probability is 1. Show that if
c > λ E[X] the adjustment coefficient always exists and hence the ruin probability
is less than 1.
§30: The Collective Risk Model Revisited 121
The previous exercises suggest that only the case in which c > λ E[X] is of
interest. Henceforth write c = (1 + θ )λ E[X] for some θ > 0. Here θ may be
interpreted as a relative security loading.
Even more detailed information can be obtained when N(t) is a Poisson process.
To do this define a stopping time Tu = inf{s : U(s) < u} to be the first time that
the surplus falls below its initial level and denote by L1 = u − U(Tu ) the amount by
which the surplus falls below its initial level. It can be shown that
∞
1
P[Tu < ∞, L1 ≥ y] = (1 − FX (x)) dx.
(1 + θ )E[X] y
The rather technical proof of this fact will not be given. Two of its consequences
are these. First, by taking y = 0, the probability that the surplus ever drops below its
initial level is 1/ (1 + θ ). Second, an explicit formula for the size of the drop below
the initial level is obtained as
y
1
P[L1 ≤ y| Tu < ∞] = (1 − FX (x)) dx.
E[X] 0
It is possible to evaluate this expression in certain cases.
Exercise 30–3. Derive this last expression from the above formula for P[Tu <
∞, L1 ≥ y].
Exercise 30–4. What is the conditional distribution of L1 given Tu < ∞ if the claim
size has an exponential distribution with mean 1/ δ ?
Exercise 30–5. Show that the conditional moment generating function of L1 given
Tu < ∞ is (MX (t) − 1)/ (tE[X]).
This information can also be used to study the random variable L which repre-
sents the maximum aggregrate loss and is defined by L = maxt≥0 {S(t) − ct}. Note
that P[L ≤ u] = 1 − ψ (u) from which it is immediately seen that the distribution of
L has a discontinuity at the origin of size 1 − ψ (0) = θ / (1 + θ ), and is continuous
otherwise. In fact a reasonably explicit formula for the moment generating function
of L can be obtained.
Theorem. If N(t) is a Poisson process and L = maxt≥0 {S(t) − ct} then
θ E[X]ν
ML (ν ) = .
1 + (1 + θ )E[X]ν − MX (ν )
proof : Note from above that the size of each new deficit does not depend on the initial starting
point of the surplus process. Thus
D
L= Aj
j=1
§30: The Collective Risk Model Revisited 122
where A1 , A2 , . . . are independent identically distributed random variables each having the
same distribution as the conditional distribution of L1 given Tu < ∞, and D is a random
variable independent of the A’s which counts the number of times a new deficit level is
reached. From here it is a simple matter to compute the moment generating function of
L.
This formula for the moment generating function of L can sometimes be used
to find an explicit formula for the distribution function of L, and hence ψ (u) =
1 − P[L ≤ u].
There are some other interesting consequences of the assumption that the claim
number process N(t) is a Poisson process.
d
First, a bit of notation. If A and B are random variables, write A = B to denote
that A and B have the same distribution.
A random variable S is said to have the compound Poisson distribution with Pois-
d d
son parameter λ and mixing distribution F(x), denoted S = CP(λ , F), if S = Nj=1 Xj
where X1 , X2 , . . . are independent identically distributed random variables with com-
mon distribution function F and N is a random variable which is independent of the
X’s and has a Poisson distribution with parameter λ .
d
Example 30–2. For each fixed t, the aggregate claims process S(t) = CP(λ t, FX ).
d
Example 30–3. If S = CP(λ , F) then the moment generating function of S is
∞
MS (ν ) = exp{λ (euν − 1) dF(u)}.
−∞
d d
Exercise 30–7. Suppose that S = CP(λ , F) and T = CP(δ, G) and that S and T are
independent. Show that S + T = CP(λ + δ, λλ+δ F + λ δ+δ G).
d
This last property is very useful in the insurance context. Because of this
property the results of the analysis of different policy types can be easily combined
into one grand analysis of the company’s prospects as a whole. A compound Poisson
distribution can also be decomposed.
Example 30–4. Suppose each claim is either for $1 or $2, each event having
probability 0.5. If the number of claims is Poisson with parameter λ then the
amount of total claims, S, is compound Poisson distributed with moment generating
function
MS (ν ) = exp{0.5λ (eν − 1) + 0.5λ (e2ν − 1)}.
§30: The Collective Risk Model Revisited 123
d
Hence S = Y1 + 2Y2 where Y1 and Y2 are independent Poisson random variables with
mean λ / 2. Thus the number of claims of each size are independent!
B
N
Xj′ ≈ Xj′
d
BX =
j=1 j=1
where N has a Poisson distribution with parameter P[B = 1] and X1′ , X2′ , . . . are
independent random variables each having the same distribution as X. Thus the
distribution of BX may be approximated by the CP(P[B = 1], FX ) distribution.
§30: The Collective Risk Model Revisited 124
Problems
d
Problem 30–1. Show that if X takes positive integer values and S = CP(λ , FX )
then x P[S = x] = ∞k=1 λ kP[X = k]P[S = x − k] for x > 0. This is called Panjer’s
recursion formula. Hint: First show, using symmetry, that E[Xj | S = x, N = n] =
x/ n for 1 ≤ j ≤ n and then write out what this means.
Problem 30–2. Suppose in the previous problem that λ = 3 and that X takes on the
values 1, 2, 3, and 4 with probabilities 0.3, 0.2, 0.1, and 0.4 respectively. Calculate
P[S = k] for 0 ≤ k ≤ 40.
Problem 30–5. The compound Poisson distribution is not symmetric about its
mean, as the normal distribution is. One might therefore consider approximation of
the compound Poisson distribution by some other skewed distribution. A random
variable G is said to have the Gamma distribution with parameters α and β if G has
density function
β α α −1 −β x
fG (x) = x e 1(0,∞) (x).
Γ(α )
It is useful to recall the definition and basic properties of the Gamma function in this
connection. One easily computes the moments of such a random variable. In fact
the moment generating function is MG (ν ) = (β / β − ν )α . The case in which β = 1/ 2
and 2α is a positive integer corresponds to the chi–square distribution with 2α
degrees of freedom. Also the distribution of the sum of n independent exponential
random variables with mean 1/ β is a gamma distribution with parameters n and β .
For approximation purposes the shifted gamma distribution is used to approximate
the compound Poisson distribution. This means that an α , β , and x is found so that
x + G has approximately the same distribution as the compound Poisson variate.
The quantities x, α , and β are found by using the method of moments. The first
three central moments of both random variables are equated, and the equations are
then solved. Show that when approximating the distribution of a compound Poisson
§30: The Collective Risk Model Revisited 125
random variable S the method of moments leads to
3
4[Var(S)]3 4λ E[X 2 ]
α= =
[E[(S − E[S])3 ]]2 E[X 3 ] 2
2Var(S) 2E[X 2 ]
β= =
E[(S − E[S])3 ] E[X 3 ]
2
2[Var(S)] 2 2λ E[X 2 ]
x = E[S] − = λ E[X] − .
E[(S − E[S])3 ] E[X 3 ]
Problem 30–6. In the continuous time model, if the individual claims X have
density fX (x) = (3e−3x + 7e−7x )/ 2 for x > 0, find the adjustment coefficient and ψ (u).
(The density of X is a mixture of exponential densities. The value of X can be
thought of as resulting from the following 2 stage experiment. Flip a coin. If the
result is heads, take an observation on the exponential distribution with parameter
3; otherwise take an observation on the exponential distribution with parameter 7.)
Problem 30–7. In the continuous time model, if the individual claims X are discrete
with possible values 1 or 2 with probabilities 1/ 4 and 3/ 4 respectively, and if the
adjustment coefficient is ln(2), find the relative security loading.
Problem 30–8. Use integration by parts to show that the adjustment coefficient in
∞
the continuous time model is the solution of the equation e (1 − FX (x)) dx = c/ λ .
rx
0
Problem 30–9. In the continuous time model, use integration by parts to find ML1 (t).
Find expressions for E[L1 ], E[L21 ] and Var(L1 ). Here L1 is the random variable which
is the amount by which the surplus first falls below its initial level, given that this
occurs.
Problem 30–10. Find the moment generating function of the maximum aggregate
loss random variable in the case in which all claims are of size 5.
Problem 30–11. If ψ (u) = 0.3e−2u + 0.2e−4u + 0.1e−7u , what is the relative security
loading? What is the adjustment coefficient? Hint: What is the moment generating
function of the maximum aggregate loss random variable?
Problem 30–12. If L is the maximum aggregate loss random variable, find expres-
sions for E[L], E[L2 ], and Var(L).
Problem 30–13. In the compound Poisson continous time model suppose that
λ = 3, c = 1, and X has density fX (x) = (e−3x + 16e−6x )/ 3 for x > 0. Find the relative
security loading, the adjustment coefficient, and an explicit formula for the ruin
probability.
§30: The Collective Risk Model Revisited 126
Problem 30–14. In the compound Poisson continous time model suppose that
9x
λ = 3, c = 1, and X has density fX (x) = e−3x/5 for x > 0. Find the relative security
25
loading, the adjustment coefficient, and an explicit formula for the ruin probability.
Problem 30–15. In the discrete time version of the collective risk loss model,
U(n) = u + cn − ni=1 Xi , for n ≥ 0. If X takes the values 0 and 2 with probabilities p
and q = 1 − p respectively, find the adjustment coefficient in terms of p and q. Find
the ruin probability in terms of p and q.
Problem 30–16. The claim number random variable is sometimes assumed to have
the negative binomial distribution. A random variable N is said to have the negative
¯
binomial distribution with parameters p and r if N counts the number of failures
before the rth success in a sequence of independent Bernoulli trials, each having
success probability p. Find the density and moment generating function of a random
variable N with the negative binomial distribution. Define the compound negative
binomial distribution and find the moment generating function, mean, and variance
of a random variable with the compound negative binomial distribution.
Problem 30–17. In the case of fire insurance the amount of damage may be quite
large. Three common assumptions are made about the nature of the loss variables in
d
this case. One is that X has a lognormal distribution. This means that X = eZ where
d
Z = N(µσ 2 ). A second possible assumption is that X has a Pareto distribution.
This means that X has a density of the form α x0 / xα +1 1[x0 ,∞) (x) for some α > 0. Note
that a Pareto distribution has very heavy tails, and the mean and/or variance may
not exist. A final assumption which is sometimes made is that the density of X is a
mixture of exponentials, that is,
for example. After an assumption is made about the nature of the underlying
distribution one may use actual data to estimate the unknown parameters. For each
of the three models find the maximum likelihood estimators and the method of
moments estimators of the unknown parameters.
Problem 30–19. One may also examine the benefits, in terms of risk reduction,
of using reinsurance. Begin by noting the possible types of reinsurance available.
First there is proportional reinsurance. Here the reinsurer agrees to pay a fraction
α , 0 ≤ α ≤ 1, of each individual claim amount. Secondly, there is stop–loss
reinsurance, in which the reinsurer pays the amount of the individual claim in
§30: The Collective Risk Model Revisited 127
excess of the deductible amount. Finally, there is excess of loss reinsurance in
which the reinsurer pays the amount by which the claims of a portfolio of policies
exceeds the deductible amount. As an example, the effect of stop–loss reinsurance
with deductible d on an insurer’s risk will be analyzed. The amount of insurer’s
risk will be measured by the ruin probability. In fact, since the ruin probability is
so difficult to compute, the effect of reinsurance on the adjustment coefficient will
be measured. Recall that the larger the adjustment coefficient, the smaller the ruin
probability. Initially (before the purchase of reinsurance) the insurer’s surplus at
time t is
N(t)
U(t) = u + ct − Xj
j=1
λ + cr = λ MX (r).
After the purchase of stop loss reinsurance with deductible d the insurer’s surplus is
N(t)
U ′ (t) = u + c′ t − (Xj ∧ d)
j=1
where c′ = c − reinsurance premium. Note that this process has the same structure
as the original one. The new adjustment coefficient is therefore the solution of
λ + c′ r = λ MX∧d (r).
where θ ′ is the reinsurer’s relative security loading. With this information the new
adjustment coefficient can be computed. Carry out these computations when λ = 2,
θ = 0.50, θ ′ = 0.25, d = 750, and X has a exponential distribution with mean 500.
Problem 30–20. Repeat the previous problem for the case of proportional reinsur-
ance.
Problem 30–21. The case of excess of loss reinsurance leads to a discrete time
model, since the reinsurance is applied to a portfolio of policies and the reinsurance
is paid annually (say). The details here a similar to those in the discussion of the
discrete time gamblers ruin problem. Analyze the situation described in the previous
problem if the deductible for a proportional reinsurance policy is 1500 and the rest
of the assumptions are the same. Which type of reinsurance is better?
§30: The Collective Risk Model Revisited 128
Problem 30–22. Supppose the claims follow a compound Poisson distribution with
λ = 1 and fX (x) = e−x for x > 0. The security loading is θ > 0. Find the adjustment
coefficient if proportional reinsurance is purchased and the reinsurer’s security
loading is β > 0. What is the insurer’s relative security loading after buying such
reinsurance?
Problem 30–23. Suppose in the previous exercise that excess-of-loss coverage with
deductible d is available and all other conditions are the same. Find the equation
for the adjustment coefficient. What is the insurer’s relative security loading after
buying such reinsurance?
Problem 30–24. In the discrete time model, suppose the X’s have the N(10, 4)
distribution and the relative security loading is 25%. A reinsurer will reinsure a
fraction f of the total portfolio on a proportional basis for a premium which is 140%
of the expected claim amount. Find the insurers adjustment coefficient as a function
of f . What value of f maximizes the security of the ceding company?
§30: The Collective Risk Model Revisited 129
Solutions to Problems
Problem 30–16. The compound negative binomial distribution is the distribu-
N
tion of the random sum Xi where N and the X’s are independent, N has the
i=1
negative binomial distribution, and the X’s all have the same distribution.
§30: The Collective Risk Model Revisited 130
Solutions to Exercises
Exercise 30–1. MU(t)−u (−R) = 1 holds if and only if −ctR − λ t(MX (R) − 1) = 0,
which translates into the given condition.
Exercise 30–4. Since in this case FX (t) = 1 − e−δ t for t > 0, direct substitution
gives P[L1 ≤ y| Tu < ∞] = 1 − e−δ y for y > 0.
Exercise 30–5. Given Tu < ∞ the density of L1 is (1−FX (y))/ E[X] for y > 0. Us-
ing integration
∞ by parts then gives the conditional moment
∞ generating function of
L1 as 0 ety (1−FX (y))/ E[X] dy = ety (1 − FX (y))/ tE[X]0 + 0 ety fX (y)/ tE[X] dy =
∞
(MX (t) − 1)/ tE[X]. Notice that the unconditional distribution of L1 has a jump of
size θ / (1 + θ ) at the origin. The unconditional moment generating function of
L1 is θ / (1 + θ ) + (MX (t) − 1)/ (1 + θ )tE[X].
simplifies to the desired result using the formula of the previous exercise.
Exercise 30–7. Using the independence, MS+T (ν ) = MS (ν )MT (ν ) and the result
follows by substituion and algebraic rearrangement.
§31. Discrete Time Markov Chains
In many situations the random variables which serve naturally as a model are
not independent. The simplest kind of dependence allows future behavior to depend
on the present situation.
Example 31–1. Patients in a nursing home fall into 3 categories, and each category
of patient has a differing expense level. Patients who can care for themselves with
minimal assistance are in the lowest expense category. Other patients require some
skilled nursing assistance on a regular basis and are in the next higher expense
category. Finally, some patients require continuous skilled nursing assistance and
are in the highest expense category. One way of modeling the level of care a
particular patient requires on a given day is as follows. Denote by Xi the level of
care this patient requires on day i. Here the value of Xi would be either 1, 2, or
3 depending on which of the 3 expense categories is appropriate for day i. It is
intuitively clear that the random variables {Xi } are not independent.
Exercise 31–1. Show that any sequence of independent discrete random variables
is a Markov chain.
Because of the simple dependence structure a vital role is played by the tran-
sition probabilities P[Xn+1 = j| Xn = i]. In principle, this probability depends not
only on the two states i and j, but also on n. A Markov chain is said to have
stationary transition probabilities if the transition probabilities P[Xn+1 = j| Xn = i]
do not depend on n. In the discussion here, the transition probabilities will always
be assumed to be stationary, and the notation Pi,j = P[Xn+1 = j| Xn = i] will be used.
P[Xn = in , . . . , X0 = i0 ]
= P[Xn = in | Xn−1 = in−1 , . . . , X0 = i0 ] × P[Xn−1 = in−1 , . . . , X0 = i0 ]
= P[Xn = in | Xn−1 = in−1 ] × P[Xn−1 = in−1 , . . . , X0 = i0 ]
= Pin−1 ,in P[Xn−1 = in−1 , . . . , X0 = i0 ]
= ...
= Pin−1 ,in Pin−2 ,in−1 . . . Pi0 ,i1 P[X0 = i0 ].
Exercise 31–2. Justify each of the steps here completely. Where was the Markov
property used?
Exercise 31–3. Show that for a Markov chain with stationary transition probabilities
P[X4 = 3, X3 ≠ 3| X2 = 3, X1 ≠ 3, X0 = 3] = P[X2 = 3, X1 ≠ 3| X0 = 3]. Generalize.
Example 31–2. In the previous nursing home example, suppose the transition
0.9 0.05 0.05
matrix is P =
0 .1 0 .8 0 .1
. Then using conditioning it is easy to compute
0 0.05 0.95
P[X3 = 2| X0 = 1].
Example 31–3. The gambler’s ruin problem illustrates many of the features of a
Markov chain. A gambler enters a casino with $z available for wagering and sits
down at her favorite game. On each play of the game, the gambler wins $1 with
probability p and loses $1 with probability q = 1 − p. She will happily leave the
casino if her fortune reaches $a > 0, and will definitely leave, rather unhappily,
if her fortune reaches $0. Denote by Xn the gambler’s fortune after the nth play.
Clearly {Xn } is a Markov chain with P[X0 = z] = 1. The natural state space here is
{0, 1, . . . , a}.
The induction hypothesis together with the formula for the multiplication of matrices
conclude the proof.
Using this lemma gives the following formula for the density of Xn in terms of
the density of X0 .
( P[Xn = 0] P[Xn = 1] . . . ) = ( P[X0 = 0] P[X0 = 1] . . . )Pn .
= n→∞
lim ( P[X0 = 0] P[X0 = 1] . . . )Pn+1
= ( P[Y = 0] P[Y = 1] . . . )P
§31: Discrete Time Markov Chains 134
which gives a necessary condition for Y to be a distributional limit for the chain,
namely, the density of Y must be a left eigenvector of P corresponding to the
eigenvalue 1.
Example 31–4. For the nursing home chain given earlier there is a unique left
eigenvector of P corresponding to the eigenvalue 1, after normalizing so that the
sum of the coordinates is 1. That eigenvector is (0.1202, 0.1202, 0.7595). Thus a
patient will, in the long run, spend about 12% of the time in each of categories 1
and 2 and about 76% of the time in category 3.
Exercise 31–8. Find the left eigenvectors corresponding to the eigenvalue 1 of the
transition matrix for the gambler’s ruin chain.
0 1
Example 31–5. Consider the Markov chain with transition matrix P = .
1 0
The left eigenvector of P corresponding to the eigenvalue 1 is ( 1/ 2 1/ 2 ) but the
chain clearly has no limiting distribution.
Exercise 31–9. Show that this last chain does not have a limiting distribution.
The last example shows that a Markov chain need not have a limiting distribu-
tion. Even so, this chain does spend half the time in each state, so the entries in the
left eigenvector do have an intuitive interpretation as properties of the chain. To
explore this possiblilty further, some terminology is introduced. Let P be the transi-
tion probability matrix of a Markov chain X with stationary transition probabilities.
A vector π = ( π 0 , π 1 , . . . ) is said to be a stationary distribution for the chain X if
(1) π i ≥ 0 for all i,
(2) ∞i=0 π i = 1, and
(3) π P = π .
If a limiting distribution for the chain X exists then that limiting distribution will
be a stationary distribution. In the example above, ( 1/ 2 1/ 2 ) is a stationary
distribution even in though the chain has no limiting distribution.
Checking each state to see whether that state is transient or recurrent is clearly
a difficult task with only the tools available now. Some other useful notions can
greatly simplify the job.
One key notion is that of accessibility. The state j is said to be accessible from
the state i if there is a positive probability that the chain can start in state i and reach
state j. Technically, the requirement for j to be accessible from i is that Pni,j > 0 for
some n ≥ 0. Two states i and j are said to communicate, denoted i↔j, if each is
accessible from the other.
Example 31–6. Consider the Markov chain X in which Xn denotes the outcome of
the nth toss of a fair coin in which 1 corresponds to a head and 0 to a tail. Clearly
0↔1.
Example 31–7. In the gambler’s ruin problem it is intuitively clear that the states
0 and a are accessible from any other state but do not communicate with any state
except themselves. The other states all communicate with each other.
Exercise 31–11. Prove that the intuition of the preceding example is correct.
Example 31–8. In the nursing home example, all states communicate with each
other.
Exercise 31–12. For the coin tossing chain, is the state 1 recurrent?
Exercise 31–13. What are the recurrent states for the gambler’s ruin chain?
Exercise 31–14. Show that if a Markov chain is irreducible and every state is
transient then there is no stationary distribution for the chain.
If the chain is irreducible and recurrent there are two distinct possibilities.
Either the chain has no stationary distribution or the stationary distribution is given
by π i = 1/ µi,i . In this connection a recurrent class is said to be strongly ergodic (or
positive recurrent) if π i > 0 for all states in the class, and null recurrent otherwise.
Positive recurrence and null recurrence are class properties.
Return now to the question of the long term behavior of the chain. Denote by
I = ( P[X0 = 0] P[X0 = 1] . . . ) the initial distribution of the chain. If I S is a
density (the total probability may not be 1), then I S is called the steady state of the
chain. Note that by an earlier formula the steady state is approximately the same as
the density of Xn when n is large. The steady state is always a stationary distribution,
but the converse need not be true. In the nicest situations the steady state does not
depend on I . This will be the case if all of the rows of S are the same. In this case
there will be at most one steady state (or stationary) distribution for the chain.
Exercise 31–15. Show that for an irreducible recurrent chain S has identical rows
and that there is at most one stationary (=steady state) distribution.
Problem 31–3. Suppose the chain has only finitely many states all of which com-
municate with each other. Are any of the states transient?
Problem 31–4. In addition to the 3 categories of expenses in the nursing home ex-
ample, consider also the possibilities of withdrawal from the home and death. Sup-
0.8 0.05 0.01 0.09 0.05
0.5 0.45 0.04 0.0 0.01
pose the corresponding transition matrix is P = 0.05 0.15 0.70 0.0 0.10
0 .0 0 .0 0 .0 1 .0 0 .0
0 .0 0 .0 0 .0 0 .0 1 .0
where the states are the 3 expense categories in order followed by withdrawal and
death. Find the stationary distribution(s) of the chain. Which states communicate,
which states are transient, and what are the absorbtion probabilities?
§31: Discrete Time Markov Chains 138
Solutions to Problems
Problem 31–1. Since i↔j there are integers a and b so that Pai,j > 0 and
Pbj,i > 0. As shown earlier, Pa+n+b
j,j ≥ Pbj,i Pni,i Pai,j . If Pni,i > 0 this inequality shows
n 2
that Pa+b+n
j,j > 0 too and therefore d(j) divides a + b + n. But since P2n i,i ≥ Pi,i a
similar inequality shows that d(j) divides a + b + 2n as well. Hence d(j) divides
a + b + 2n − (a + b + n) = n, and so d(j) ≤ d(i). Interchanging the roles of i and j
shows that d(i) ≤ d(j).
Problem 31–3. No. Since all states communicate, either all are transient or
all are recurrent. Since there are only finitely many states they can not all be
transient. Hence all states are recurrent.
§31: Discrete Time Markov Chains 139
Solutions to Exercises
Exercise 31–1. Because of the independence both of the conditional probabil-
ities in the definition are equal to the unconditional probability P[a < Xtn+1 < b].
1 0 0 0 0 ... 0
q 0 p 0 0 ... 0
0 q 0 p 0 ... 0
Exercise 31–5.
0 0 q 0 p ...
0 .
. .. .. .. .. .. ..
.
. . . . . . .
0 0 0 0 0 ... 1
Exercise 31–6. Use induction on n. The case n = 1 is true from the definition
of stationarity. For the induction
step assume the result holds when n = k. Then
P[Xk+1+m = j| Xm = i] = b P[Xk+1+m = j, X k+m = b| X m = i] = b P[Xk+1+m =
j| Xk+m = b]P[Xk+m = b| Xm = i] = b P[Xk+1 = j| Xk = b]P[Xk = b| X0 = i] =
P[Xk+1 = j| X0 = i], as desired.
Exercise 31–7. P[Xn = k] = i P[Xn = k| X0 = i]P[X0 = i] = i Pni,k P[X0 = i]
which agrees with the matrix multiplication.
Exercise 31–8. Matrix multiplication shows that the left eigenvector condition
implies that the left eigenvector x = (x0 , . . . , xa ) has coordinates that satisfy
x0 + qx1 = x0 , qx2 = x1 , pxk−1 + qxk+1 = xk for 2 ≤ k ≤ a − 2, pxa−2 = xa−1 and
pxa−1 + xa = xa . From these equations it follows that only x0 and xa can be
non-zero, and that these two values can be arbitrary. Hence all left eigenvectors
corresponding to the eigenvalue 1 are of the form (c, 0, 0, . . . , 0, 1 − c) for some
0 ≤ c ≤ 1.
Exercise 31–15. If i and j communicate then both fi∗,j and fj∗,i are positive. If
i is recurrent, then fi∗,j = 1 because in the infinitely many visits to i at least one
attempt to visit j must succeed. So fi∗,j = 1 for all i and j which makes the rows
of S identical.
§32. Continuous Time Markov Chains
The next step in the study of Markov chains retains a discrete state space but
allows time to vary continuously.
A discrete time chain always spent one time unit in each state before the next
transition. For a continuous time chain the time spent in each state before a transition
will be random. Intuitively, this is the only difference between discrete time and
continuous time Markov chains. The bulk of the work consists of verifying that this
intuition is indeed correct and in identifying the probabilistic properties of the times
between transitions for the chain.
As for discrete time chains the transition probabilities Pi,j (t) = P[Xt = j| X0 = i]
and the associated transition probability matrix P(t) = [Pi,j (t)] play an important
role. By convention, P(0) = I, the identity matrix.
Example 32–1. The Poisson process provides a typical example of the way in which
the transition probabilities are specified for continuous time Markov chains. As a
specific example let Xt denote the number of calls that have arrived at a telephone
switchboard by time t. In a very short time interval essentially either 0 or 1 call can
arrive. Denote by λ > 0 the average rate at which calls arrive. This model can be
expressed by saying that for small h > 0
(1) P[Xt+h − Xt = 1| Xt = i] ≈ λ h,
(2) P[Xt+h − Xt = 0| Xt = i] ≈ 1 − λ h, and
(3) P[Xt+h − Xt ≥ 2| Xt = i] ≈ 0
The matrix P′ (0) is called the infinitesimal generator of the transition semi-
group {P(t) : t ≥ 0}. Since typically the infinitesimal generator is specified during
model building the central question is how (or even whether) the infinitesimal
generator determines the transition semi-group of the chain.
These two equations, especially the forward equation, are very useful in appli-
cations. Unfortunately there are no simple general conditions which guarantee that
the forward equation holds.
Example 32–2. Assume for the moment that the forward equation holds for the
Poisson process. (Later this will be shown to be true.) Recall that the infinitesimal
generator in this case is
−λ λ 0 0 ...
0 −λ λ 0 ...
A = P′ (0) = .
0 0 −λ λ ...
.. .. .. .. ..
. . . . .
Translating the matrix form of the forward equation P′ (t) = P(t)A into statements
about the individual transition probabilities gives
P′i,j (t) = −λ Pi,j (t) + λ Pi,j−1 (t).
§32: Continuous Time Markov Chains 143
To compute M(t) = E[Xt | X0 = i] use the definition of expectation and this equation
to obtain ∞
M ′ (t) = jP′i,j (t)
j=0
∞
= j −λ Pi,j (t) + λ Pi,j−1 (t)
j=0
∞
∞
= −λ jPi,j (t) + λ (j − 1)Pi,j−1 (t) + λ
j=0 j=0
= λ.
So M(t) = λ t + i. Thus the forward equation can be used to find the conditional
expectation without first finding the transition matrices.
Exercise 32–3. Try the same computation using the backward equations.
In the case of chains with only a finite number of states the theory is very simple:
the transition semi-group is uniquely determined by the infinitesimal generator and
both the forward and backward equations hold.
The case in which the chain has infinitely many states is more complex. The
main result in this case is just stated here. Since the matrices involved are infinite
is size, an assumption must be made which is automatically satisfied in the case in
which there are only finitely many states. The infinitesimal generator A of the chain
is conservative if ∞j=0 ai,j = 0 for all i.
Theorem. If P(t) is the unique solution of either the forward or the backward
equation and if ∞j=0 Pi,j (t) = 1 for all i then P(t) is the unique solution to both the
forward and the backward equation and is a transition semi-group.
Exercise 32–4. Show that if the chain has only finitely many states then the in-
finitesimal generator must be conservative.
Some examples of continuous time chains will now be given. One of the
main features is the way in which the behavior of the chain is determined from the
infinitesimal generator.
Example 32–3. Look once again at the Poisson process. Here the forward equation
is P′i,j (t) = −λ Pi,j (t) + λ Pi,j−1 (t). This can be rewritten using an integrating factor
as dtd eλ t Pi,j (t) = λ eλ t Pi,j−1 (t). Thus P0,0 (t) = e−λ t and by induction P0,n (t) =
(λ t)n e−λ t / n!. Similarly, Pi,0 (t) = 0 if i > 0 and by induction Pi,j (t) = (λ t)j−i e−λ t / (j −
i)!1[0,∞) (j − i). This solution is unique and ∞j=0 Pi,j (t) = 1 for all i. From the general
theory Pi,j (t) is therefore the unique solution to the backward equation as well and
is also the unique transition semi-group with this infinitesimal generator.
§32: Continuous Time Markov Chains 144
Exercise 32–5. Fill in the details of the induction arguments above.
Example 32–4. The next example is a pure birth process. This is a variant of the
Poisson process in which the probability of additional calls depends on the number
of calls received. Specifically assume that
(1) P[Xt+h − Xt = 1| Xt = k] ≈ λk h,
(2) P[Xt+h − Xt = 0| Xt = k] ≈ 1 − λk h, and
(3) P[Xt+h − Xt ≥ 2| Xt = k] ≈ 0.
Using the forward equation and the integrating factor eλj t yields
t
−λj t −λj t
Pi,j (t) = δi,j e + λj−1 e eλj t Pi,j−1 (s) ds.
0
This shows inductively that Pi,j (t) ≥ 0 and that the solution is unique. The remaining
question is whether or not ∞j=0 Pi,j (t) = 1 for all i. Fix i and define Sn (t) = nj=0 Pi,j (t).
t
Using the forward equation gives Sn′ (t) = −λn Pi,n (t) and so 1 − Sn (t) = 0 −Sn′ (s) ds =
λn 0t Pi,n (s) ds. Now from the definition of Sn (t), 1 − ∞j=0 Pi,j (t) ≤ 1 − Sn (t) ≤ 1 so
∞ t
1 1
1− Pi,j (t) ≤ Pi,n (s) ds ≤ .
λn j=0 0 λn
Summing on n gives
∞
∞ t
∞ ∞
1 1
1− Pi,j (t) ≤ Pi,n (s) ds ≤ .
n=0 λn j=0 0 n=0 n=0 λn
The right hand part of this inequality shows that if ∞n=0 λ1n < ∞ then ∞j=0 Pi,j (t) can
not be 1 for all t, while the left hand part of this inequality shows that if ∞n=0 λ1n = ∞
then ∞j=0 Pi,j (t) = 1 for all t ≥ 0. Thus the pure birth process exists if and only if
∞ 1
n=0 λn = ∞. Note that this condition is obviously satisfied for the Poisson process.
Exercise 32–6. Explain intuitively what goes wrong with the pure birth process
when ∞n=0 λ1n < ∞.
Example 32–5. The Yule process is a pure birth process for which λn = nβ , that
is, the birth rate is proportional to the number present. This process clearly meets
the criteria for existence which was established in the previous example. If Mi (t) =
§32: Continuous Time Markov Chains 145
E[Xt | X0 = i] then the forward equation can be used to show that Mi′ (t) = β Mi (t).
Hence Mi (t) = ieβ t .
Exercise 32–7. Fill in the details of how the forward equation is used in this
computation. What is the conditional variance of Xt ?
Example 32–6. The final example is the birth and death process. Suppose Xt
is the size of a population at time t. Then in a short time period there will be
(essentially) a single birth or a single death or neither. Formally the model is
(1) P[Xt+h − Xt = 1| Xt = k] ≈ λk h,
(2) P[Xt+h − Xt = 0| Xt = k] ≈ 1 − (λk + µk )h,
(3) P[Xt+h − Xt = −1| Xt = k] ≈ µk h, and
(4) P[| Xt+h − Xt | ≥ 2| Xt = k] ≈ 0.
Here λk is birth rate and µk is the death rate when the population size is k.
The relationship between the infinitesimal generator and the chain itself has been
made rather precise. The question of the behavior of the chain is now considered.
π = lim
t→∞
I P(t) = lim I P(t + s) = lim I P(t)P(s) = π P(s)
t→∞ t→∞
for all s ≥ 0.
Exercise 32–9. Give an example of a chain for which there is no limiting distribu-
tion.
Once again the notion of a stationary distribution will play an important role.
In the continuous time context π = ( π 0 π 1 . . . ) is said to be a stationary
distribution if
(1) π i ≥ 0 for all i,
§32: Continuous Time Markov Chains 146
(2) ∞i=0 π i = 1, and
(3) π P(s) = π for all s ≥ 0.
Because of the close relationship between the infinitesimal generator and the
transition semi-group it should be possible to find the stationary distribution of the
chain using only the generator.
Theorem. Suppose the infinitesimal generator A of the chain is conservative and
that π i ≥ 0 for all i and ∞i=0 π i = 1. Then π is a stationary distribution if and only
if π A = 0.
For discrete time chains the stationary distribution could be interpreted as the
long run fraction of the time that the chain spent in each state. A similar interpretation
can be made in the continuous time case too.
Define T0 = 0 and inductively set Tn = inf{t ≥ Tn−1 : Xt ≠ XTn−1 }. The T’s are
the times of changes of state for the chain.
To study the behavior of the chain let t > 0 and i ≠ j be states. Then
P[T1 > t, XT1 = j| X0 = i]
∞
= lim P[Xk/2n = i, 1 ≤ k ≤ [2n t] + l, X([2n t]+l+1)/2n = j| X0 = i]
n→∞
l=0
∞
[2n t+l]
= lim Pi,i (1/ 2n) Pi,j (1/ 2n)
n→∞
l=0
Pi,j (1/ 2n ) 2n [2n t]/2n
∞
1
= lim Pi,i (1/ 2n ) Pi,i (1/ 2n)l
n→∞ 1/ 2n l=0 2n
n
1/ 2
= ai,j eai,i t lim
n→∞ 1 − Pi,i (1/ 2n )
−ai,j ai,i t
= e .
ai,i
−ai,j
(Recall that ai,i ≤ 0.) Hence P[XT1 = j| X0 = i] = ai,i
and P[T1 > t| X0 = i] = eai,i t for
t > 0.
This computation means that the time spent in state i until a transition occurs has
an exponential distribution with mean −1/ ai,i and that the probability upon leaving
state i of entering state j is −ai,j / ai,i .
If interest is only in the states that are visited by the chain and not in the time
spent in each state one may as well study the embedded discrete time chain with
transition matrix P = [−ai,j / ai,i (1 − δi,j )]. Note that if ai,i = 0 the corresponding
transition probability in the embedded chain is Pi,i = 1 since state i is obviously an
absorbing state for the continuous time chain.
Example 32–7. For the Poisson process the embedded chain has transition matrix
0 1 0 0 0 ...
0 0 1 0 0 ...
P= .
0 0 0 1 0 ...
.. .. .. .. .. ..
. . . . . .
Hence all states are transient and there is no stationary distribution.
Example 32–8. For the birth and death process the embedded chain has transition
matrix
0 1 0 0 ...
µ1
P= 0 µ1λ+1λ1 0 . . .
µ1 +λ1
.. .. .. .. ..
. . . . .
if λ0 > 0 so this chain behaves in a manner similar to the gambler’s ruin chain.
In particular if λ0 = 0 and 0 < λi for i > 0 then finding the absorption probability
for the birth and death chain is exactly the same problem as finding the absorption
probability for the gambler’s ruin chain.
Exercise 32–12. How does the first row of P differ from that shown when λ0 = 0?
For a continuous time Markov chain with conservative generator A the stationary
distributions π are the solutions of π A = 0. Also −1/ ai,i is the mean of the
exponentially distributed time that the chain spends in state i prior to a transition
and −ai,j / ai,i is the probability that when a transition occurs from state i the chain
will move to state j ≠ i. The probability of a transition from state i to itself is 0 unless
ai,i = 0 in which case i is an absorbing state.
One consequence of the Markov property is that the time spent in each state is
exponential. In many models this is not realistic because the exponential distribution
is memoryless.
§32: Continuous Time Markov Chains 148
Problems
Problem 32–1. Consider a continuous time version of the nursing home chain in
which the states are (1) minimal care, (2) some skilled assistance required, (3)
continuous skilled assistance required, and (4) dead. Suppose the infinitesimal
generator A of the chain has entries a1,2 = 0.12, a1,3 = 0.05, a1,4 = 0.08, a2,1 = 0.05,
a2,3 = 0.07, a2,4 = 0.12, a3,4 = 0.20 and all other non-diagonal entries are zero.
What is the matrix A? What are the stationary distributions of the chain? What is
the expected time until absorption?
§32: Continuous Time Markov Chains 149
Solutions to Problems
−0.25 0.12 0.05 0.08
0.05 −0.24 0.07 0.12
Problem 32–1. A = , from which the sta-
0 0 −0.20 0.20
0 0 0 0
tionary distribution is easily found to be ( 0 0 0 1 ).
§32: Continuous Time Markov Chains 150
Solutions to Exercises
Exercise 32–1. Hint: Look at the proof of the parallel result for discrete time
chains.
Exercise 32–2. If there are infinitely many states there could be problems since
these ‘derivations’ involve passing a limit through the infinite sum occurring in
the matrix multiplication.
Exercise 32–3. The backward equation gives P′i,j (t) = λ Pi−1,j (t)− λ Pi,j (t) which
does not produce a differential equation for M(t).
Exercise 32–4. Since there are only finitely many states, the derivative of
the sum is the sum of the derivatives, so the equation j Pi,j (t) = 1 can be
differentiated to show that the generator is conservative.
Exercise 32–5. The inductive step is dtd (eλ t Pi,j (t)) = λ j tj−1 / (j − 1)! from which
integration gives eλ t Pi,j (t) = (λ t)j / j!, as desired.
Exercise 32–6. If the process is currently in state k the expected waiting time
until a birth is 1/ λk . If the sum is finite then there is a finite expected waiting
time for infinitely many births to occur, that is, the population size will become
infinite in a finite amount of time.
Exercise 32–7. The forward equation gives P′i,j (t) = β jPi,j (t) + β (j − 1)Pi,j−1 (t).
Now multiply both sides of this equation by j and sum on j = j − 1 + 1 to obtain
the equation.
The discrete time and state space model will be constructed to model the motion
of a particle in one dimension. Let Sn denote the position of the particle at time n and
assume S0 = 0. The change in position of the particle at time i will be denoted Di .
The random variables D1 , D2 , . . . will be assumed to be independent and identically
distributed random variables each taking the values 1 and −1 with equal probability.
Hence Sn = ni=0 Di and {Sn : n ≥ 0} is nothing more than a random walk model.
proof : Denote by Pn the number of displacements D1 , . . . , Dn which are positive and let Nn = n−Pn
denote the number of negative displacements. The D’s generate a path from the origin to
(n, x) if and only if n = Pn + Nn and x = Pn − Nn . This shows that there are Pnn such paths.
From the two equations Pn = (n + x)/ 2 and the result follows.
The reflection principle is another path inspired device. The notation corre-
sponds to that of the picture below, which also provides the proof.
Reflection Principle. The number of paths from A to B which touch or cross the
level T is equal to the number of all paths from A′ to B.
Sn
.•.......
A •......... .....•........... ....•...........
•. .•............... ... B
....•
.... ....
T . •
....
.... ....
. •
.....
...
n
Copyright 2001 Jerry Alan Veeh. All rights reserved.
§33: Introduction to Brownian Motion 152
As an application consider an election in which candidate A wins with a votes
compared to the b < a votes for candidate B. If the ballots are counted one at a time,
what is the probability that candidate A is always ahead in the count?
Ballot Theorem. Suppose n and x are positive integers. The number of paths from
the origin to (n, x) which do not touch the n axis is n n+x .
x n
2
proof : The number of paths meeting the requirement is the same as the number of paths from
(1, 1) to (n, x) which do not touch or cross the n axis. Now the total number of paths from
,n−1
(1 1) to (n, x) is the same as the total number of paths from (0, 0) to (n − 1, x − 1), which is
n−1+x−1 by the proposition. The number of paths from (1, 1) to (n, x) which touch or cross
2
the n axis can be computed using the reflection principle. By the reflection principle this
is the same as the number of paths from (1, −1) to (n, x) which in turn is the same as the
of paths from (0, 0) to (n − 1, x + 1). Again using the proposition gives this number
number
as n−1 n+x . Subtraction gives the desired result.
2
Exercise 33–1. Use the Ballot Theorem to show that the probability that candidate
A is always ahead in the tally is (a − b)/ (a + b). Thus if A wins 70 votes to 30 the
probability that A was always ahead is 0.40.
One of the first applications of the Brownian motion model was to describe the
displacement of a small particle suspended in fluid. As a first approximation to the
behavior of the particle assume that the particle moves only because of collisions
with other particles. Assume also that collisions are as likely to come from the
left as from the right. To simplify even further suppose that each collision causes a
displacement of magnitude ∆ and that the time between collisions is τ . If the particle
begins at the origin at time 0 the position Xt of the particle at time t is then S[t/ τ ] in the
notation above. The idea is to see what happens as both ∆ and τ go to 0. This passage
to the limit makes the approximate model become more realistic. Note that E[Xt ] = 0
and Var(Xt ) ≈ ∆2 t/ τ . To avoid disaster some relationship between ∆2 and τ must be
maintained in the passage to the limit. The assumption made here is that ∆2 / τ = σ 2
for some σ > 0. After passing to the limit under this assumption the Central Limit
d
Theorem shows that Xt = N(0, σ 2 t). The process Xt obtained in this way is also a
process with independent increments. This means that if t1 < t2 < . . . < tn then
the random variables Xtn − Xtn−1 , . . . , Xt2 − Xt1 are independent. The increments are
d
also stationary which means that for t > s Xt − Xs = Xt−s . Intuition further suggests
that the sample paths of the process Xt should be continuous functions. This means
that for each point ω in the underlying probability space the function of t defined by
t → Xt (ω ) should be continuous. This intuitive motivation leads to the following
more formal definition. A stochastic process {Xt : t ≥ 0} is said to be a Brownian
motion process (or Wiener process) with parameter σ 2 if
§33: Introduction to Brownian Motion 153
(1) Xt is a process with stationary independent increments,
d
(2) Xt = N(0, σ 2 t) for each t > 0, and
(3) X0 = 0.
The important but difficult to prove fact about Brownian motion is that the
sample paths of the process can be assumed to be continous. This result can be
intuitively deduced from the approximation argument given earlier. The sample
paths can also be shown to be nowhere differentiable. Again this fact is intuitively
apparent from the earlier construction since the sample paths of the approximating
sums have a lot of corners.
For some purposes the notion of a Brownian motion starting from x will be
useful. Such a process is a Brownian motion as above except for the fact that X0 = x
is assumed.
Some of the basic properties of Brownian motion are derived here in an intuitive
way. The continuous time analog of the reflection principle plays an important role.
Example 33–1. Suppose a > 0 and denote by Ta the first time at which the standard
Brownian motion Xt takes the value a. Then Ta = inf{s > 0 : Xs = a}. The
random variable Ta is well defined because of the continuity of the sample paths
of the process X. Let t > 0 be fixed. Now by the Theorem of Total Probability
P[Xt ≥ a] = P[Xt ≥ a| Ta ≤ t] P[Ta ≤ t] + P[Xt ≥ a| Ta > t] P[Ta > t]. Clearly
P[Xt ≥ a| Ta > t] = 0. If Ta ≤ t the independent increment property of X suggests
that after time Ta the Brownian motion restarts as though from scratch except that
the intial value is a rather than 0. Hence P[Xt ≥ a| Ta ≤ t] = P[Xt ≤ a| Ta ≤ t] = 1/ 2.
So P[Ta ≤ t] = 2 P[Xt ≥ a] = √22π a/∞√t e−x /2 dx. From this formula P[Ta < ∞] = 1
2
but E[Ta ] = ∞.
Example 33–2. Again let a > 0 and fix t > 0. Then P[max0≤x≤t Xs ≥ a] = P[Ta ≤ t].
Example 33–3. Suppose 0 < t1 < t2 . What is the probability that Xt is 0 at least
once in the interval (t1 , t2 )? Let Z denote the event that Xt is zero for at least one
§33: Introduction to Brownian Motion 154
value of t in (t1 , t2).
P[Z] = E[P[Z | Xt1 ]]
∞
= P[T | x| ≤ t2 − t1 ] dFXt1 (x)
−∞
∞
1 −x2 /2t1
= P[T | x| ≤ t2 − t1 ] e dx.
−∞ √2π t1
Martingales can also be exploited to obtain some results about Brownian motion.
The verification that {Xt : t ≥ 0}, {Xt2 − t : t ≥ 0} and {exp{λ Xt − λ 2 t/ 2} : t ≥ 0}
are martingales is quite straightforward.
Example 33–4. Suppose a < 0 < b and let T = inf{s : Xs = a or Xs = b}. Applying
the Optional Stopping Theorem gives 0 = E[X0 ] = E[XT ] = a P[XT = a] + b P[XT =
b]. Solving gives P[XT = a] = b/ (b − a) and P[XT = b] = −a/ (b − a).
Exercise 33–8. Verify this computation. Why does the Optional Stopping Theorem
apply?
Exercise 33–2. If t > s write Xt = Xt −Xs +Xs and use the independent increments
property to compute as follows: Cov(Xt , Xs ) = E[Xt Xs ] = E[(Xt −Xs )Xs ]+E[Xs2] =
0 + Var(Xs ) = σ 2 s. When s > t a similar argument shows that the covariance is
σ 2 t.
∞ 2
Exercise 33–3. Let t → ∞ to obtain P[Ta < ∞] = (2/ √2π ) 0 e−x / 2 dx = 1.
Differentiation gives the density of Ta as e−a / 2t / √2π t3 for t > 0, from which it
2
d
Exercise 33–4. The Reflection Principle gives Ta = T | a| .
Exercise 33–5. If the maximum exceeds a then the time required to reach level
a must have been smaller than t.
Exercise 33–6. Substitute from the earlier formula for P[T | x| ≤ t2 − t1 ] and then
integrate by parts.
One theory about the behavior of stock price Pt over time is that Pt should
behave like a stochastic process with independent ratios, that is, if t1 < . . . < tn
then Ptn / Ptn−1 , . . . , Pt2 / Pt1 should be independent. A simple model for this sort of
behaviour is given by a geometric Brownian motion process Pt = eBt where Bt is
a Brownian motion with drift. One use of this model of stock prices is construct a
theoretical pricing model for stock options. A call option is a contract which gives
the owner of the contract the right to buy one share of the underlying stock at a
particular price K called the strike price of the option. The call option contract
expires after a fixed time T. Thus a call that is not used by time T becomes worthless.
1. If the drift and variance parameters of the underlying Brownian motion with
drift are 0.10 and 0.005 respectively, what is the probability that a stock with a price
of $20 today has a price exceeding $25 one year from now?
2. The economist Robert Merton has presented an economic argument that call
options will not be exercised before the expiration time T. This implies that the
value of a call option today depends only on the price of the underlying stock at
time T. Since the value of the call at time T is max{PT − K, 0}, the value of the call
today (at time 0) is just the actuarial present value of this amount. Thus the value
of a call today is E[vT max{PT − K, 0}]. This is the Black–Scholes option pricing
formula. Simplify this expression to obtain a form suitable for computation.
3. Explain how stock price data could be used in conjunction with the formula
of the previous question in order to value a call option on a particular stock. Choose
a stock for which call options are traded on the exchange and estimate the parameters
governing the stock price process. Compare the option price with that given by the
Black–Scholes formula. Most traders believe that the interest rate on 3 month U.S.
Treasury bills is the interest rate to use when computing the present value.