Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
510 views511 pages

Foulds Optimization

Uploaded by

Arghya Basu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
510 views511 pages

Foulds Optimization

Uploaded by

Arghya Basu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 511

Undergraduate Texts in Mathematics

Editors
F. W. Gehring
P. R. Halmos
Advisory Board
C. DePrima
I. Herstein
L. R. Foulds

Optimization Techniques
An Introduction

With 72 Illustrations

Springer-Verlag
New York Heidelberg Berlin
L. R. Foulds
Department of Economics
University of Canterbury
Christchurch 1
New Zealand

Editorial Board

P. R. Halmos F. W. Gehring
Department of Mathematics Department of Mathematics
Indiana University University of Michigan
Bloomington, IN 47401 Ann Arbor, MI 48109
U.S.A. U.S.A.

AMS Classification: 49-01, 90-01

Library of Congress Cataloging in Publication Data

Foulds, L. R., 1948-


Optimization techniques.

Bibliography: p.
Includes index.
1. Mathematical optimization. 2. Programming
(Mathematics) I. Title.
QA402.5.F68 519 81-5642
AACR2

© 1981 by Springer-Verlag New York Inc.


Softcover reprint of the hardcover 1st edition 1981

All rights reserved. No part of this book may be translated or reproduced in any form
without written permission from Springer-Verlag, 175 Fifth Avenue, New York,
New York 10010, U.S.A.
9 8 765 432 1

ISBN-13:978-1-4613-9460-0 e-ISBN-13:978-1-4613-9458-7
DOl: 10.1007/978-1-4613-9458-7
This book is dedicated to the memory of my father

Richard Seddon Foulds


Contents

Preface IX

Plan of the Book XI

Chapter I
Introduction
1.1 Motivation for Studying Optimization, 1.2 The Scope of Opti-
mization, 1.3 Optimization as a Branch of Mathematics, 1.4 The
History of Optimization, 1.5 Basic Concepts of Optimization

Chapter 2
Linear Programming 10
2.1 Introduction, 2.2 A Simple L.P. Problem, 2.3 The General L.P.
Problem, 2.4 The Basic Concepts of Linear Programming, 2.5 The
Simplex Algorithm, 2.6 Duality and Postoptimal Analysis, 2.7
Special Linear Program, 2.8 Exercises

Chapter 3
Advanced Linear Programming Topics 106
3.1 Efficient Computational Techniques for Large L.P. Problems,
3.2 The Revised Simplex Method, 3.3 The Dual Simplex Method,
3.4 The Primal-Dual Algorithm, 3.5 Dantzig-Wolfe Decomposi-
tion, 3.6 Parametric Programming, 3.7 Exercises

vii
Vlll Contents

Chapter 4
Integer Programming 150
4.1 A Simple Integer Programming Problem, 4.2 Combinatorial
Optimization, 4.3 Enumerative Techniques, 4.4 Cutting Plane
Methods, 4.5 Applications of Integer Programming, 4.6 Exercises

Chapter 5
Network Analysis 187
5.1 The Importance of Network Models, 5.2 An Introduction to
Graph Theory, 5.3 The Shortest Path Problem, 5.4 The Minimal
Spanning Tree Problem, 5.5 Flow Networks, 5.6 Critical Path
Scheduling, 5.7 Exercises

Chapter 6
Dynamic Programming 235
6.1 Introduction, 6.2 A Simple D.P. Problem, 6.3 Basic D.P. Struc-
ture, 6.4 Multiplicative and More General Recursive Relationships,
6.5 Continuous State Problems, 6.6 The Direction of Computations,
6.7 Tabular Form, 6.8 Multi-state Variable Problems and the Limi-
tations of D.P., 6.9 Exercises

Chapter 7
Classical Optimization 257
7.1 Introduction, 7.2 Optimization of Functions of One Variable,
7.3 Optimization of Unconstrained Functions of Several Variables,
7.4 Optimization of Constrained Functions of Several Variables,
7.5 The Calculus of Variations, 7.6 Exercises

Chapter 8
Nonlinear Programming 310
8.1 Introduction, 8.2 Unconstrained Optimization, 8.3 Constrained
Optimization, 8.4 Exercises

Chapter 9
Appendix 370
9.1 Linear Algebra, 9.2 Basic Calculus, 9.3 Further Reading

References 395

Solutions to Selected Exercises 400

Index 499
Preface

Optimization is the process by which the optimal solution to a problem, or


optimum, is produced. The word optimum has come from the Latin word
optimus, meaning best. And since the beginning of his existence Man has
strived for that which is best. There has been a host of contributions, from
Archimedes to the present day, scattered across many disciplines. Many of
the earlier ideas, although interesting from a theoretical point of view, were
originally of little practical use, as they involved a daunting amount of com-
putational effort. Now modern computers perform calculations, whose time
was once estimated in man-years, in the figurative blink of an eye. Thus it has
been worthwhile to resurrect many of these earlier methods. The advent of
the computer has helped bring about the unification of optimization theory
into a rapidly growing branch of applied mathematics. The major objective
of this book is to provide an introduction to the main optimization tech-
niques which are at present in use. It has been written for final year undergrad-
uates or first year graduates studying mathematics, engineering, business, or
the physical or social sciences. The book does not assume much mathemati-
cal knowledge. It has an appendix containing the necessary linear algebra
and basic calculus, making it virtually self-contained.
This text evolved out of the experience of teaching the material to finishing
undergraduates and beginning graduates. A feature of the book is that it
adopts the sound pedagogical principle that an instructor should proceed
from the known to the unknown. Hence many of the ideas in the earlier
chapter& are introduced by means of a concrete numerical example to which
the student can readily relate. This is followed by generalization to the
underlying theory. The courses on which the book is based usually have a
significant number of students of Business and Engineering. The interests

ix
x Preface

of these people have been taken into account in the development of the
courses and hence in the writing of this book. Hence many of its arguments
are intuitive rather than rigorous. Indeed plausibility and clarity have been
given precedence before rigour for the sake of itself.
Chapter I contains a brief historical account and introduces the basic
terminology and concepts common to all the theory of optimization. Chap-
ters 2 and 3 are concerned with linear programming and complications of
the basic model. Chapter 2 on the simplex method, duality, and sensitivity
analysis can be covered in an undergraduate course. However some of the
topics in Chapter 3 such as considerations of efficiency and parametric pro-
gramming, may be best left to graduate level. Chapter 4 deals with only the
basic strategies of integer linear programming. It is of course dependent on
Chapter 2. It does contain a number of formulations of applications of inte-
ger programming. Some of this material has never appeared before in book
form. Chapter 5 is on network analysis and contains a section on using net-
works to analyze some practical problems.
Chapter 6 introduces dynamic programming. It is beyond the scope of
this book to provide a detailed account of this vast topic. Hence techniques
suitable for only deterministic, serial systems are presented. The interested
reader is referred to the extensive literature. Chapter 7 serves as an introduc-
tion to Chapter 8, which is on nonlinear programming. It presents some of
the classical techniques: Jacobian and Lagrangian methods together with the
Kuhn-Tucker conditions. The ideas in this chapter are used in devising the
more computationally efficient strategies of Chapter 8.
This text contains enough material for one semester at the undergraduate
level and one more at the graduate level. The first course could contain Chap-
ters 1, 2, the first half of Chapter 3, and parts of Chapter 4 and Chapter 5.
The remainder can be covered in the second course. A plan outlining this
follows.
The book contains a large number of exercises. Students are strongly en-
couraged to attempt them. One cannot come to grips with the concepts by
solely looking at the work of others. Mathematics is not a spectator sport!
The author is grateful for this opportunity to express his thanks for the
support of his employers, the University of Canterbury, which he enjoyed
while finishing this book. He is also thankful for the faith and encouragement
of his wife, Maureen, without which it would never have been written. He is
also grateful to a number of friends including David Robinson, Hans
Daellenbach, Michael Carter, Ian Coope and Susan Byrne, who read parts of
the manuscript and made valuable suggestions. A vote of thanks should also
go to his student, Trevor Kearney, who read the entire manuscript and dis-
covered an embarrassing number of errors.

Christchurch, New Zealand L. R. Foulds


November 1980
Plan of the Book

Undergraduate
Course

Graduate
Course

xi
Chapter 1

Introduction

1.1 Motivation for Studying Optimization


There exist an enormous variety of activities in the everyday world which
can usefully be described as systems, from actual physical systems such as
chemical processing plants to theoretical entities such as economic models.
The efficient operation of these systems often requires an attempt at the
optimization of various indices which measure the performance ofthe system.
Sometimes these indices are quantified and represented as algebraic vari-
ables. Then values for these variables must be found which maximize the
gain or profit of the system and minimize the waste or loss. The variables
are assumed to be dependent upon a number of factors. Some of these
factors are often under the control, or partial control, of the analyst respon-
sible for the performance of the system.
The process of attempting to manage the limited resources of a system
can usually be divided into six phases: (i) mathematical analysis of the
system; (ii) construction of a mathematical model which reflects the impor-
tant aspects of the system; (iii) validation of the model; (iv) manipulation
of the model to produce a satisfactory, if not optimal, solution to the model;
(v) implementation of the solution selected; and (vi) the introduction of a
strategy which monitors the performance ofthe system after implementation.
It is with the fourth phase, the manipulation of the model, that the theory
of optimization is concerned. The other phases are very important in the
management of any system and will probably require greater total effort
than the optimization phase. However, in the presentation of optimization
theory here it will be assumed that the other phases have been, or will be,
carried out. Because the theory of optimization provides this link in the
chain of systems management it is an important body of mathematical
knowledge.
2 1 Introduction

1.2 The Scope of Optimization


One of the most important tools of optimization is linear programming. A
linear programming problem is specified by a linear, multivariable function
which is to be optimized (maximized or minimized) subject to a number of
linear constraints. The mathematician G. B. Dantzig (1963) developed an
algorithm called the simplex method to solve problems of this type. The
original simplex method has been modified into an efficient algorithm to
solve large linear programming problems by computer. Problems from a
wide variety of fields of human endeavor can be formulated and solved by
means of linear programming. Resource allocation problems in government
planning, network analysis for urban and regional planning, production
planning problems in industry, and the management of transportation dis-
tribution systems are just a few. Thus linear programming is one of the
successes of modern optimization theory.
Integer programming is concerned with the solution of optimization prob-
lems in which at least some of the variables must assume only integer values.
In this book only integer programming problems in which all terms are
linear will be covered. This subtopic is often called integer linear program-
ming. However, because little is known about how to solve nonlinear integer
programming problems, the word linear will be assumed here for all terms.
Many problems of a combinatorial nature can be formulated in terms of
integer programming. Practical examples include facility location, job se-
quencing in production lines, assembly line balancing, matching problems,
inventory control, and machine replacement. One of the important methods
for solving these problems, due to R. E. Gomory (1958), is based in part on
the simplex method mentioned earlier. Another approach is of a combina-
torial nature and involves reducing the original problem to smaller, hope-
fully easier, problems and partitioning the set of possible solutions into
smaller subsets which can be analyzed more easily. This approach is called
branch and bound or branch and backtrack. Two of the important contri-
butions to this approach have been by Balas (1965) and Dakin (1965).
Although a number of improvements have been made to all these methods,
there does not exist as yet a relatively efficient method for solving realistically-
sized integer programming problems.
Another class of problems involves the management oj a network. Prob-
lems in traffic flow, communications, the distribution of goods, and project
scheduling are often of this type. Many of these problems can be solved by
the methods mentioned previously-linear or integer programming. How-
ever because these problems usually have a special structure, more efficient
specialized techniques have been developed for their solution. Outstanding
contributions have been made in this field by Ford and Fulkerson (1962).
They developed the labelling method for maximizing the flow of a commodity
through a network and the out-oj-kilter method for minimizing the cost of
transporting a given quantity of a commodity through a network. These
1.3 Optimization as a Branch of Mathematics 3

ideas can be combined with those of integer programming to analyze a


whole host of practical network problems.
Some problems can be decomposed into parts, the decision processes of
which are then optimized. In some instances it is possible to attain the opti-
mum for the original problem solely by discovering how to optimize these
constituent parts. This decomposition process is very powerful, as it allows
one to solve a series of smaller, easier problems rather than one large,
intractable problem. Systems for which this approach will yield a valid
optimum are called serial multistage systems. One of the best known tech-
niques to attack such problems was named dynamic programming by the
mathematician who developed it, R. E. Bellman (1957). Serial multistage
systems are characterized by a process which is performed in stages, such
as manufacturing processes. Rather than attempting to optimize some
performance measure by looking at the problem as a whole, dynamic
programming optimizes one stage at a time to produce an optimal set of
decisions for the whole process. Problems from all sorts of areas, such as
capital budgeting, machine reliability, and network analysis, can be viewed
as serial multistage systems. Thus dynamic programming has wide applica-
bility.
In the formulation of many optimization problems the assumption of
linearity cannot be made, as it was in the case of linear programming. There
do not exist general procedures for nonlinear problems. A large number of
specialized algorithms have been developed to treat special cases. Many of
these procedures are based on the mathematical theory concerned with
analysing the structure of such problems. This theory is usually termed
classical optimization. One of the outstanding modern contributions to this
theory has been made by Kuhn and Tucker (1951) who developed what are
known as the Kuhn-Tucker conditions.
The collection of techniques developed from this theory is called nonlinear
programming. Despite the fact that many nonlinear programming problems
are very difficult to solve, there are a number of practical problems which
can be formulated nonlinearly and solved by existing methods. These
include the design of such entities as electrical transformers, chemical
processes, vapour condensors, microwave matching networks, gallium-
arsenic light sources, digital filters, and also problems concerning maximum
likelihood estimation and optimal parts replacement. .

1.3 Optimization as a Branch of Mathematics


It can be seen from the previous section that the theory of optimization is
mathematical in nature. Typically it involves the maximization or minimi-
zation of a function (sometimes unknown) which represents the performance
of some system. This is carried out by the finding of values for those variables
4 1 Introduction

(which are both quantifiable and controllable) which cause the function to
yield an optimal value. A knowledge of line&r algebra and differential
multivariable calculus is required in order to understand how the algorithms
operate. A sound knowledge of analysis is necessary for an understanding
of the theory.
Some ofthe problems of optimization theory can be solved by the classical
techniques of advanced calculus-such as Jacobian methods and the use
of Lagrange multipliers. However, most optimization problems do not
satisfy the conditions necessary for solution in this manner. Ofthe remaining
problems many, although amenable to the classical techniques, are solved
more efficiently by methods designed for the purpose. Throughout recorded
mathematical history a collection of such techniques has been built up.
Some have been forgotten and reinvented, others received little attention
until modern-day computers made them feasible.
The bulk of the material of the subject is of recent origin because many
of the problems, such as traffic flow, are only now of concern and also
because of the large numbers of people now available to analyze such
problems. When the material is catalogued into a meaningful whole the
result is a new branch of applied m&thematics.

1.4 The History of Optimization


One of the first recorded instances of optimization theory concerns the
finding of a geometric curve of given length which will, together with a
straight line, enclose the largest possible area. Archimedes conjectured
correctly that the optimal curve is a semicircle. Some of the early results are
in the form of principles which attempt to describe and explain natural
phenomena. One of the earliest examples was presented approximately
100 years after Archimedes' conjecture. It was formulated by Heron of
Alexandria in C. 100 B.c., who postulated that light always travels by the
shortest path. It was not until 1657 that Fermat correctly generalized this
postulate by stating that light always travels by the path which incurs least
time rather than least distance.
The fundamental problem of another branch of optimization is concerned
with the choosing of a function that minimizes certain functionals. (A
functional is a special type of function whose domain is a set of real-valued
functions.) Two problems of this nature were known at the time of Newton.
The first involves finding a curve such that the solid of revolution created
by rotating the curve about a line through its endpoints causes the minimum
resistance when this solid is moved through the air at constant velocity.
The second problem is called the brachistochrone. In this problem two points
in space are given. One wishes to find the shape of a curve joining the two
points, such that a frictionless bead travelling on the curve from one point
1.5 Basic Concepts of Optimization 5

to the other will cover the journey in least time. This problem was posed as a
competiton by John Bernoulli in 1696. The problem was successfully solved
by Bernoulli himself, de I'Hopital, Leibniz, and Newton (who took less than
a day!). Problems such as these led Euler to develop the ideas involved into
a systematic discipline which he called the calculus of variations in 1766.
Also at the time of Euler many laws of mechanics were first formulated in
terms of principles of optimality (examples are the least action principle of
Maupertuis, the principle of least restraint of Gauss, and Lagrange's kinetic
principle). Lagrange and Gauss both made other contributions. In 1760
Lagrange invented a method for solving optimization problems that had
equality constraints using his Lagrange multipliers. Lagrange transforma-
tions are, among other uses, employed to examine the behaviour of a function
in the neighbourhood of a suspected optimum. And Gauss, who made
contributions to many fields, developed the method of least squares curve
fitting which is of interest to those working in optimization as well as
statistics.
In 1834 W. R. Hamilton developed a set of functions called Hamiltonians
which were used in the statement of a principle of optimality that unified
what was known of optics and mechanics at that time. In 1875 J. W. Gibbs
presented a further principle of optimality concerned with the equilibrium
of a thermodynamical system. Between that time and the present there have
been increasing numbers of contributions each year. Among the most out-
standing recent achievements, the works of Dantzig and of Bellman have
already been mentioned. Another is the work ofPontryagin (1962) and others,
who developed the maximum principle which is used to solve problems in
the theory of optimal control.

1.5 Basic Concepts of Optimization


This section introduces some of the basic concepts of optimization. Each
concept is illustrated by means of the following example.
The problem is to:

Maximize: Xo = f(X) = f(xl> X2) (1.1)


subject to: hl(X) ~ 0 (1.2)
Xl ~ 0 (1.3)
X2 ~O. (1.4)

This is a typical problem in the theory of optimization-the maximization


(or minimization) of a real-valued function of a number of real variables
(sometimes just a single variable) subject to a number of constraints (some-
times the number is zero). The special case of functionals, where the domain
6 1 Introduction

of the function is a set offunctions, will be dealt with under the section on the
calculus of variations in Chapter 7.
The function f is called the objective function. The set of constraints, in
this case a set of inequalities, is called the constraint set. The problem is to
find real values for Xl and X2' satisfying (1.2), (1.3) and (1.4), which when
inserted in (1.1) will cause f(xl, x 2) to take on a value no less than that for
any other such Xl> X2 pair. Hence Xl and X2 are called independent variables.
Three objective function contours are present in Figure 1.1. The objective
function has the same value at all points on each line, so that the contours
can be likened to isobar lines on a weather map. Thus it is not hard to see

Figure 1.1. Objective function contours and the feasible region for an optimization
problem.
1.5 Basic Concepts of Optimization 7

that the solution to the problem is:


X* = (xf,x!) = (1,0).
This means that
f(X*) ~ f(X) for all XES. (1.5)
When a solution X* E S satisfies (1.5) it is called the optimal solution, and
in this case the maximal solution. If the symbol in (1.5) were "::S;;", X* would
be called the minimal solution. Also, f(X*) is called the optimum and is
written X6.
On looking at Figure 1.1 it can be seen that greater values for f could
be obtained by choosing certain Xl> X2 outside S. Any ordered pair ofreal
numbers is called a solution to the problem and the corresponding value of
f is called the value of the solution. A solution X such that
XES
is called a feasible solution.
Let us examine which Xl> X2 pairs are likely candidates to achieve this
maximization. In Figure 1.1 the set of points which satisfy this constraint
set has been shaded. The set is defined as S:
S= {(Xl,X2): h(Xl>X2)::S;; 0, Xl ~ 0, X2 ~ O}.

Such a set S for an optimization problem is often a connected region and


is called the feasible region.
Many optimization problems do not have unique optimal solutions. For
instance, suppose a fourth constraint
(1.6)
is added to the problem. The feasible region is shown in Figure 1.2. In this
case one of the boundaries of S coincides with an objective function contour.
Thus all points on that boundary represent maximum solutions.
However, if it exists the optimal value is always unique.
As another example of a problem which does not have an optimal solution,
suppose (1.2) is replaced by:
(1.7)
On examining Figure 1.2, it becomes apparent that (1.7) does not hold for
X* = (1,0), hence X* ¢ S. In fact, there is no solution which will satisfy
(1.5), as points successively closer to (but a positive distance away from)
(1,0) correspond to successively larger Xo values. To recognize this situation
we called f(X') an upper bound for f under S if
f(X') ~ f(X) for all XES. (1.8)
Also f(X') is called a least upper bound or supremum for f under S if f(X')
is an upper bound for f under Sand
f(X') ::s;; f(X) for all upper bounds f(X) for f under S. (1.9)
8 1 Introduction

f(X) = Xo
/'
/'
/
/
/
,/

(0,0)

Figure 1.2. Feasible region for an optimization problem where one constraint is
identical with an objective function contour.

Most of the preceding ideas have been concerned with maximization. Of


course many optimization problems have the aim of minimization and each
of the above concepts has a minimization counterpart. The sense of the
inequalities in (1.7), (1.8), and (1.9) need to be reversed for minimization.
The counterparts of the terms are:
minimum maximum
lower bound upper bound
greatest lower bound least upper bound
infimum supremum
1.5 Basic Concepts of Optimization 9

Throughout the remainder of book we shall deal mainly with maximi-


zation problems only, because of the following theorem.

Theorem 1.1. If X* is the optimal solution to problem P1:


Maximize: f(X),
subject to: giX) = 0, j = 1,2, ... , m
hj(X) ::;; 0, j = 1,2, ... , k
then X* is the optimal solution to problem P2:
Minimize: - f(X),
subject to: giX) = 0, j = 1,2, ... , m
hiX)::;; 0, j = 1,2, ... , k.

PROOF. Because X* is the optimal solution for P1, it is a feasible solution


for P1, hence
gj(X*) = 0, j = 1,2, ... , m
hj(X*) ::;; 0, j = 1,2, ... , k.
Hence X* is a feasible solution for P2.
Also,
f(X*) 2:: f(X) for all XES
where
S = {X: gj(X) = O,j = 1,2, ... , m; hiX)::;; O,j = 1,2, ... , k}.
Hence
- f(X*) ::;; - f(X) for all XES.
Hence X* is optimal for P2. D

This result allows us to solve any minimization problem by multiplying


its objective function by -1 and solving a maximization problem under the
same constraints. Of course we could have just as easily proven another
theorem concerning the conversion of any maximization problem into an
equivalent minimization problem.
Chapter 2

Linear Programming

2.1 Introduction
This present chapter is concerned with a most important area of optimiza-
tion, in which the objective function and all the constraints are linear. Prob-
lems in which this is not the case fall in the nonlinear programming category
and will be covered in Chapters 7 and 8.
There are a large number of real problems that can be either formulated
as linear programming (L.P.) problems or formulated as models which can
be successfully approximated by linear programming. Relatively small prob-
lems can readily be solved by hand, as will be explained later in the chapter.
Large problems can be solved by very efficient computer programs. The
mathematical structure of L.P. allows important questions to be answered
concerning the sensitivity of the optimum to data changes. L.P. is also used
as a subroutine in the solving of more complex problems in nonlinear and
integer programming.
This chapter will begin by introducing the basic ideas of L.P. with a sim-
ple example and then generalize. A very efficient method for solving L.P.
problems, the simplex method, will be developed and it will be shown how
the method deals with the different types of complications that can arise.
Next the idea of a dual problem is introduced with a view to analyzing the
behaviour of the optimal L.P. solution when the problem is changed. This
probing is called postoptimal analysis. Algorithms for special L.P. problems
will also be looked at.

10
2.2 A Simple L.P. Problem 11

2.2 A Simple L.P. Problem


A coal mining company producing both lignite and anthracite finds itself in
the happy state of being able to sell all the coal it can process. The present
profit is $4.00 and $3.00 (in hundreds of dollars) for a ton of lignite and an-
thracite, respectively. However, because of various restrictions the cutting
machine at the coal face, the screens, and the washing platit can be operated
for no more than 12, 10, and 8 hours per day, respectively. It requires 3, 3,
and 4 hours for the cutting machine, the screens, and the washing plant, re-
spectively, to process one ton of lignite. It requires 4, 3, and 2 hours for the
cutting machine, the screens, and the washing plant, respectively, to process
one ton of anthracite. The problem is to decide how many tons of each type
of coal will be produced so as to maximize daily profits.
In order to solve this problem we need to express it in mathematical terms.
Toward this end the decision (independent) variables are defined as follows.
Let
Xl = the daily production of lignite in tons,
X2 = the daily production of anthracite in tons,
Xo = the profit gained by producing Xl and X2 tons oflignite and anthra-
cite, respectively.
If Xl tons of lignite are produced each day, and the profit per ton is $4.00
then the daily profit for lignite is
$4X l·
Similarly, if X2 tons of anthracite are produced each day with a profit of $3.00
per ton, then the daily profit is

Thus for a daily production schedule of Xl and X2 tons oflignite and anthra-
cite, the total daily profit, in dollars, is:
4Xl + 3X2 (=xo).
It is this expression whose value we must maximize.
We can formulate similar expressions for the constraints of time on the
various machines. For instance, consider the cutting operation. If Xl tons of
lignite are produced each day and each ton of lignite requires 3 hours' cut-
ting time, then the total cutting time required to produce those Xl tons of
lignite is
3Xl hours.

Similarly, if X2 tons of anthracite are produced each day with each ton taking
4 hours to cut, the total cutting time required to produce those Xl tons of
anthracite is
4X2 hours.
12 2 Linear Programming

Thus the total cutting time for Xl tons oflignite and X2 tons of anthracite is
3Xl + 4X2'
But only 12 hours' cutting time are available each day. Hence we have the
constraint:
3Xl + 4X2 :::; 12.

We can formulate similar constraints for the screening and washing times.
This has been done below. The problem can now be stated mathematically:
Maximize: 4Xl + 3X2 = Xo (2.1)
subject to: 3Xl + 4X2 :::; 12 (2.2)
3Xl + 3X2 :::; 10 (2.3)
4Xl + 2X2 :::; 8 (2.4)
~o (2.5)
(2.6)
The above expressions are now explained:
(2.1): The objective is to maximize daily profit.
(2.2): A maximum of 12 hours cutting time is available each day.
(2.3): A maximum of 10 hours screening time is available each day.
(2.4): A maximum of 8 hours washing time is available each day.
(2.5), (2.6): A nonnegative amount of each type of coal must be produced.
Because only two independent variables are present it is possible to solve
the problem graphically. This can be achieved by first plotting the constraints
(2.2)-(2.6) in two-dimensional space. The origin can be used to test which
half-plane created by each constraint contains feasible points. The feasible
region is shown in Figure 2.1.11 can be seen that constraint (2.3) is redundant,
in the sense that it does not define part ofthe boundary of the feasible region.
The arrow on constraint (2.3) denotes the feasible half-plane defined by the
constraint. The problem now becomes that of selecting the point in the fea-
sible region which corresponds to the maximum objective function value-
the optimum. This point is found by setting the objective function equal to a
number of values and plotting the resulting lines. Clearly, the maximum
value corresponds to point (!, V). Thus the optimal solution is
xt =! and
with value 10l Hence the best profit the company can hope to make is $1,040
by producing 0.8 tons of lignite and 2.4 tons of anthracite per day.
When more than two independent variables are present, linear programs
are solved by analytic methods, as it is difficult to draw in three dimensions
and impossible in higher dimensions. The next section introduces the general
problem.
2.3 The General L.P. Problem 13

\ 2
Xo = 1~
(2.4)
Figure 2.1. Graphical solution to the L.P. example problem.

2.3 The General L.P. Problem


The problem of (2.1)-(2.6) can be generalized as follows:
Maximize: C1X 1+ CzXz + ... + CnXn = Xo
subject to: allxl + a12 x Z + ... + alnX n ::; b 1

aZ1x 1 + azzxz + ... + aZnX n ::; bz

Xi :2: 0, i = 1,2, ... , n.


Of course this problem can be stated in matrix form:
Maximize: cTX
subject to: AX::; B,
X :2: 0,
14 2 Linear Programming

where
C = (C l , C2, . . . , Cn)T
X = (Xl,X2,···, xnf
A = (aij)mxn
B = (bbb 2 , · · · , bmf
0= (0)1 xn'
Here (Xl' X2, ... , xnf represents the transpose of (Xb X2, ... , xn). The gen-
eral minimizing linear program has an analogous form:

Minimize: CTX
subject to: AX~B

X~O.

Weare now in a position to discover some basic features of the general


linear programming problem.
1. The objective function and all the constraints are linear functions of the
independent variables. This assumption has some important implica-
tions. It means that both the contribution of the level of each activity
represented by its decision variable value (for the objective function) and
the drain on resources of each activity (for the constraints) are directly
proportional to the level ofthe activity. That is, for example, doubling the
amount of a product produced will double both the profit gained by the
product and the amount of each resource used on the product. It also
means that both the total contribution to the objective and the total drain
on each resource of all activities is, in each case, the sum of those of the
individual activities.
2. The independent variables are all nonnegative. Nearly all problems which
come from real situations have this property. In the few cases where this
is not so, no great hardship need occur. A method for replacing variables
unrestricted in sign by nonnegative ones will be explained later in this
section.
3. The independent variables are all continuous. This feature does restrict
the application of linear programming. It does not make sense to advo-
cate the allocation of a noninteger number of ships to a task, as this would
be indiscrete in more ways than one! When the variables concerned have
relatively large values at the optimum they can often be rounded to the
nearest feasible combination of integral values to yield a satisfactory so-
lution. When this is not true, specialized artillery, collectively called inte-
ger linear programming, must be brought into service. Some of the shots
that can be fired are examined in Chapter 4.
4. Each constraint involves either a "::;" or a "~" sign. In many problems,
one or more constraints contain an equality sign. A method for replacing
2.3 The General L.P. Problem 15

such equations by inequalities will be explained later. In the previous


chapter we found that a problem with a strict inequality constraint (in-
volving either a "<" or a ">" sign) does not necessarily have an optimal
solution. This is also true for linear programming. Most problems from
real situations do not contain strict inequality constraints, and methods
for solving L.P. problems do not allow for strict inequalities. Thus we
shall confine our attention to problems in which all the inequalities are
nonstrict, i.e., are of the" :::;" or " ~ " type, not the" <" or " > " type.

Although all L.P. problems possess all four features outlined above, it is
obvious that there can be many variations. The problem could be one of
maximization or minimization, it may contain variables unrestricted in sign,
and it may contain a mixture of constraint signs. Rather than devise a method
for each class of problems, a method will be presented which will solve the
problems of one common class. The method is completely general, as it will
be shown that any L.P. problem can be made a member of the class by a
series of simple steps. L.P.'s belonging to the class of interest are said to be in
standard form.
An L.P. is in standard form if it can be expressed as:

Maximize: cTX (2.7)


subject to: AX=B (2.8)
X~O, (2.9)
where
B~O.

Thus the features of a problem in standard form are

1. The objective function is to be maximized.


2. All constraints except the nonnegativity conditions are strict equations.
3. The independent variables are all nonnegative.
4. The constant to the right of each equality sign in each constraint is non-
negative.

The steps that transform any L.P. into standard form are as follows.

1. A minimizing problem can be transformed into a maximizing problem by


replacing the objective function by a new function in which the signs of
the objective function coefficients have all been changed. (See Section 1.5).
2. Each variable unrestricted in sign can be replaced by an expression rep-
resenting the difference between two new nonnegative variables. For
example, if Xi is unrestricted, it is replaced by
16 2 Linear Programming

where x j and Xk are new variables. Two new constraints,


Xj~ 0
Xk ~ 0,
are added to the problem.
3. Each negative right-hand-side constraint constant can be made positive
by multiplying the entire equation or inequality by minus one.
4. Each inequality constraint can be made an equation by adding a non-
negative variable to a "~" constraint, or subtracting a nonnegative vari-
able from a "~" constraint. For example, consider the constraint
3Xl + 4X2 ~ 6.
This becomes
3Xl + 4X2 + Xi = 6,
and a new constraint is added:
Xi~O.

Similarly, a constraint of the form


5X3 - 9X4 ~ 18
becomes
5X3 - 9X4 - Xj = 18,

with the additional constraint:


Xj~ O.
Note that as all decision variables must be nonnegative the new variables
which force equality must be added for "~" constraints and subtracted for
"~ " constraints. The new variables added to the constraints are called slack
variables. The original variables are called structural variables.
The problem of Section 2.2 has the following standard form:

PROBLEM 2.1
Maximize: 4Xl + 3X2 = Xo (2.10)
subject to: 3Xl + 4X2 + X3 = 12 (2.11)
=10 (2.12)
+ Xs = 8 (2.13)
Xi~O, i = 1,2, ... ,5.
Now that the problem is in a form suitable to be attacked, we can consider
ways to find its solution. It is apparent that realistically-sized problems will
present quite a challenge and thus trial-and-error methods would be futile.
Before unveiling the algorithm, some mathematical preliminaries are pre-
sented which are essential to the understanding of the method.
2.4 The Basic Concepts of Linear Programming 17

2.4 The Basic Concepts of Linear Programming


Consider the L.P. problem (2.7)-(2.9). Suppose that the problem has n vari-
ables and m constraints:

and

A solution X is feasible if it satisfies (2.8) and (2.9). Let us now consider (2.8):
AX=B.
This represents a system of m equations in n unknowns.
If
m>n,
some of the constraints are redundant.
If
m=n,
and A is nonsingular (see Section 9.1.5), a unique solution can be found:
X=A-1B.
If
m<n,
n - m of the variables can be set equal to zero. This corresponds to the for-
mation of an m x m submatrix A of A.
As an example of this last possibility, consider Problem 2.1, where
m = 3 and n = 5.
Here
4 1
o
3 0 10.
0)
2 0 o 1
By setting
X4 =0 and Xs = 0,
we obtain

(3 4 1)
A= 33 O.
420
Provided A is nonsingular, the values of the remaining variables can be
found, as there are now m equations in m unknowns. Such a solution is called
a basic solution and the m variables are called basic variables.
18 2 Linear Programming

If this basic solution, which must satisfy (2.8), also satisfies (2.9) it is called
a basic feasible solution. A basic feasible solution is called degenerate if at
least one of the basic variables has a zero value.
A subset, S of R n is said to be convex if the line segment joining any two
points of S is also in S. That is, S is convex ¢>aX 1 + (1 - a)X 2 E S, for all
X b X 2 E S, 0 :::::; a :::::; 1. Using this definition we can form some idea of what
a convex set is like in two dimensions. In Figure 2.2, sets D and E are convex,
sets F and G are not.
It is not difficult to show that the set S of all feasible solutions to a L.P.
problem in standard form is convex. If the set is nonempty it must be ex-
amined in order to identify which of its points corresponds to the optimum.
A point X of a convex set, S is said to be an extreme point of S if x cannot be
expressed as:
X = aX l + (1 - a)X2' for some a, 0 < a < 1; Xl ¥- X 2; Xl' X 2 E S.

C)
E

Figure 2.2. Convex and nonconvex sets.


2.5 The Simplex Algorithm 19

Suppose that the convex set of feasible solutions to an L.P. problem is


denoted by Sand S is bounded. Then an optimal solution to the problem
corresponds to an extreme point of S. This fact considerably reduces the
effort required to examine S for an optimal solution. We need examine only
the extreme points of S to find an optimum. The next section introduces
the method which takes advantage of this fact.

2.5 The Simplex Algorithm

2.5.1 Background

In the previous section it was noted that the optimal solution to the L.P.
problem corresponds to an extreme point of the feasible region of the
problem. Each extreme point can be determined by a basic solution. Now
by (2.9) all the variables have to be non-negative in a feasible solution.
Thus it is necessary to examine only the basic feasible solutions, rather than
all basic solutions, in order to find the optimum. This amounts to examining
only those extreme points for which all variables are non-negative. The
algorithm is a process by which successive basic feasible solutions are
identified and in which each has an objective function value which is greater
than the preceding solution. Each basic feasible solution in this series is
obtained from the previous one (after the first has been selected) by replacing
one of the basic variables by a non basic variable. This is attained by setting
one of the basic variables equal to zero and calculating the values of the
other basic variables and the new variable (which is now part of the basis)
which satisfy (2.8). This replacement of one variable by another is carried
out with the following criterion in mind. The new variable that is becoming
part of the basis (the entering variable) is selected so as to improve the
objective function value. This happens if the non basic variable with the
largest per unit increase is selected (as long as the solution is not degenerate).
The variable to leave the basis is selected so as to guarantee that feasibility
has been preserved. This procedure is repeated until no improvement in
objective function value can be made. When this happens the optimal
solution has found.
Consider once again Problem (2.1). Suppose we choose an initial basis
of (X3, X4, xs). The nonbasic variables are then Xl and X2, which are set
equal to zero. The submatrix A corresponding to this basis is the identity
matrix I and is of course nonsingular. Hence we can solve for the basic
variables:
X3 = 12
X4 = 10
X5 = 8.
20 2 Linear Programming

As all these basic variables are nonnegative, we have found a basic feasible
solution. The next step is to find a new basic feasible (b.f.) solution with an
improved (larger) value. Recall that when a new b.f. solution is created
exactly one new variable replaces one existing variable in the basis. Which
variable should be brought into the basis in the present problem? On looking
at (2.10) it can be seen that Xl has the largest gain per unit (4) in the objective
function. Hence it seems wise to prefer Xl to Xz' In some cases this criterion
will not always yield the greatest improvement; however, it has been shown
that other criteria usually require more overall computation to find the
optimum. Now that Xl has been chosen to enter, which of X3, X 4 , or Xs
should leave the basis? Two factors must be considered:
1. We wish to allow Xl to assume as large a value as possible in order to
make the objective function take on the largest possible value.
2. The new basic solution must be feasible: all variables must be non-
negative.
How much can we increase Xl and still satisfy factor 2? Suppose we
write the constraints of (2.10) as functions of Xl:
Xl = 4 -1Xz -1X3

Xl = 13° - X z -1X4

Xl = 2 - ~Xz - ixs.
Now, as

these equations reduce to


Xl = 4 -1X3 (2.14)
Xl = 13° -1X4 (2.15)
x1=2-tx s· (2.16)
Consider in turn the removal of one of X3 or X 4 or Xs from the basis. That
is, set X3, or X 4 or Xs equal to zero one at a time. Here Xl will take on the
following values:
X 3 =0=>X1=4

°
°
X4 = => X 1 = 13°

Xs = =>X 1 = 2.
Now it can be seen from (2.15) and (2.16) that
X3 = ° => X4 < 0, Xs < 0,
X4 = O=>Xs < 0.
Hence setting either of X3 or X4 equal to zero will cause the new basis to be
infeasible. Therefore, the leaving variable should be x s , and the new basis
is (x 1> X3, x 4 ). It should be noted that the leaving variable belongs to the
equation which has the minimum positive constant out of (2.14), (2.15), and
(2.16). This is no coincidence, and will always occur.
2.5 The Simplex Algorithm 21

Now that the new basis has been chosen, the values of its variables can
be found. We have:

where

(3 1 0)
A= 3 0 1.
400
Thus

and
X4 =4.
The corresponding objective function value is 8.
What has been performed here is basically one iteration of the simplex
method. In order to perform the iterations of the simplex algorithm it is
convenient to set out the problem in a tableau. How this is done is dis-
cussed in the next section.

2.5.2 Canonical Form

As was mentioned previously, the calculations of the simplex method are


most easily performed when the problem is set out in a tableau. We shall
assume that all the inequalities of the problem are of the "~" type, with a
nonnegative right-hand-side (r.h.s.) constant. Thus in converting the problem
into standard form a slack variable is added to each inequality to transform
it into an equation. Other cases shall be considered in Section 2.5.4. Problem
2.1 is of the required form and will be used for illustrative purposes.
Refer to Table 2.1. Each column of the tableau corresponds to a variable,
except the last column, which corresponds to the r.h.s. of each standard
form equation. For consistency, the objective function equation must be
put in the same form as the constraint equations. In Problem 2.1, (2.10)

Table 2.1

Variables

Constraint Xo Xl X2 X3 X4 Xs r.h.s.

(2.11) 0 3 4 1 0 0 12
(2.12) 0 3 3 0 1 0 10
(2.13) 0 4 2 0 0 1 8

(2.10) 1 -4 -3 0 0 0 0
22 2 Linear Programming

can be expressed as:


Xo - 4Xl - 3X2 = o.
The Xo column is usually not included in the tableau. Each row of the
tableau corresponds to a constraint equation, the last row corresponding
to the objective function. The r.h.s. entry ofthe objective function row equals
the value of the objective function for the current basis.
We must now select an initial basis for the problem and calculate the
values of the basic variables. An initial basic feasible solution can always
be found by letting all the slack variables only be basic. Each basic variable
then has a value equal to the r.h.s. constant of its equation. The value of
this solution is zero, as all basic variables have a zero objective function
coefficient. It can be seen from Table 2.1 that the coefficients in A corre-
sponding to the basis form an identity matrix. As the simplex method is
applied to the elements of the tableau, their values will be manipulated.
However, at the end of each iteration, the coefficients of the current basis
will form an identity matrix (within a permutation ofrows) and the objective
function coefficients of basis variables will be zero. A tableau which possesses
this property is said to be in canonical form.

2.5.3 The Algorithm

Before discussing the steps of the algorithm it is necessary to make a digres-


sion into the area of matrix manipulation. It has been noted that the columns
in the simplex tableau corresponding to the basic variables form an identity
matrix (within a permutation of rows). When another iteration is performed
(if necessary), one of the basic variables is replaced by a nonbasic variable.
This new basis must have coefficients in the tableau which form an identity
matrix. How is the tableau to be transformed so as to create this new identity
matrix?
As an example, consider Table 2.1. It was decided that Xl should replace
Xs in the basis. Thus the Xl column should be manipulated until it looks
like the present Xs column. It can be shown (Hu (1969» that Gauss-Jordan
elimination can achieve this without altering the set of feasible solutions to
the problem. For convenience, Table 2.1 is reproduced in Table 2.2 with
extraneous matter omitted and the objective function row labelled Xo rather
than (2.10).

Table 2.2
Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 3 4 0 0 12
(2.12) 3 3 0 0 10
(2.13) ® 2 0 0 1 8
Xo -4 -3 0 0 0 0
2.5 The Simplex Algorithm 23

The entry which lies at the intersection of the entering variable column
and the row containing the unit element of the leaving variable is called
the pivot element. It is circled in Table 2.2. The first step is to divide each
element in the pivot row (the row containing the pivot element) by the pivot
element. This produces Table 2.3. We have now produced a unit element in
the correct position in the Xl column.

Table 2.3

Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 3 4 1 0 0 12
(2.12) 3 3 0 1 0 10
1 1
(2.13) 1 2 0 0 4 2
Xo -4 -3 0 0 0 0

Next, each row other than the pivot row has an amount subtracted from
it, element by element. The amount subtracted from each element is equal
to the present entry of the corresponding pivot row element multiplied by
a constant. That constant is equal to the entry in the row concerned which
lies in the pivot column-the column containing the pivot element (the
entering variable column.)
For example, let us subtract from the first row of Table 2.3 element by
element. The constant to be subtracted is the entry in row (2.11) in the Xl
column: 3. Thus row (2.11) becomes:
3 - 3(1) 4 - 3(t) 1 - 3(0) 0 - 3(0) 0 - 3(i) 12 - 3(2)
This produces Table 2.4.

Table 2.4

Constraints Xl X2 X3 X4 Xs r.h.s.

s 1 0 3 6
(2.11) 0 "2 -4
(2.12) 3 3 0 1 0 10
(2.13) 1 1
"2 0 0 t 2
Xo -4 -3 0 0 0 0

We have now produced a zero element in the first entry of the Xl column.
Performing the same operation for each other row (other than the pivot
row) produces Table 2.5. The new basis (Xl> X2, X4) now has coefficients
which form an identity matrix, (within a permutation of rows).
The simplex method can now be outlined.
1. Transform the problem into standard form.
2. Set up the initial simplex tableau.
24 2 Linear Programming

Table 2.5

Constraints XI X2 X3 X4 Xs r.h.s.

(2.11) s 3
0 2" 0 -4 6
(2.12) 3 3 4
0 2" 0 -4
(2.13) I I 2
1 2" 0 0 4
Xo 0 -1 0 0 8

3. Identify the negative entry which is largest in magnitude among all en-
tries corresponding to nonbasic variables in the objective function row.
Ties may be settled arbitrarily. (If all such entries are nonnegative, go to
step 10). Suppose the entry in column i is identified.
4. Identify all nonnegative elements in column i.
5. For each element identified in step 4, form a ratio of the r.h.s. constant
for the row of the element to the element itself.
6. Choose the minimum such ratio and identify to which row it belongs, say
row j. Ties may be settled arbitrarily.
7. Identify the basic variable which has a unit entry in row j, say Xk.
8. Replace variable Xk by variable Xi in the basis using Gauss-Jordan elim-
ination.
9. Go to step 3.
10. The optimal solution has been found. Each basic variable is set equal to
the entry in the r.h.s. column corresponding to the row in which the vari-
able has a unit entry. All other variables are set equal to zero. The opti-
mal solution value is equal to the entry at the intersection of the Xo row
and the r.h.s. column.
Problem 2.1 will now be solved by the simplex method. Refer to Table
2.6. The initial basis is (X3. X4, xs), with values
X3 = 12
X4 = 10
Xs = 8
and
Xo = O.

Table 2.6

Constraints XI X2 X3 X4 Xs r.h.s. Ratio

(2.11) 3 4 0 0 12 Il
(2.12) 3 3 0 0 10 13°
(2.13) @) 2 0 0 1 8 ~
4
Xo -4 -3 0 0 0 0
2.5 The Simplex Algorithm 25

Table 2.7

Constraints Xl X2 X3 X4 X5 r.h.s. Ratio

(2.11) 0 CD 1 0 3
-"4 6 II
(2.12) 0 3
2' 0 3
-"4 4 J
(2.13) 1 t 0 0 1.
4 2 t
Xo 0 -1 0 0 8

The entering variable is Xl' as it has the smallest xo-row (objective-function-


row) coefficient. The leaving variable is x s , as it has a unit element in the row
corresponding to the minimum ratio (~). The pivot element has been circled.
Gauss-Jordan elimination produces Table 2.7.
The new basis is (Xt>X3,X4), with values
Xl = 2
X3 = 6

x4=4
and
Xo = 8.
The entering variable is X2, as it has the smallest xo-row coefficient ( -1). The
leaving variable is X3, as it has a unit element in the row corresponding to
the minimum ratio ell. The pivot element has been circled. Gauss-Jordan
elimination produces Table 2.8.
As there are no more negative entries in the Xo row, the optimal solution
has been found. It can be read off from the tableau, the basic variables being
equal to the r.h.s. values of the rows in which their column entry is a unit
element. Thus
X! =!
x! = II
x~ =~.
All other variables are zero. The optimum is
*
X 0-~
- s·

Table 2.8

Constraints Xl X2 X3 X4 X5 r.h.s.

(2.11) 0 1 ~
5 0 3
-TO II
(2.12) 0 0 -s3
1 3
-TO t
(2.13) 0 -sI 0 t !
Xo 0 0 t 0 ?o li
5
26 2 Linear Programming

The slack variables in constraints (2.11) and (2.13) are zero at the optimal
solution. This means that the amount of resource available in each of these
constraints (cutting and washing time, respectively) is to be fully used. There
are 12 and 8 hours' cutting and washing time available per day, respectively,
and all this is going to be used in the optimal solution. Such constraints are
called binding constraints. The slack variable of constraint (2.12) is positive.
This means that not all of the available screening time of 10 hours is to be
used. The amount unused per day is equal to the optimal value of the slack
variable, ! hour. A constraint such as (2.12) is called a slack constraint.
The algorithm presented above is designed to solve maximization prob-
lems only. A minimization problem can be converted into a maximization
problem by maximizing the negative of its objective function. However, the
algorithm can instead be easily modified to solve such problems directly.
At the beginning of each iteration in which a minimization problem is being
solved, the xo-row element that is the minimum of all negative elements is
identified. The column of this element becomes the pivot column. The itera-
tion then proceeds as before. When all elements in the Xo row are nonnegative
the optimum has been found.

2.5.4 Artificial Variables

Until now it has been assumed that all constraints in the linear programming
problem were of the ":s:;" type. This allowed slack variables to be added to
(rather than subtracted from) each inequality to transform it to an equation.
The positive unit coefficients of these slack variables meant that an identity
submatrix was present in A. Thus the collection of slack variables conve-
niently formed an initial basis which represented a basic feasible solution.
Hence the simplex algorithm could be easily initiated using this easily
found basis.
With constraints of the" =" or "~" type the procedures differ: no slack
variable need be introduced in the former case and the slack variable is
subtracted in the latter, so each equation does not necessarily contain a
unique element with a positive unit element as coefficient. Therefore an
identity submatrix of A is not necessarily present. As many problems contain
constraints of these types, we must develop a systematic method for creating
an initial feasible basis so that the simplex algorithm can be used.

2.5.4.1 The Big M Method


The problem is first transformed into standard form. Next a new variable
is added to the left-hand side of each constraint equation which was of the
"=" or "~" type. The collection of these variables together with the slack
variables in the equations that were of the":s:;" type form the initial feasible
basis. As with all other variables these new variables are constrained to be
2.5 The Simplex Algorithm 27

non-negative. Any feasible solution must contain these new variables all at
the zero level, for any positive new variable causes its constraint to be
violated. In order to ensure that all the new variables are forced to zero
in any feasible solution, each is included in the objective function. The
coefficient of each new variable in the objective function is assigned a re-
latively large negative (positive) value for a maximization (minimization)
problem. These coefficients are usually represented by the symbol M. Thus
this technique is sometimes called the big M method. The new variables
introduced have no physical interpretation and are called artificial variables.
To illustrate the method, suppose an additional constraint is added to
Problem 2.1. Because of contractual commitments at least one ton of coal
must be produced and the buyers are not concerned about the ratio of
lignite to anthracite. The new constraint is
(2.17)

which, on the introduction of the slack variable X6, becomes


Xl + X2 - X6 = 1.
When the artificial variable X7 is introduced we have
Xl + X2 - X6 + X7 = 1,
and the new objective function is
Xo = 4XI + 3X2 - MX7'

In mathematical form the new problem is

PROBLEM 2.2
Maximize: 4XI + 3X2 - MX7 = Xo
subject to: 3XI + 4X2 + X3 = 12
3XI + 3X2 =10
4XI + 2X2 + Xs =8
Xl + X2 - X6 + X7 = 1
Xj ~ 0, i = 1,2, ... ,7.
The feasible region for this problem is shown in Figure 2.3. The optimal
solution remains unchanged because the optimal solution of the previous
problem is still a solution to the new problem, whose feasible set is a subset
of the original feasible set. The initial tableau for the problem is displayed
in Table 2.9.
The initial basis is (X3, X4,XS, X7)' However, because the objective function
coefficient of the basic variable X7 is nonzero, the tableau is not yet in ca-
nonical form. Gauss-Jordan elimination is used to remedy this by replacing
the Xo row by the sum of the Xo row and - M times (2.17). This creates
28 2 Linear Programming

Xo = 1~ Xl
(2.17) (2.4)
Figure 2.3. The graphical solution to the expanded example problem.

Table 2.9

Constraints Xl Xl X3 X4 Xs X6 X7 r.h.s.

(2.11) 3 4 0 0 0 0 12
(2.12) 3 3 0 0 0 0 10
(2.13) 4 2 0 0 1 0 0 8
(2.17) 0 0 0 -1 1

Xo -4 -3 0 0 0 0 M 0

Table 2.l0. The simplex iterations required to reach the optimal solution
are displayed in Tables 2.11-2.13. The optimal solution is
xi =!, *
X 2 -_ l5l

xl =~, x~ = V
x~, x~, x~ = 0
x~ = 5l.
2.5 The Simplex Algorithm 29

Table 2.10

Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s. Ratio

(2.11) 3 4 0 0 0 0 12 ¥
(2.12) 3 3 0 0 0 0 10 \0
(2.13) 4 2 0 0 0 0 8 !
(2.17) CD 0 0 0 -1 1 1 1-
1

Xo -M-4 -M-3 0 0 0 M 0 -M

Table 2.11

Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s. Ratio

(2.11) 0 1 1 0 0 3 -3 9 1
(2.12) -3 7
0 0 0 1 0 3 7 "!
(2.13) 0 -2 0 0 1 @ -4 4 4
4
(2.17) 1 0 0 0 -1 1 1
Xo 0 0 0 0 -4 (M + 4) 4

Table 2.12

Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s. Ratio

(2.11) 0 CD 0 -4
3 0 0 6 V
(2.12) 0 t 0 1 -4
3 0 0 4 8
"!
(2.13) 0 -t 0 0 4
1
1 -1 1
(2.17) 1 1
0 1
0 0 2 4
"2 0 4 T
Xo 0 -1 0 0 1 0 M 8

Table 2.13

Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s.

(2.11) 0 z5 0 -TO
3 0 0 If
(2.12) 0 0 -t 1 -TO
3 0 0 t
(2.13) 0 0 t 0 /0 1 -1 II
5
(2.17) 0 1
-5 0 2
5 0 0 !
Xo 0 0 t 0 t'o 0 M II
5
30 2 Linear Programming

2.5.4.2 The Two-Phase Method


There exists another method for finding an initial feasible solution to an
L.P. problem with" = " or "~" constraints. It is called the two-phase method.
Phase I of the method begins by introducing slack and artificial variables
as before. The objective function is then replaced by the sum of the artificial
variables. In terms of the present example, the new objective function is
Xo = X 7 . This creates a new problem in which this new objective function
is to be minimized subject to the original constraints.
When this minimization has taken place, the optimal solution value is
analyzed. A value greater than zero indicates that the original problem does
not have a feasible solution. A value of zero corresponds to a solution which
is basic and feasible for the original problem as all the artificial variables
have value zero. In this case the original objective function is substituted in
the Xo row of the final tableau, and this basic feasible solution without the
artificial variables is used as a starting solution for further iterations of the
simplex method. This is phase II.
The two-phase method is usually preferred to the big M method as it
does not involve the problem of roundoff error that occurs in using the large
value assigned to M. It will now be illustrated by employing Problem 2.2.

PHASE I
Minimize:
Subject to: 3Xl + 4X2 + X3 = 12
3x l + 3X2 + X4 = 10
4Xl + 2X2 + Xs =8
Xl + X2 - X6 + X7 = 1
Xi~ 0, i = 1,2, ... ,7.

Table 2.14 shows the initial tableau for phase I. Note that the xo-row
coefficient of X7 is + 1 rather than -1 as the objective has been changed
to one of maximization. Transforming the problem to canonical form, we

Table 2.14

Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s.

(2.11) 3 4 0 0 0 0 12
(2.12) 3 3 0 1 0 0 0 10
(2.13) 4 2 0 0 1 0 0 8
(2.17) 1 0 0 0 -1 1 1
x~ 0 0 0 0 0 0 0
2.5 The Simplex Algorithm 31

Table 2.15

Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s. Ratio


12
(2.11) 3 4 0 0 0 0 12 ""3
(2.12) 3 3 0 0 0 0 10 130

(2.13) 4 2 0 0 1 0 0 8 !
(2.17) Q) 1 0 0 0 -1 1 1 t
x~ -1 -1 0 0 0 0 -1

Table 2.16
Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s.

(2.11) 0 1 1 0 0 3 -3 9
(2.12) 0 0 0 1 0 3 -3 7
(2.13) 0 -2 0 0 1 4 -4 4
(2.17) 1 1 0 0 0 -1 1
x~ 0 0 0 0 0 0 0

obtain Tables 2.15 and 2.16. It is clear from Table 2.16 that phase I is now
complete, as the objective function has value zero. (Note that the objective
can never attain an optimal negative value as it is the sum of a set of variables
all constrained to be nonnegative.) The solution in Table 2.16 represents a
basic feasible solution to the original problem.

PHASE II. The original objective function is substituted, neglecting the arti-
ficial variable X7' This gives Table 2.17, which is expressed in canonical
form as Table 2.18. Subsequent iterations are shown in Tables 2.19 and 2.20.
Table 2.20 displays the same optimal solution as that found by the big
M method in Table 2.13. It can be seen that the iterations in phase II are
identical to those of the big M method. This is no coincidence, and will
always happen.

Table 2.17

Constraints Xl X2 X3 X4 Xs X6 r.h.s.

(2.11) 0 1 1 0 0 3 9
(2.12) 0 0 0 1 0 3 7
(2.13) 0 -2 0 0 1 4 4
(2.17) 1 1 0 0 0 -1 1
Xo -4 -3 0 0 0 0 0
32 2 Linear Programming

Table 2.18
Constraints XI X2 X3 X4 X5 X6 r.h.s. Ratio
9
(2.11) 0 1 1 0 0 3 9 3"
7
(2.12) 0 0 0 0 3 7 3"
(2.13) 0 -2 0 0 1 @ 4 4
4
(2.17) 0 0 0 -1
Xo 0 0 0 0 -4 4

Table 2.19

Constraints XI x2 X3 X4 X5 X6 r.h.s. Ratio

(2.11) 0 CD 0 -4
3
3
0 6 12
"""5
8
(2.12) 0 ~ 0 -4 0 4 3"
I I
(2.13) 0 -2 0 0 4 1
I I 4
(2.17) 1 2 0 0 4 0 2 T
Xo 0 -1 0 0 0 8

Table 2.20

Constraints XI X2 X3 X4 X5 X6 r.h.s.

1- 3 II
(2.11) 0 1 5 0 -TO 0 5
3 3 2
(2.12) 0 0 -s -TO 0 S
(2.13) 0 0 I
s 0 I
TO
.u
5
±
(2.17) 1 0 -sI 0 t 0 5
1- 7 52
Xo 0 0 5 0 TO 0 """5

2.5.5 Multiple Optimal Solutions

Suppose that in order to compete with other companies in the sale oflignite,
the firm must reduce its price per ton. The profit is now $3 per ton. In order
to compensate, the profit on anthracite is raised to $4/ton. Although the
feasible region of the problem remains unchanged, as given in Figure 2.1,
the new objective function is:
Xo = 3x 1 + 4X2'
The problem is solved graphically in Figure 2.4. When the objective function
is drawn at the optimal level, it coincides with constraint line (2.2). This
means that all points on the line from point (0, 3) to (!, V) represent optimal
solutions. This situation can be stated as follows:
2.5 The Simplex Algorithm 33

.....
......
.....
......
......
......
......
......
Xo = 12

......
...... Xo = 8

.....
..... Xo = 4

Figure 2.4. An L.P. problem with multiple optimal solutions.

3xt + 4x! = 12
0:::;; xt:::;;!
V :::;;x!:::;;3
x~ = 12.
Note that for the multiple optimal solutions to be present the objective
function line, plane, or hyperplane (in two, three, or more dimensions, re-
spectively) must be parallel to that of a binding constraint. When this occurs
there is always an infinite number of optimal solutions (except when the
solution is degenerate,-see Section 2.5.6).
The problem is now solved using the simplex method (see Tables 2.21
and 2.22). Table 2.22 yields the following optimal solution:
x! = 3
x: = 1
x~ =2
xt, x! = 0
x~ = 12.
34 2 Linear Programming

Table 2.21

Variable

Constraints Xl X2 X3 X4 Xs r.h.s. Ratio

(2.11) 3 @) 0 0 12 11

(2.12) 3 3 0 1 0 10 130

(2.13) 4 2 0 0 8 !
Xo -3 -4 0 0 0 0

Table 2.22

Constraints Xl X2 X3 X4 Xs r.h.s. Ratio

(2.11) i 1 4
1
0 0 3 12
""3
3 3 4
(2.12) 4 0 -4 1 0 "3
±
(2.13) ! 0 1
-2" 0 1 2 s
Xo 0 0 0 0 12

However, the nonbasic variable Xl has a zero xo-row coefficient, indi-


cating that the objective function value would remain unchanged if Xl was
brought into the basis. This is carried out in Table 2.23, this tableau yields
the optimal solution:
xt =!
x! = II
x: =~
x~, xt = 0
and
X~ = 12.
Of course the xo-row value of Xs is zero, indicating that Xs could replace Xl
in the basis at no change in objective function value. This would produce
Table. 2.22.
The significance of this example is that we have discovered two basic
optimal solutions. It is straightforward to prove that if more than one basic

Table 2.23

Constraint Xl X2 X3 X4 Xs r.h.s.

(2.11) 0 1 2
s 0 3
-TO II
3 2
(2.12) 0 0 -s3 -TO S
2 4
(2.13) 0 -s1 0 S s
Xo 0 0 0 0 12
2.5 The Simplex Algorithm 35

feasible solution is optimal, then any linear combination of those points is


also optimal (see, for example, Gass (1969)). As we have seen from Figure 2.4,
any point on the line segment joining the two basic optimal solutions is
optimal. Multiple optimal solutions are present if nonbasic variables have
zero entries in the xo-row of the simplex tableau which displays an optimal
solution.

2.5.6 Degeneracy

Suppose that the management of the mining company would like to reduce
the number of hours of screening time available each day. They reason that,
as it is not all being used in the present optimal plan, why not reduce it?
Exactly 9~ hours are used daily, so this becomes the amount available.
Mathematically the new problem is the same as Problem 2.1, except that
constraint (2.12) is replaced by
(2.18)
This problem is solved graphically in Figure 2.5. Notice that constraint
(2.18) coincides with exactly one point of the feasible region-the optimal

X2

(2.13)
Figure 2.5. The graphical solution to a degenerate L.P. problem.
36 2 Linear Programming

Table 2.24

Constraints Xl X2 X3 X4 Xs r.h.s. Ratio

(2.11) 3 4 0 0 12 II
(2.18) 3 3 0 0 9t 1.§.
s
(2.13) ® 2 0 0 1 8 4
8

Xo -4 -3 0 0 0 0

Table 2.25

Constraints Xl X2 X3 X4 Xs r.h.s. Ratio

(2.11) 0 CD 0 3
-4 6 12
5
(2.18) 0 ! 0 1 3
-4 3t II
s
(2.13) 1 t 0 0 1
4 2 ±
1

Xo 0 -1 0 0 8

Table 2.26

Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 0 ~ 0 3
-TO \2
(2.18) 0 o 3
-5 1 3
-TO 0
(2.13) o -t 0 2
5
4
5
Xo 0 o ~ 0 170 Sl

point. This problem is solved by the simplex method in Tables 2.24-2.26.


It can be seen from the tableau of Table 2.25 that X2 should enter the basis.
However, on forming the ratios to decide which variable leaves the basis, a
tie occurs. Whenever this happens the next iteration will produce one or
more basic variables with value zero. Such basic feasible solutions are called
degenerate solutions. As it happens, we have reached the optimum in the
same tableau as the first instance of degeneracy, so no problems occur.
However, if Table 2.26 did not display the optimum, complications might
have arisen. These are best explained by means of another example.
Suppose that a new screening plant is built and it now takes 4 hours to
process one ton of lignite and 1 hour to process one ton of anthracite. There
are 8 hours' screening time available per day. This means that the problem
is the same as Problem 2.1 except that constraint (2.12) is replaced by
(2.19)
The problem is solved graphically in Figure 2.6. It is solved in by the simplex
method in Tables 2.27-2.29.
2.5 The Simplex Algorithm 37

XI

(2.13)
Figure 2.6. The graphical solution to a second degenerate L.P. problem.

Table 2.27

Constraints XI X2 X3 X4 X5 r.h.s. Ratio

(2.11 ) 3 4 0 0 12 12
""3
(2.19) 4 1 0 0 8 8
4
(2.13) ® 2 0 0 1 8 8
4
Xo -4 -3 0 0 0 0

Table 2.28

Constraints XI X2 X3 x4 X5 r.h.s. Ratio

(2.11) 0 CD 1 0 3
-4 6 12
""5
(2.19) 0 -1 0 1 -1 0
I I 4
(2.13) "2 0 0 4 2 T
Xo 0 -1 0 0 8
38 2 Linear Programming

Table 2.29

Constraints Xl X2 X3 X4 X5 r.h.s.

2 3 12
(2.11) 0 "5 0 TO 5
2 13 12
(2.19) 0 0 "5 1 TO 5
1 2 4
(2.13) 1 0 "5 0 "5 "5
2 7 52
Xo 0 0 "5 0 "5 5

We can see from Table 2.28 that one of the basic feasible solutions pro-
duced by the simplex method was degenerate, as the variable X 4 has zero
value. However, there is no degeneracy in the tableau of the next iteration.
This is because the entering-variable (x z) coefficient is negative (-1) in
(2.19). Thus no ratio is formed, hence the dash in the ratio column. What
would have happened if that Xz coefficient had been positive and the ap-
propriate ratio was formed? This would have caused the ratio to be zero.
Thus that X z coefficient would become the pivot element. Then the next
basic feasible solution would also be degenerate. Also there would be no
improvement in the value of the objective function.
But the simplex algorithm assumes that each new basic feasible solution
value is an improvement over the preceding one. When this does not happen,
there is a danger that eventually a previous basis will reappear, and an
endless series of iterations will be performed, with no improvement in the
objective function value. And the optimal solution would never be found.
This unhappy phenomonen is termed cycling.
Degeneracy occurs often in realistic large-scale problems. However, there
do not appear to be any reported cases of cycling of the simplex technique
in solving realistic problems. Because of this most computer codes do not
contain measures to prevent cycling. This appears to be quite safe, because
the accumulation of rounding errors will usually prevent any basic variable
from assuming a value of exactly zero. There are a number of theoretical
techniques which do prevent cycling (see, for example, Gass (1969)).
In the previous paragraph we asked the question, what would happen if
the Xz coefficient in the (2.19) row of Table 2.28 had been positive. This will
come about if constraint (2.19) is replaced by
4Xl + 2txz + X4 = 8 (2.20)
in Table 2.20. We are solving the following problem:
Maximize: 4Xj + 3xz = Xo
subject to: 3x 1 + 4X2 + X3 = 12
4Xl + 2tX2 + X4 = 8
4Xl + 2x z + Xs = 8
Xi ~ 0, i = 1,2, ... , 5.
2.5 The Simplex Algorithm 39

(2.11) (2.13)

(2.20)
Figure 2.7. The graphical solution to a third degenerate L.P. problem.

This problem is solved graphically in Figure 2.7 and analytically in Tables


2.30-2.33. The optimal solution is
= 147
x!
=g
x~
x! = fi
x! = x! = 0
x~ = 9/7 ,

Table 2.30

Constraint Xl X2 X3 X4 Xs r.h.s. Ratio

(2.11) 3 4 0 0 12 12
""3
(2.20) 4 2t 0 0 8 !
(2.13) @ 2 0 0 8 !
Xo -4 -3 0 0 0 0
40 2 Linear Programming

Table 2.31

Constraint XI X2 X3 X4 Xs r.h.s. Ratio

5 3 12
(2.11 ) 0 "2 1 0 "4 6 5
(2.20) 0 CD I
0 -1
I
0 0
4
(2.13) 1 "2 0 0 "4 2 T
Xo 0 -1 0 0 8

Table 2.32

Constraint XI X2 X3 X4 Xs r.h.s. Ratio

(2.11) 0 0 1 -5 (£J 6 24
17
(2.20) 0 0 2 -2 0
5 8
(2.13) 1 0 0 -1 "4 2 5
Xo 0 0 0 2 -1 8

Table 2.33

Constraint Xl X2 X3 X4 Xs r.h.s.
4 20 24
(2.11) 0 0 17 17 1 17
8 -.!L 48
(2.20) 0 17 17 0 17
5 8 4
(2.13) 0 17 17 0 17
4 14
Xo 0 0 17 17 0 9 177

Notice in Table 2.31 that the circled X2 coefficient under question is indeed
positive. Thus the corresponding ratio is formed and is zero, and this co-
efficient becomes the pivot element. Therefore the next basic solution, in
Table 2.32, has the same value as that of the previous solution and is still
degenerate. Happily, the next iteration produces a non degenerate optimal
solution.
In Section 2.5.5 it was stated that for multiple optimal solutions to be
present, the objective function hyperplane must be parallel to that of a
binding constraint. The converse is not universally true, as illustrated by
the following example. Suppose Problem 2.1 is modified by simultaneously
adopting the following changes. The price per ton of both lignite and an-
thracite is $3. The amount of screening time available is 9% hours per day.
The problem we are solving is given by:
2.5 The Simplex Algorithm 41

Maximize: 3Xl + 3X2 = Xo

subject to: 3Xl + 4X2 + X3 = 12 (2.11)


(2.18)
+ Xs = 8 (2.13)

This problem is solved graphically in Figure 2.8 and analytically in the


following Tables 2.34-2.36. The optimal solution is
xt =!
x! = II
x! = X~ = X~ = 0
X6 = 9!.
Consider Table 2.36. As there are no nonbasic xo-row coefficients with
zero value, there are no alternative optimal solutions. Yet the objective

Xl

~2.11)

\
\
\.
\.
\.
\.
\.
\.
\.
Xo = 9!
(2.13)
Figure 2.8. The optimal solution to an L.P. problem in which the objective function
is parallel to a binding redundant constraint.
42 2 Linear Programming

Table 2.34

Constraints Xl X2 X3 X4 Xs r.h.s. Ratio

12
(2.11 ) 3 4 0 0 12 ""3
(2.18) 3 3 0 0 n 16
5
(2.13) @) 2 0 0 1 8 .!l.
4

Xo -3 -3 0 0 0 0

Table 2.35

Constraints Xl X2 X3 X4 X5 r.h.s. Ratio

(2.11) 0 CD
3
0 1.
4
3
6 12
5
12
(2.18) 0 2 0 -4 3t 5
1 1 4
(2.13) 2 0 0 4 2 T
3 3
Xo 0 -2 0 0 4 6

Table 2.36

Constraints Xl X2 X3 X4 Xs r.h.s.

2 3 12
(2.11) 0 5 0 TO 5
1. 3
(2.18) 0 0 5 -TO 0
1 2 4
(2.13) 1 0 5 0 5 5
3
Xo 0 0 5 0 ~
10 9t

function is parallel to the binding constraint (2.18). (Constraint (2.18) is


binding because its slack variable, X4, has zero optimal value.) This is
possible because the constraint is redundant (although binding), as shown
in Figure 2.8. This creates a degenerate optimal solution.

2.5.7 Nonexistent Feasible Solutions

Suppose now that the company management, heartened by the efficiency of


the L.P. approach, demands a plan that guarantees that at least 4 tons of
coal are produced each day. The coal produced no longer need be screened.
This means that the problem is the same as Problem 2.1 except that con-
straint (2.12) is to be replaced by a constraint representing the new guarantee.
Mathematically, this guarantee can be expressed as
Xl + X2 ;:::: 4.
When we introduce a slack variable (X6) and an artificial variable (X4), it
becomes
(2.21)
2.5 The Simplex Algorithm 43

So the problem is the following:


Maximize: = Xo

subject to: = 12 (2.11 )


(2.21)

+ Xs =8 (2.13)
Xi 2 0, i = 1,2, ... , 6.
When this problem is expressed graphically, as in Figure 2.9, it can be seen
that there does not exist a point which will satisfy all constraints simulta-
neously. Hence the problem does not have a feasible solution. We need a
strategy for detecting this situation in the simplex method. Towards this end

X2

\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\

\
Xo = 11 (2.19)
Figure 2.9. An L.P. problem with no feasible solution.
44 2 Linear Programming

the present problem shall be "solved" by the simplex algorithm. Table 2.37
shows the initial tableau. The first step is to transform the problem into
canonical form, as in Table 2.38. Tables 2.39 and 2.40 complete the process.
Table 2.40 displays the "optimal" solution. However, it will be noticed
that the artificial variable X 4 has a positive value (~) in this solution. When-
ever this occurs in the final simplex tableau, it can be concluded that the

Table 2.37

Constraints X2 r.h.s.

(2.11) 3 4 o o o 12
(2.21) o 1 o -1 4
(2.13) 4 2 o o 1 o 8
Xo -4 -3 o M o o o

Table 2.38

Constraints r.h.s. Ratio

(2.11) 3 4 0 0 o 12 12
3""
(2.21) 1 o 0 -1 4 4
T
(2.13) @ 2 0 0 o 8 8
"4
Xo -(M + 4) -(M + 3) 0 0 0 M -4M

Table 2.39
Constraints Xl X2 r.h.s. Ratio

(2.11) 0 .2.
2 o 3
"4 o 6 12
5
(2.21) 0 1
2 o 1 -i -1 2 4
T
(2.13) 1 1
2 o 0 l
4 o 2 4
T
Xo o -(M/2 + 1) 0 o M/4 +1 M -2M +8

Table 2.40

Constraints r.h.s.

(2.11) o 2
5 o - 130 o 5
12

(2.21) o o -5
1
-fa -1 4
5
(2.13) 1 o -5
1
o 2
5 o 4
5
o o t(M + 2) o /o(M + 7) M t(52 - 4M)
2.5 The Simplex Algorithm 45

problem has no feasible solution. If the two-phase method had been used,
we would have obtained a positive Xo at the end of the first phase, indicating
no feasible solution.

2.5.8 Unboundedness

Consider the following L.P. problem.


Maximize: Xl + 2X2 = Xo
subject to: -4x l + x2~2
Xl + X2;::': 3
Xl + 2X2 ;::.: 4
Xl - X2 ~ 2
Xl ;::.:
X 2 ;::.:
°
0.
On looking at the graphical solution to the problem in Figure 2.10 it can
be seen that the feasible region is unbounded. Because of the slope of the
objective function (dashed line), the Xo line can be moved parallel to itself
an arbitrary distance from the origin and still coincide with feasible points.
Therefore this problem does not have a bounded optimal solution value.
We shall now attempt to apply the simplex method to the problem.
Transforming the problem into standard form yields:
Maximize: Xl + 2X2 -Mxs - MX 7 = Xo
subject to: -4Xl + X2 + X3 =2 (2.22)
Xl + X2 - X4 + Xs =3 (2.23)
Xl + 2X2 - X6 + X7 =4 (2.24)
+ Xs = 2 (2.25)
Xi;::': 0, i = 1,2, ... , 8.
The simplex tableaux are given in Tables 2.41-2.45.
From Table 2.45 it can be seen that X4 should enter the basis next. Which
variable should leave the basis in order to ensure feasibility? All of the X4
constraint coefficients are now negative. SO X4 can be bought into the basis
at an arbitrarily large positive level. This will cause the objective function to
assume an arbitrarily large value. Thus there is no bounded optimal solution
to the problem. This situation can always be detected in the simpiex algo-
rithm by the presence of a negative xo-row variable coefficient with column
entries all nonpositive.
Sometimes a problem will have an unbounded feasible region but still
have a bounded optimum. This is illustrated by the dotted line in Figure
46 2 Linear Programming

""

""
""
""

.- - .--.-- - - Xo =
"
"12

Xo = t-- -.-'
Figure 2.l0. L.P. problems with an unbounded feasible region.

Table 2.41

Constraints Xl X2 X3 X4 Xs X6 X7 Xs r.h.s.

(2.22) -4 1 0 0 0 0 0 2
(2.23) 1 1 0 -1 1 0 0 0 3
(2.24) 1 2 0 0 0 -1 1 0 4
(2.25) 1 -1 0 0 0 0 0 1 2
Xo -1 -2 0 0 M 0 M 0 0
2.5 The Simplex Algorithm 47

Table 2.42

Constraints Xl X2 X3 X4 X5 X6 X7 Xs r.h.s. Ratio

(2.22) -4 CD 1 0 0 0 0 0 2 t
(2.23) 1 1 0 -1 0 0 0 3 t
(2.24) 1 2 0 0 0 -1 0 4 t
(2.25) 1 -1 0 0 0 0 0 1 2
Xo -(1 + 2M) -(3M + 2) 0 M 0 M 0 0 -7M

Table 2.43

Constraints Xl X2 X3 X4 X5 X6 X7 Xs r.h.s. Ratio

(2.22) -4 0 0 0 0 0 2
(2.23) 5 0 -1 -1 1 0 0 0 1 !
(2.24) ® 0 -2 0 0 -1 1 0 0 0
"9
(2.25) -3 0 1 0 0 0 0 1 4
Xo -(14M + 9) 0 (2 + 3M) M 0 M 0 0 4-M

Table 2.44

Constraints Xl X2 X3 X4 Xs X6 X7 Xs r.h.s. Ratio


4 4
(2.22) 0 -9
1
0 0 -9 9 0 2
(2.23) 0 0 1
9
2
-1 CD1
-9
5
0
0 0
!
(2.24) 0 -9 0 0 -9 ~
(2.25) 0 0 t 0 0
1
-3" 3"
1
4

Xo 0 0
M
M 0 -CM9+ 9) C4~ +9) 0 4-M
9

Table 2.45

Constraints Xl X2 X3 X4 X5 X6 X7 Xs r.h.s.

(2.22) 0 1
5
4
-5 ! 0 0 0 154

(2.23) 0 0 ! 9
-5 ~ -1 0 ~
(2.24) 0 -1 1
-5 ! 0 0 0 !
(2.25) 0 0 3
-5 t 0 0 2l

Xo 0 0 9
-5 M+~ 0 M 0 ~
48 2 Linear Programming

2.10, where the problem is the same as the previous one, except the objective
is now to maximize:

This problem is solved by the simplex method in Tables 2.46-2.52. The opti-
mal solution is
x! = i
x~ = ~
x~ = 12
xl = i
x~ = x~ = x~ = x~ = °
x~ = 1.

Table 2.46

Constraints XI X2 X3 X4 Xs X6 X7 Xs r.h.s.

(2.22) -4 0 0 0 0 0 2
(2.23) 1 1 0 -1 1 0 0 0 3
(2.24) 2 0 0 0 -1 1 0 4
(2.25) 1 -1 0 0 0 0 0 1 2
Xo -1 2 0 0 M 0 M 0 0

Table 2.47
Constraints Xl X2 X3 X4 Xs X6 X7 Xs r.h.s. Ratio

(2.22) -4 CD 1 0 0 0 0 0 2 2
T
(2.23) 1 1 0 -1 0 0 0 3 3
T
(2.24) 1 2 0 0 0 -1 1 0 4 4
"2
(2.25) 1 -1 0 0 0 0 0 2
Xo -(1 + 2M) 2-3M 0 M 0 M 0 0 -7M

Table 2.48

Constraints XI X2 X3 X4 Xs X6 X7 Xs r.h.s. Ratio

(2.22) -4 0 0 0 0 0 2
(2.23) 5 0 -1 -1 1 0 0 0 1
5
(2.24) ® 0 -2 0 0 -1 1 0 0 0
9
(2.25) -3 0 0 0 0 0 1 4
Xo 7 -14M 0 3M-2 M 0 M 0 0 -(M + 4)
2.5 The Simplex Algorithm 49

Table 2.49

Constraints Xl X2 X3 X4 X5 X6 X7 Xs r.h.s. Ratio

I 4 4
(2.22) 0 1 9 0 0 -9 9 0 2
I , 9
(2.23) 0 0 9 -1 1 ~ -g- 0 1 5"
2 I I
(2.24) 1 0 -9 0 0 -9 9 0 0
(2.25) 0 0 1- 0 0 I I
4
3 -3 3

Xo 0 0 _(M: 4) M 0
7-SM
- -
14M - 7
--- 0 -(M + 4)
9 9

Table 2.50

Constraints Xl X2 X3 X4 Xs X6 X7 Xs r.h.s. Ratio


I 4 4
(2.22) 0 5 -5 5 0 0 0 14
5
14
T
(2.23) 0 0 CD 1
-5
9 9
5
1
-1 0 9
5 T
9

(2.24) 0 5 -5
I
5 0 0 0 1
5
2 3 3 23 23
(2.25) 0 0 5 -5 5 0 0 5 T

Xo 0 0 -5
3 7
5 C M 7
5- ) 0 M 0 27
5

Table 2.51

Constraints XI X2 X3 X4 Xs X6 X7 Xs r.h.s. Ratio


1
(2.22) 0 1 0 -1 -1 0 1 T
(2.23) 0 0 -9 9 5 -5 0 9
(2.24) 0 0 -2 2 -1 0 2
(2.25) 0 0 0 G) -3 -2 2 1
3
Xo 0 0 0 -4 (4 + M) 3 M-3 0 0

Table 2.52

Constraints XI X2 X3 X4 Xs X6 X7 Xs r.h.s.
I 2
(2.22) 0 1 0 0 0 -3
I
3 -3
I
3
(2.23) 0 0 1 0 0 -1 3 12
(2.24) 0 0 0 0 -3
I I
3
2
3 i
(2.25) 0 0 0 -1 -3
2 2
3
1
3
1
3

3M -1
Xo 0 0 0 0 M 1
3
3
4
3 1
50 2 Linear Programming

2.6 Duality and Post optimal Analysis


Duality is an important concept and we now present some of the reasons for
this importance. In the previous section it became obvious that the more
constraints an L.P. problem had, the longer it took to solve. Experience with
efficient computer codes has shown that computational time is more sensi-
tive to the number of constraints than to the number of variables. In order to
solve a relatively large problem it would therefore be convenient to reduce
its number of constraints. This can often be done by constructing a new L.P.
problem from the given problem, where the new problem has fewer con-
straints. This new problem is then solved more easily than the original one.
The information obtained in the final simplex tableau can be used to deduce
the optimal solution to the original problem. The new problem constructed
for this purpose is called the dual problem to the original problem. The orig-
inal problem is called the primal.
After an L.P. problem has been solved one would often like to know the
sensitivity of the solution to changes in the objective function, constraint
coefficients and the r.h.s. constants and to the addition of new variables and
constraints. Duality can be used to answer such questions.

2.6.1 Duality

2.6.1.1 The Relationship Between the Primal and the Dual


Consider once again the initial L.P. problem outlined in Section 2.2. Suppose
that a corporation is considering hiring the equipment of the mining com-
pany. The corporation is uncertain about the hourly hireage rates it should
offer the company for the three types of implements. During negotiations
the mining company reveals that its profits per ton of lignite and anthracite
are $4 and $3, respectively. The company states that it will not accept hireage
rates which amount to less revenue than these present figures. For the pur-
pose of fixing acceptable rates the following variables are defined. Let
Yl = the hourly hireage rate of the cutting machine,
Y2 = the hourly hireage rate of the screens,
Y3 = the hourly hireage rate of the washing plant.
Recall that it requires 3,3, and 4 hours for the cutting machine, the screens,
and the washing plant, respectively, to process 1 ton of lignite. The revenue
of the company from hiring out the machines for the corporation to process
one ton of lignite is then
3Yl + 3Y2 + 4Y3·
Because the company requires a revenue no less than its present profit, this
revenue must be such that
2.6 Duality and Postoptimal Analysis 51

By analogy, the constraint for anthracite is


4Yl + 3Y2 + 2Y3 ;;:: 3.
The corporation obviously wishes to minimize the total daily hireage cost
it has to pay. Recall that the cutting machine, the screens, and the washing
plant can be operated for no more than 12, 10, and 8 hours per day, respec-
tively. The objective of the corporation is to minimize
12Yl + 1OY2 + 8Y3.
Of course all hireage costs have to be nonnegative. The corporation is then
faced with the following L.P. problem:
Minimize: 12Yl + 10Y2 + 8Y3 = Yo
subject to: 3Yl + 3Y2 + 4Y3 ;;:: 4
4Yl + 3Y2 + 2Y3 ;;:: 3
Yi ;;:: 0, i = 1, 2, 3.
Let us now compare this problem with the original L.P. problem, which is
reproduced here for convenience:
Maximize: 4Xl + 3x 2 = Xo
subject to: 3x 1 + 4x 2 :::; 12
3Xl + 3X2:::; 10
4Xl + 2X2:::; 8
Xi ;;:: 0, i = 1, 2.
A moment's comparison shows that both problems have the same set of
constants, but in different positions. In particular, each "row" of the hireage
problem contains the same coefficients as one "column" of the original prob-
lem. When two L.P. problems have the special relationship displayed here,
the original problem is called the primal and the new problem is called the
dual. We shall now formalize this relationship by showing how the dual is
constructed from the primal.
1. Replace each primal equality constraint by a ":::;" constraint and a ";;::"
constraint. For example, replace
3x 1 + 4X2 + 5X3 = 6,
by

and
3Xl + 4X2 + 5X3 :::; 6.
2. If the primal is a maximization (minimization) problem, multiply all ";;::"
(":::;") constraints by (- 1). This ensures all constraints are of the ":::;"
type for maximization and of the";;::" type for minimization.
3. Define a unique nonnegative dual variable for each primal constraint.
52 2 Linear Programming

4. Define each dual objective function coefficient to be equal to the r.h.s.


constant of the primal constraint of the variable.
5. If the primal objective is maximization, define the dual objective to be
minimization, and vice versa.
6. Define the dual r.h.s. constraint constants to be the primal objective func-
tion coefficients.
7. If the primal objective is maximization (minimization), define the dual
constraint inequalities to be of the" ;;?: " (" ::;;") type.
8. Define the dual constraint coefficient matrix, A (defined in Section 2.3) to
be the transpose of primal constraint coefficient matrix.
These steps can be summed up in mathematical form. The primal:
Maximize: C1X l + C2 X 2 + ... + CnXn = Xo

subject to: allxl + a12 x 2 + ... + alnX n ::;; bl

a2l x l + a22 x 2 + ... + a2nX n ::;; b2

Xi;;?: 0, i = 1,2, ... , n,


has dual:
Minimize: blYl + b 2Y2 + ... + bmYm = Yo
subject to: allYl + a21Y2 + ... + amlYm ;;?: Cl
a12Yl + a22Y2 + ... + a m2Ym ;;?: C 2

alnYl + a2nY2 + ... + amnYm ;;?: Cn


Yi ;;?: 0, i = 1, 2, ... , m.
This can also be expressed in matrix form. The primal:

Maximize: CTX
subject to: AX::;;B (2.26a)

has dual:
X;;?: °
Minimize: BTy,
subject to: ATy;;?: C (2.26b)
y;;?:O,
where
Cis n x 1,
X is n x 1,
Aismxn,
B is m x 1, and
Yis m x 1.
2.6 Duality and Postoptimal Analysis 53

Suppose the dual of (2.26b) is constructed:


Maximize: cTX
subject to: (ATfX ~ (BT)T
X~O.

Because the transpose of the transpose of a matrix (vector) is the matrix


(vector), we have proven:

Theorem 2.1. The dual of the dual of a primal L.P. problem is the primal L.P.
problem itself.

2.6.1.2 The Optimal Solution to the Dual


The dual problem introduced in the last section will now be solved by the
two-phase method. In standard form the problem is as follows:

PROBLEM 2.3
Maximize: -12Yl - lOY2 - 8Y3 = Yo (2.27)

subject to: 3Yl + 3Y2 + 4Y3 - Y4 + Ys =4 (2.28)

- Y6 + Y7 = 3 (2.29)
i = 1,2, ... ,7.

Phase I, with Y~ = Ys + Y7' is shown in Tables 2.53-2.56.


Phase II, with columns corresponding to artificial variables removed, is
shown in Tables 2.57 and 2.58.

Table 2.53

Constraints Yl Y2 Y3 Y4 Ys Y6 Y7 r.h.s.

(2.28) 3 3 4 -1 1 0 0 4
(2.29) 4 3 2 0 0 -1 1 3
y~ 0 0 0 0 1 0 1 0

Table 2.54

Constraints Yt Y2 Y3 Y4 Ys Y6 Y7 r.h.s. Ratio

(2.28) 3 3 4 -1 0 0 4
(2.29) ® 3 2 0 0 -1 1 3
y~ -7 -6 -6 0 1 0 -7
54 2 Linear Programming

Table 2.55

Constraints Y1 Y2 Y3 Y4 Y5 Y6 Y7 r.h.s. Ratio

(2.28) 0 3
4 CD -1 1 3
-4 4
7
170
(2.29) 1 1 t 0 0 1
-4 4
1
1 2
3

Yo 0 3
-4 -2
5
0 3
-4 i -4
7

Table 2.56

Constraints Y1 Y2 Y3 Y4 Y5 Y6 Y7 r.h.s.

(2.28) 0 3
TO 1 2
-5 t 3
TO -TO
3 7
TO
(2.29) t 0 1
5
1
-5 -5
2 2
5 5
2

Yo 0 0 0 0 0 0

Table 2.57

Constraints Y1 Y2 Y3 Y4 Y6 r.h.s.
3 2
(2.28) 0 TO -5 130 t'o
(2.29) 1 3
5 0 1
5 -5
2
t
Yo 12 10 8 0 0 0

Table 2.58

Constraints Y1 Y2 Y3 Y4 Y6 r.h.s.

(2.28) 0 ..J.. 2 7
10 -5 130 TO
(2.29) 1 3
5 0 1
5 -5
2
t
2 4 II 52
Yo 0 5 0 5 5 -5

The solution to the original minimization problem is:


y! =~
y~ = 170
yf = 0, otherwise
y~ = 5l·
2.6.1.3 Properties of the Primal-Dual Relationship
Compare Table 2.58, the optimum tableau for the dual, with Table 2.8, the
optimum tableau for the primal, which is reproduced here for convenience.
As the primal and the dual had the same set of constants in their mathemat-
ical formulation, it is not surprising to find some similarities in their optimal
2.6 Duality and Postoptimal Analysis 55

Table 2.8

Constraints XI X2 X3 X4 Xs r.h.s.
2 3 12
(2.11 ) 0 5 0 -TO "5
3 3 2
(2.12) 0 0 -5 1 -TO 5
(2.13) 0 I
-5 0 t 5
4
2 7 g
Xo 0 0 5 0 TO s

tableaux. These similarities are:


1. The value of optimal solutions of the primal and dual are equal.
2. The optimal value of each slack variable in one problem is equal to the
objective function coefficient of the structural variable of the correspond-
ing equation in the other.
3. (a) Whenever a primal structural variable has a positive optimal value,
the corresponding dual slack variable has zero optimal value.
(b) Whenever a primal slack variable has positive optimal value, the cor-
responding dual structural variable has zero optimal value.
Result 3 is an example of what is known as the complementary slackness
theorem. Indeed, these results are true for any primal-dual pair ofL.P. prob-
lems which have finite optimal solutions. We now go on to prove some gen-
eral results concerning duality for the pair of problems defined by (2.26),
with a view to proving the complementary slackness theorem in general.

Theorem 2.2. If X and Yare feasible solutions for (2.26a) and (2.26b), respec-
tively, then the value of Y is no less than the value of X. That is,
CTx::;; BTy'

PROOF. As X is feasible,
AX::;;B.
As Y is feasible,
Y~O.
Therefore

As Y is feasible,

As X is feasible,
X~o.
Therefore

But
XTATy = (AXfY
= yT AX.
56 2 Linear Programming

Therefore

o
Theorem 2.3. If X* and y* are feasible solutions for (2.26a) and (2.26b) such
that C T X* = BT Y*, then X* and y* are optimal solutions for (2.26a) and
(2.26b), respectively.

PROOF. By Theorem 2.2,


CTX ~ BTy*, for any feasible X.
But, by assumption,

Therefore
cTX ::::; C T X*, for any feasible X.

Thus X* is optimal for (2.26a). Similarly,

CTX* ~ BTy, for any feasible Y.


And, by assumption,

Therefore
BTy* ~ BTy, for any feasible y.
Thus y* is optimal for (2.26b). o
We can make a number of inferences from these results. Firstly, the value
of any feasible primal solution is a lower bound on the value of any feasible
dual solution. Conversely, the value of any feasible dual solution is an upper
bound on the value of any feasible primal solution. The reader should verify
that these observations are true for the numerical example. Secondly, if the
primal has an unbounded optimal solution value, the dual cannot have any
feasible solutions.
The converse to Theorem 2.3 is also true:

Theorem 2.4. If X* and y* are optimal solutions for (2.26a) and (2.26b) re-
spectively then

For a proof of this theorem, see the book by David Gale (1960).

Theorem 2.5 (Complementary Slackness). Feasible solutions X* and y* are


optimal for (2.26a) and (2.26b), respectively if and only if
(X*f[ATy* - C T ] + (Y*f[B - AX*] = o.
2.6 Duality and Post optimal Analysis 57

PROOF. Let U and V be the set of slack variables for (2.26a) and (2.26b), re-
spectively, with respect to X* and Y*, i.e.,

AX* + U = B
ATy* - V = C
u, V;:::::O.

Premultiplying the first equation by (y*f, we obtain


(y*)T AX* + (y*fu = (y*)TB;
premultiplying the second by (X*f, we obtain
(X*)T ATy* - (X*fV = (X*fC.
As
(x*f ATy* = (y*fAX*,
we can eliminate this common expression from these two equations, to
obtain
(Y*fB - (y*)TU = (X*fC = (X*fV. (2.30)
In view of the way the slack variables have been introduced, we have
U = B - AX*
V=ATy*-c.

Thus, in order to prove the theorem we must show that X* and y* are opti-
mal for (2.26a) and (2.26b) if and only if

(x*fv + (y*fu = o. (2.31)

(=» If X* and y* are assumed optimal for (2.26a) and (2.26b), respectively,
then, by Theorem 2.4,

Thus (2.30) reduces to (2.31).


(<=) Assuming (2.31) holds, (2.30) reduces to
CTX* = BTy*.
Thus, by Theorem 2.3, X* and y* are optimal solutions for (2.26a) and
(2.26b), respectively. 0

Let us examine (2.31) more closely in order to discover why Theorem 2.5
is named the complementary slackness theorem. Because X*, Y*, U, and V
are all nonnegative we have
(x*fv;::::: 0,
and
58 2 Linear Programming

Therefore
xtv;;;::: 0, i = 1, 2, ... ,n, and
yju j ;;::: 0, j = 1,2, ... , m.
But by (2.31) we can conclude that
xtv; = 0, i = 1, 2, ... ,n, and
ytUj = 0, j = 1,2, ... , m.
Thus we can conclude that the results 1,2, and 3 hold for any pair of primal-
dual L.P. problems with finite optimal solution values.
It will now be shown what happens to the dual when the primal either
does not have a feasible solution or else has an unbounded optimum. Recall
the problem of Section 2.5.7. The dual ofthat problem is
Minimize: 12Yl - 4Y2 + 8Y3 = Yo (2.32)
subject to: 3Yl - Y2 + 4Y3 ;;::: 4 (2.33)
4Yl - Yz + 2Y3 ;;::: 3 (2.34)
Yt> Y2' Y3 ;;::: 0.
An attempt will now be made to solve this problem by the two-phase method.
Phase I is shown in Tables 2.59-2.62. Here,

y~ = Ys + Y7'
Phase II is shown in Tables 2.63 and 2.64. This problem has an unbounded
optimum because Y2 can be introduced to the basis at an arbitrarily high
level, causing an arbitrarily large objective function value. It is true in general
that when a primal L.P. problem has no feasible solution the dual has either an
unbounded optimum or no feasible solution.

Table 2.59

Constraints Yl Y2 Y3 Y4 Ys Y6 Y7 r.h.s.

(2.33) 3 -1 4 -1 1 0 0 4
(2.34) 4 -1 2 0 0 -1 1 3
Y~ 0 0 0 0 0 1 0

Table 2.60

Constraints Yl Y2 Y3 Y4 Ys Y6 Y7 r.h.s. Ratio


4
(2.33) 3 -1 4 -1 1 0 0 4 3"
(2.34) @) -1 2 0 0 -1 1 3 3
4:

Y~ -7 2 -6 0 0 -7
2.6 Duality and Postoptimal Analysis 59

Table 2.61

Constraints Y1 Y2 Y3 Y4 Ys Y6 Y7 r.h.s. Ratio

(2.33) 0 1
-4 CD -1 ! 3
-4
7
4
7
TO
(2.34) 1
-4 2
1
0 0 1
-4
1
4 ! 2
3

1 S 3 7 7
Y~ 0 4 -2 0 -4 4 -4

Table 2.62

Constraints Y1 Y2 Y3 Y4 Ys Y6 Y7 r.h.s.

(2.33) 0 -TO
1
-s2 t TO
3
-TO
3 7
TO
1 1 1 2 2
(2.34) -s 0 S -s -s s s2
Y~ 0 0 0 0 0 0

Table 2.63

Constraints Y1 Y2 Y3 Y Y5 Y6 Y7 r.h.s.
1
(2.33) 0 -TO 1 -s2 S
2 3
TO
3
-TO
7
TO
1 1 2 2 2
(2.34) -s1 0 S -s -s s s
Yo 12 - 4 8 0 0 0 0 0

Table 2.64
Constraints Y1 Y2 Y3 Y4 Y5 Y6 Y7 r.h.s.

(2.33) 0 1
-TO 1 -s2 S
2
-to -TO
3
TO
7

(2.34) 1 -s 1
0 t -s 1
-s 2 2
s t
Yo 0 -s4 0 4
S -s4 12
-5 -5
12
- IsI

Considering problem (2.32)-(2.34) as the primal, we have an example of


the following statement which is true in general: When a primal L.P. prob-
lem has an unbounded optimum the dual has no feasible solution. This result is
a corollary to Theorem 2.3.

2.6.2 Postoptimal Analysis

When the optimal solution to a linear program is analyzed to answer ques-


tions concerning changes in its formulation, the study is called postoptimal
analysis. What changes can be made to an L.P. problem? Of the variety that
60 2 Linear Programming

can be studied, the following will be considered:


1. Changes in the coefficients of the objective function.
2. Changes in the r.h.s. constants of the constraints.
3. Changes in the l.h.s. coefficients of the constraints.
4. The introduction of new variables.
5. The introduction of new constraints.
Obviously, when the original L.P. is changed, the new problem could be
solved from scratch. If the changes are minor, however, it seems a shame to
ignore the valuable information gained in solving the original problem. The
following sections show how the optimal solution to a modified problem can
be found using duality and the solution to the original L.P.

2.6.2.1 Changes in the Objective Function Coefficients


When changes are made to the objective function only, the optimal solution
is still feasible, as the feasible region is unaltered.
Consider once again problem 2.1. The optimal simplex tableau for this
problem was presented in Table 2.8, which is reproduced here for conve-
nience.

Table 2.8

Constraints Xl X2 X3 X4 Xs r.h.s.

1. 3 12
(2.11) 0 5 0 -10 5
3 3 2
(2.12) 0 0 -s -10 S
4
(2.13) 1 0 -s1 0 5
1.
S
Xo 0 0 2
s 0 7
TO Sl

(i) Changes to Basic Variable Coefficients. Suppose that the objective


function coefficient C2 of X2 is going to be changed. What is the range from
its present value of 3 for which the present solution will remain optimal?
Suppose C2 is changed from 3 to 3 + q. The initial simplex tableau for the
problem then is as shown in Table 2.65.

Table 2.65
Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 3 4 1 0 0 12
(2.12) 3 3 0 1 0 10
(2.13) 4 2 0 0 1 8
Xo -4 -(3 + q) 0 0 0 0
2.6 Duality and Postoptimal Analysis 61

Table 2.66

Constraints Xl X2 X3 X4 X5 r.h.s.

(2.11) 0 ~
0 - 130
II
5 5
3 3 2
(2.12) 0 0 -s -10 S
(2.13) 1 0 -s1 0 ~
5 !
Xo 0 -q ~
5 0 ?o II
5

Table 2.67

Constraints Xl X2 X3 X4 X5 r.h.s.
3
(2.11) 0 1 ~
5 0 -TO Il
3
(2.12) 0 0 -s 1 -TO
3 ~
5
4
(2.13) 0 -s1 0 ~
5 S
Xo 0 0 ~+~q 0 170 - 130q 5l + Ilq.

It is easily verified that the tableau corresponding to table 2.8 is that shown
in Table 2.66. In order for the present basis to remain optiqlal, X2 must still
be basic. Therefore, the X2 value in the Xo row must have zero value. This is
achieved in Table 2.67 by adding q times (2.11) to the Xo row.
For the present basis to remain optimal, all xo-row values must be non-
negative. Thus,
~ + ~q ~ 0
1 0 - 130q ~ O.
7

Therefore,
-l:S;q:S;1.
Hence the range for C2 is (3 - 1,3 + i), with a corresponding optimum range
of (8, 16). This is illustrated in Figure 2.11.
This approach can be generalized. If the objective function coefficient Ci
of a basic variable Xi is replaced by (Ci + q), it is of interest to know whether
the original optimal solution is still optimal or not. On considering the
mechanics of the simplex method, it is clear that if the same iterations were
repeated on the new problem the only change in the optimal tableau is that
the Xi coefficient in Xo is reduced by q. Hence this coefficient is ( - q), as it was
originally zero as Xi was basic.
For the present basis to remain optimal, Xi must remain basic. That is, the
Xi coefficient in the Xo row must be zero. This is achieved by adding q times
the equation containing Xi as its basic variable to the Xo row. The tableau is
now in canonical form. For the present basis to remain optimal, all the Xo-
row coefficients must be nonnegative. Conditions on q can be deduced to
62 2 Linear Programming

\ Xo = 4xI + 2X2 = 8
(2.2) \
\
\
\
\
~
,
t.', ,
,,
"-
"- ,,
,
"-
"-
"-
"-
Xo = 4xI + 136X2 = 16

(2.4)
Figure 2.11. The graphical solution to an L.P. when an objective function coefficient
ranges.

achieve this. If a particular value of q is given, it can be deduced whether the


present basis is optimal. If it is not, further simplex iterations can be carried
out in the normal way.
(ii) Changes to Nonbasic Variable Coefficients. The situation is even sim-
pler for a change of + q to the objective function coefficient cj of a variable
Xj that turns out to be nonbasic in the optimal solution. The coefficient of Xj
in the Xo row is still reduced by q. However, in the present case there is no need
for the coefficient to become zero (as Xj is nonbasic). Hence it simply remains
to check whether the coefficient is nonnegative for the present basis to remain
optimal. Thus once more conditions on q can be deduced.

2.6.2.2 Changes in the r.h.s. Constants of the Constraints


Suppose that a r.h.s. constant of an L.P. problem is altered. Is the current
optimal solution still feasible? If it is still feasible it will still be optimal, as
the xo-row coefficients are unchanged.
2.6 Duality and Postoptimal Analysis 63

For example, consider Problem 2.1. The optimal simplex tableau for this
problem is given in Table 2.8.
(i) Change in r.h.s. Constant Whose Slack Variable Is Basic. Suppose that
the r.h.s. constant of constraint (2.12) is changed from 10 to (10 + r). For what
values of r will the present solution remain feasible and hence optimal?
Recall that (2.12) was:

It now becomes:
3Xl + 3X2 + 1 . X4 = 10 + 1 . r.
Note that the columns corresponding to X4 and r are identical in the initial
tableau of the new problem, i.e.,
(0, 1,0, Of.
Hence they will remain equal in any subsequent simplex tableau. But, as X 4
is basic in Table 2.8, its column of coefficients is unchanged. Hence when the
same sequence of iterations that produced Table 2.8 is performed on the new
problem the only place in which r will appear is the r.h.s. of (2.12). This new
constant becomes (~ + r). For this solution to remain feasible, all the r.h.s.
constants must be nonnegative, i.e.,
~+r;::::O
or

Thus, as long as

i.e.,
b2 ;:::: 9!,
the current solution will remain feasible and optimal.
Let us now generalize the above considerations. Suppose it is decided to
increase the r.h.s. constant of constraint i from b i to bi + r, and the slack
variable of the constraint is basic at the optimum. Then the only possible
change in the new optimal tableau will occur in the final bi entry. This entry
could be negative, indicating that the present solution may be infeasible.
However, if it is feasible it will still be optimal. Now if the final b i entry was
Oi' the present solution will be optimal if
0i + r;::::O,
i.e.,
r;:::: -Oi.
It may be that a specific value of r has been given that forces this inequality
to be violated, and hence for the present solution to be infeasible. One then
may ask what is the new optimal solution and its value? The negative r.h.s.
entry (Oi + r) for constraint i is removed to attain feasibility. This is achieved
64 2 Linear Programming

by replacing the basic variable associated with this constraint by a nonbasic


variable. How this is done forms the kernel to the dual simplex method, which
is explained in the next chapter.
(ii) Change in r.h.s. Constant Whose Slack Variable Is Nonbasic. Suppose
now that the r.h.s. constant of constraint (2.11) is changed from 12 to (12 + r).
Once again we ask for what values of r will the present solution remain
feasible and hence optimal? Equation (2.11) was

It now becomes
3X1 + 4X2 + 1 . X3 = 12 + 1 . r.
As with the previous case, the columns in any simplex tableau corresponding
to X3 and r are identical, and will remain identical in any subsequent simplex
tableau. But now X3 is non basic, and hence its column in the optimal tableau
is substantially changed. Hence the r.h.s. column in the tableau found by
performing the same iterations to the new problem is:
(V + tr,t - ~r,~ - tr, 5l + tr)T.
The first three entries must be nonnegative to preserve feasibility (and
optimality):
II + tr 2:: 0
t- ~r 2:: 0
~ - tr 2:: o.
That is,

with a corresponding solution value


5l + tr.
This can be generalized quite naturally. Suppose that the r.h.s. constant,
bi of constraint i is changed to (b i + r), where the starting basic variable in
constraint i is Xj. That is, constraint i is changed from

to

Assuming that Xj is nonbasic in the optimal tableau, let its coefficients in this
tableau be given in the vector:
(a1j' a2j' ... , amjf
and let the final r.h.s. coefficients be given in the vector:
(5 1,5 2 , ... ,omf.
2.6 Duality and Postoptimal Analysis 65

Then the current solution will still be feasible (and optimal) if


51 + aljr ~ 0
52 + a2 j r ~ 0

all hold.
Those inequalities can be used to deduce a range in which the present
solution remains optimal. Provided r remains within this range, this yields
optimal solution values

where x;'; is the present optimal solution value and aOj is the xo-row co-
efficient of Xj in the optimal tableau.

2.6.2.3 Changes in the l.h.s. Coefficients of the Constraints


(i) Changes to N onbasic Variable Coefficients. Consider changing a l.h.s.
coefficient aij to aij when its associated variable Xj is nonbasic in the optimal
tableau. Suppose that the same sequence of simplex iterations are carried out
on the new problem. How will the new optimal tableau differ from the original
optimal tableau? The only differences that can possibly occur are in the Xj
column. However, we have assumed that Xj is nonbasic. Thus
xj = O.
Therefore changes in the Xj coefficients in the constraints have no effect, and
the original solution will still be obtained and so must still be feasible. It
remains to settle the question of its optimality. The new xo-row coefficient
of Xj can be obtained as follows.
Let the starting basic variable from constraint i be X~(i)' This variable does
not appear in any original equation other than constraint i, where it has
coefficient 1. So it is possible to deduce what multiple of aij was added to
(- c), the coefficient of Xj in the Xo row. Indeed, if a;';~(i) is the final coefficient
of Xi' then exactly a;';~(i) times equation i must have been somehow added to
equation O. Let the current xo-row coefficient of x j be a;';j. Then the new x-row
coefficient of Xj should become

If this value is nonnegative, the present solution is still optimal. If the value is
negative, further simplex iterations must be performed, beginning with Xj
entering the basis. In order to decide which variable leaves the basis it is
necessary to update the rest of the x j column. It can be shown by an argument
similar to that for a;';j thatthe coefficient ofx)n row k(k = 1,2, ... , m)should
be changed from atj to
66 2 Linear Programming

It is also possible to decide the question of whether the change to aij affects
the optimality of the current solution by analyzing the dual problem. The
only change in the dual problem is that the jth constraint

becomes
a 1 jY1 + a2jY2 + ... + aijYi + ... + amjYm ~ Cj.
It is possible to deduce the optimal Yi values, yf, either by having solved the
dual originally or from the xo-row coefficients in the optimal primal tableau.
Hence one can substitute in these yf values and check whether this new
constraint is satisfied. If it is the solution is still optimal. If it is not, and the
optimal dual tableau is available, the optimal solution to the new problem
can be obtained by using the dual simplex method. This method will be
explained in Chapter 3.
(ii) Changes to Basic Variable Coefficients. Consider once again Problem
2.1. Suppose that a32 is changed from 2 to 3 in (2.13). Suppose now that the
same sequence of simplex iterations is performed on the new problem as
that which produced Table 2.8. This will produce a tableau which differs
from Table 2.8 only in the X2 column. The new X2 column values can be
calculated by the method outlined in the previous section. This new tableau
is shown in Table 2.68. Here the condition of canonical form is destroyed,
as X2 is supposed to be a basic variable with column
(l,O,o,of·
This condition is restored by row manipulation in Table 2.69.

Table 2.68

Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 0 ~ ~
0 -TO
3 li
10 s s
3 3 3
(2.12) 0 -TO -s -TO ~
s
(2.13) 1 ~
s -s1 0 ~
s s
~

7 7 g
Xo 0 TO
~
s 0 TO s

Table 2.69

Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 0 1 4 0 3
-"7 274
3 3
(2.12) 0 0 -"7 -"7 \0

(2.13) 1 0 3
-"7 0 4 -"7
4

Xo 0 0 0 0 8
2.6 Duality and Postoptimal Analysis 67

It can be seen that this solution is infeasible, as


x! = -4.
The dual simplex method (detailed in Chapter 3) can be used to transform
this tableau into an optimal one.
However, if the condition for optimality (all xo-row coefficients non-
negative) has not been satisfied the situation may be somewhat gloomy. In
this case one can select an earlier tableau where the condition is satisfied
and use the dual simplex method from there. If there are no such suitable
earlier tableaux; little of the computation can be saved and it is necessary
to solve the new problem from scratch.
If Table 2.69 had displayed a feasible but suboptimal solution, further
simplex iterations would have been needed to produce optimality. If it had
displayed a feasible solution satisfying the condition for optimality, nothing
more would have been necessary

2.6.2.4 The Introduction of New Variables


Suppose that a new variable, X6 is added to Problem 2.1 as follows:
Maximize: 4Xl + 3X2 + 5X6
subject to: 3x 1 + 4X2 + X3 + 2X6 = 12
3Xl + 3X2 + X4 + 3X6 = 10
4Xl + 2X2 + Xs + 4X6 = 8
Xi~O, i = 1,2, ... ,6.
The original optimal solution given in Table 2.8 can be considered a solution
to this new problem with X6 nonbasic, i.e.,
x~ = O.
Hence it must still be a feasible solution. One must now decide whether it
is optimal or not. The new dual problem can be used to make this decision.
It will be identical to the original dual except that a new constraint,
(2.35)
based upon X6 must be added to Problem 2.2. So the original dual solution
given in Table 2.64 remains feasible if and only if it satisfies (2.35). And
feasibility of the original dual solution implies optimality of the original
primal solution (with x~ = 0) for the new primal problem. However the
optimal dual solution given in Table 2.64 unfortunately does not satisfy
(2.35). So more primal simplex iterations are necessary to produce optimality.
Before they can be carried out it is necessary to calculate the coefficients
of X6 in the tableau produced when the iterations that produced Table 2.8
are applied to the new problem. These coefficients can be found by consid-
ering X6 as an original variable with constraint and objective function
68 2 Linear Programming

coefficients equal to zero. Then the introduction of X6 corresponds to a


change in the value of these coefficients from zero to their present values.
How to perform calculations based on these changes was explained in
Section 2.6.2.3. As each X6 coefficient was assumed to be zero, it would
remain zero when the simplex iterations are performed on the original
problem.lIence at6 = 0, k = 0, 1, ... ,4. For the purposes of the calculations
it is assumed that the changes occur one at a time. This produces Table 2.70.
Now further simplex iterations can be carried out with first X6 entering the
basis.

Table 2.70

Constraint Xl X2 X3 X4 X5 X6 r.h.s.

(2.11) 0 1 2
5 0 -TO
3
[H2) + 0(3) - /0(4)] 12
5
(2.12) 0 0 -5
3
-TO
3
[ -t(2) + 1(3) - M4)] 2
5
(2.13) 0 -5
1
0 2
5 [ -t(2) + 0(3) + t(4)] ~
Xo 0 0 2
5 0 7
TO [ - 5 + t(2) + 0(3) + ?0(4)] 5S2

2.6.2.5 The Introduction of a New Constraint


Suppose that a new constraint,
(2.36)
is added to Problem 2.1. Is the solution in Table 2.8 still feasible? When a
further constraint is added to an L.P. problem, a new optimal solution
cannot improve on the original one. So if the original optimal solution is
still feasible, it is still optimal. However this is not true with regard to (2.36)
and the solution in Table 2.8. Hence a new slack variable, X6 is added to
(2.36) to produce
Xl + X 2 + X6 = 3, (2.37)
which is added to Table 2.8 to give Table 2.71.

Table 2.71

Constraints Xl X2 X3 X4 Xs X6 r.h.s.

(2.11) 0 1 2
5 0 -TO
3
0 II
3 3 2
(2.12) 0 0 -5 -TO 0 5
1 4
(2.13) 1 0 -5 0 t 0 5
(2.37) 1 0 0 0 3
Xo 0 0 2
5 0 7
TO 0 Sl
2.7 Special Linear Programs 69

Table 2.72

Constraints Xl X2 X3 X4 X5 X6 r.h.s.

(2.11) 0 ~ 0 3
-TO 0 V
3 3
(2.12) 0 0 -5 1 -TO 0 ~
(2.13) 1 0 1
-5 0 5
2
0 !
1 1 1
(2.37) 0 0 -5 0 -TO 1 -5
2 7 52
Xo 0 0 5 0 TO 0 ""5

When this Table 2.71 is reduced to canonical form by subtracting (2.11)


and (2.13) from (2.37) it is seen that the resulting solution is infeasible, as

even though the condition for optimality is satisfied. This is shown in Table
2.72. This situation can be remedied to produce optimality using the dual
simplex method of Chapter 3.

2.7 Special Linear Programs

2.7.1 The Transportation Problem

The transportation problem is a special type of linear program. Because of


its structure it can be solved more efficiently by a modification of the sim-
plex technique than by the simplex technique itself. Consider a supply sys-
tem comprising three factories which must supply the needs for a single
commodity of three warehouses. The unit cost of shipping one item from
each factory to each warehouse is known. The production capacity of each
factory is limited to a known amount. Each warehouse must receive a mini-
mum number of units of the commodity. The problem is to find the minimum
cost supply schedule which satisfies the production and demand constraints.
Figure 2.12 shows a typical supply system in diagrammatic form, the num-
bers associated with the arrows representing unit shipping costs. The supply
schedule to be found consists of a list which describes how much of the com-
modity should be shipped from each factory to each warehouse. For this
purpose, define xij to be the number of units shipped from factory i to ware-
housej.
Consider factory 1 with capacity 20. Factory 1 cannot supply more than
20 units in total to warehouses 1, 2, and 3. Hence
70 2 Linear Programming

Capacity Factories Warehouses Demand


0.9
20 __----------~----------~1 5

15 2 ...---=-:..-----*-------------:. 2 20

10 3~---------------------.....,.3 20
0.8
Figure 2.12. The supply system of a typical transportation problem.

The production constraints for factories 2 and 3 are, respectively,


X21 + X22 + X23 ~ 15
and
X31 + X32 + X33 ~ 10.
Consider warehouse 1 with a demand of 5. Warehouse 1 must receive at
least five units in total from factories 1, 2, and 3, hence
Xu + X21 + X31 ~ 5.
The demand constraints for warehouses 2 and 3 are, respectively,
X12 + X22 + X32 ~ 20
and
X13 + X23 + X33 ~ 20.
Of course, all quantities shipped must be nonnegative; thus,
i = 1,2,3
j = 1,2,3.

The objective is to find a supply schedule with minimum cost. The total
cost is the sum of all costs from all factories to all warehouses. This cost Xo
2.7 Special Linear Programs 71

can be expressed as
Xo = 0.9x u + 1.0x 12 + 1.0x 13 + 1.0X21 + 1.4X22
+ 0.8X23 + 1.3X31 + 1.0X32 + 0.8X33'
The problem can now be summarized in linear programming form as
follows.
Minimize: Xo = 0.9x u+ 1.0X12 + 1.0X13 + 1.0X21 + 1.4X22
+ 0.8X23 + 1.3X31 + 1.0X32 + 0.8X33
subject to: Xu + X12 + X13 ::;; 20
X21 + X22 + X23 ::;; 15
X31 + X32 + X33 ::;; 10

Xu + X 21 + X 31 ~ 5
X12 + X22 + X32 ~ 20

X13 + X23 + X33 ~ 20


Xij ~ 0, i = 1,2,3,
j = 1,2,3.

The problem can be generalized as follows. Let


m= the number of factories;
n= the number of warehouses;
ai = the number of units availllhle at factory i, i = 1,2, ... , m;
bj = the number of units required by warehousej,j = 1,2, ... , n;
cij = the unit transportation cost from factory i to warehouse j.

Then the problem is to

m n
Minimize: Xo = L1 L1
i= j=
CijXij (2.38)

m
subject to: L
i=l
xij ~ bj , j = 1,2, ... , n (2.39)
n

L1
j=
Xij::;; ai' i = 1,2, ... , m (2.40)

Xij~ 0, i = 1,2, ... , m,


j = 1,2, ... , n.
Problems which belong to this class of L.P. problems are called trans-
portation problems. However, many of the problems of this class do not
involve the transporting of a commodity between sources and destinations.
In the particular problem studied here, total supply is equal to total demand.
72 2 Linear Programming

r----------i
: 1 1 1 1
'- - - - - - - - - - -1- - - - - - - - - - -I
1 1 1 1 1
L---------+--------- i
1 1 1 1 I
"'-...............
--- l.o-........- - '""-- - - - - - ..J
" 1" '~ 1 "
......" ' .....
" 1', " 1',
"', ....... , ' ......
" 1 " " 1 "
" 1 ,I
, 1 , 1 " I
',I '-.,J '--J
Figure 2.13. The distinctive pattern of the unit constraint coefficients in the trans-
portation problem.

Hence in any feasible solution each factory will be required to ship its entire
supply and each warehouse will receive exactly its demand. Therefore, all
constraints will be binding in any feasible solution. The algorithm for the
solution of the transportation problem, shortly to be explained, assumes
that supply and demand is balanced in this way. Of course, there may exist
well formulated problems in which "supply" exceeds "demand" or vice versa,
as the problems may have nothing to do with the transportation of a com-
modity. In this case a fictitious "warehouse" or "factory" is introduced,
whichever is required. Its "capacity" or "demand" is defined so as to balance
total supply with total demand. All unit transportation costs to or from this
fictitious location are defined to be zero. Then the value of the optimal
solution to this balanced problem will equal that of the original problem.
It was mentioned earlier that, because of its structure, the transportation
problem could be solved efficiently by a modified simplex procedure. This
structure is.
1. Alll.h.s. constraint coefficients are either zero or one.
2. Alll.h.s. unit coefficients are always positioned in a distinctive pattern in
the initial simplex tableau representing the problem (ignoring slack vari-
ables). This is shown in Figure 2.13.
3. All r.h.s. constraint constants are integers.
This structure implies a very important result, that the optimal values of the
decision variables will be integer.
In solving problems by hand using the simplex method it was convenient
to display each iteration in a tableau. This is also done in the transportation
problem, except a different type of tableau is used. The general tableau is
given in Table 2.73.
The tableau for the example problem is given in Table 2.73a. The value
of each decision variable is written in each cell. A feasible solution to the
problem is displayed in Table 2.74. Methods by which an initial feasible
solution can be identified are outlined in the next section.
2.7 Special Linear Programs 73

Table 2.73
Warehouses Supply
2 j n

Cll C12 c 1j c 1n

2 c2 ! C2 2 c2j c2n
Factories
c il Ci 2 cij Cin

m Cm ! Cm 2 Cmj Cmn

Demand

Table 2.73a
2 3

0.9 1 1 20

2 1 1.4 0.8 15

3 1.3 1 0.8 10

5 20 20

Table 2.74
2 3

0.9 1 1 20
CD @

2 1 1.4 0.8 15
CD @

3 1.3 1 0.8 10
@
5 20 20
74 2 Linear Programming

2.7.1.1 The Identification of an Initial Feasible Solution


2.7.1.1.1 The Northwest Corner Method. This method starts by allocating
as much as possible to the cell in the northwest corner of the tableau of the
problem, cell (1,1) (row 1, column 1). In the example problem, the maximum
that can be allocated is five units, as the demand of warehouse 1 is five. This
satisfies the demand of warehouse 1 and leaves factory 1 with 15 units left.
As warehouse 1 is satisfied, column 1 is removed from consideration. Then
cell (1,2) becomes the new northwest corner. As much as possible is allocated
to this cell. The maximum that can be allocated is 15, all that remains in
factory 1. Warehouse 2 now has its demand reduced to 5, as it has just
received 15 units from factory 1. Row 1 is dropped from consideration as it
has now expended all its resources. This means that cell (2,2) becomes the
new northwest corner. This procedure continues until all demand is met.
Table 2.74 shows the feasible solution thus obtained.

2.7.1.1.2 The Least Cost Method. Although the northwest corner method
is easy to implement and always produces a feasible solution, it takes no
account of the relative unit transportation costs. It is quite likely that the
solution thus produced will be far from optimal. The methods of this section
and the next usually produce less costly initial solutions. The least cost
method starts by allocating the largest possible amount to the cell in the
tableau with the least unit cost. In the example problem, this amounts to
allocating to either cell (2,3) or cell (3,3). Suppose cell (2,3) is chosen arbi-
trarily and 15 units are assigned to it. This procedure will always satisfy a
row or column which is removed from consideration. In this case row 2 is
removed. The demand of warehouse 3 is reduced to 5, as it has been allocated
15 by factory 2. (The cell with the next smallest unit cost is identified and the
maximum is allocated to it. This means 5 units are allocated to cell (3,3).
This procedure continues until all demand is met. Table 2.75 shows the
feasible solution thus obtained.

Table 2.75
2 3

0.9 1 1 20
CD @

2 1 1.4 0.8 15
@

3 1.3 1 0.8 10
CD CD
5 20 20
2.7 Special Linear Programs 75

2.7.1.1.3 The Vogel Approximation Method. The Vogel approximation


method often produces initial solutions which are even better than those
of the least cost method. However, the price of this attractiveness is con-
siderably more computation than the previous two methods. The approach
is similar to that of the Hungarian method for the assignment problem,
discussed later in this chapter, and also to that used in solving the travelling
salesman problem by branch and bound enumeration, discussed in Chapter 4.
The variation of the Vogel approximation method described here begins
by first reducing the matrix of unit costs. This reduction is achieved by sub-
tracting the minimum quantity in each row from all elements in that row.
This results in the following unit costs in the current example in Table 2.76.
The costs are further reduced by carrying out this procedure on the columns
of the new cost matrix. This produces Table 2.77.
A penalty is then calculated for each cell which currently has zero unit
cost. Each cell penalty represents the unit cost incurred if a positive allocation
is not made to that cell. Each cell penalty is found by adding together the
second smallest costs of the row and column of the cell. These second

Table 2.76
2 3

0 0.1 0.1 ( -0.9)

2 0.2 0.6 0 ( -0.8)

3 0.5 0.2 0 ( -0.8)

Table 2.77
2 3

0 0 0.1

2 0.2 0.5 0

3 0.5 0.1 0

(0) ( -0.1) (0)


76 2 Linear Programming

smallest costs for each row ahd column are shown alongside each row and
column for the example problem in Table 2.78. The penalties are shown in
the top right-hand corner of each appropriate cell.

Table 2.78
2 3
0 I o I .I I
r---' ~--' ---'
0.2 0.1 0 (0)

0.2 I 0.5 I o I
r---' r--~ ---'
2 0 0 0.2 (0.2)

0.5 I 0.1 I o I
r---' I---~ ---'
3 0 0 0.1 (0.1)

(0.2) (0.1) (0)

The cell with the largest penalty is identified. The maximum amount
possible is then allocated to this cell. Ties are settled arbitrarily. In the
example, either cell (1, 1) or cell (2,3) could be chosen, each with a penalty of
0.2. Cell (1, 1) will be arbitrarily chosen, and 5 units are allocated to it. This
procedure will always satisfy a row or column (or both), which is then re-
moved from further consideration. This removal may necessitate a further
reduction in the cost matrix and a recalculation of some penalties. This
results in Table 2.79. This process is repeated until all demand is met. The
final allocation is given in Table 2.80.
A comparison of the three techniques shows that the northwest corner
method produced an initial solution with value 42.5, the least cost method
and the Vogel approximation method produced the same solution with
value 40.5. It will be shown that this latter solution is optimal.

Table 2.79

5 0 0.2 0.1 (0.1)

0.5 0 0.5 (0.5)

0.1 0 0.1 (0.1)

(0.1) (0)
2.7 Special Linear Programs 77

Table 2.80
2 3

0.9 1 1
(]) @

2 1 1.4 8
@

3 1.3 0.8
(]) (])

2.7.1.2 The Stepping Stone Algorithm


Once an initial feasible solution has been found by one ofthe three preceding
methods, it is desired to transform it into the optimal solution. This is
achieved by the stepping stone algorithm. Consider the initial feasible solu-
tion found by the northwest corner method given in Table 2.74. To deter-
mine whether this solution is optimal or not it is necessary to ask, for each
cell individually, if the allocation of one unit to that cell would reduce the
total cost. This is done only for those cells which presently have no units
assigned to them.
For example, cell (1,3) has nothing assigned to it. Would the total cost
be reduced if at least one unit was assigned to that cell? Assume that one
unit is assigned, i.e.,
X 1 3 = 1.

This means row 1 and column 3 are unbalanced-the sum of their assign-
ments do not add up to the appropriate capacity and demand. To balance
row 1, one unit is subtracted from cell (1,2) so that now

X 12 = 14.

Now column 2 is unbalanced. To correct this, one unit is added to cell 2, 2,


so that now
X22 = 6.

Now row 2 is unbalanced. To correct this, one unit is subtracted from cell
(2,3):

This also balances row 3.


What we have done is to trace out a circuit of cells, the only empty one
being the cell under scrutiny. This circuit is shown in Table 2.81. Is this
solution an improvement over the initial solution? The solution is displayed
in Table 2.82.
78 2 Linear Programming

Table 2.81
2 3

0.9 1 1 20
CD @) ?

2 1 1.4 0.8 15
CD @

3 1.3 1 0.8 10
@
5 20 20

Table 2.82
2 3

0.9 1 1 20
CD @ CD
2 1 1.4 0.8 15
@ ®

3 1.3 1 0.8 10
@
5 20 20

The difference between the solutions in Tables 2.81 and 2.82 is


+$1.0 for the unit shipped from factory 1 to warehouse 3
- $1.0 for the unit less from factory 1 to warehouse 2
+ $1.4 for the unit shipped from factory 2 to warehouse 2
- $0.8 for the unit less from factory 2 to warehouse 3
$0.6
Thus an allocation of one unit to cell (1,3) causes an increase of $0.6. Hence
such an allocation is not worthwhile. We can evaluate the worth of all other
empty cells in a similar manner; that is, for each empty cell we can form a
circuit of cells, the only empty cell in the circuit being the cell in question.
The reader should verify that the changes in Xo for a unit allocation to
cells (2, 1), (3, 1), and (3, 2) is - $0.3, $0.0, and - $0.4, respectively. The circuit
of cells for this last proposed allocation is shown in Table 2.83.
As there is a decrease in Xo of $0.4 for each unit allocated to cell (3,2),
we wish to allocate the maximum possible amount to (3,2). The cells which
are going to have their allocations reduced are (2, 2) and (3,3). The minimum
2.7 Special Linear Programs 79

Table 2.83
2 3

9 1 1
0) @

2 1 1.4 0.8
Q> ~
3 1.3 1
t 0.8
.@

allocation among these is 5 units in (2,2). Hence the maximum allocation we


can make to (3,2) is 5: any more and (2,2) would have a negative allocation,
which would be infeasible. The new assignment is shown in Table 2.84. The
total cost is decreased by
$(5 x (0.4)) = $2.0.
Thus the new total cost is
$(42.5 - 2.0) = $40.5,
which can be verified by direct computation. We have now completed one
iteration of the stepping stone method. All empty cells in Table 2.84 are
examined in the same way. Of course we know from the previous iteration
that a unit allocation to (2,2) will produce an increase in Xo of $0.4. Indeed
an allocation to any empty cell in Table 2.84 will effect an increase in Xo.
The circuit of cells for each empty cell is:
(1,3): «(1,3), (3,3), (3,2), (1,2), (1,3) ( +$0.2)
(2,1): «(2,1), (1, 1), (1,2), (3,2), (3,3), (2,3), (2,1) ( +$0.1)
(2,2): «(2,2), (2,3) (3,3), (3,2), (2,2) ( +$0.4)
(3,1): «(3,1), (1, 1), (1,2), (3,2), (3, 1) ( +$0.4).

Table 2.84
2 3

0.9 1 1
0) @

2 1 1.4 0.8
@

3 1.3 1 0.8
0) 0)
80 2 Linear Programming

Note that the circuit for (2,1) crosses over itself, but this need not cause
any alarm.
This means that we have arrived at the optimal solution. The shipping
schedule is: ship
5 units from factory 1 to warehouse 1,
15 units from factory 1 to warehouse 2,
15 units from factory 2 to warehouse 3,
5 units from factory 3 to warehouse 2,
5 units from factory 3 to warehouse 3,
for a total cost of $40.5. This solution is the same as that obtained by the
Vogel approximation method and the least cost method.
It has been stated that total supply must equal total demand for the
problem to be in a form suitable for the stepping stone algorithm. This means
that one ofthe constraints (2.39), (2.40) can be expressed in terms of the others
and is redundant. Hence the problem possesses in effect (m + n - 1) con-
straints. Thus any basic feasible solution should contain (m + n - 1) basic
variables. It may occur that a solution contains less than (m + n - 1) basic
(positive) variables. Such a solution is degenerate.
It is not possible to analyze all the empty cells of the tableau of the degen-
erate solution to find an improvement. This problem can be overcome by
declaring basic as many cells as necessary to bring the number in the basis
up to (m + n - 1). This is achieved by allocating a very small positive real
number t; to these cells. These allocations are made to cells in rows or columns
where there is only one basic cell in order to enable circuits of cells to be
created for all empty cells. These t;'S are then removed when the optimal
solution has been found.

2.7.1.3 Dantzig's Method


The stepping stone method will guarantee to find the minimal solution for
any well formulated transportation problem in a finite number of steps.
However, its implementation becomes very laborious on all but the smallest
problems. For realistically sized problems the following simpler method due
to Dantzig is recommended. Like the stepping stone method it evaluates each
empty cell-in order to see whether it would be profitable to make a positive
assignment to it. This evaluation is based on the theory of duality of Section
2.6.1. To be more specific, values are calculated for variables in the dual of
the transportation problem regarded as an L.P.
Unlike the stepping stone method, Dantzig's method does not create a
circuit of cells in order to evaluate the worth of an empty cell. Instead it
calculates values for the dual variables; these enable one to determine which
empty cell should be filled. It then creates one circuit of cells in order to deter-
mine how much should be allocated and which cell leaves the basis. As only
one circuit is created at each iteration, this method is far simpler than the
preceeding one.
2.7 Special Linear Programs 81

We now explain how the method works by using it to solve our example
problem. Consider once again the solution obtained by the northwest corner
method, given in Table 2.74. We associate multipliers U j with each row i and
Vj with each columnj. For each basic cell (i,j) set

the unit transportation cost, and


Ul = o.
The values of all u/s and v/s can be then be calculated as the cij's are known
constants. From Table 2.74 we have
0+ Vl = 0.9,
0+ V2 = 1,
U2 + V2 = 1.4
U2 + V3 = 0.8
U3 + V3 = 0.8,
which can be solved to yield
Vl = 0.9,
Having determined values for what will be seen to be dual variables, we
now calculate the change in Xo for a unit allocation to each nonbasic cell
(k, I):

The ck,'s will have the same values as those determined by the stepping stone
method. For our example:
c 13 = C13 - Ul - V3 = 1 - 0.4 = 0.6
C21 = C2l - U2 - Vl = 1 - 0.4 - 0.9 = -0.3
C3l = C3l - U3 - Vl = 1.3 - 0.4 - 0.9 = 0
C32 = C32 - U3 - V2 = 1 - 0.4 - 1 = -0.4.
Thus, as with the stepping stone method, we have discovered that the maxi-
mum amount possible should be allocated to cell (3,2). This allocation is
made as in the previous method. We effect the change of basis, producing
Table 2.84 from Table 2.83. The multipliers for Table 2.84 are now calculated:
o + Vl = 0.9, (Ul = 0)
0+ V2 = 1
U2 + V3 = 0.8
U3 + V2 = 1

U3 + V3 = 0.8.

These are solved to yield:


Ul = 0, Vl = 0.9, V3 = 0.8,
82 2 Linear Programming

We can now calculate the change in Xo for a unit allocation to each nonbasic
cell:
C13 = 1 - 0 - 0.8 = 0.2
C21 = 1 - 0 - 0.9 = 0.1

C22 = 1.3 - 0 - 0.9 = 0.4

C31 = 1.3 - 0 - 0.9 = 0.4.

These values are identical with those obtained by the simplex method and
are all nonnegative. As the optimal solution has been found, as displayed
in Table 2.84, the method is terminated.
In order to explain why the method works, let us take the dual of the
capacitated version of (2.38), (2.39), (2.40). That is, we assume equality in the
constraints and the problem becomes
m n
Minimize: Xo = L L
i= 1 j= 1
CijXij

m
subject to: L Xij = bj , j = 1, 2, ... , n
i= 1
n

L
j= 1
Xij = ai' i = 1,2, ... , m

Xij :2: o.
The reader unfamiliar with L.P. duality should refer to Section 2.6.1. In
taking the dual, suppose we associate a dual variable Vj with each of the
first n constraints and a dual variable, Ui with each of the next m constraints.
The dual problem is:
n m
Maximize: L
j= 1
bjvj + L
i= 1
aiui

subject to: Vj + Ui ::;; Cij' i = 1,2, ... , m


j = 1,2, ... , n.
Note that the u;'s and v/s are not restricted to nonnegative values, as they
arise from equality constraints. The special nature of the inequality con-
straints in the dual arises because of the structure of the primal constraint
matrix, as illustrated in Figure 2.l3.
Suppose we are solving the transportation problem as a regular L.P. using
the simplex method. We would wish to calculate the xo-row coefficients cij
at each iteration in order to test for optimality and, if the test is negative,
decide which variable enters the basis. Now, according to property 3(a) of
complementary slackness (see Section 2.6.1.3),
Xij >0~ + Ui = cij
vj

for every basic variable Xij. This creates (m + n - 1) equations in (m + n)


2.7 Special Linear Programs 83

unknowns, which can be solved by assigning an arbitrary value to one of


the unknowns. Traditionally U1 is set to zero.
Now the value of each slack variable in the dual constraint

is

Thus from property 2 of Section 2.6.1.3 we have

Thus in order to determine which variable xij should enter the basis (if op-
timality has not yet been reached) we must simply select the xij which has the
most negative value of Cij - U i - Vj' Note that the most negative (rather than
most positive) is selected, as our original objective is one of minimization.

2.7.2 The Assignment Problem

The assignment problem is a special type of transportation problem. Because


of its structure it can be solved more efficiently by a special algorithm than
by the stepping stone algorithm. Consider a collection of n workers and n
machines. Each worker must be assigned one and only one machine. Each
worker has been rated on each machine and a standardized time for him to
complete a standard task is known. The problem is to make an assignment
of workers to machines so as to minimize the total amount of standardized
time of the assignment.
For the purposes of describing an assignment, define

X ..
'J
= {1'0, if worker i is assigned to machine j,
otherwise
n = the number of workers and the number of machines
cij = the standardized time of worker i on machine j, assumed to be non-
negative.
Then the problem is to
n n
Minimize: Xo = L L
i= 1 j= 1
CijXij (2.41)

n
subject to: L
i= 1
Xij = 1, j = 1,2, ... , n (2.42)

L
j= 1
Xij = 1, i = 1,2, ... , n (2.43)

Xij = 0,1, i = 1,2, ... , n


j = 1,2, ... , n.
84 2 Linear Programming

It can be seen on comparison with (2.38), (2.39), (2.40) that this formula-
tion is indeed a special case of the transportation problem the workers re-
presenting factories and the machines representing warehouses. Here each
"factory" and each "warehouse" has a capacity and demand of one unit.
The problem can be represented by a tableau like Table 2.73, shown in
Table 2.85. The problem could be solved using the techniques developed
for the transportation problem. However let us examine the problem a little
more deeply in order to discover a more efficient method which exploits
the special structure of this assignment problem.

Table 2.85
Machines
2 ... j n

Workers

The matrix of standardized times

C ll C 12 Cli

C21 C22 C2i

c= C il Ci2 Cii Cin

Cn1 Cn2 Cni Cnn

holds the key to the problem. Because there is a one-to-one assignment of


workers to machines, our problem reduces to finding a set S of n entries of C
with the properties that (i) exactly one entry of S appears in each row of C
and (ii) exactly one entry of S appears in each column of C. Then among all
sets S of n entries of C we require the one with the least sum. In order to
2.7 Special Linear Programs 85

make use of this fact, consider the following numerical example:


n=5
and
5 CD 5 1 3
6 CD 8 6
c= 3 2 2 3
5 6 5 CD 6
2 1 5 3 CD
In this case it is possible to identify a minimal S quite easily. Because every
entry in C is at least one, any set S with sum five must be minimal. Such a
set has been circled and it has, of course, exactly one entry in each row and in
each column, thus obeying properties (i) and (ii).
Suppose now that we subtract one unit from each entry in C to obtain C':

4 @ 4 0 2
5 @ 7 3
C'= 2 1 1
4 5 4 @
1 0 4 2

Because the relative value of the entries remain unchanged, the minimal
solution remains the same.
These observations hold true for any matrix C, and furthermore, because
all entries are assumed to be nonnegative, once a set S of all zero entries has
been identified it must be minimal. The Hungarian method, due to the Hun-
garian mathematician Konig has this as its aim. The method progressively
reduces the entries in a manner similar to our step from C to C' until a set
S of zeros can be identified.
The method is made up of three parts:
1. C is reduced by subtracting the least entry in column i from every element
in column j, for each column i, i = 1, 2, ... , n. Then if any row has all
positive entries, the same operation is applied to it.
2. A check is made to see whether a set S of all zeros can be found in the
matrix. If so, S represents a minimal solution and the method is terminated.
If not, step (c) is applied.
3. As a minimal S cannot yet be identified, the zeros in C are redistributed
and possibly spme new zeros are created. How this is carried out will be
explained shortly. Then the check of step 2 is performed again. This cycle
of steps 2 and 3 is repeated until a minimal S is found.
A few comments about these steps will now be made. We need to show
that if C' is the matrix obtained from C by step (1), then the set of minimal
sets S for C and C' are identical. Suppose that (Xi and pj , positive real numbers,
86 2 Linear Programming

are subtracted from the ith row and jth column of C, respectively, for each
row i and each column j. Then, if c ij and C;j are the i - j elements of C and
C', respectively,

Further, if x~ is the objective function associated with the new assignment


problem represented by C', then
n n

x~ = L L C;jXij
i=l j=l
n n

= L L (cij -
i=l j=l
lXi - {3)xij

n n n n n n
= L L cijxij - i=lj=l
i=lj=l
L L lXiXij - i=lj=l
L L {3jX ij

= Xo - itl (lXi jtl X ij ) - jtl {3j(tl Xij}

By (2.42) and (2.34),


n n

x~ = Xo - L lXi - L {3j.
i=l j=l

Thus x~ and Xo differ only by the total amount subtracted, which is a con-
stant. Therefore they have identical minimal sets.
We need to have an efficient way of performing step (2). That is, we need
to be able to pronounce whl!ther or not a set S of zeros exists, and if it does,
which entries belong to it. A moment's reflection reveals that any such S
has the property that its zeros in C' can be transformed into a leading
diagonal of zeros by an interchange of rows. For example the C' of our
numerical example:
@ 2 1 1 2
3 5 @ 3
4 @ 4 2
454
104
upon the interchange of rows 2 and 3 becomes:

@ 2 1 1 2
4 @ 4 0 2
3 5 @ 7 3
4 5 4 @ 5
1 0 4 2 @
2.7 Special Linear Programs 87

One needs exactly n (in this case 5) straight lines in order to cross out all the
circled zeros: no smaller number of straight lines will suffice:
@ 2 1 1 2
4 @ 4 0 2
3 5 @ 7 3
4 5 4 @ 5
1 0 4 2 @

Because the interchange of rows does not affect this minimum number
of crossing lines, we have discovered a simple test to determine whether or
not a minimal S can be found:

If the minimum number of lines necessary to cross out all the zeros equals n,
a minimal S can be identified. If the minimum number of lines is strictly less
than n, a minimal S is not yet at hand.

How can we be sure we are using the smallest possible number of crossing
lines? The following rules of thumb are most helpful in this regard: (a)
Identify a row (column) with exactly one uncrossed zero. Draw a vertical
(horizontal) line through this zero. (b) If all rows or columns with zeros
have at least two uncrossed zeros, choose the row or column with the least,
identify one of the zeros and proceed as in (a). Ties are settled arbitrarily.
In order to make these rules clear, we illustrate them on the following
example:
o 0 4 4 9
60050
80808
3 0 1 8 0
4 2 008

The first column has exactly one zero, so according to (a) we cross out the
first row:
0 0 4 4 9
6 0 0 5 0
8 0 8 0 8
3 0 1 8 0
4 2 0 0 5

We must now use (b). We arbitrarily choose C32 and cross out the second
88 2 Linear Programming

column:

6 0 5 0
8 8 0 8
3 1 8 0
4 0 0 5
Now row 4 has one uncrossed zero, thus we cross out column 5. Proceeding
in this way we produce:
v

6 D (
8 0 0
3 D
4 ~

This requires five lines and thus contains a minimal S. Identifying such an
S is usually not difficult if one begins by looking for rows or columns with
exactly one zero.
Now we come to step (3). Suppose that strictly less than n lines are needed
to cross out all the lines in c. We know then that a minimal S cannot be
found directly. In order to transform C we make use of Konig's theorem:

If the elements of a matrix are divided into two classes by property R, then the
minimum number of lines that contain all the elements with the property R is
equal to the maximum number of elements with the property R, with no two
on the same line.

Applying this to C, where R is the property of being zero, we now present


a way to transform c. We wish to change at least one of the uncrossed
(and hence positive) numbers to become zero. This is brought about by (i)
subtracting the minimum uncrossed entry from all uncrossed entries; and
(ii) adding this same number to each doubly crossed entry (an entry with
both horizontal and vertical lines passing through it). All lines are then
removed and step (3) is completed.

Table 2.86

Machines
1 2 3 4 5

1 6 5 9 4 6
2 3 8 3 9 5
Workers 3 2 7 6 5 6
4 5 9 8 3 8
5 1 3 7 4 2
2.7 Special Linear Programs 89

2.7.2.1 A Numerical Example


The Hungarian method will now be used to solve the problem whose C
matrix is given in Table 2.86. Here n = 5. Following step 1 we subtract a
quantity from each column of the (Ci) matrix. This amount is equal to the
minimum quantity in that column. Thus the initial (Ci) matrix becomes:

5 2 6 1 4
2 5 0 6 3
1 4 3 2 4
4 6 5 0 6
0 0 4 1 0
( -1) (-3) (-3) (-3) (-2)

The next step is to carry out the same operation for each row. The matrix
becomes
4 1 5 0 3 (-1)
2 5 0 6 3 (0)
0 3 2 1 3 ( -1)
4 5 0 6 (0)
6
0 0 4 1 0 (0)
Following step 2, the minimum number of lines passing through all the
zero elements are drawn:
1 5 3
~. {\ "l
~ v ~

~ 3 2 3
6 5 6
{\ (\
v v

If the minimum number of lines had been 5, an optimum solution could


have been found by inspection. This involves selecting five zero elements-
one such element in each row and each column. This selection will be
illustrated shortly. Such is not the case in the present problem, where only
four lines are necessary. This means that an implementation of step 3 is
required. The minimum uncrossed number is selected. It is subtracted from
all uncrossed numbers:
4 0 4 0 2
2 5 0 6 3
0 2 1 1 2 (-1)
4 5 4 0 5
0 0 4 1 0
90 2 Linear Programming

This same number (1 in this case) is added to all numbers with two lines
passing through them:

4 0 4 0 2
3 5 0 7 3
0 2 1 1 2 (+ 1)
4 5 4 0 5
1 0 4 2 0

The minimum number oflines are again drawn through all the zero elements:

4 e 4 e ~
3 5 e 'I 3
e ~ 1 1 ~
4 5 4 0 5
1 e 4 ~ e

As five lines are required, the minimal solution can be found.


The solution for the present problem is

X 12 =1
X23 = 1
X31 = 1
X44 = 1
xss = 1.

The value of this solution is equal to the total of the numbers subtracted,
i.e.,
X~ = 1 + 3 + 3 + 3 + 2 + 1 + 0 + 1 + 0 + 0 + 1 = 15.
This value can be checked by inspecting the original (Ci) matrix.

2.8 Exercises

(I) Computational

1. Solve the following problems graphically.


(a) A baker bakes two types of cakes each day, one chocolate and one banana. He
makes a profit of $0.75 for the chocolate cake and $0.60 for the banana cake
The chocolate cake requires 4 units offiour and 2 units of butter and the banana
2.8 Exercises 91

cake requires 6 units of flour and 1 unit of butter. However, only 96 units of
flour and 24 units of butter are available each day. How many of each type of
cake should he bake each day so as to maximize profit?
(b) A bakery produces two types of bread. Each type requires two grades of flour.
The first type requires 5 kg of grade 1 flour and 4 kg of grade 2 flour per batch.
The second type requires 4 kg of grade 1 and 6 kg of grade 2 per batch. The
bakery makes a profit of $10 and $20 per batch on the first and second types,
respectively. How many batches of each type should be made per day if 200 kg
of grade 1 and 240 kg of grade 2 flour can be supplied per day?
(c) In the production of wool yarn by carding it is found that the waste produced
is dependent on the quantity by weight of a lubricant/water emulsion added
before processing. Because of pumping restrictions the concentration of the
emulsion should not exceed 1 part lubricant to 2 parts water. The application
of the emulsion should be at a rate so that no more than 5% dry wool weight of
emulsified wool is emulsion. Assume that the densities of water and lubricant
are the same. Quality control measures stipulate that the lubricant should not
be more than 4% (dry wool weight) of the emulsified wool. It is found that the
waste produced decreases by 8 kg per kg lubricant added and decreases by 5 kg
per kg water added. Find the amounts oflubricant and water to apply to 100 kg
of dry wool so as to minimize the waste produced.
(d) A company makes two types of brandy: The Seducer (S), and Drunkard's
Delight (D). Each barrel of S requires 5 hours in the fermenter and 2 hours in
the distiller, while each barrel of D requires 3 hours in the fermenter and 4 hours
in the distiller. Because of various restrictions the fermenter and the distiller
can be operated for no more than 15 and 8 hours per day, respectively. The
company makes a profit of$210 for a barrel of Sand $140 for a barrel of D. How
many barrels of each type should be produced to maximize daily profit?
(e) A farmer produces potatoes at a profit of $200 per unit and pumpkins at a
profit of $140 per unit. It takes him 5 days to crop a unit of potatoes and 7 days
to crop a unit of pumpkins. Earlier in the year it takes him 5 days to prepare
the land and plant seeds for a unit of potatoes and 3 days for a unit of pumpkins.
He has 90 cropping days and 50 preparation days available. What amount of
each vegetable should he plan on in order to maximize profit?

2. Solve the following problems by the simplex method.


(a) A plant manufactures three types of vehicle: automobiles, trucks, and vans, on
which the company makes a profit of $4,000, $6,000, and $3,000, respectively,
per vehicle. The plant has three main departments: parts, assembly, and
finishing. The labour in these departments is restricted, with parts, assembly,
and finishing operating 120, 100, and 80 hours, respectively, each two-week
period. It takes 50, 40, and 30 hours, respectively, to manufacture the parts for
an automobile, truck, and van. Assembly takes 40, 30, and 20 hours, respectively,
for an automobile, truck, and van. Finishing takes 20, 40 and 10 hours, respec-
tively, for an automobile, truck, and van. How many of each type of vehicle
should the company manufacture in order to maximize profit for a two-week
period?
(b) A manufacturer produces three soft drink cocktails: Fruito, Fifty/fifty, and
Sweeto. The amounts of sugar and extract in one barrel of each are shown in
Table 2.87. The manufacturer can obtain 6 kg, 4 kg, and 3 kg per day of sugar,
92 2 Linear Programming

Table 2.87. Data for Exercise 2(b).

Cocktail Profit/barrel Sugar Orange extract Lemon extract

Fruito $30 1 kg 2 kg 1 kg
50/50 $20 2kg 1 kg 1 kg
Sweeto $30 3 kg 1 kg 1 kg

orange extract, and lemon extract, respectively. The profit is proportional if


fractional quantities of a barrel are produced. How much of each cocktail
should be produced in order to maximize daily profit?
(c) The problem is to maximize the satisfaction gained on a 140 km journey when
only 3 constant speeds are permitted: 0, 50, and 80 km/hr. The satisfaction
gained from stopping and resting (0 km/hr), travelling slow (50 km/hr) and
travelling fast (80 km/hr) is rated at 5, 9, and 2 units/hr, respectively. Restric-
tions imply that the journey must be completed in no longer than 4 hours, the
total time spent stationary or travelling at high speed must not exceed 1 hour,
and the average speed of the journey must not be less than 40 km/hr.
(d) A small company makes 3 types of biscuits: A, B, and C. 10 kg of biscuit A
requires 5 kg of sugar, 3 kg of butter, and 2 kg of flour. 10 kg of biscuit B re-
quires 4 kg of sugar, 3 kg of butter, and 3 kg of flour. 10 kg of biscuit C requires
3 kg of sugar, 4 kg of butter, and 3 kg of flour. The company has available per
day 40 kg of sugar, 33 kg of butter, and 24 kg of flour. The company can sell all
it produces, and makes a profit of $60, $50, and $30 from 10 kg of biscuits A,
B, and C, respectively. How much of each biscuit should the company make to
maximize daily profit?
(e) Recall the farmer in Exercise l(e). He discovers he can now make a profit of
$160/kg from beets. These take 4 days for planting and 4 days for cropping per
kilogram. He also considers the time it takes to sell his produce in the market.
It takes 2 days to sell one kilogram of any vegetable. He has 30 days to sell his
vegetables. What weights of the three crops should he now plan for in order to
maximize profit?
(f) A man has approximately 100 m 2 of garden space. He decides to grow corn (C),
tomatoes (T), and lettuce (L) in the 20 week growing season. He estimates that
on average for every expected kg of yield from the crops it takes 0.5, 1.0, and
0.5 minutes each week to cultivate the corn, tomatoes, and lettuces respectively.
He does not want to spend more than 3 hours per weekend cultivating. He will
spend up to $2.00/week for seeds. The seed costs (on a weekly basis) per kg
yield for C, T, and L are 0.5, 1.5, and 1.0 cents, respectively. Each crop, C, T,
and L requires t i-, and 1- m 2, respectively, in space per kg yield. He can sell the
vegetables for 0.40, 1.00, and 0.50 dollars per kg for C, T, and L respectively.
What amounts should he plan for in order to maximize revenue?
(g) An ice cream factory makes 3 different types of ice cream: plain (P), hokey
pokey (H), and chocolate (C). Profits for one unit of each type are $5, $2, and
$1 for P, H, and C, respectively. Time constraints for producing a unit of each
are shown in Table 2.88. Available hours per day are 8, 10, and 4 for machining,
men, and, packing, respectively. What amounts of the different ice creams
should be manufactured to maximize daily profit?
2.8 Exercises 93

Table 2.88. Data for Exercise 2(g).

Product Machine hours Man hours Packing hours

P 4 3 1
H 2 2 1
C 1 2 1

3. Solve the following problems graphically and by the simplex method and compare
your solutions.
(a) A housewife makes sauce (S) and chutney (C) which she sells to the local store
each week. She obtains a profit of 40 and 50 cents for a pound of C and S,
respectively. C requires 3lb tomatoes and 4 cups of vinegar; requires 5lb
tomatoes and 2 cups of vinegar. She can buy 24 lb tomatoes and 3 bottles of
vinegar at discount price each week. The 3 bottles provide 16 cups of vinegar.
In order to make it worthwhile, the store insists on buying at least 3 lbs of goods
each week. What combination should be made in order to maximize profit?
(b) A man makes glue in his backyard shed. Glue A requires 2 g of oyster shell and
4 g of a special rock to produce a 2 kg package. Glue B requires 3 g of shell
and 2 g of rock for each 2 kg package. He must produce at least 8 kg of glue
per day to stay in business. His son scours the sea shore for the shell and rock
and can gather 12 kg of each per day. If the profit is $3 and $4 on a 2 kg package
oftype A and B, respectively, what is the maximum profit he can hope to make?
(c) An orchard which grows apples (A) and pears (B) wishes to know how many
pickers to employ to maximize the quantity of fruit picked in a given period.
The average quantity offruit a picker can gather is 14 kg of A or 9 kg of B. The
orchard can afford to employ no more than 18 people. There cannot be more
than 9 picking apples or the supply will be exhausted too soon, flooding the
market and reducing returns. But there must be more than half as many picking
apples as there are picking pears or costs are increased because of fallen fruit
being wasted. .
(d) Recall Exercise l(e). Suppose in time of war the government insists that the
farmer produces at least 5 kilograms of vegetables. What should he do to
maximize profits now?
(e) Consider the gardener who decides to eat some of his corn and tomatoes. A
100 g serving of corn will add 80 calories; a 100 g serving oftomatoes will add
20 calories. He does not want to take in more than 200 calories from this part
of his diet. He needs at least 50 mg of vitamin C and at least 1.8 mg of iron
from these vegetables to make up his daily intake. A 100 g serving of corn
yields 10 mg of vitamin C and 0.6 mg of iron, while a 100 g serving of tomatoes
yields 18 mg of vitamin C and 0.8 mg of iron. With corn and tomatoes costing
4 and 10 cents per 100 g, how should he achieve his dietary needs while mini-
mizing costs?
4. The following problems have multiple optimal solutions. Solve each graphically
and by the simplex method. Define the set of all optimal solutions.
(a) A builder finds he is commonly asked to build two types of buildings, A and B.
The profits per building are $4,000 and $5,000 for A and B respectively. There
are certain restrictions on available materials. A requires 4,000 board feet of
94 2 Linear Programming

timber, 4 units of steel, 3 units of roofing iron, and 2 units of concrete. B re-
quires 5,000 board feet of timber, 3 units of steel, 2 units of roofing iron, and
1 ton of concrete. However only 32,000 board feet of timber, 24 units of steel,
20 units of roofing iron, and 16 units of concrete are available per year. What
combination of A and B should he build per year to maximize profit?
(b) Recall the glue manufacturer of Exercise 3(b). In order to remain competitive
he finds he must add resin and filler to his glues. 3 g of resin must be included
in each packet of each glue and 4 g of filler must be included in glue A and
2 g of filler in glue B. His son can manufacture only 15 g of resin and 9 g of
filler each day. He can now make a profit of $8 and $4 for a package of glues A
and B, respectively. What is the maximum profit he can hope to make?
(c) A recording company is going to produce an hour-long recording of speeches
and music. The problem is to fully utilize the 60 available minutes. There can be
no more than 3 speeches and 5 musical items. The time allotted for speeches
must be no less than one-eighth of the time allotted to music. The gaps between
items or speeches must be filled with commentary, which must be no more than
12 minutes in total. The speeches are 5 minutes long, the items 8 minutes. How
many of each should be included so as to minimize the commentary time on
the recording?
(d) A bakery makes 2 types of cakes, A and B. 10 Ib of cake A requires 21b of flour,
31b of sugar, 3 eggs, and 41b of butter. 10 Ib of cake B requires 41b of flour, 31b
of sugar, 6 eggs, and 1 Ib of butter. The bakery can afford to purchase 24 Ib of
flour, 27 Ib of sugar, 24 eggs, and 20 Ib of butter per day. The bakery makes a
profit of $3 for 10 Ib cake A and $6 for 10 Ib of B. How much of each cake
should be made daily in order to maximize profit?
(e) Melt-In-Your-Mouth Biscuit Co. finds that its two best sellers are Coco De-
lights (C) and Cheese Barrel Crackers (B). C and B produce a profit of $10 and
$15 per carton sold to the supermarkets. Some ingredients are common to each
biscuit. Each week no more than 500 kg of flour, 360 kg of sugar, 250 kg of
butter and 180 kg of milk can be used effectively. Every 100 kg of C requires
20 kg of flour, 16 kg of sugar, 18 kg of butter, and 15 kg of milk. Every 100 kg
of B requires 30 kg of flour, 20 kg of sugar, 12 kg of butter, and 10 kg of milk.
Find the weekly combination of production which maximizes profit.
(f) A man finds he is eating a lot of corn and no cheese and decides to do some-
thing about it. Being very careful of his dietary considerations he realises that
he needs 600I.V. of vitamin A, 1 mg of iron, 0.12 mg of calcium, and no more
than 400 calories per day. Now 100 g of cheese gives 400 LV. Vitamin A, 0.3 mg
iron, 0.2 mg calcium, and 120 calories. Also, 100 g of corn gives 160 I.U. vita-
min A, 0.6 mg iron, no calcium, and 80 calories. Moreover, cheese costs 10 cents
and corn 4 cents per 100 g. What should his daily intake of these two items be
ifhe is to satisfy the requirements above at minimal cost?
5. The following problems have degenerate optimal solutions. Solve them by the
simplex method and interpret the final tableau.
(a) Consider a farmer who wishes to plant 4 types of grain: oats, barley, wheat,
and corn. The profits he can make from an acre of corn, barley, wheat and oats
are $300, $200, $400, and $100, respectively. However, there are a number of
restrictions regarding fertilizing, spraying, and cultivation. These are as follows:
the corn, barley, wheat, and oats require 8, 2, 5, and 4 cwt of fertilizer per acre,
respectively, but only 16 cwt is available for the season. Similarly corn, barley,
2.8 Exercises 95

wheat, and oats require 6, 4,3, and 2 gallons of insecticide per acre, respectively,
but only 10 gallons are available. Also it takes 3, 3, 2, and 1 day to cultivate
1 acre of com, barley, wheat, and oats, respectively, but the farmer can spare a
total of only ~ days. What crop combination maximizes profit?
(b) A man operates a small warehouse to store goods for other companies on a
temporary basis. His warehouse is limited to 150 m 2 in usable space. He can
afford to employ up to 10 men. Each load of product A, B, e, and D requires
16, 15, 20, and 30 m 2 of space, respectively. Each load of A, B, e, and D keeps
1, 9, 1, and 2 men fully occupied, respectively. His storage charges are $200,
$300, $400, and $700 per load for A, B, e, and D respectively. What combina-
tion of goods should he attempt to store in order to maximize revenue?
(c) In a carpet wool spinning plant four blends of wool can be produced and are
worth $80, $60, $50, and $20 per kg for blends 1, 2, 3, and 4, respectively. As-
suming that all the yarn produced can be sold, find the amount of each blend
necessary to maximize profit. Because of certain restrictions with shiftwork and
staff regulations, the carding, spinning, twisting, and hanking machinery can
only be operated for a maximum of 18, 15, 10, and 12 hours a day, respectively.
The hours each machine takes to process 103 kg of each blend are shown in
Table 2.89. A further restriction limits the quantity of blends 3 and 4 to 5 X
10 3 kg per day.
Table 2.89. Data for Exercise 5(c).

Blend Carding Spinning Twisting Hanking

1 4 4 4 4
2 4 3 4 2
3 3 3 2 4
4 2 2 0 1

(d) A company produces 4 types of fertilizer: A, B, e, and D. 10 lb of A requires


3 lb of potash (P), 4 lb of phosphate (H), and 3 lb of nitrogen (N). 10 lb of B
requires 3 lb of P, 3 lb of H, and 4lb of N. 10 lb of e requires 5 lb of P, 2lb of
H, and 3 lb of N. 10 lb of D requires 4lb of P, 4lb of H, and 2 lb of N. The
company can produce 40 lb, 40 lb, and 60lb of P, H, and N, respectively, per
day. The company makes a profit of $20, $40, $50, and $30 per 10 lb of A, B,
e, and D respectively. Determine the amount of each type that should be
produced each day so as to maximize profit.
(e) A brick manufacturer produces red (R), white (W), brown (B), and grey (G)
bricks at profits of $100, $200, $300, and $300 per ton, respectively. These are
all produced using the same equipment, which can operate continuously. It
takes 2, 3, 5, and 4 equipment hours to produce a ton of R, W, B, and G, respec-
tively. He has a maximum electric power allocation of 252 units because of
shortages. It takes 3, 4, 5, and 6 units to produce a ton of R, W, B, and G re-
spectively. Find his maximum weekly profit.
(f) Recall Exercise 2(f). The man now decides to plant pumpkins as well. To pro-
duce one kg of yield cultivation will take 0.5 minutes, 0.5 cents will be spent on
seed each per week, and 0.5 m 2 of garden space is required. Pumpkins can be
sold for 40 cents per kg. Also the cultivation time for com can now be reduced
to ~ of a minute per week. Solve 2(f) over.
96 2 Linear Programming

6. The following L.P. problems exhibit temporary degeneracy during the simplex
iterations. Solve each by this method and comment on this phenomenon.
(a) A dairy factory is about to start production. The manager wishes to know what
lines of production-butter, cheese, milk powder, or yoghurt-would be most
profitable. The various restrictions, requirements and unit profits are shown
in Table 2.90. Solve this problem by the simplex method.

Table 2.90. Data for Exercise 6(a).

Profit Milk Labour Electricity

Butter 3 3 1 2
Cheese 4 2 3 3
Milk powder 2 4 4 5
Yoghurt 2 2 1
Amount available 8 9 9

(b) Recall the warehouse problem of 5(b). The manager finds that he receives too
many orders for storage of A. So he streamlines the process for A by reducing
its storage requirement per load to 120 m 2 and increases the storage fee to
$360 per load. What is his optimal strategy now?
(c) Recall the wool spinning plant in Exercise 5(c). A competing plant realises it
has to match the efficiency of the first plant if it is to survive. It produces the
same 4 blends, but has a profit of $120, 60, 60, and 30 per kg for each type.
The plant can obtain 23 hours and 14 hours carding and hanking time per day,
respectively. The other times are identical. However this factory has older
twisting machines and it takes 4 hours to produce 10 3 kg of blend 3. Also no
more than 2.5 x 10 3 kg of blends 3 and 4 are to be produced per day. What is
the best policy for the plant?
(d) Recall the fertilizer problem of Exercise 5(d). A rival company has the data
shown in Table 2.91. What is the best way for this company to operate?

Table 2.91. Data for Exercise 6(d).

Fertilizer P H N Profit

A (10 lb) 4 1 5 $3
B (10 lb) 3 2 5 $4
C (10 lb) 5 2 3 $4
D (10 lb) 4 4 2 $5
Availability 40 40 30

(e) An ice cream manufacturer makes 2 types of ice cream-creamy and ordinary.
Creamy sells at a profit of $5 per unit, ordinary at $4 per unit. Each requires
4 tanks of milk per unit. Creamy requires 5 tanks of cream and 5 bags of sugar
per unit. Ordinary requires 2 tanks of cream and 3 bags of sugar. Also 10 tanks
of cream and 10 bags of sugar are available each day. How does the manu-
facturer maximize profit?
2.8 Exercises 97

(f) Recall Exercise 2(f). The man decides to plant cucumbers as well. Corn can be
cut down to t minute cultivation time per week per kg yield. One kg yield of
cucumbers require iz minutes of cultivation per week, 0.5 cents per week on
seeds and H m 2 of garden space. They sell for 40 cents per kg. The garden has
been reduced to 88& m 2 • Also the gardener decides he cannot spend more than
150 minutes in the garden each weekend. Solve the problem over with the new
data.
7. The following L.P. problems have no feasible solutions. Prove that this is so by
use of the simplex method. Also attempt to solve the problems graphically where
possible.
(a) A small clothing factory makes shirts and skirts for a boutique in town. A
profit of $4 and $3 is made from a shirt and skirt respectively. A shirt requires
3 yards of material and a skirt 4, with only 12 yards available daily. It takes
5 hours of total time to make a shirt and 2 hours to make a skirt, with 8 hours
available daily. At least 5 garments must be made per day. Attempt to maximize
profit.
(b) A man has a part time job making chairs (C), deck chairs (D), and stools (S).
Each chair takes 5 hours to complete and weighs 2 kg. Each deck chair takes
3 hours and weighs 1.5 kg. Each stool takes 2 hours and weighs 1 kg. He has
only 10 hours to spend each weekend on this work. Now his employers sud-
denly state that in order to make it worth their while he must produce at least
20 kg of furniture per week. Can he continue?
(c) Assuming an unlimited supply of paint and turpentine, attempt to maximize
the coverage of a mixture of the two when the addition of an equal quantity of
turpentine to the paint increases the coverage by 50%, the coverage of paint
alone being 8 m 2 /litre. Paint costs $3.00 a litre and turpentine $0.50. The total
cost of the mixture must be no more than $21.00. To aid spraying, the volume
of turpentine plus i- times the volume of paint must be greater than 50 litres.
(d) A nursery covers 5,000 m 2 • It grows trees at a profit of 35 cents each and shrubs
at a profit of 20 cents each. At least 2,000 plants must be grown. A tree needs
4 m 2 to grow, a shrub 1 m 2 • Each tree requires 2 g of fertilizer, each shrub 3 g,
while 4 kg is available. Attempt to maximize profit.
(e) Recall Exercise 2(f). Suppose that seeds costs are to be neglected. However, it
is vital that the energy value gained from the crop should be greater than
300 calories. Now corn, lettuce, and tomatoes will yield 0.8, 0.1, and 0.2 calories
per gram, respectively. Attempt to solve 2(f) over with the new data.
8. Create and solve the dual for each of the following problems. Find the optimal
solution to the original solution by interpreting the optimal dual tableau.
(a) A local vintner makes two types of wine, medium white (M) and dry white (D),
to sell to the local shop. He makes $5 profit per gallon from M and $4 a gallon
from D. Now M requires 3 boxes of grapes, 41b of sugar, and 2 pints of extract
per gallon. Also, D requires 4 boxes of grapes, 2 Ib of sugar, and 1 pint of ex-
tract per gallon. He has 14 boxes of grapes, 8 Ib of sugar, and 6 pints of extract
left before selling his business. How should he use these resources to maximize
profit?
(b) A turning workshop manufactures two alloys, A and B, at a profit of$5 and $2
a kg, respectively. Alloy A requires 2, 5, 5, and 2 g of nickel, chrome, germanium,
and magnesium, respectively. Alloy B requires 3, 2, 3, and 1 g of the metals in
98 2 Linear Programming

the same order. Supplies of the metals are reduced to 7, 11, 10, and 6 kg ofthe
metals in the same order. The furnace cannot be operated for more than 6 hours
per day. Alloys A and B require 1 and 2 hours of furnace time, respectively, to
produce 1 kg of alloy. How can profits be maximized?
(c) Recall Exercise 7(c). Suppose now the total cost of paint and turpentine cannot
exceed $100. The paint now used is of lower quality and costs only $2/litre.
However, because of price rises and the decision to use a better grade, the price
of turpentine has risen to $2/litre. A new sprayer has been purchased, and now
the volume of turpentine plus twice the volume of paint need exceed only
20 litres. However, the ratio of turpentine to paint must be between 1:4 and
3: 1. Solve Exercise 7( c) over with the new data.
(d) Recall Exercise 7(d). Having discovered that this problem was infeasible, the
nursery removed the restriction that 2,000 plants must be grown. During the
summer each plant requires one litre of water, but because of restrictions
brought on by the annual drought only 6,000 gallons can be used per day. Also
4 g of beetle powder must be used on each shrub each day and 1 g on each tree.
There are 4 kg of powder available per day. Solve Exercise 7(d) over with the
new data.
(e) A person has the option of eating chocolate, oranges, or ice cream as a means
of obtaining at least 10% of the minimum recommended daily vitamin intake.
At least 0.1 g of calcium, 1 mg of iron, 8 mg of vitamin C, 0.2 mg of riboflavin,
and 2 mg of niacin are required daily. The three foods would provide these, as
shown in Table 2.92. The problem is to keep calorie intake down to a minimum
where 100 gm of chocolate, oranges, or ice cream provide 400, 40, and 160
calories respectively. What combination ofthe foods should be eaten to achieve
these objectives?

Table 2.92. Data for Exercise 8(e).

100 g of:

Chocolate Oranges Ice cream

Calcium 0.5 gm 0.03 gm 0.1 gm


Iron 1.0 mg O.4mg 0.1 mg
Vitamin C 40mg 1 mg
Riboflavin O.2mg .02mg 0.1 mg
Niacin 1.0 mg 0.2mg 0.1 mg

9. In each ofthe following problems an objective function coefficient has been changed.
Examine the effect that this has on the optimal solution and its value by solving
the original problem and then performing sensitivity analysis.
(a) Consider the primal L.P. problem of 8(a):
Maximize: 5Xl + 4X2
subject to: 3Xl + 4X2 :0;; 14

4Xl + 2X2 :0;; 8

2Xl + X2:O;; 6

Xl, X 2 ~ O.
2.8 Exercises 99

For what range of profit for X 2 will the present optimal basis remain optimal?
(b) Recall Exercise 2(c). The driver finds that he gets satisfaction at the rate of
10 units/hour driving at 80 km/hr. What is the optimal solution to this new
problem?
(c) Recall Exercise 2(d). Suppose that the manager of the company discovers that
biscuit B can be sold to another buyer at $7 per 10 kg. What is the optimal
solution now?
(d) A paper manufacturer produces 3 grades of paper: fine (F) at a profit of $600
per ton, medium (M) at a profit of $400 per ton, and heavy (H) at a profit of
$250 per ton. F requires 90 tons of wet pulp and 60 units of electric power to
produce one ton. M requires 80 tons of wet pulp and 50 units of power to pro-
duce one ton. H requires 70 tons of wet pulp and 30 units of power to produce
one ton. 5,000 tons of pulp and 2,000 units of power are allocated each week
for this activity. Find the optimal solution to the problem. If the profit for H is
increased to $350, how does this affect the solution?
(e) Recall Exercise 6(f). As their contribution to fighting inflation the supermarkets
are going to pay only 80 cents per kg for tomatoes. How does this affect the
present basis?
10. Solve the following L.P. problems by the simplex method. Then analyze what effect
the given change in a r.h.s. constant has.
(a) Recall Exercise 8(a). Suppose the vintner wishes to vary the supply of grapes he
requires in the production of his two white wines. He wants to know ifhis wine-
making business will still be profitable if for some reason there is a shortage of
grapes. How much below 14 can the supply drop for the present basis to be still
optimal?
(b) Recall Exercise 5(b). One of the men is injured and cannot work for one month.
Is the present policy still optimal? Is so, what is the new optimal solution value?
(c) Recall Exercise 8(c). Suppose the cost limitation is raised from $100 to $110.
What is the new optimal solution and its value?
(d) Consider the following problem:

Maximize: 21xl + 14x2


subject to: 5Xl + 3X2:S; 15 (2.44)
2Xl + 4X2:S; 8
Xl + X2 ~ 1
Xl,X2 ~ o.
Solve this problem by the simplex method. What is the optimal solution and
its value ifthe r.h.s. of (2.44) is changed from 15 to 9?
(e) Recall the Exercise 9(d). Suppose that the total weekly pulp production is halved.
Find the new optimum.
(f) Recall Exercise 2(f). Suppose the gardener now wishes to spend 15 minutes less
in the garden each week. How does this affect the optimal solution?
11. In each of the following L.P. problems one of the l.h.s. constraint coefficients is
changed from an original value. Analyze the affect of this change by using sensitivity
analysis rather than solving the problem again from scratch.
(a) Recall Exercise 8(a). Suppose now that the medium white requires 7t units of
extract. How does this affect the solution?
100 2 Linear Programming

(b) Recall Exercise 5(b). Suppose now the floor space is increased to 175 m 2. Per-
form sensitivity analysis with this change.
(c) Recall Exercise 8(c). Suppose that it is desired to change the restriction that the
volume of turpentine plus twice the volume of paint exceeds 20 litres. If the
twice is replaced by thrice is the present basis still optimal?
(d) Recall Exercise 9(d). Suppose that a new system is put into operation whereby
power consumption is reduced for medium grade paper from 50 units to 30 units.
How does this affect optimality?
(e) Recall Exercise 3(e). Consider the dual of that problem as a problem of maxi-
mizing the benefits to be gained from a diet of corn and tomatoes with cost
constraints. The relative cost of iron has been re-estimated at 0.5 cents per
100 gm.
12. In each of the following L.P. problems a further variable is introduced. The new
optimal solution is then to be found.
(a) Recall Exercise 5(b). Suppose a new product comes on the market with the
following storage requirements: 2 m 2 per truckload, one man week per truck-
load, and $500 profit per truckload. Is it worthwhile for the storage agency to
accept orders to store this new product?
(b) Consider the L.P. problem:
Maximize: 8XI + 4X2
subject to: 2XI + 2X2 ::;; 100 (2.45)
2XI + X2 2::: 20 (2.46)
-3XI + X2::;; 0 (2.47)
Xl - 4X2 ::;; 0 (2.48)
XI ,X22:::0.
Suppose a new variable is added, with coefficients 1, 1, 1, and -1 in constraints
(2.45), (2.46), (2.47), and (2.48), respectively. You should have solved the original
problem when you did Exercise 8(c). Now if the coefficient in the objective
function of the new variable is 6, use sensitivity analysis to see ifit is worthwhile
using this new variable.
(c) Recall Exercise l(d). Suppose the company has decided to manufacture another
type of brandy. Each barrel of this brandy requires 4 hours' fermentation, and
5 hours' distillation. Should the company produce this brandy if its profit is $16
per barrel?
(d) Recall Exercise 9(d). A new grade of paper, extra fine, is now to be made for a
profit of $800/ton. It requires 95 tons of wet pulp and 70 units of power for
every ton produced. Find the new optimum.
13. Each of the following problems is a transportation problem. For each problem find
an initial basis by (i) the northwest corner method, (ii) the least cost method, (iii) the
Vogel approximation method. Solve the problem by the stepping stone algorithm
and by Dantzig's method, starting with each basis.
(a) Consider the supply system of 4 breweries, supplying the needs of 4 taverns for
beer. The transportation cost for a barrel of beer from each brewery to each
tavern is as shown in Table 2.93. The production capacities of breweries 1, 2, 3,
2.8 Exercises 101

Table 2.93. Data for Exercise 13(a).

Taverns
2 3 4

1 8 14 12 17
2 11 9 15 13
Breweries
3 12 19 10 6
4 12 5 13 8

and 4 are 20,10,10, and 5 barrels per day, respectively. The demands of taverns
1,2,3, and 4 are 5, 20, 10, and 10 barrels per day respectively. Find the minimum
cost schedule.
(b) A bread manufacturer has 4 factories. He supplies 4 towns. The unit transporta-
tion costs are shown in Table 2.94. The demands of towns 1,2,3, and 4 are 6,000,
12,000, 5,000, and 8,000 loaves per day, respectively. The daily production
capacities ofthe factories 1,2,3, and 4 are 7,000, 8,000, 11,000, and 5,000 loaves,
respectively. Find the minimum cost schedule.

Table 2.94. Data for


Exercise 13(b).

Town
2 3 4

1 7 5 4 3
2 5 4 4 3
Factory
3 6 5 6 7
4 3 4 7 9

(c) An effluent treatment plant has 4 independent oxidation systems with capacities
of 15, 18, 20, and 30 (in millions of litres) per day. These systems can be inter-
connected in any combination to any of 5 effluent mains by intermediate pump-
ing stations. The outputs of the mains are 12, 17, 15, 19, and 14 (x 106 ) litres per
day. The cost involved in pumping 106 litres from any of the mains to any of
the systems is shown in Table 2.95. Find the least cost flow.

Table 2.95. Data for


Exercise 13(c).

System
2 3 4

1 4 6 7 5
2 3 2 2 1
Main 3 7 4 3
4
5
7
2
3
3 °
8
4
2
102 2 Linear Programming

(d) Four large farms produce all the potatoes to satisfy the demands of markets in
four towns. The monthly production of the farms and the demand of the towns
are shown in Table 2.96, and the transportation costs per ton are shown in Table
2.97. Find the minimum cost schedule.

Table 2.96. Production and Demand


in Exercise 13 (d).

Farm Production Town Demand

1 30 1 20
2 40 2 35
3 25 3 50
4 45 4 35

Table 2.97. Transportation Cost


in Exercise 13 (d).

Town
2 3 4

7 7 10 8
2 6 6 9 6
Town
3 8 7 9 5
4 11 10 12 8

(e) The roads board is about to complete four urgent tasks on state highways in the
Wellington province. Costs must be minimized to the satisfaction of the audit
team from the treasury. A costly part of the operation involves the transportation
of suitable base course and sealing metal from screening plants at Masterton,
Otaki, Bulls, Raetihi, and the Desert Road to the tasks at Levin, Palmerston
North, Taihape, and Wanganui. In the time available the plants can supply in
(I,OOO-ton units) Masterton, 10; Otaki, 18; Bulls, 12; Raetihi, 14; Desert Road,
24. The demand is: Levin, 20; Palmerston North, 10; Taihape, 30; and Wanga-
nui, 15. Unit costs of loading, transportation and unloading in terms of man
hours are shown in Table 2.98. Find the minimum cost schedule.

Table 2.98. Transportation Costs in Exercise 13 (e).

Masterton Otaki Bulls Raetihi Desert Road

Levin 6 2 2 8 7
Palmerston North 4 4 1 7 6
Taihape 8 7 3 3 2
Wanganui 7 6 2 5 7
2.8 Exercises 103

14. Each of the following problems is an assignment problem. Solve each one by the
Hungarian method.
(a) Consider a collection of six students and six assignments. Each student must be
assigned a different assignment. The time (in hours) it is likely to take each
student to complete each assignment is given in Table 2.99. Find the minimum
time assignment.

Table 2.99. Assignments in Exercise 14(a).

Tasks
2 3 4 5 6

1 7 5 3 9 2 4
2 8 6 1 4 5 2
3 2 3 5 6 8 9
Students
4 6 8 3 7 2
5 4 5 6 9 4 7
6 9 2 3 5 8

(b) A factory manager has a table (Table 2.100) which shows how much profit is
accomplished in an hour when each of six men operate each of six machines.
Note that: man 6 cannot work on machine 1 because this task requires good
eyesight; man 2 cannot work on machine 3 because he is allergic to dust; and
man 5 cannot work on machine 6 because this job requires two hands. You
are required to find the maximum profit assignment.

Table 2.100. Assignments in Exercise 14(b).

Men
2 3 4 5 6

1 7 7 8 6 7
2 8 5 8 6 5 5
3 6 7 5 6 5
Machines
4 5 4 5 5 4 4
5 6 6 7 7 6 6
6 7 8 7 6 6

(c) In carpet manufacture the carpets are inspected for faults and repaired by hand
sewing, called picking. In a certain factory there are 6 picking boards and the
management wishes to assign 6 rated workers to these boards so that the total
time to repair any quantity of carpet is minimized. The rates of the workers on
the different picking boards are shown in Table 2.101; they vary because the
boards handle different sizes and types of carpets depending upon their location.
(d) An air force has six pilots which it wishes to assign to six different types of aircraft.
Each pilot has been rated on each one and given a numerical rating in terms of
104 2 Linear Programming

Table 2.101. Data for Exercise 14(c).

Workers
2 3 4 5 6

1 8 6 1 4 9 4
2 3 4 10 2 3 2
3 4 5 6 7 4 6
Boards
4 1 8 5 5 6
5 7 2 3 7 10 3
6 2 5 5 3 7 5

Table 2.102. Ratings in Exercise 14(d).

Aircraft
2 3 4 5 6

1 7 4 8 2 3 5
2 8 3 3 6 2 4
3 2 5 3 7 4 9
Pilots
4 5 2 6 6 7 2
5 6 4 2 8 3
6 3 5 6 4 5 7

errors in operation (Table 2.102). Make an assignment of pilots to aircraft so


as to minimize the culmulative rating of those assigned.
(e) Table 2.103 gives the standardized times of seven workers on seven machines.
Find a minimum time assignment.

Table 2.103. Data for Problem 14(e).

Machines
2 3 4 5 6 7

1 6 4 4 5 6 7 4
2 7 5 1 1 3 9 2
3 3 3 7 1 9 6 6
Workers 4 4 6 5 8 1 5 8
5 6 1 4 4 2 1 4
6 6 6 9 8 8 2 9
7 5 7 3 9 1 8 2

(f) A novelty atheletics meeting is to be held for teams of eight. There are eight
events, and one man from each team is to enter each event. A certain team has a
member who predicts where each man would be placed ifhe entered each event
(Table 2.104). Given that the team accepts his predictions, and that each finisher's
score is inversely proportional to the place he gets, find the optimal allocation.
2.8 Exercises 105

Table 2.104. Predictions of Exercise 14(f).


Events

100m Long High Shot


Athletes 100m hurdles 400m 1500m jump jump Javelin put

Allan 4 3 3 4 3 2 4 6
Big Billy 3 2 2 1 3 1 2 4
Chris 5 3 5 7 2 4 2 2
Dangerous
Dan 3 2 2 5 4 3 3 2
Ewen R. 2 3 1 2 1 2 3 4
Freddy 1 1 2 3 2 4 4 6
George 2 4 3 3 2 5 1 1
Harry 4 6 4 6 4 6 3 5

(II) Theoretical

15. Formulate a number of real-world problems as linear programming problems.


16. Show that the set of feasible solutions to an L.P. in standard form is convex.
17. Prove that, if more than one basic feasible solution is optimal for a linear program-
ming problem in standard form, then any convex combination of those basic
feasible solutions is also optimal.
18. Attempt to solve the problem of Section 2.5.7 by the two-phase method. Compare
the efficiency of that approach with using the big M method.
19. Attempt to solve the problems of Section 2.5.8 by the two-phase method. Draw
conclusions about the use of that method on problems with unbounded optima in
general. Prove your conclusions.
20. Solve the transportation problem of Section 2.7 as a linear programming problem
by the simplex method. Compare the process, step by step, with that obtained by
the stepping stone method.
21. Formulate a 3 x 3 assignment problem as a linear programming problem. Solve it
by the simplex method. Formulate the same problem as a transportation problem
and solve it by the stepping stone method. Compare these processes with solving
the problem by the Hungarian method.
22. If a linear programming problem has multiple optima, then its objective function
hyperplane is parallel to that of a binding constraint. State conditions which must
hold when the converse is not true.
23. Prove that if a linear programming problem has an unbounded optimum its dual
cannot have any feasible solutions.
24. Prove that a variable is unrestricted in sign in a L.P. if and only if the corresponding
constraint in the dual is an equality.
Chapter 3

Advanced Linear Programming Topics

3.1 Efficient Computational Techniques for


Large L.P. Problems
We shall now discuss how large linear programming problems may be
solved on a digital computer with the aid of properly organized calculations.
In spite of the recent tremendous advancement in the computational power
and memory size of modern computers, computational difficulties still arise
in solving large L.P. problems. New techniques have been developed to
overcome some of these. The techniques that we shall discuss are: the re-
vised simplex method, the dual simplex method, the primal-dual algorithm,
and Wolfe-Dantzig decomposition.

3.2 The Revised Simplex Method


We turn now to improving the efficiency of the simplex method presented
in the previous chapter. Although fairly small problems can be solved by
hand using the method, realistic industrial problems are too large for even
the most patient arithmetician. As they are to be solved using scarce, expen-
sive computer time, it is desirable to make the simplex method as efficient
as possible.
Anyone who has used the simplex method on a nontrivial problem will
have noticed that most tableau entries have their values calculated and re-
calculated. Often, many such values are never actually used to make deci-
sions about entering and leaving basic variables and may just as well never

106
3.2 The Revised Simplex Method 107

have been computed. At any iteration, what entries are necessary in order to
know how to proceed? The xo-row coefficients of non basic variables are
needed to decide whether or not to continue, and if so what variable enters
the basis. The other coefficients of entering variable and the r.h.s. entries
are needed to take ratios to decide which variable should leave the basis. It
is desirable to have these values available without having to calculate all
the others, which are of no immediate interest.
The revised simplex method achieves this. It is in essence no different
from the simplex method; it is simply a more efficient way of going about
things when using a computer. As fewer numbers are calculated at each
iteration, less storage is required by the computer, which may be an impor-
tant factor in dealing with relatively large problems.

3.2.1 A Numerical Example

Let us return once more to Problem 2.1 and find how we can solve it in an
efficient manner. The first two tableaux generated in solving the problem
by the regular simplex method are given in Tables 3.1 and 3.2.

Table 3.1

Constraints Xl X2 X3 X4 Xs Lh.s. Ratio


12
(2.11) 3 4 1 0 0 12 3
10
(2.12) 3 3 0 0 10 3
(2.13) @ 2 0 0 8 8
4:
Xo -4 -3 0 0 0 0

Table 3.2

Constraints Xl X2 X3 X4 Xs Lh.s. Ratio

(2.11) 0 .2. 0 3
6 12
2 4: 5
3 .§.
(2.12) 0 2" 0 1 ~
4 4 3
1 1 4
(2.13) 2" 0 0 4: 2 T
Xo 0 -1 0 0 8

Given Table 3.1, what information is needed to generate the next itera-
tion, which produces Table 3.2? In order to decide whether any further itera-
tions are necessary, the Xo row is required: (- 4, - 3, 0, 0, 0, 0). The column
(3,3,4, V of the incoming variable x 1 and the r.h.s. column (12, 10,8, V are
required to decide upon the outgoing basic variable. None of the other
information is relevant at this moment.
108 3 Advanced Linear Programming Topics

Recall that the simplex method starts with an initial basis and a corre-
sponding basic feasible solution and generates a sequence of improving basic
feasible solutions by replacing one basic variable at a time. The kernel of the
revised simplex method is that the basic feasible solution corresponding to
any basis can be calculated from the original tableau by a correct sequence
of row operations. In order to motivate this, consider Problem 2.1 in matrix
form:
Maximize: Xo = (4,3,0,0,0)(Xb X2,X 3,X 4 ,X sf
subject to: Xl
4
° 1

~ (:~)
G °° ° ~)
X2
3 1 X3 (3.1)
2 X4
Xs
Xi ~ 0, i = 1,2, ... , 5. (3.2)
In this problem, three equations in five unknowns form the constraints.
Hence any basic feasible solution is found by setting two variables equal to
zero and solving the remaining three equations in three unknowns. This
creates a basis matrix, a sub matrix of the original constraint matrix, found
by deleting the columns corresponding to nonbasic variables.
The initial basis is PI = {X3,X 4,X S}' In Table 3.1 it can be seen that Xl
should replace Xs in the basis, creating a basis P2 = {X 3,X4,xd. The basis
matrix for P2 is found by deleting columns 2 and 5 from the constraint
matrix in (3.1) and rearranging the order of the remaining columns if neces-
sary, i.e.

B= (°1 °1 3)3 .
004
And, as
X2 = Xs = 0,
(3.1) can be abbreviated as

so that the basic feasible solution (bJ.s.) corresponding to P2 is

Once B~ I has been found, any column in the tableau representing the
hJ.s. based on Pi can be calculated. This is achieved by multiplying the
original column by B~ 1. For instance, if it is desired to find the X2 column
3.2 The Revised Simplex Method 109

(;2 in the tableau corresponding to P2,


(;2 = B- l C 2

where
C i = Xi column in original tableau,
Ei = updated Xi column.
It is necessary to calculate the Xo row in any new tableau to find whether
the new tableau is optimal and, if not, which variable should enter the
basis. The xo-row coefficients of the basic variables will be zero. How are
the xo-row coefficients of the non basic variables calculated in the regular
simplex method? For instance, let us discover the steps taken to calculate
C2' the xo-row coefficient of X2 in Table 3.2. Suppose that rows (2.11), (2.12),
and (2.13) have been updated, and now the Xo row is to be revised. One can
form a vector of the coefficients of the basic variables in the original tableau:
CB = (0,0, -4).
The scalar

is the quantity that has been subtracted from the original xo-row coefficient
of X2 when Table 3.2 has been arrived at. Hence

c, ~ " - e,G, ~ - 3 - (0,0, -4)(!) ~ -l.


But we have seen how to deduce E2 from C 2 , i.e.,
E2 = B- l C 2
C2 = C2 - cB B- l C 2 ·
For brevity let

Then

The entries in n are called simplex multipliers.


We are now in a position to calculate all the nonbasic variable coefficients,
given that the new basis is P2:

• ~ c,Jr' ~ (0,0, -4)(~ °1


°
110 3 Advanced Linear Programming Topics

Hence
C2 = - 1, as before
Cs = C s - nC s

~ °- (0,0, -l)(~)
=1.
On examining c2 and Cs we see that c2 alone is negative and therefore X2
enters the basis. We must now decide which variable leaves the basis. The
information required for this is the X2 column entries and the r.h.s. entries in
the new tableau, Table 3.2. These can be obtained, as shown earlier, by mul-
tiplying the original X2 column and r.h.s. by B- 1 • On taking ratios it is seen
that X3 should leave the basis. The new basis becomes

Thus the new B is

B= 3 1 3. (4 ° 3)
2 4°
We could calculate the new B- 1 by directly inverting B. However, because of
the nature of the simplex iteration it is computationally more efficient to
calculate each new B- 1 from the previous one. In order to understand how
this can be achieved it is necessary to realise that each entry bi} 1, i #- j, in
B- 1 is simply the multiple of the original constraint (j) which has been fi-
nally added to the original constraint (i) to obtain the ith row in present
tableau. For instance, if we wished to create row (2.12) in the next tableau
from Table 3.2 using the regular simplex method we would subtract i/~
times row (2.11) from row (2.12). Thus the middle row in the new B- 1 (cor-
responding to (2.12)) can be obtained from the previous B- 1 in the same
way, i.e.,
(b:;l, b:;i, b:;i) = (0,1, - i) - (i/~)(l, 0, - i)
= (-t1,-13o).

The bottom row can be found in a similar manner:


(b 3l, b321, b3i) = (O,OJ) - (!/~)(1,0, -i)
= (-t,OJ).
Of course, the top row can be found by dividing the top row of the previous
B- 1 by ~. Therefore,
(btl,bti,btl) = ~(1,0, -i)
= (~,O, - ?o).
3.2 The Revised Simplex Method III

Hence the new B- 1 is


o
1
-1:0) .
-10
o 2
"5

Now the very first B is I, so


B- 1 = I, initially.
The next B- 1 can be obtained from this one, using the method illustrated
above. And indeed each B- 1 can be found from the one before.
Going back to the example, we can now calculate the new simplex mul-
tipliers associated with the new basis /33:

n = cBB- 1 = (-3,0, -4) (-!


-5
o
1
o
-TO

- ~o3 ) = (- ~, 0, - ?o ).

We now calculate the nonbasic Cj to discover whether or not /33 is an optimal


basis:

_1.
-5
(;5 = C5 - nC 5

7
=TO'

As both entries are nonnegative, the optimal solution has been found. This
solution is

i = 3, 5.
The actual solution value can be found by substituting these values in the
original objective function.

x~ = (3, 0, 4)(l)
= 5l· !
112 3 Advanced Linear Programming Topics

3.2.2 Summary of the Revised Simplex Method

Consider the following problem:


Maximize: (3.3)
subject to: AX=B
(3.4)
X~O,

with m constraints and n variables.


Let
a12
a22 a.")
a2n

am2 ... amn

Let the jth column of A be denoted by Ci , i.e.,

ali)
C j ~ ( ~'j .

am}
Suppose at some point in the implementation of the revised simplex method
a basis Pi has been identified corresponding to a basic feasible solution to
the problem. Without loss of generality, let this basis be given by the first
m variables, i.e.

A basis matrix Bi is defined for each basis Pi. Bi is the matrix formed by
ordering the columns of A corresponding to the variables in Pi in the order
in which they would form the columns of an identity matrix in the regular
simplex tableau.
Suppose that for Pi this order is 1,2, ... , m; then

. .. aim)
B·I = (::: ::: . ..

~2m = (C 1

ami am2 amn


For the first basic feasible solution, with a basis of slack and artificial
variables,
(3.5)
3.2 The Revised Simplex Method 113

In order to discover whether or not the bJ.s. corresponding to Pi is optimal,


it is necessary to calculate Bi- 1 • With the first basis matrix, because of (3.5),
Bi 1 = I.
However, in general B i- 1 can be calculated from B i-- 11 as follows.
Suppose that the last variable to enter Pi is x p' and that xp is the basic
variable for the qth constraint. That is, in terms of the regular simplex
method, the element in the qth row and pth column was the pivot element.
Now let (a~p), k = 1,2, ... , m be the column entries according to xp in the
tableau corresponding to Pi-1 and
...
~,.)
C
a12
a22
...
B.-
,-1
1 = a21
.
a2m

amI a m2 a mm

Then the entry in the kth row and jth column of B i- 1 is

for k # q,
or
aqp
-,-, for k = q.
aqp
Once B i- 1 is calculated as described above, the Xo row corresponding to
Pi is found.
Let C B be the row vector of the negative of the basic variable coefficients.
Define the simplex multipliers '7ti as
'7ti = cB B i- l •

Once the row vector '7t i has been found, the non basic variable coefficients of
the Xo row are calculated.
Let cj be the Xo row coefficient of each nonbasic variable Xj' Then
Cj = cj - '7tiCj, for all nonbasic variables Xj'
If
Cj ;?: 0, for all nonbasic variables Xj' (3.6)
the basis Pi corresponds to an optimal solution. This solution XB can be
found as follows:
(3.7)
and the optimal solution value is

If (3.6) is not satisfied, the column corresponding to the entry which is largest
in magnitude is identified. Let this be column p. xp will enter the next
basis, Pi+ 1 .
114 3 Advanced Linear Programming Topics

In order to determine which basic variable xp replaces it is necessary to


calculate the r.h.s. and pth column in the tableau according to f3i. The r.h.s.
column is given in (3.7); the pth column, C~ is
C~ = Bi-1C p •

Ratios of corresponding elements of X B and C~ are formed to decide which


variable leaves f3i> say the basic variable for the qth constraint. Now f3H 1
can be identified, and the previous steps can be repeated with f3i+ 1 replacing
f3i. The process stops, as with the regular simplex method, when (3.6) is
satisfied.

3.2.3 The Calculations in Compact Form

Problem 2.1 will now be reworked using the revised simplex method with
the calculations laid out in the normal compact form. The reader should
compare this with the tableaux necessary for the regular simplex method,
shown in Tables 2.2-2.8.
From (3.1):

A ~ (! ~ ~ ~ ~),
c = (4,3,0,0,0).
The iterations are shown in Tables 3.3-3.5. Table 3.5 reveals the same op-
timal solution as that found by the regular simplex method in Table 2.8,
namely,
xi = 54
x*2 -_1.2.5
x: = 52
x! = x! =
x*0-- 525 .
°
Table 3.3

Entering
PI CB B-1 1 1[1 b Cj Ratio variable
12
X3 0 0 0 0 12 3 ""3
10
X4 0 0 0 0 10 3 ""3
X5 0 0 0 0 8 4 .!l.
4

(-c) -4 -3 0 0 0 0
3.3 The Dual Simplex Method 115

Table 3.4

Entering
P2 CB Bi 1 11:2 b Cj Ratio variable

X3 0 0 3
-4 0 6 t II
5 X2
X4 0 0 1 3
-4 0 4 t .!!.
3

Xl -4 0 0 1
4 -1 2 1
"2
4
T

(-c) 0 -1 0 0 1 8

Table 3.5

P3 CB B-3 1 11:3 b

-3 1- 12
x2 5 0 - 130 -s2 ""5
X4 0 -s3
1 -?o 0 2
S
-4 1-
Xl -s1 0 5 -TO
7 4
S

(-c) 0 0 1-
5 0 /0
52
""5

To sum up, the advantages of the revised over the regular simplex method
are:
1. Fewer calculations are required.
2. Less storage is required when implementing the revised simplex method
on a computer.
3. There is less accumulation of round-off error, as tableau entries are not
repeatedly recalculated. An updated column of entries is not calculated
until its variable is about to become basic.

3.3 The Dual Simplex Method

3.3.1 Background

Consider the application of the simplex method to an L.P. problem. When


an optimal solution has been found (assuming its existence), the optimal
solution to the dual problem can be found by inspecting the optimal primal
tableau. However, each tableau generated by the simplex method in solving
the primal can be inspected to yield a solution to the dual. What is the
nature of this sequence of solutions to the dual? It can be shown that all
except the last are infeasible and have solution values which are better than
116 3 Advanced Linear Programming Topics

the optimum. As an example of this, consider Problem 2.1 and its solution
by the simplex method:
Maximize: 4Xl + 3X2 = Xo (2.10)
subject to: 3x 1 + 4X2 + X3 = 12 (2.11)
=10 (2.12)
+ Xs = 8 (2.13)

This problem has the following dual:


Minimize: 12Yt + 1OY2 + 8Y3 = Yo
subject to: 3Yl + 3Y2 + 4Y3 - Y4 =4
4Yl + 3Y2 + 2h - Y6 = 3
Yl' Y2, h, Y4, Y6 ~ o.
The initial tableau for the primal is shown in Table 2.6, repeated here for
convenience.

Table 2.6

Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 3 4 1 0 0 12
(2.12) 3 3 0 1 0 10
(2.13) ® 2 0 0 1 8
(2.10) -4 -3 0 0 0 0

Using the summary given in Section 2.6.1.3 on interpreting the primal


tableau to find a solution to the dual, it can be seen that Table 2.6 corresponds
to the following dual solution:
Yl = 0
Y2 =0
Y3 = 0
Y4 =-4
Y6 = -3
and
Yo = o.
When we solved the dual in Section 2.6.1.2 we found that the optimal
solution had value
Y*0-
-g s·
Hence this present solution is better in value (Yo = 0), as we are minimizing.
3.3 The Dual Simplex Method 117

Table 2.7

Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 0 1 1 0 3
-"4 6
3
(2.12) 0 ~ 0 1 -"4 4
(2.13) t 0 0 .1
4 2
(2.10) 0 -1 0 0 1 8

The next simplex iteration for the primal produces Table 2.7, repeated
here for convenience.
This corresponds to the following dual solution:
Yl = 0
Y2 = 0
Y3 = 1
Y4=O
Y6 = -1
and
Yo = 8.
This solution is still infeasible, and is worse in value than the last produced,
but still better than the optimal solution which is generated in the next
iteration (see Section 2.6.1.3).
It is true in general that the sequence of solutions (all but the last) for a
dual problem generated by interpreting successive primal tableaux have the
following properties:

1. They are infeasible.


2. Each has a solution value worse than the last.
The very last such solution is optimal, as has been seen in the previous
chapter. So the possibility presents itself of a new approach to solving L.P.
problems. Rather than start with a feasible solution and produce a sequence
of feasible and improving (better solution values) solutions, why not start
with an infe~sible solution and produce a sequence of infeasible solutions
with worsening solution values ultimately terminating with the optimal
solution? The dual simplex method does just that.
When is such a strategy likely to produce a more efficient procedure?
When a problem contains many "~" constraints, many artificial variables
have to be introduced in the regular simplex method. Considerable effort
may be expe~ded in reaching a solution in which all of these have zero value.
In such circumstances it is usually better to start with an initial solution of
slack variables. Such a solution will be infeasible, as each slack in a "~"
constraint will have negative value. However, usually fewer iterations are
118 3 Advanced Linear Programming Topics

needed to attain optimality than in the two-phase method or the big M


method.
A second situation in which one is trying to transform an infeasible
solution into a feasible solution with a worse value is in postoptimal analysis.
The optimal solution to an L.P. problem may no longer be feasible once
changes are made to the parameters of the problem. The dual simplex
method can be applied to transform this basic, infeasible solution into the
optimal one. The mechanics of the method will be explained by means of
an example in the next section.

3.3.2 The Dual Simplex Method Applied to a Numerical Example

One of the problems of the previous section will be solved by the dual
simplex method:
Minimize: 12Yl + lOY2 + 8Y3 = Yo (3.8)
subject to: 3Yl + 3Y2 + 4Y3 - Y4 =4 (3.9)

- Y6 = 3 (3.10)

Yb h, Y3, Y4, Y6 ~ O. (3.11)


As usual we shall adopt the criterion of maximization:
Maximize: (3.12)
Consider problem (3.9)-(3.12) as the primal L.P. The initial step is to
find a basic solution in the first tableau in which:
1. The criterion for optimality is satisfied, (all nonbasic variables have non-
negative Yo-row coefficients); and
2. All basic variables have zero Yo-row coefficients.
Table 3.6 displays the problem. It can be seen that criteria 1 and 2 would
be satisfied if the nonzero entries in the Y4 and Y6 columns were of opposite
sign. Then {Y4, Y6} would be a suitable basic set. This is achieved by multi-
plying the two constraints by negative one, as shown in Table 3.7, i.e.,
Y4 =-4
Y6 = -3.

Table 3.6

Constraints Yl Y2 Y3 Y4 Y6 r.h.s.

(3.9) 3 3 4 -1 0 4
(3.10) 4 3 2 0 -1 3
Y~ 12 10 8 0 0 0
3.3 The Dual Simplex Method 119

Table 3.7

Constraints Y1 Y2 Y3 Y4 Y6 r.h.s.

(3.9) -3 -3 -4 1 0 -4
(3.10) -4 -3 -2 0 1 -3
Y~ 12 10 8 0 0 0

This solution corresponds to a basis of all slack variables, which is


usually the case. If it was feasible it would be optimal, as the criterion for
optimality is satisfied. Alas, this is not the case, as both basic variables are
negative. However, the solution value is
Y~ = 0,
which is better than the known optimum of 51. This solution corresponds
to the solution in Table 2.8, as discussed in the previous section. (The reader
is urged to compare the discussion concerning the numerical example in
Section 3.3.1 with the similar steps of the present section.) As the present
solution is infeasible, a change of basis is made in order to reduce this
infeasibility, so it must be decided which variable leaves the basis and which
enters.
First the question of the leaving variable is settled. Recall that when
the regular simplex method is applied to the dual the variable with the most
negative xo-row coefficient is selected to enter the basis. This is shown in
Section 3.3.1, where Xl is selected as the incoming basic variable in Table 2.6.
This most negative xo-row coefficient corresponds to a value of one of basic
variables in the problem. For instance, the most negative xo-row coefficient
of variable Xl' equalling -4, corresponds to the present value of Y4' (Com-
pare Tables 2.6 and 3.7). It is this basic variable which is to leave the basis.
This is intuitively quite reasonable, as it is natural to remove the most
negative variable when trying to attain feasibility by eventually making all
variables nonnegative.
Next the question of which variable enters the basis is settled. Once
again, let us consider the mechanics ofthe regular simplex method in solving
the dual. Having selected Xl to enter the basis, one then takes the ratios

(:111 ' :12/ :13 J C3


= 2
, 13°, ~)
and selects the minimum. The ratios correspond to a set of ratios in Table
3.7:

(~23' ~03' ~4}


Note that when taking ratios with the regular simplex method, ratios with
negative denominators are ignored. Now, as all equations in our primal in
120 3 Advanced Linear Programming Topics

Table 3.7 have been multiplied by -1, only ratios with negative denominators
are taken into account and the variable corresponding to the largest ratio
enters the basis. Thus Y3 enters the basis.
Note that the above criteria for deciding which variables enter and leave
the basis represent a departure from the regular simplex method. However,
the mechanics of the regular and dual simplex methods are otherwise the
same.
Once it is determined that Y3 enters the basis and Y4leaves, this transfor-
mation is carried out by the usual Gauss-Jordan elimination, which produces
Table 3.8. This solution corresponds to the second one found in the previous
section, i.e.,
Y3 = 1
Y6 = -1
Yo = -8, i.e., Yo = 8.

Note that the value needs to be multiplied by -1, as we took the negative
of the objective function in (3.12) in order to maximize.

Table 3.8

Constraints YI Y2 Y3 Y4 Y6 r.h.s.

(3.9) i J.
4 1 -4
1
0 1
5
(3.10) -2" -2"
3
0 1
-2" -1
Yo 6 4 0 2 0 -8

The process is repeated once more. The leaving basic variable is Y6, as it
is the only one with a negative value. Taking the ratios, we obtain

6/( -~), 4/( -i), 2/( -!).


Thus Yl enters the basis. This produces Table 3.9, which represents the
optimal solution to the problem, as it is feasible and satisfies the criterion
for optimality. The solution is identical to that found previously.

Table 3.9

Constraints YI Y2 Y3 Y4 Y6 r.h.s.

(3.9) 0 ..L
10 1 -5
2 3
-TO ?o
(3.10) 3
5 0 1.
5 -s2 ~
5

Yo 0 ~
5 0 ! il
5 -5
52
3.3 The Dual Simplex Method 121

3.3.3 Summary of the Dual Simplex Method

When solving an L.P. problem with the dual simplex method the following
steps are carried out.

Step 1. Equality constraints are split into pairs of inequality constraints with
opposite sense. For example, the equality constraint

is replaced by

and

which on the introduction of slack variables become

and

Step 2. A basic solution (normally comprising exactly the set of slack vari-
ables) is found which satisfies the criterion for optimality. This criterion is
that all xo-row coefficients for non basic variables are nonnegative. Of course
all basic variable xo-row coefficients must be zero, as usual.

Step 3
3.1. Determination of the feasibility of the present solution. Each solution
generated satisfies the condition for optimality. Thus if it is feasible it
will be optimal. A solution will be feasible if all its variable values are
nonnegative. If this is so, the process is terminated and the present
solution is optimal. Otherwise, proceed.
3.2. Determination of variable to leave the basis. Among all variables with
negative values, the one with the value which is largest in magnitude is
selected to leave the basis.
3.3. Determination of variable to enter the basis. Identify the equation
which contains a unit coefficient for the leaving variable discovered in
step 3.2. Identify all variables which have negative coefficients in this
equation, say (j). For each such variable, form a ratio of its current
xo-row coefficient divided by its coefficient in equation (j). The variable
with the ratio which is largest enters the basis.
3.4. Make the change of basis according to the variables found in steps 3.2
and 3.3 by Gauss-Jordan elimination and create a new tableau. Go to
step 3.1.
It is important that the reader realises that the dual simplex method
performs corresponding iterations on the L.P. problem as the regular simplex
method would perform on the dual problem.
122 3 Advanced Linear Programming Topics

3.4 The Primal-Dual Algorithm


The dual simplex method was presented in Section 3.3 as a way of overcoming
the inefficiency brought about by introducing a relatively large number of
artificial variables when solving an L.P. problem. There is another approach
possible. Rather than begin with an infeasible solution with value better
than the optimum, as in the dual simplex method, why not begin with an
infeasible, worse than optimal solution? At least such an initial solution
should be easy to find. This is the essence of the primal-dual algorithm.
The algorithm begins by constructing the dual L.P. problem. A feasible
solution is then found for the dual. On the basis ofthis solution, the original
primal L.P. is modified and this modified problem is used to create a new
feasible solution to the dual with an improved value. The process continues
in this manner, examining solutions to the dual and a modified primal
alternately until the optimal dual solution is produced. (Convergence must
take place.) Loosely speaking the successive solutions to the dual correspond
to primal solutions which are successively less infeasible for the primal, until
the optimal dual solution corresponds to a primal solution which is not
only feasible but optimal.
The algorithm is fully described in Hadley (1962) and Dantzig (1963).

3.5 Dantzig-Wolfe Decomposition

3.5.1 Background

Nearly all real-world L.P. problems have far more variables and constraints
than the small problems concocted for illustrative purposes so far in this
book. In fact some industrial problems are so large that it is not very practical
to consider solving them by the methods presented up to this point. One
line of approach is to ask what special structure an L.P. must possess in
order for it to be possible to break it up into a number of smaller, hopefully
easier subproblems. The idea is to somehow combine the solutions of the
subproblems in order to find the solution for the original problem.
It has been found that many realistic L.P. problems possess a matrix A
of l.h.s. coefficients which has the property called block angular structure.
What this structure is will be described a little later. The point is that block
angular L.P. problems can be decomposed into smaller subproblems. By
solving these in a special way it is possible to identify an optimal solution
to the original problem. This is achieved by Dantzig- Wolfe decomposition,
which is due to Dantzig and Wolfe (1960). Their method is now introduced
by means of a numerical example. This section requires a more thorough
3.5 Dantzig-Wolfe Decomposition 123

understanding of matrix theory. The unprepared reader is referred to the


Appendix.

3.5.2 Numerical Example

We shall now consider an expanded version of the coal mining problem


which was introduced at the beginning of the previous chapter. Suppose
now that the coal mine mentioned earlier (mine No.1) is taken over by
another company which already owns a mine (No.2). Thus the company
now has two mines. For simplicity, constraints generated by screening the
coal are neglected. However the company is anxious to maintain good
labour relations with its new miners, and so maintains the same restriction
of 12 and 8 hours of cutting and washing, respectively, per day, with unit
consumption being 2 and 1 hour for lignite and 1 hour for anthracite for
both cutting and washing. A restriction of 24 and 14 hours per day of cutting
and washing are in force at mine 2. Electricity and gas are purchased by the
company and supplied to its mines. Because the mines employ different
processes, the same type of coal in different mines requires different amounts
of electricity and gas to produce the same quantity. Thus, to produce one
ton oflignite and anthracite requires 2 and 3 units of electricity, respectively,
in mine 1 and 4 and 1 units, respectively, in mine 2. The corresponding
figures for gas consumption are it t~, 6, and 8. Due to energy shortages
the company is allocated a maximum of 20 and 30 units daily of electricity
and gas, respectively. The unit profit for mine 1 lignite and anthracite was
$4 and $3, respectively (in hundreds of dollars), and this can be maintained.
Because of increased shipping costs, as mine 2 is in a remote area, unit
profit for mine 2 lignite and anthracite is $i~ and $;~ respectively. Let
Xl = daily production of mine 1 lignite in tons
X2 = daily production of mine 1 anthracite in tons
X3 = daily production of mine 2 lignite in tons
X4 = daily production of mine 2 anthracite in tons.
Then the problem can be expressed as follows:
Maximize: 4Xl + 3X2 + HX3 + ;~X4 = Xo (3.13)
2Xl + 3X2 + 4X3 + X4 :s; 20 (3.14)
subject to: t~Xl + t~X2 + 6X3 + 8x 4 :s; 30 (3.15)
3x l + 4X2 :s; 12 (3.16)
4Xl + 2X2 :S;8 (3.17)
2X3 + x 4 :s; 24 (3.18)
X3 + X4 :s; 14 (3.19)
Xt.X2,X3,X4 Z o. (3.20)
124 3 Advanced Linear Programming Topics

Matrix A for this problem is

2 3 I 4 1
I
1~ 1~ ..1I____
______ 6 8_
3 4 1 0 0
4 2 0 0 1
--------t-----
o 0 1 2 1
o 0 1 1 1
Now, letting

A1 = (:6 :6)'
19 19

A can be expressed as

where 0 is a 2 x 2 matrix of zeros. In general, a matrix which can be expressed


as
A1 A2 AN
A N+ 1 0 0
0 AN+ 2 0

0 0 A2N

is termed block angular. Let Xi be the vector of variables corresponding to


the columns of submatrix Ai' The constraints:
N

I Aixi S bo
i= 1

are called global constraints. The constraints:

are called local constraints.


Block angular matrices appear in L.P. problems when, as in the present
example, the total operation can be divided into groups of activities, each
with its own exclusive resources, and there are further resources which
must be shared by all activities. Thus, apart from the gas and electricity
constraints the problem can be considered as two separate L.P. problems,
one for each mine. These subproblems are:
3.5 Dantzig-Wolfe Decomposition 125

Mine 1 Mine 2
Maximize: 4Xl + 3X2 Maximize: nX3 + i~X4
subject to: 3x 1 + 4X2 ::::;; 12 subject to: 2X3 + X4::::;; 24

4Xl + 2X2 ::::;; 8 X3 + X4::::;; 14


Xl,X2 ;:::: O. X3,X4;::::O.

In the general case there will be N subproblems of the form:


T
Maximize: CjX j

subject to: AN+jxj::::;;b j , forj=1,2, ... ,N,


Xj;:::: 0
where
C = (c 1 C2 CN)T = the vector of objective function coefficients.
X = (Xl X2 xNf = the vector of decision variables.
b = (bi bI b~f = the vector of r.h.s. constants.
Now suppose the constraints (3.14) and (3.15) are temporarily ignored. Then
if the optimal solutions to the two subproblems are feasible for these con-
straints, their combination represents an optimal solution to the original
problem. Hence it appears worthwhile to attempt to solve the original
problem by analyzing the subproblems. Of course, there must be some
modification to the strategy of simply solving the subproblems, as their
combined solutions will seldom satisfy the global constraints (3.14) and
(3.15). A technique called the method of decomposition developed by Dantzig
and Wolfe will now be explained by using it to solve the example problem.
The simplified version of the method assumes that each subproblem has a
set of feasible solutions which is bounded, i.e., no variable can take on an
infinite feasible value. We make that assumption here.
The assumption of boundedness implies that the set of feasible solutions
for each subproblem has a finite number of extreme points. Furthermore,
any point in such a set can be expressed as a convex combination of these
extreme points. More precisely, if subproblem j, j = 1, 2, ... , N, has mj
extreme points, denoted by x~, k = 1, 2, ... , mj, then any feasible solution
Xj to the subproblemj can be expressed as:

where
mj

L
k=l
Q(~ = 1

k = 1,2, ... , nj.


Also, no point outside the set can be expressed in this way.
126 3 Advanced Linear Programming Topics

For example, in subproblem 1, Figure 2.1 reveals that there are four
extreme points:
X~ = (0,0), xi = (2,0), xi = (0,3),
So ml = 4. Thus any point, Xl in the feasible region satisfies

where

k = 1,2,3,4.
A similar expression for subproblem 2 can be found which also involves
four extreme points:

Xl = (=~) = (~~xl + ~ixi + ~ixi + ~ixi)


(3.21)
4
= L ~~x1
k= I
and

Xz = G:) = ( ~~x~ + ~~x~ + ~~x~ + ~ixi)


(3.22)
4
= L ~~x~.
k=l

Now problem (3.13)-(3.20) can be written in matrix form as follows:

Maximize: (4,3)(:J + GjJ~)G:)

subject to: (3.23)

(3.24)

(3.25)

Xi 20, i = 1,2, 3,4. (3.26)


Now if (3.21) and (3.22) are used to eliminate Xl' Xz, x 3, and X4 from this
formulation, (3.24), (3.25), and (3.26) are no longer needed, as they are
implicitly satisfied in (3.21) and (3.22). Making this substitution, the problem
becomes:
4 4
Maximize: (4,3) L ~~x~ + m,;~) k=l
k=l
L !X~x~ (3.27)
3.5 Dantzig-Wolfe Decomposition 127

subject to: (2 3) ~
46
19
46
19
k
L... O(lX l
k=l
k
+ 6 (4 81) k~l~ k k
0(2 X 2:-S;
[20J
30 (3.28)

O(J + O(J + O(j + O() = 1, j = 1,2 (3.29)


O(J ~ 0, j = 1,2
k = 1,2,3,4.
The x~ and x~ are constant. The decision variables are the 0(7. However,
the formulation has fewer constraints and should be easier to solve.
It would appear that it is necessary to find all the extreme points of the
subproblems before the solving process can be started. This is a substantial
task and would negate the gains made by the reduction in the number of
constraints. Fortunately, it is unnecessary to find all the extreme points
first; they can be found one at a time as needed. The basis for achieving
this is the revised simplex method.
We begin by introducing slack variables into (3.28):

(2 3) k~l~
tg tg
k
O(lX l
k
+
(46 81) k~l~ k
0(2 X 2
k
+
(xs) (20)
X6 = 30 .
Define the actual variable values at the extreme points as follows:

1
Xl = (x~)
X~ , Xl
2
= (xi)
X~ , Xl
3
= (xi)X~ ,
4
Xl =
(xt)
X~ ,

1
X2 = (x~)
xl ' 2
X2 = xi '
(x~) 3
X2 =
(x~)
xl ' 4
X2 =
(x~)
x! .
On substituting these into problem (3.27)-(3.29), we obtain the following
problem:

Maximize:

subject to: 2
( tg 1) + (X~2) + (X~3) + (X~4)}
3
tg )
1 Xl
{ 0(1 ( X~
2
0(1
Xl 3
0(1
Xl 4
0(1
Xl

+ (46 81){ (xl1) + (xi2) + (xl3) + ( 4)}


1 X3 2 X3 3 X3 4 X3
0(2 0(2 0(2 0(2 X!

+ (::) = G~)
O(~ + O(i + O(i + O(t = 1
O(~ + O(~ + O(~ + O(~ = 1
O(J ~ 0, for all j, k.
128 3 Advanced Linear Programming Topics

On rearrangement, this becomes


Maximize: (4x~ + 3x~)cd + (4xi + 3x~)Cti + (4xI + 3x~)CtI
+ (4xi + 3xi)Cti + mx~ + nxi)Ct~ + mx~ + i~xl)Ct~ (3.30)
+ mx~ + ;~x~)Ct~ + mx~ + i~x!)Cti
subject to: (2x~ + 3x~)Ct~ + (2xi + 3x~)Cti + (2xI + 3x~)Cti
+ (2xi + 3xi)Cti + (4x~ + xi)Ct1 + (4x~ + xl)Ct~ (3.31 )
+ (4x~ + x~)Ct~ + (4x~ + x!)Cti + Xs = 20
(nx~ + nx~)Ct~ + (i~xi + i~x~)Cti + mXI + i~x~)Cti
+ (nxi + i~xi)ai + (6x~ + 8xi)Ct1 + (6x~ + 8xl)Ct~ (3.32)
+ (6x~ + 8xl)Ct~ + (6x~ + 8xl)Cti + X6 = 30
a~ + Cti + Cti + ai = 1
c.:1 + c.:~ + a~ + ai = 1
aJ ;: : 0, for all j, k.
Now on examining the subproblems it can be seen that (0, of is an extreme
point for both problems. Let

(:1) = (~)
and

(:D = (~}
These points are associated with a~ and aL respectively. These values of
xl, x~, x1, and xi, cause a~ and a~ to drop out of (3.31) and (3.32). Thus a
suitable initial basis is

Hence, in terms of the revised simplex method,

B=(~° °~ ~ ~)= lOB


-1
,

xB
° °[20,30,1,1]T
=
°1
CB = [0,0,0, Oy.
The subscript of the B, denoting the iteration number, has been dropped,
as we shall use it for another purpose soon.
We must now decide whether or not this solution is optimal. This is done
in the revised simplex method by examining the sign of the minimum
element in the Xo row. Let c jk be the xo-row coefficient of aJ. Now the
evaluation of the Cjk depends upon the extreme points xJ. However, rather
3.5 Dantzig-Wolfe Decomposition 129

than determining all extreme points one can simply find the extreme point
for each subproblemj which yields the smallest Cjk' Now remember that the
x~ are the extreme points for subproblem j, which has the following feasible
region:
AN+jXj::S; bj

Xj ~ o.
Thus the problem of finding the extreme point for each subproblemj yielding
the smallest Cjk reduces to solving a number of L.P.'s of the following form:
Minimize: C jk (3.33)
subject to: AN+jxj::S; bj
Xj ~ O.
At any iteration of the revised simplex method, an expression for the Cjk
can be found as follows. Recall that the xo-row coefficient Cj of each nonbasic
variable Xj in an ordinary problem is found by

where
Cj = the original coefficient of x j
Cj = the jth column of A
1ti = C~B-1, the current simplex multipliers.

Now for our original problem let p be the number of global constraints. In
the present example, let
(B- 1 )P = the first p columns of B- 1
Bj 1 = the jth column of B-1, considered as a vector.
Then
-Cjk=CB
T(B-1)PA j ' xjk + CBTB-1 T k
p+j-CjkXj

Xj + CB p+ j'
_ ( T(B-1)PA T) k TB-1
- CB j - cjk
This expression can be substituted into (3.33) to produce the following for-
mulation:
Minimize: (C~(B-1)PAj - c~)Xj + c~B;.!j
subject to: AN+jxj::S; bj

Xj ~O.

The optimal (minimal) solution value of this L.P. corresponds to the mini-
mum xo-row coefficient in the original problem. If it is negative, the optimal
solution to (3.33) corresponds to the extreme point we are trying to find.
Thus an L.P. of the form of (3.33) must be solved for each subproblem.
If all solution values are nonnegative, no further iterations are required.
Returning to the example problem, the L.P. of the form (3.33) for the first
130 3 Advanced Linear Programming Topics

subproblem is

Minimize ((0,0,0, 0) (~ ~) G!;)- (4, 3)) (::) +(0, 0, 0, 0) W- xi

subject to: (! ~)(::) ~ C~)


GJ 2 (~}
This problem can be solved graphically by examining Figure 2.1. More gen-
eral problems can be solved by the simplex method. The optimal solution is
xt = 4
5
12
x~ = 5
and
(X6)* = -V·
This corresponds to an extreme point of subproblem 1, say xi. Therefore
X 21 -_ [±5, ll]T
5
and
C 12 = -5l < o.
Hence optimality has not been reached.
The L.P. associated with the second subproblem is

Minimize:

subject to:

This problem can also be solved graphically to yield the following optimal
solution:
x~ = 0
= 14x:
x~* = _1~~4.

Let this solution correspond to the solution x~ of subproblem 2.


x~ = [x~,xI] = [1O,4Y and C 22 = _1~~4 < o.
3.5 Dantzig-Wolfe Decomposition 131

As this value e22 represents the minimum xo-row coefficient, and it is nega-
tive, its variable oc~ enters the basis. Following the steps ofthe revised simplex
method we next calculate the oc~ column in the tableau. On looking at (3.31)
and (3.32) it can be seen that this column is.

[4x~ + xi,6x~ + 8x!,0, 1Y = [14,112,0, 1Y

The r.h.s. column is (20,30,1, ). On taking the ratios:

it can be seen that the minimum corresponds to variable X6, which now
leaves the basis. Therefore,

and
CB = (0, 1064, 0, O)T.

B- 1 is now updated and becomes


1

~ (~ ~)
-8 0
1
TI2 0
B-' 0 1
1
-TI2 0

We must now test whether P2 corresponds to an optimal solution. The


L.P. for subproblem 1 is:

Minimize: -:t2)( 2 3)
O
-lt2
46 46
19 19

subject to:

Now
l32 3 Advanced Linear Programming Topics

The optimal solution to this problem is


xt= 4
5
x! = 12
5
and
xA* = _356.

As xA* is negative, the optima} solution has not been reached. The L.P.
problem for subproblem 2 is

Minimize: ( 38(4 1)
(0'92) 6 8 -(23'23)
57 76 )(X3) +(0,----r3, 0, 0) (~)~
X4 1064 2
=Xo= (0)°
subject to: G~)(::) ~ G:)
G:) ~ (~}
Hence the minimum solution is provided by subproblem 1, and corresponds
to the extreme point
xi = [xi,xn = [O,oy
and
(;12 = 0.

The corresponding variable (Xi enters the basis. We next calculate the (Xi
column in the tableau, which is

Cl'"XT'+ Xl)' ~ (~ ° °0)("') ("~)


1
-8
3X
° ° =
46 2 46 2 1 736 46
IT2 95 665
B-'
° 1 1 1 .

° °
1
-IT2 1 46
-1 6 5

The r.h.s. column is

°
1

~ (~ ~)(~){)
8

°
1
IT2
B-'b,
°
-m °
1

On taking the ratios:


(~~~~, ~~~~, t,-)
it can be seen that the minimum corresponds to variable (XL which now
leaves the basis. Therefore
3.5 Dantzig-Wolfe Decomposition l33

B- 1 is now updated and becomes:

-8
1 744
-"95
_1_ 46
112 -665
o 1
1 46
-TTI 665

We now test whether f33 corresponds to an optimal solution. The L.P.


for subproblem 1 is

Minimize: =~)
o (2 3)46
19
46
19
-1}2

744)
-(4,3)C:) + (O,'~;', 'f,O) ( ~'
-665

subject to:
(! ~)GJ~C~)
GJ~o.
Now
xb = (O,H)(~ !~) - (4,3»)GJ + 356

= -3Xl - 2X2 + 356,


and the optimal solution is
x! =!
x~ = V
xb* = O.
The L.P. problem for subproblem 2 is

Minimize:

subject to:
134 3 Advanced Linear Programming Topics

Hence the minimum solution is provided by subproblem 1. We have arrived


at an optimum as none of the xo-row coefficients are negative.
The optimal solution is

°)0 C
1 744

@~B-'b'~(~
-8 -95
ITI
1 46
-665 o 30 _ ~~~
t
0 1 o 1 - 1
-112
1 46
665 1 1 ~~g
)
Therefore

xi = (X!) = i
X2 k= 1
(X~x~ = 0 + 1 x (t) + 0 + 0 = (t)
5 5

x *2 _
-
(Xj) _ ~
xl - /~'t k k _
(X2 X 2 -
609
760
(0)
0 +
151 (
760
0)
14 + 0 + 0 -_ ( 13~5ri
0 )

X~= 3Nt
(X~* = ~~6
(Xi* = 1
(X~* = ~~g
x~ = 0

ll
380
151
o = [0,23
X* 1064 52 0]
, 5 '1
760
3199 = 2253 '"
115 -
1959
. .

609
760

The objective function hyperplane is parallel to that representing the con-


straint on the gas:

As this contraint is binding at the optimum (x~ = 0), multiple optima exist.
The complete set of solutions is given by:
x~ = 2ltl
xi =!
xi = V
3xj + 4xl = 19~7,

where

and
3.5 Dantzig-Wolfe Decomposition 135

3.5.3 Summary of the Decomposition Algorithm

Given an L.P. in the following form:


Maximize: cTX

Al A2
A N+ I 0
subject to: 0 A N+2

o o
X~o.
Let
C
T T
= ( CI'C 2 , · · · , CN
T)T
X = (X I ,X 2 , " ' , xNf
and
b = (b~,br. ... , b~f.
Define a set of feasible points Xj satisfying
AN+jx j ::;; bj
Xj~ O.
Let the jth such set have nj extreme points xj, x;, ... , xy. Then any point
Xjin the jth set can be expressed as

k = 1,2, ... , nj •
The given problem can now be reformulated with the introduction of a
vector Xs of slack variables:

Maximize:

N nj

subject to: LI k=LI (Ajx')tX' + Xs =


j=
bo

L tX' =
nj

1, j = 1, 2, ... , N,
k=l

tX' ~ 0, j = 1, 2, ... , N,
k = 1, 2, ... , nj.
This formation is then solved using the revised simplex method.
136 3 Advanced Linear Programming Topics

In order to calculate the minimum xo-row coefficient, N linear program-


ming problems of the form
Minimize:
subject to: AN+jx j ~ bj
Xj ~ 0
are solved, where the terms above have been defined in the previous section.
The minimum solution value obtained in solving the above problems is
equal to the minimum xo-row coefficient. If it is nonnegative, the optimal
solution has been found. Otherwise, a substitution in the set of basic vari-
ables is made in the usual way.

3.6 Parametric Programming


3.6.1 Background

In Section 2.6 the sensitivity of the optimal solution of an L.P. problem to


changes in its coefficients was discussed. It was assumed that these changes
were made one at a time. We now look at the possibility of analyzing the
effects of simultaneous changes. Only changes in objective function coeffi-
cients and r.h.s. constants will be dealt with here. The approach is to develop
techniques whereby the investigation can take place in an efficient manner,
as opposed to solving the whole problem from scratch with the new values
inserted. The techniques are collectively called parametric linear programming,
although the term linear will be dropped, as it is understood that we are
dealing solely with L.P. problems. Such methods are useful in situations in
which, because of the effects of some predictable process, many of the L.P.
parameters vary at constant rates. For example, profits or costs could vary
as a result of inflation, or daily consumption of resources may have to be
steadily reduced as the supply of raw materials dwindles. It will be assumed
that the coefficients vary linearly with time.

3.6.2 Numerical Example

3.6.2.1 Changes in the Objective Function Coefficients


Consider again problem 2.1.
Maximize: 4Xl + 3X2 = Xo
subject to: 3x 1 + 4X2 + X3 = 12
3x 1 + 3X2 + X4 = 10
4Xl + 2X2 + Xs = 8
Xi~ 0, i = 1,2, ... , 5.
3.6 Parametric Programming 137

Suppose that the xo-row coefficients, 4 and 3, are changing at the rates of 2
and 3 units per unit of time, respectively. Then if e is defined as the amount
of elapsed time, after e units of time have elapsed Xo becomes
Xo = (4 + 2e)Xl + (3 + 3e)xz. (3.34)
Given a particular value of e, it is possible to determine the optimal solu-
tion and its value. This has already been done for the case e = o. We now
address ourselves to the task of using this solution to find the solution for
any other positive e in a way which requires less work than solving the new
problem from the beginning using the simplex method.
The optimal solution to the problem when e = 0 is given Table 2.8, re-
peated here for convenience. When e is given any nonnegative real value,
the only change in Problem 2.1 occurs in the objective function. Hence the
solution in Table 2.8 will be feasible for the problem corresponding to any
e. As e is increased from zero to a relatively small positive value it is likely
that the present solution will remain optimal. However, it may be that as
e is progressively increased in value there will occur a critical point at which
the present solution is no longer optimal. A new optimal solution can be
established and e increased further. Later a new critical point may be estab-
lished. We shall now establish the ranges for efor which the various possible
bases are optimal.

Table 2.8

Constraints Xl X2 X3 X4 X5 r.h.s.
2 3 12
(2.11 ) 0 1 5 0 TO ""5
3 3 2
(2.12) 0 0 5 1 -TO 5
1 2 4
(2.13) 0 5 0 5 5
2 7 52
Xo 0 0 5 0 TO ""5

Suppose Problem 2.1 has its objective function replaced by (3.34). If the
manipulations carried out in Tables (2.1)-(2.8) are applied to this new prob-
lem, the final tableau will be as shown in Table 3.10. Transforming this to

Table 3.10

Constraints Xl X2 X3 X4 X5 r.h.s.
2 3 12
(2.11 ) 0 1 5 0 -TO ""5
3 3 2
(2.12) 0 0 -5 1 -TO 5
1 2 4
(2.13) 1 0 -5 0 5 5
2 7 52
Xo -2IJ -3IJ 5 0 TO ""5
138 3 Advanced Linear Programming Topics

canonical form, we get Table 3.11. Thus it can be seen that for the present
basis to remain optimal all the xo-row coefficients must be nonnegative, i.e.,

2 +40
--5
>-0

and
7-0
-10
- >- 0.

As it has been assumed that 0 is nonnegative,

0$7.
So setting 0 = 7 we have reached the first critical point.

Table 3.11

Constraints Xl X2 X3 X4 Xs r.h.s.

(2.11) 0 t 0 3
-TO
12
""5
3 3 2
(2.12) 0 0 -s -TO S
(2.13) 1 0 -s1 0 t 4
S

2+40 7-0 52 + 440


Xo 0 0 0
5 10 5

Substituting this value into Table 3.11, we obtain Table 3.12. This, of
course, corresponds to a situation with mUltiple optimal solutions. The ob-
jective function for this value of 0 is:

Xo = (4 + 2(7»Xl + (3 + 3(7»X2 = 18xl + 24x2.


It can be seen from Figure 3.1 that the increase in 0 from 0 to 7 has changed
the slope of the objective function to a point where it is now parallel to (2.11).

Table 3.12

Constraints Xl X2 X3 X4 Xs r.h.s. Ratio

(2.11) 0 ~
s 0 3
-TO II
(2.12) 0 0 -s3
1 3
-TO t
(2.13) 0 -s1 0 ~
S ! ~
1

Xo 0 0 6 0 0 72
3.6 Parametric Programming l39

Xo with 8 =7

Xo with 8 = 0

(2.13)
Figure 3.1. Parametric programming with xo-row coefficient changes.

Table 3.l3

Constraints Xl Xz X3 X4 Xs r.h.s.

;> I
(2.11) 4 1 4 0 0 3
(2.12) 3 0 3
-4 1 0 1
4
(2.13) s 0 I
0 2
2 -2

2 + 48 7-8 52 + 448
Xo 0 0 -- 0 --
5 10 5

Bringing Xs into the basis at the expense of Xl in Table 3.11 produces


Table 3.13. Now as it has been assumed that () is nonnegative, all that is
required is that
() ";::. 7.
For any values of () no less than 7 the present basis remains optimal. Thus
there is only one critical point. These results are summarized in Table 3.14.
140 3 Advanced Linear Programming Topics

Table 3.14. Results of a


Change in the Objective
Function coefficients.

o~ ()~ 7 7~(}

4 ~ C1 ~ 18 18 ~ C1

3 ~ C2 ~ 24 24 ~ C2

52 + 448 47 + 458
x*0
5 5
4
x*1 "5 0
12
xi ""5 3
x! 0 0
x*4 2
"5
x*5 0 2

3.6.2.2 Changes in the r.h.s. Constants


Suppose now that the r.h.s. constants in Problem 2.1 are increasing from
12, 10, and 8 at the rates of 2, 2, and 3 units per unit of time, respectively.
Then after eunits of time have elapsed, these r.h.s. constants are, respectively,

12 + 2e, 10 + 2e, and 8 + 3e.


Once again we wish to identify the optimal solution and its value for in-
creasing e. The optimal solution when e = 0 is given in Table 2.8. Now
suppose the new r.h.s. constants for a given positive value of the eare intro-
duced to the problem and this new problem has the same simplex iterations
applied to it as produced Table 2.8. Then if the methods of Section 2.6.2.2
are used repeatedly for each r.h.s. constant, the new tableau will be as shown
in Table 3.15. For this basis to remain feasible, all r.h.s. values must be
nonnegative. Therefore
II + (% - loW;:?: 0
t + (- ~ + 2 - lo)e ;:?: 0
and

i.e.,
e::;; 4.
So the first critical point occurs at e = 4. Substituting this value into
Table 3.15, we obtain Table 3.16. If e is increased beyond 4 the solution in
Table 3.16 will become infeasible, as X 4 will become negative. Thus X 4
should leave the basis. This is done using the dual simplex method, which is
explained in Section 3.3.
3.6 Parametric Programming 141

Table 3.15

Constraints Xl x2 X3 X4 Xs r.h.s.

(2.11) 0 1 2
5 0 3
-TO V+ (t(2) + 0(2) - lo(3))O
(2.12) 0 0 3
-5 1 3
-TO t + (-t(2) + 1(2) - 130(3))8
(2.13) 0 1
-5 0 2
5 ! + (-!(2) + 0(2) + t(3))O
sl + (t(2) + 0(2) + /0(3))8
7
Xo 0 0 t 0 TO

Table 3.16

Constraints Xl X2 X3 X4 Xs r.h.s.

3
(2.11) 0 ~
5 0 -TO 2
(2.12) 3 3
0 0 -5 -TO 0
1
(2.13) 1 0 -5 0 ~
5 4
~
Xo 0 0 5 0 /0 22

Thus X3 enters the basis in Table 3.15 at the expense of X4' producing
Table 3.17. This basis will remain feasible if all the r.h.s. values are
nonnegative:
16 - 0
-->0
6 -
0-4
-->0
6 -

4: 50 ;;::0,

i.e.,

Table 3.17

r.h.s.

16 - 8
(2.11) o o 2
"3
6

8-4
(2.12) o o 1 -t 6

4+ 58
(2.13) 1 o o -t 6

64 + 178
o o o 2
"3
6
142 3 Advanced Linear Programming Topics

Table 3.18

Constraints XI X2 X3 X4 Xs r.h.s.

4 8 -16
(2.11) 0 -2 0 -3
3
(2.12) 0 -1 0 2
10 + 28
(2.13) 0 t 0
3

4 40+ 88
Xo 0 0 3 0
3

So the second critical point occurs at () = 16. If () is increased beyond 16 the


solution in Table 3.17 will become infeasible, as X2 will become negative.
Thus X 2 should leave the basis.
Now Xs enters the basis at the expense of x 2 , producing Table 3.18. This
basis will remain feasible if all r.h.s. values are nonnegative:
() - 16
--
3 >-0

10+ 2() 0
3 ~,
i.e.,
() ~ 16.
For any value of () no less than 16 the present basis remains feasible. Thus
there are two critical points. These results are summarized in Table 3.19.
Changing r.h.s. constants in an L.P. problem is equivalent to changing
objective function coefficients in the dual. Hence let us resolve the problem
just analyzed by examining its dual. The dual of the problem is
Minimize: 12Yl + lOYz + 8Y3 = Yo
subject to: 3Yl + 3Y2 + 4Y3 - Y4 =4
4Yl + 3Yz + 2Y3 - Y6 = 3
Yl, Y2, Y3' Y4, Y6 ~ o.
Changing the r.h.s. constants of the primal by 2, 2, and 3 units per unit of
time corresponds to changing the dual Yo row coefficients by the same
amounts. The optimal solution to the above problem is given in Table 3.20.
Following the ideas of Section 3.6.2.1, when () is introduced the objective
becomes
Minimize: (12 + 2())Yl + (10 + 2())Y2 + (8 + 3())Y3,
i.e.,
Maximize: y~ = -(12 + 2())Yl - (10 + 2())Y2 - (8 + 3())Y3. (3.35)
3.6 Parametric Programming 143

Table 3.19. Results of Changes in the r.h.s.


Constants.

O~(J ~4 4 ~ (J ~ 16 16 ~ (J

12 ~ b 1 ~ 20 20 ~ b 1 ~ 44 44 ~ b1
10 ~ b 2 ~ 18 18 ~ b 2 ~ 42 42 ~ b2
8 ~ b 3 ~ 20 20~ b 3 ~ 56 56 ~ b3

104 + 29(J 64 + 17(J 40 + 8(J


x~ --
10 6 3
8-(J 4 + 5(J 10 + 2(J
xi -- --
10 6 3
24 - (J 16 - (J
x~ 0
10 6
(J-4
x! 0 2
6
2 + 4(J
x: -- 0 0
5
(J - 16
x~ 0 0
3

Table 3.20

Constraints Y1 Y2 Y3 Y4 Y6 r.h.s.

2 3 7
(3.09) 0 130 1 -5 TO 10
d 1 2
(3.10) 1 5 0 5 -5 ~
0 0 4 .li 52
Y~ ~ 5 5 -5

When the manipulations applied to the dual to produce Table 3.20 are
applied to the problem with (3.35) as an objective function, Table 3.21 is
produced. Transforming this into canonical form, we obtain Table 3.22.
Hence for the present basis to remain optimal all the y~-row coefficients

Table 3.21

Constraints Y1 Y2 Y3 Y4 Y6 r.h.s.

2 3 7
(3.09) 0 130 -5 10 TO
d 1. 2
(3.10) 1 5 0 5 -5 ~
Y~ 2(J ~+ 2(J 3(J 4
5 5
12 -5l
144 3 Advanced Linear Programming Topics

Table 3.22

Constraints Yl Y2 Y3 Y4 Y6 r.h.s.

(3.09) 0 ?o 1 -5
2 3
TO TO
7

(3.10) 1 CD 0 t 2
-5 1.
5

2 0 4 40 12 0 52 29
Y~ 0
5 10
0 -+-
5 5 5 10
----0
5 10

must be nonnegative:
2 ()
--->0
5 10-
4 4()
5+5~0
and

i.e.,
() :$; 4.

Thus () = 4 is the first critical point. When () > 4 the objective function
of Y2 is negative, so Y2 enters the basis, as in Table 3.23. For this basis to
remain optimal all the xo-row coefficients must be nonnegative:

()-4
-->0
6 -

4 + 5() > 0
6 -
16 - ()
-6-~0,

i.e.,
16 ~ () ~ 4.

Table 3.23

Constraints Yl Y2 Y3 Y4 Y6 r.h.s.

(3.09) -2
1
0 1 -2
1 !.
2 t
(3.10) t 0 t -3
2
t
0-4 4+ 50 16 - 0 64 + 170
Y~ 0 0
6 6 6 6
3.6 Parametric Programming 145

Table 3.24

Constraints Yl Y2 Y3 Y4 Y6 r.h.s.

(3.09) -1 0 2 -1 1 1
(3.10) 1 ! -'3
1
0 !
8 -16 10+ 28 40+ 88
Y~ 2 0 4
3 3 3

Thus () = 16 is the second critical point. When () > 16 the objective function
coefficient of Y6 is negative, so Y6 enters the basis, as in Table 3.24. For this
basis to remain optimal all the y~-row coefficients must be nonnegative,
which is certainly true for () ~ 16. Thus there are only two critical points,
at 4 and 16. These results confirm what was discovered by analyzing the
primal earlier in this section.

3.6.3 Summary of Parametric Programming

3.6.3.1 Changes in the Objective Function Coefficients


Given an L.P. in the following form:
n
Maximize: Xo = L
i= 1
CiXi (3.36)

n
subject to: L
i= 1
AijXi ~ bj , j = 1,2, ... , m
(3.37)
Xi~O, i = 1,2, ... , n.

Suppose that the xo-row coefficients Ci' i = 1, 2, ... , n, are changing at the
rate of bi units per unit of time. Then after () units of time, Xo becomes
n

Xo = L (Ci + bi()Xi'
i= 1

It is obvious that x~ is a function of (). It may be desirable to find x~«() and


to find ranges for () for which the various possible bases are optimal. As ()
represents elapsed time it is assumed that () ~ O.
First the problem is solved for () = O. Then () at the positive level is
introduced. As the only change in the problem comes about in the objective
function, the present solution (found when () = 0) will still be feasible for the
problem when () > O. Thus if the same manipulations used in solving the
problem when () = 0 are applied to the problem when () > 0, only changes
in the xo-row will occur. The new Xo row can be obtained by subtracting
146 3 Advanced Linear Programming Topics

t5;f} from the xo-row coefficient of Xi' and then transforming the tableau into
canonical form, as explained in Section 2.5.2.
If all the new xo-row coefficients are nonnegative, the present solution is
still optimal. Hence the maximum value for 0, say 01> for which nonnega-
tivity of all the coefficients occurs can be found; 0 1 is called the first critical
value. This value is substituted into the xo-row, producing at least one
nonbasic coefficient with value zero. A variable corresponding to this zero
is brought into the basis in the usual manner, and 0 is introduced once
more. The process is repeated to produce further critical values until it is
obvious that the increases in the value of 0 will not create a situation in
which the current basis is suboptimal. Successive tableaux can be examined
to find the ranges for 0 and their corresponding solutions and values.

3.6.3.2 Changes in the r.h.s. Constants


Given an L.P. in the form of(3.36) and (3.37), suppose that the r.h.s. constants
bj, j = 1, 2, ... , m, are changing at the rate of t5 j units per unit of time. Then
after 0 units of time the constraints (3.37) become
n

L aijxi ~ bj + t5 j O, j = 1,2, ... , m.


i= 1

First the problem is solved for 0 = O. Then 0 at the positive level is introduced.
If the present solution, found when 0 = 0, is still feasible it will still be optimal.
The final tableau, produced by applying the manipulations that created the
original optimum to the new problem with 0 > 0, is now be deduced.
This final tableau can be obtained from the original optimal tableau by
repeatedly using the considerations of Section 2.6.2.2 for each r.h.s. constant.
For this new tableau to represent an optimal solution, all the entries in the
r.h.s. column must be nonnegative. As they are functions of 0, an upper
bound on 0 can be obtained. That is, a value 0 1 can be found such that if

at least one r.h.s. entry will be negative. 0 1 is the first critical value. The
solution and its value, as functions of 0, can be found from the tableau for

0 1 is substituted into the tableau, creating at least one zero entry in the
r.h.s. column. The dual simplex method of Section 3.3 is now applied to
effect a change of basis, with a basic variable with present value of zero
departing. When a nondegenerate basis has been found, the above procedure
is repeated and a second critical point is identified.
The process is repeated until a basis is found with the property that
further increases in the value of 0 will not lead to the basis being suboptimal.
3.7 Exercises 147

This analysis could also be carried out by taking the dual of the problem,
which is
m
Minimize: I (h j + 8c5)Yj
j= 1

m
subject to: I ajiYj ~ ci, i = 1,2, ... , n
j= 1

j = 1,2, ... , m.

Then the procedure of Section 3.6.3.1 can be used.

3.7 Exercises
1. Solve the following problems using the revised simplex method.
(a) Maximize: 3x I + 2x z + X3 + 2X4
subject to: 3x I + Xz + X3 + 2X4 ::;; 9
Xl + 2x z + X3 + 4X4 ::;; 12
2XI + Xz + 3X3 + x 4 ::;; 8
3x I + 3x z + 2X3 + x 4 ::;; 10
Xi ::::: 0, i = 1, 2, 3, 4.

(b) Maximize: 2XI - 3x z + 2X3 + 4X4


subject to: 2XI + 5x z + 3X3 + 3x 4 ::;; 20
2XI + 4x z + X3 + 6x 4 ::;; 20
2XI + 2x z + 2X3 + 3x 4 ::;; 12
Xl + 2x z + 2X3 + 4X4 ::;; 16
i = 1,2,3,4.
(c) Maximize: Xl + 2x z + 3X3 - X4
subject to: Xl + X z - X3 + x 4 ::;; 3
2XI + 3X3 ::;; 6
3x I + Xz + 2X3 - 2X4 ::;; 10
2x z + 3X3 + 2X4 ::;; 8
i = 1,2,3,4.

(d) Maximize: 4XI + 2X2 + 3X3 - X4


subject to: Xl + 2X2 + X4 ::;; 8
3x I + 2X3 + X 4 ::;; 12
2XI + x 2 + 3X3 ::;; 20
2X2 + 2X3 + x 4 ::;; 10
i = 1,2,3,4.
148 3 Advanced Linear Prognlmming Topics

2. Solve the problems of Exercise 1 by the regular simplex method. Compare the amount
of computational effort required with that required by the revised simplex method.

3. Solve the duals of the following problems by the dual simplex method.

(a) Maximize: 3Xl + 2xz + X3 + 2X4


subject to: Xl + 2x z + X3 + 3x 4 :'0: 6
3x l + 4x z + 2X3 + x 4 :'O: 8
2Xl + 3x z + 3X3 + x 4 :'O: 9
2Xl + Xz + 2X3 + 2X4 :'0: 12
Xi ~ 0, i = 1,2,3,4.
(b) Maximize: 2Xl + 4x z + X3 + 3x 4
subject to: 2Xl - Xz + X3 + 2X4 :'0: 6
2x z - x 4 :'O: 1
Xl + Xz + 2X4 :'0: 4
3x l + 2x z + 2X3 + x 4 :'O: 9
Xi ~ 0, i = 1,2,3,4.

(c) Maximize: 2Xl + Xz + X3 + X4


subject to: Xl - 2x z + X3 + X4 :'0: 11
-4XI - Xz + 2X3 :'0: 4
2Xl - 2X3 + X4 :'0: 1
- X3 + X4 ~ 2
i = 1,2,3,4.

(d) Minimize: Xl + 4x z
subject to: Xl + 2x z - X3 + X4 ~ 3
-2XI- XZ+4X3+X4~2
Xl + 2x z + X3 :'0: 11
2x z + 2X3 + x4 ~ 8
i = 1,2,3,4.
4. Solve the problems of Exercise 3 by using the regular simplex method on the duals.
Compare the computation step by step for each problem.

5. Solve the following parametric programming problems where the xo-row coefficient
Xi is changing at the rate of Si units per unit of time, where S = (Sl, Sz, S3)'

(a) Maximize: 3x l + Xz + 2X3


subject to: 2Xl + Xz + 4X3 :'0: 10
Xl + 2x z + x 3 :'O: 4
3x l - 2xz + X3:'O: 6
Xi ~ 0, i = 1,2,3
S = (1, 2, 3).
3.7 Exercises 149

(b) Minimize: + 4X2 + X3


XI
subject to: + 2X2 - X3 2 3
XI
- 2xI - x 2 + 4X3 2 1
XI + 2X2 + X3 ~ 11
Xi 2 0, i = 1,2,3
S = (1, 6, 1).

(c) Maximize: XI + 2X2 + 2X3


subject to: 2xI + 2X2 - X3 ~ 8
2xI - X2 + X3 ~ 2
i = 1,2,3
s = (2, -2,1).
(d) Maximize: 3xI + 2X2 + X3
subject to: 3x I + X2 + X3 ~ 9
XI + 2X2 + X3 ~ 12
2xI + X2 + 3X3 ~ 8
Xi 2 0, i = 1,2,3
S = (2,3,4).
6. Solve each of the problems of Exercise 5 by taking the dual and using postoptimal
analysis on the r.h.s. parameters.
7. Solve each of the parametric programming problems in Exercise 5 when the r.h.s.
parameters bi increase with time at the rate of Ai units per unit of time, A = (AI' A2, A3)'
(a) A = (2, 1, 3)
(b) A = (1, -2,3)
(c) A= (-5,1,1)
(d) A = (2, - 3, 3).
Chapter 4

Integer Programming

4.1 A Simple Integer Programming Problem


The Speed of Light Freight Company has just secured a contract from a
corporation which wants its big crates of machine parts periodically shipped
from its factory to its new mineral exploration site. There are two types of
crate; A and B, weighing 3 and 4 units, with volume 4 and 2 units, respectively.
The company has one aircraft with a capacity of 12 and 9 units of weight
and volume, respectively. The company gains revenue of 4 and 3 units (in
hundreds of dollars), respectively, for each crate of A and B flown to the site.
As the revenue for road transport is much lower, the company would like
to make maximum revenue from its one aircraft, the remaining goods being
trucked. We can formulate this problem mathematically as follows. Let
Xl = the number of crates of type A flown
X2 = the number of crates of type B flown.
As 4 units are gained for one A crate flown, the revenue for Xl crates is
4x l . Similarly, 3X2 is gained for X 2 B crates. Thus the total return for a
policy of flying Xl A crates and X 2 B crates is 4XI + 3X2, which we denote
by Xo. Now as one A crate weighs 3 units, Xl A crates will weigh 3x l .
Similarly X2 B crates weigh 4X2' Thus the total weight flown by the policy
is 3x I + 4X2, which must be less than or equal to 12 units. By similar rea-
soning we can formulate a constraint for volume:
4XI + 2X2 :s; 9.
We are now in a position to define the problem mathematically.
Maximize: (4.1)

150
4.2 Combinatorial Optimization 151

subject to: 3x l + 4X2 ::; 12 (4.2)


41 + 2X2 ::; 9 (4.3)

X l ,X 2 ~ ° 4.4)
Xl' x 2 integers. (4.5)
This problem is an example of an integer programming problem. It would
be a linear programming problem if it were not for (4.5). Before going on to
develop methods which will solve this problem, let us define the general
area of combinatorial optimization of which integer programming is a part.

4.2 Combinatorial Optimization


A combinatorial optimization problem is defined as that of assigning discrete
numerical values (from a finite set of values) to a finite set of variables X so
as to maximize some functionf(X) while satisfying a given set of constraints
on the values the variables can assume. Some problems of this type have
already been considered: the transportation problem of Section 2.7.1 and
the assignment problem of Section 2.7.2. Stated formally the combinatorial
optimization problem is

Maximize: f(X)
subject to: gj(X) = 0, j = 1,2, ... , m,
h;(X) ::; 0, i = 1,2, ... , k,
X a vector of integer values.

Note that there are no restrictions on the functions f, gj' j = 1,2, ... , m,
and hi, i = 1,2, ... , k. These functions may be nonlinear, discontinuous, or
implicit. This general problem is difficult to solve, and so we confine our
attention to a drastic simplification, which is a linear programming problem
in which at least one specified variable must have an integer value in any
feasible solution.
Let n be the number of decision variables. Without loss of generality,
suppose that the first q (1 ::; q ::; m) variables are constrained to be integer.
Consider the following problem:
Maximize: CTX (4.6)
subject to: AX=B, (4.7)
X~O (4.8)
X l 'X 2 ' ... , Xq integer, (4.9)
where X = (XtoX2,"" x q , •.• , xnf and C is n x 1, B is m x 1, and A is m x n.
152 4 Integer Programming

If
q = n,
the problem is termed an integer linear programming problem. Our air freight
problem comes into this category.
If
l~q<n,

the problem is termed a mixed integer linear programming problem.


If (4.9) is replaced by
Xi = 0 or 1, i = 1,2, ... , n,
then the problem is termed a zero-one linear programming problem.
Of course, if
q = 0,
(4.9) ceases to be relevant and the problem becomes an ordinary linear
programming problem.
The transportation problem is an integer linear programming problem
and the assignment problem is a zero-one linear programming problem.
Further examples of integer linear programming problems will be given
later. Because only linear problems will be considered, we will simply refer
to an integer program (I.P.).
The formulation (4.6)-(4.9) is identical to an L.P. except for the presence
of (4.9). Because the simplex method is a very efficient way of solving an
L.P., it seems natural to ask whether this method might not be used on the
I.P., solving it by ignoring (4.9). If the solution obtained satisfies (4.9) it is
optimal. However, suppose the solution contained, for at least one i, 1 ~ i ~ q,

where bi is noninteger. In this case the L.P. solution is infeasible as an I.P.


solution. The value for each such Xi could be rounded either up or down as
Xi = [b;] or Xi = [b;] + 1
to achieve feasibility, where [b;] denotes the integer part of bi • Sometimes
this approach yields a satisfactory solution. There are, however, problems.
Consider, for example, Figure 4.1, where the constraints for the following
smalll.P. problem have been drawn:
Maximize: Xo = Xl + X2
subject to: 2Xl + 12x2 ~ 39
4X2 ~ 9
Xl,X2 ~ 0
Xl' X2 integer.
4.2 Combinatorial Optimization 153

3
Optimal I. P.
solution

Rounded solution

o 2 3 4 5 6 Xl
Figure 4.1. An example showing the failure to obtain a feasible I.P. solution by
rounding an L.P. solution.

It can be seen in Figure 4.1 that the L.P. solution is

xi = 6
x! =t;,
which is infeasible for the above I.P. The rounding of X2, either up or down,
does not produce a feasible solution. In fact the optimal I.P. solution, as
shown, is not at all close, relatively speaking, to the L.P. solution.
This example points up the pitfalls of rounding L.P. solutions to obtain
I.P. solutions. No combination of rounding either up or down of the non-
integer variables may be feasible, let alone optimal. Even when rounding
does produce feasibility, the solution may be far from optimal.
It is obvious that more sophisticated methods need to be developed if
we are to guarantee an optimal solution to an I.P. problem. Some such
methods are described in the next two sections.
154 4 Integer Programming

4.3 Enumerative Techniques


Theoretically any I.P. problem can be solved by simply listing all possible
feasible solutions, finding the value of each one and choosing the best. Such
a technique is called exhaustive enumeration. However even for zero-one
problems, in which there are just two possibilities for each variable, as the
number of variables increases the number of possibilities quickly becomes
very large: 2" for n variables. The situation is far worse for general problems.
Hence it is impractical to solve anything other than trivial problems in
this way.
What can be done, however, is to examine the set of all possible solutions
in such a way that whole sets of solutions can be discarded without specific
evaluation of all the solutions in each of these sets. Thus the enumeration is
carried out implicitly, and this approach is termed implicit enumeration.
Dynamic programming, which will be covered in Chapter 6, is an example
of implicit enumeration. An implicit enumeration technique designed espe-
cially for integer programming problems, called branch and bound enumera-
tion, will be described next.

4.3.1 Branch and Bound Enumeration

Branch and bound enumeration is a sequential technique for solving com-


binatorial optimization problems. Its use on such problems produces a
decision tree. The first iteration produces the point at which the tree is
rooted. Any subsequent iteration produces a number of new points which
are connected to the existing tree by lines which all emanate from one
existing point. A set of decisions concerning the values that the variables
can assume is associated with each point along with a bound. The bound
represents a value which is at least as good as that which could be attained
by any feasible solution obeying the set of decisions of that point. The
process begins by creating the root of the decision tree, which represents
all feasible solutions to the problem. A bounding routine calculates a bound
for this point, i.e., a bound on the optimal value. If the solution associated
with this bound is feasible it is optimal and the procedure is terminated. If
not, a partitioning routine partitions the set of feasible solutions into a
number of subsets, each represented by a distinct point in the decision tree,
all connected by lines to the parent point. The bounding routine then cal-
culates a bound for each of these points. An elimination routine discards a
point from the tree ifit can be shown that no solution in its set can be optimal.
This would occur, for example, if its bound is worse than the value of a
known feasible solution. The process continues generating new points at
each iteration. Termination occurs when finally the optimal solution or
evidence that no such solution exists has been obtained.
4.3 Enumerative Techniques 155

4.3.1.1 Solving the Numerical Example by Dakin's Method


Land and Doig (1960) presented a branch and bound algorithm for solving
I.P. or mixed I.P. problems. It was found to be very difficult to program
a computer to implement it efficiently. However, Dakin (1965) introduced
a modification of their algorithm which overcome this restriction. The
latter algorithm will be explained here.
The branch and bound decision tree built up in applying Dakin's method
to problem (4.1)-(4.5) is shown in Figure 4.2. The algorithm begins by
solving (4.1)-(4.4) as an L.P. This has the following optimal solution:
x! = t6
x~ = n
x~ = \V.
If this first optimal solution had satisfied (4.5), it would have been optimal
for the I.P. and the method would have been terminated. However, as this is
not the case, we proceed. The bound of V~ is associated with the highest
point of the decision tree, labelled aJ.s. (which means that it represents the
set of all feasible solutions). Any feasible solution for the I.P. cannot have
a value greater than this bound. As this solution is infeasible with regard
to (4.5), one of the variables with a non integer value is arbitrarily chosen,
say X2' The integer part of its value is identified. That is, we find the greatest
integer less than or equal to the current value (i6) of X2' As
n= 2 + lo,
this integer part is 2. Now, as X2 must be integral in any feasible solution,
either
x2 S 2 (4.10)
or
X2 2': 3. (4.11)

aJ.s. 11n,

(1) 11 (2) 9

(3) 10 (4)

Figure 4.2. A decision tree for Dakin's method.


156 4 Integer Programming

We now create two new L.P. problems, I and II:


I: (4.1), (4.2), (4.3), (4.4), and (4.10)
II: (4.1), (4.2), (4.3), (4.4), and (4.11),
that is:

PROBLEM I
Maximize: 4Xl + 3x 2
subject to: 3x 1 + 4x 2 ::;; 12
4Xl + 2x 2 ::;; 9

x2 ::;; 2
Xb X 22':O.

PROBLEM II
Maximize: 4Xl + 3X2
subject to: 3Xl + 4x 2 ::;; 12
4Xl + 2X2::;; 9

X2 2': 3
Xl,X2 2': O.

We have in effect partitioned the set of feasible solutions to the original


I.P. into two disjoint subsets: one comprising all the solutions where X 2 ::;;
2, and the other all solutions where X2 ~ 3. Consequently, two new points
representing these two sets of solutions are added to the decision tree in
Figure 4.2. Problems I and II are now solved. Problem I has optimal solution
xf =i
x! = 2
x~ = II.
Problem II has an optimal solution
xf =0
x! = 3
x~ = 9.
The solution to II satisfies (4.5) and is thus stored as the best solution
found so far for the I.P., with value 9. However the bound 11 is associated
with point 1 in the tree. Thus the possibility remains that there may be a
better I.P. solution lurking in its set. We choose Xl as the noninteger valued
variable with yalue i. As the integer part of this value is 1, we create two
constraints,
Xl::;; 1 and x 2 ~ 2.
4.3 Enumerative Techniques 157

We create two new L.P.'s:

PROBLEM III
Maximize: 4Xl + 3x 2
subject to: 3Xl + 4x 2 :::;; 12
4Xl + 2X2 :::;; 9
X2:::;; 2
Xl :::;; 1
X\>X2;;::: O.

PROBLEM IV
Maximize: 4Xl + 3X2
subject to: 3x l + 4X2 :::;; 12
4Xl + 2X2 :::;; 9
x2 :::;; 2
Xl;;::: 2
X I 'X 2 ;;:::O.

These problems are now solved. Problem III has an optimal solution:
xt =1
x! = 2
x~ = 10.
As this satisfies (4.5) and its value exceeds that of the best solution found so
far, it is stored as our best solution. Problem IV has value 9t, which is less
than the value of present incumbent. We have found a solution whose value
exceeds the bound for any other set of feasible solutions. This solution must
be optimal.
We have discovered that the company should fly one A crate and two
B crates on each trip for a maximum return of 10 units.

4.3.1.2 Dakin's Method in General


Dakin's method begins to solve a problem of the form of (4.6)-(4.9) by first
ignoring (4.9) and solving the problem as an L.P. using the simplex method.
The value of the solution thus found is the bound assigned to the first point
of the decision tree, representing all feasible solutions to the original I.P.
problem. This makes sense, as (4.6)-(4.9) can be thought of as the equivalent
L.P. with the added constraint of(4.9). Hence it cannot have an optimal solu-
tion better than the equivalent L.P. If the optimal L.P. solution has integer
values for the first q variables, it is optimal for the I.P. and the method ter-
minates. However, suppose at least one variable, Xi (l :::;;; i :::;;; q) has a non-
integer value
xi = Oi' 0i noninteger.
158 4 Integer Programming

Now as Xi is constrained to be an integer, values in the range


[0;] < Xi < [0;] + 1,
are infeasible. Hence Xi must obey exactly one of the following constraints:
Xi :s; [0;]
or
Xi ~ [0;] + 1.
Two new L.P. problems are now created:

I: (4.6), (4.7), (4.8), and xi:S; [b;]


II: (4.6), (4.7), (4.8), and Xi ~ [b;] + 1.
Note that problems I and II differ from the original problem only in the fact
that one more constraint has been added. It is thus possible to deduce the
optimal solutions to these amended problems with relatively little extra
computational effort using the ideas of Section 2.6.2.5 and the dual simplex
method of Section 3.3. Constraints of the type Xi :s; [b;] and Xl ~ [b;] + 1
are called Dakin cuts. Notice that it is no longer possible for Xi to take on
the offending value 0i in either problem I or II. Two new points are created
in the decision tree, both joined by lines to the original point. The first
represents all feasible solutions to problem I, the second to problem II. The
optimal solution to the originall.P. (if such a solution exists) must lie in one
of these sets. In fact the set, S of feasible solutions to (4.6)-(4.9) has been
partitioned into these two sets S, and S" in the sense that
S, u S" = S
and
S, n S" = 0, the empty set.
Both L.P. problems I and II are solved. Their optimal solution values are
bounds assigned to the corresponding points in the decision tree. The better
of the two bounds is identified. As the objective is one of maximization the
larger bound will be selected. Ties can be settled arbitrarily. If this better
bound corresponds to a feasible solution to (4.6)-(4.9) this solution is de-
clared optimal and the procedure is terminated. If it corresponds to an
infeasible solution another of the variables constrained to be integral with a
noninteger value is identified. Two more cuts are defined based on this
variable. The partitioning (branching) routine is repeated, creating two more
decision tree points.
The algorithm is continued until either (a) a feasible solution with value
no less than that for any other bound is found (this solution is then pro-
nounced optimal), or (b) it is found that no feasible solution exists (all points
have been eliminated from the tree). When a solution is found to be feasible,
its point is never selected for branching, and the point is said to be fathomed.
The point is eliminated unless it is the best feasible solution so far found, in
4.3 Enumerative Techniques 159

which case it is recorded as the incumbent. Of course any point with a bound
worse than that of the incumbent is eliminated.

4.3.1.3 The Zero-One Method of Balas


We now examine the zero-one programming problem and a method for
its solution. Although the model assumes that each variable is binary it is
still useful, as many I.P. problems are formulated this way. Often the vari-
ables represent decisions as to whether to adopt a particular policy or
not, i.e.
X. = {I, if policy i is adopted
• 0, otherwise.
Further, any I.P. can be converted into a zero-one problem by redefining
each nonbinary variable Xj as follows.
Let Uj be the largest possible integer value that Xj could possibly assume
in any feasible solution. This bound Uj is usually deduced by examining the
constraints.
Let Nj be the smallest integer such that
2Nj +1 > uj .
Then Xj can be expressed in terms of the binary variables yL rl, Y1v j +l as
Nj+l
Xj = L (2 i - 1 )yf·
i= 1

Examples of this conversion will be given in the next section.


Balas (1965) has developed a method for solving zero-one problems which
involves branch and bound enumeration. His approach differs from that of
Dakin's in that it does not require the simplex method as a subroutine.
Balas describes the method as "additive," as it requires only the addition
and not the multiplication of numbers. The method is applicable only to
problems with nonnegative objective function coefficients. Any zero-one
programme can be converted into this form by replacing any variable Xi
with negative Ci by Xi = (1 - Xi)'
The method partitions the variables into three sets:
W: the set of variables which have been assigned a value 1
V: the set of variables which have been assigned a value
F: the set of unassigned (free) variables.
°
Initially all variables are assigned to F. For maximization problems all
variables in F are next temporarily assigned a value of 1. If this solution is
feasible it is clearly optimal, as Ci ~ 0, i = 1, 2, ... n. If this solution is
infeasible an upper bound on the value of the optimal solution can be
obtained. This bound is equal to the sum of the Ci> neglecting the minimum
Ci' for all Xi E F. After this first iteration, a bound can be found for any
partition of the variables among W, V, and F as follows.
160 4 Integer Programming

The bound is equal to the sum of the Ci for all Xi E W plus the sum of
c;, neglecting the minimum c;, for all Xi E F, i.e.,
L
~EW
Ci + L
~EF
Ci - min {ci }·
~EF

When a solution is found to be infeasible its corresponding point in the


decision tree sprouts two new points, effecting the branching step. Suppose
W, V, and F denote the partition at the parent point and the variable cor-
responding to the minimum Ci among variables in F is Xi' Then the partition
at one of the new points is:
F becomes F\ {Xi}, W becomes W u {Xi}, V remains the same
and at the other new point:
F becomes F\{Xi}' W remains the same, V becomes V u {xJ
Bounds are then calculated for these two new nodes as just described.
When the partition of a particular point cannot possibly lead to an optimal
solution, the point is eliminated from the tree. When the partition of a par-
ticular point corresponds to a feasible solution the point is fathomed and
no further branching takes place from it. When a point corresponds to a
feasible solution and has a value no less than that for any other node, this
solution is declared optimal.

4.3.1.4 Numerical Example


The method will be illustrated using the problem (4.1)-(4.5). First upper
bounds on Xl and X2 must be found, as they are not binary:
U1 = min {[bi/ali]} = min {[In, [£]} = 2
i= 1,2
U2 = min {[bi/a2J} = min {[1f], [!]} = 3.
i= 1,2
Therefore

Let
Xl + 21y~
= 2°y~
X2 = 2°yi + 21y~.

Then the problem becomes


Maximize: Xo = 4(y~ + 2y~) + 3(yi + 2y~) = 4y~ + 8y~ + 3yi + 6y~
subject to: 3(y~ + 2y~) + 4(yi + 2y~) ~ 12
4(y~ + 2y~) + 2(yi + 2y~) ~ 9
y~, y~, yi, y~ = 0 or 1.
The decision tree built up by the method is shown in Figure 4.3. The
method begins by partitioning the variables into
W=0, V=0, F = { Y1,1 Y2,1 Y1,2 Y22} .
(1) 18

(2) 14

(4) 15 (5) II (8) (9) 8

(6) - 00 (7) 7 3 (lO) 4

Figure 4.3. The decision tree for the method of Balas.


....
0'1
....
162 4 Integer Programming

Next all the variables in F are temporarily assigned a value 1. This is obvi-
ously infeasible. The bound on the first point is calculated as 4 + 8 + 6 = 18,
the sum of all Ci of the variables in F except the minimum, which is 3, cor-
responding to yi. Branching now takes place and yi is transferred to W in
point (2) and V in point (3). Bounds are calculated for these nodes as ex-
plained earlier. For instance, the bound for point (2) is arrived at by setting
all the variables in F, y~' yi, y~, equal to 1. This is infeasible, as yi = 1
because it is in W. Hence the variable with the minimum coefficient in F,
yi, is discounted when calculating the bound, which is 8 + 6 plus the 3
from yi For point (3) all the free variables are set equal to 1, but this is
infeasible. The variable discarded is y~ (yi is unavailable, as it is in V) and
the bound is 8 + 6.
As we are maximizing we branch from point (2) because it has the higher
bound. This produces points (4) and (5). Branching from point (4) produces
points (6) and (7). However, point (6) cannot represent any feasible solutions,
as yi, yi, and y~ cannot all be equal to 1, so it is eliminated. (Hence the
" - 00" symbol.)
At this stage point (3) has the largest bound. This eventually produces
point (10) with a bound of 10 which represents a feasible solution. All points
with inferior bounds can be eliminated. This leaves point (5), which spawns
points (12) and (13) both with bounds less than 10.
Therefore point (10) is declared optimal, with the solution
yi,y~ = 1
yi,yi =0
X6 = 10.
This solution corresponds to that found by Dakin's method:
x! = 2°(1) + 21(0) = 1
x! = 2°(0) + 21(1) = 2.

4.4 Cutting Plane Methods


Gomory (1958) developed cutting plane algorithms for solving all-integer
and mixed-integer programming problems. He proved that these methods
will produce an optimal solution in a finite number of iterations when
applied to problems with rational data. The methods revolve around the
idea of introducing new constraints (or cuts) to the problem. These cuts
slice away noninteger optimal solutions to the associated L.P. problem, but
leave all feasible integer solutions untouched. This is similar to what is done
in Dakin's method, but there are fundamental differences between the two
approaches. In cutting plane methods successive constraints are added to
just one problem, whereas in branch and bound methods many different
4.4 Cutting Plane Methods 163

(linear programming) problems may be created. Thus in cutting plane


methods the original feasible region for the associated L.P. problem is grad-
ually reduced as extra constraints are added. In contrast, in branch and
bound methods the original feasible region is often broken up into discon-
nected subregions. In Dakin's method cuts are parallel to the axes; this
seldom happens in cutting plane methods. Finally, cutting plane methods
always preserve all feasible integer solutions, while some feasible integer
solutions are usually eliminated from some of the problems created by the
branch and bound method.
The all-integer and mixed-integer methods will be explained in the next
two sections.

4.4.1 Gomory's All-Integer I.P. Method

The way in which the Gomory all-integer cutting plane algorithm solves
(4.6)-(4.9) will now be explained. It will be assumed that all variables are
constrained to be integers in (4.6)-(4.9):
q = n.
The outline of the algorithm is as follows. Problem (4.6)-(4.8) is solved by
the simplex method. If the optimal solution is all-integer the problem is
solved and the algorithm is terminated. If at least one variable is noninteger
a new constraint is added to the problem. This constraint is derived by
choosing a noninteger valued variable and examining the tableau row in
which it appears. The problem is then resolved with this new constraint.
It has been assumed that all variables, including slack variables, are to
be integer in any feasible solution. This assumption can be made workable
by clearing fractions from the constraint coefficients before introducing the
slack variables. That is, if one is confronted with a constraint like

one can multiply the constraint by the lowest common denominator of the
coefficients (99) to obtain

Once this has been done for all necessary constraints the initial L.P. problem
is then solved by the simplex method.
The way in which a new constraint is constructed from a noninteger
tableau will now be explained. Suppose the associated L.P. problem has
been solved and at least one variable, say, Xi has a noninteger value. The
row in the optimal tableau in which Xi has a unit entry is found, say the
jth row. Let it correspond to the equation
(4.12)
164 4 Integer Programming

where Yk> k = 1,2, ... ,p, are the nonbasic variables, ajk, k = 1,2, ... ,p is
the coefficient of Yk in this jth row; and ]jj is the value of Xi. Now (4.12)
is solved for Xi:
(4.13)
For any a E R, let [a] denote the largest integer no greater than a. Then
a = [a] + a', (4.14)
where a' is the fractional part of a; for example,

a = 3~ => [a] = 3 and a' = ~


a = - %=> [a] = - 1 and a' = ~
a = 2 => [a] = 2 and a' = O.
Each rational number in the r.h.s. of (4.13) can be expressed in the fol-
lowing format:

On collecting integer terms, this becomes

Xi = {[]jj] - [a jl ] Yl - ... - [ajp]Yp} + {]jj - ajlYl - ... - ajpYp}.


Now the first part:
{[]jj] - [ajl]Yl - ... - [ajp]Yp}
will be an integer if all the variables Yb Y2, ... , Yp are integers, which is
true by assumption. Hence for Xi to be an integer, the second part:
(4.15)
must be an integer. But
O<]jj < 1,
as]j was assumed to be noninteger. Also
o :s; aji < 1, i = 1,2, ... ,p
because of the definition (4.14). Hence, as the Yb Y2, . .. ,Yp are constrained
to be nonnegative integers, (4.15) cannot be a positive integer. Hence (4.15)
must be a non positive integer. So the constraint:
]jj - ajlYl - ... - ajpYp :s; 0 (4.16)
must hold in any feasible integer solution.
Let the slack variable Xr be introduced into (4.16):
(4.17)
As (4.15) must be an integer, then Xr must of necessity be an integer also.
This constraint (4.17) is now added to the final simplex tableau and an
4.4 Cutting Plane Methods 165

optimal solution to the amended L.P. solution is found using the dual
simplex method of Section 3.3.
The constraint (4.16) represents a Gomory cut. The process is repeated
until the dual simplex method either produces an all-integer solution (which
will be an optimal solution for the original J.P. problem) or evidence that
no feasible solution exists (in which there are no feasible all-integer solu-
tions). The algorithm will now be illustrated by solving a numerical example.

4.4.1.1 Numerical Example


The method will be illustrated on the problem (4.1)-(4.5). The problem is
first solved by the simplex method, ignoring (4.5). This produces Table 4.1,
where X3 and X4 are the slack variables introduced in (4.2) and (4.3).

Table 4.1

Constraints XI X2 X3 X4 r.h.s.

(4.19) 0 1.
5
3
-TO n
(4.20) 0 I
-5 t !i
5

Xo 0 0 t 170 ?J

This solution is noninteger, so we must introduce a cut. Consider the


second row in Table 4.1, corresponding to the noninteger-valued variable
Xl' This row corresponds to the equation:

Xl - !X3 + ~X4 = !.
Therefore
Xl =! - (-!)X3 - ~X4
= (1 + !) - (- 1 + ~)X3 - (0 + ~)X4'
The fractional part of this expression is

which cannot be a positive integer; hence


!- ~X3 - ~X4 ~ O. (4.18)
As the problem has only two structural variables, it is instructive to
follow the progress ofthe method graphically. Figure 4.4 shows the graphical
solution to the original L.P. problem. Using the equations
3Xl + 4X2 + X3 = 12 (4.19)
4Xl + 2X2 + X4 = 9, (4.20)
one can substitute for X3 and X4 in (4.18), producing
4Xl + 4X2 ~ 13, (4.21)
166 4 Integer Programming

Xz

2 --------

Figure 4.4. Graphical solution to the example problem using Gomory's method.

which is shown in Figure 4.4. Note that this solution cuts away part of the
feasible region, including the optimal solution to the present, but leaves all
feasible integer solutions still in the region. This will always happen.
On adding a slack variable Xs to (4.18) and taking the constant to the
r.h.s., we have
(4.22)
This constraint is added to Table 4.1, and the dual simplex method is used
to produce a new optimum, given in Table 4.2.
4.4 Cutting Plane Methods 167

Table 4.2

Constraints Xl X2 X3 X4 X5 r.h.s.
I I
(4.19) 0 1 0 -2 2 2
(4.20) 0 0 t I
-4 i
(4.22) 0 0 1 t -i I
4
I I
Xo 0 0 0 2 2 11

As can be seen in Table 4.2, X3 has a noninteger value. Hence the equation

X3 + !x4 - i-xs = !
is expressed in terms of X3 with integer and noninteger parts:
X3 = (0 +!) - (0 + !)X4 - (-2 + i)xs'
So the new cut is:
(4.23)
As an aside, we can use (4.19), (4.20), and (4.22) to show that (4.23) is equiva-
lent to
5Xl + 4X2 ::s; 14.

This constraint is plotted in Figure 4.4. On adding the slack variable X6 to


(4.23) we have
(4.24)
(4.24) is added to Table 4.2 and the dual simplex method is used to produce
the optimal tableau shown in Table 4.3.

Table 4.3

Constraints Xl X2 X3 X4 X5 X6 r.h.s.

(4.19) 0 0 5
-6 0 t Ii
(4.20) 0 0 t 0 I
-3 3
4
4 5 2
(4.22) 0 0 3 0 -3 3
(4.24) 0 0 0 t 4
-3 t
Xo 0 0 0 i 0 2
3
65
6

Now all the basic variables have noninteger values. The reader who thinks
we are chasing our tails is asked not to despair. The "optimal" solution value
is steadily being reduced at each iteration: from an initial \V to 11 to 6i.
The Gomory cuts are slicing away nonoptimal parts of the original feasible
region, as can be seen in Figure 4.4. Applying the technique to row (4.19) in
168 4 Integer Programming

Table 4.3 produces


X2 = V- (-i-)X4 - !X6
= (1 + i-) - (- 1 + i)X4 - ( - 1 + !)X6·
Therefore
(4.25)
which corresponds to

shown in Figure 4.4.


Adding the slack variable X7 to (4.25), which is then added to Table 4.3,
allows the dual simplex method to produce the optimal tableau shown in
Table 4.4. Applying the technique to (4.20) in Table 4.4 produces
Xl = i: - iX4 - (-t)X7
= (l + i) - (0 + i)X4 - (-1 + !)X7·
Therefore
(4.26)
which corresponds to

shown in Figure 4.4.


Adding the slack variable Xs to (4.26), which is then added to Table 4.4,
allows the dual simplex method to produce Table 4.5, which displays the
Table 4.4

Constraints Xl X2 X3 X4 Xs X6 X7 r.h.s.

(4.19) 0 0 -1 0 0
(4.20) 0 0 ! 0 0 -z1 i
(4.22) 0 0 1 i 0 0 -2
5 1f
(4.24) 0 0 0 0 -2 2
(4.25) 0 0 0 t 0 1 -2
3 5
4
Xo 0 0 0 0 0 0 660

Table 4.5

Constraints Xl X2 X3 X4 Xs X6 X7 Xs r.h.s.
5 4
(4.19) 0 0 0 0 0 3 -"3 2
(4.20) 0 0 0 0 0 -1
(4.22) 0 0 0 0 0 -3
11
!
(4.24) 0 0 0 0 1 0 s
-3 t
(4.25) 0 0 0 0 0 -"3
5
t
(4.26) 0 0 0 0 0 t 4
-3
Xo 0 0 0 0 0 0 0 660
4.4 Cutting Plane Methods 169

optimal solution to the original problem, (4.1)-(4.5):


x! = 1
x! = 2
x~ = 10.

4.4.2 Gomory's Mixed-Integer I.P. Algorithm

Consider now a mixed-integer programming problem, i.e., some but not all
of the variables are constrained to be integer. In terms of (4.6)-(4.9),
O<q<n.
Gomory's mixed-integer I.P. algorithm follows the same initial pattern as
the all-integer algorithm. Suppose the initial simplex solution contains a
noninteger-valued variable Xj which is one of those which is constrained to
be integer. Then its tableau equation (4.12) can be rewritten as
p

[oJ + oj - Xj = L
k=l
"iijkYk· (4.27)

At this point the analysis takes a different path from that of Section 4.4.1,
because not all of the variables Yk> k = 1, ... ,p may be constrained to be
integer. Let
S+={k:"iijk~O}
S_ = {k: "ii jk < O}.
Then (4.27) can be written as
[OJ] + oj - Xj = L
keS+
"iijkYk + L
keS-
"iijkYk· (4.28)

Case I. Assume
[oJ + oj - Xj < O.
As [OJ] is an integer, x j is constrained to be an integer in any feasible solution,
and oj is a nonnegative fraction. Hence
[OJ] - Xj

must be a negative integer, say - u. Therefore


[OJ + oj - Xj = oj - u,
where u E {1,2, 3, ... }. Substituting this into (4.28) produces
oj - u =
keS+
L "iijkYk + L
keS_
"iijkYk·

Now, since
u ~ 1,
170 4 Integer Programming

we have
5j - 1 ~ L ajkYk + L ajkYk,
keS+ keS-

And, from the definition of S + and the fact that Yk ~ 0 for all k,
5j - 1 ~
keS_
L ajkYk'

Now, as
5j - 1 < 0,
we have
1 ~ (5j - 1)-1
keS_
L ajkYk'

Multiplying both sides by 5j, we obtain


LI.
UJ -
< Lt.(LI.
UJ UJ
_ 1) - 1 '"
L... -
ajkYk· (4.29)
keS_

Case II. Assume


[5j] + 5j - x j ~ O.
As Xj is constrained to be an integer in any feasible solution, we have
[5 j ] + 5j - Xj = 5j + v
for some v, where v E {O, 1,2, 3, ... }. Substituting this into (4.28), we get

5j + v =
keS+
L ajkYk +
keS_
L ajkYk'

Now, since
v~O,
we have

and, from the definition of S _ and the fact that Yk ~ °for all k
5j ~ L
keS+
ajkYk- (4.30)

Combining (4.29) and (4.30), we obtain


Lt.
UJ - < LI.(Lt.
U J U J _ 1) - 1 '"
L... -ajkYk + '"
L... -
ajkYk· (4.31)
keS- keS+

This inequality must be satisfied if Xj is to be an integer. The constraint (4.31)


is the Gomory cut, which is introduced into the final tableau.
A slack variable Xr is now added to (4.31):
5j=5j(5j-1)-1 L ajkYk+ L ajkYk-x r • (4.32)
keS_ keS+
Now, as
Yk = 0, k = 1,2, ... , p
we have
Xr = -5j,
4.4 Cutting Plane Methods 171

which is infeasible. The dual simplex method is used to remedy this situation.
The above process is repeated until either:
1. A tableau is produced in which Xi' i = 1,2, ... , q are integer, in which
case the corresponding solution is optimal; or
2. The use of the dual simplex method leads to the concl usion that no feasible
solution exists, in which case one can conclude that the original mixed-
integer problem has no feasible solution.

4.4.2.1 Numerical Example


The method will be illustrated on the problem (4.1)-(4.4), with the following
additional constraint:
X 1 must be an integer,

i.e.,
q = 1.
On examining Table 4.1 it can be seen that Xl is noninteger and can be
expressed as

Therefore, in terms of (4.27),


[bJ = 1
bj = t
j=2
i= 1
p=2
- 1
ajl = -S
- 2
a j2 ="3

Yl = X3
Y2 = X4'

Also,
S+ = {4}
S_ = {3}.
Letting

in terms of (4.32) the cut becomes

t = t(t - l)-l( -t)x 3 + ~X4 - X5' (4.33)


Adding the negative of this constraint to Table 4.1 yields Table 4.6. The
application of the dual simplex method to Table 4.6 yields Table 4.7, which
displays the optimal solution to the problem, as Xl is now integer-valued.
172 4 Integer Programming

Table 4.6

Constraints XI X2 X3 X4 X5 r.h.s.

(4.19) 0 1 2
5 -TO
3
0 n
(4.20) 0 -5
I
~ 0 !
I 2 I
(4.33) 0 0 -20 -5 1 -5
Xo 0 0 2
5 TO
7
0 \V

Table 4.7
Constraints XI X2 X3 X4 X5 r.h.s.

(4.19) 0 ~6 0 -4
3
£
(4.20) 1 0 -fo 0
(4.33) 0 0 t 5
-"2 t
Xo 0 0 67
80 0 4
7
4f

This solution is
x! = I

x~
x*0--
= *
43
4'

4.5 Applications of Integer Programming


In the sections that follow we shall outline some real-world problems that
can be formulated in terms of integer programming. There is quite an art in
this. On the surface it does not seem possible to describe many of the problems
as integer programs. However with imaginative definition of variables
and construction of constraints it can be done. Once it has been recognized
that a problem is amenable to J.P. formulation there is a great deal to the
task of making the formulation efficient. That is, it is one matter to be able
to formulate a problem, it is another matter to endow the formulation with a
structure or size that can be solved efficiently.

4.5.1 The Travelling Salesman Problem

The travelling salesman problem is one of the classical problems of com-


binatorial optimization. It is concerned with a salesman who must visit a
number of cities once each and return to the city from whence he started. The
problem is to assign an itinerary to the salesman which minimizes the total
4.5 Applications of Integer Programming 173

distance travelled in order to accomplish this circuit. It is assumed that the


distance travelled in proceeding directly from one city to any other is known
for all pairs of cities. Note that it is not assumed that the distance from city
i to city j is necessarily the same as the distance from city j to city i. These
two distances may differ for example when the "cities" are intersections in a
one-way street network. When all such i, j-pairs of distances are equal the
problem is called the symmetric travelling salesman problem (T.S.P.), other-
wise it is called the asymmetric travelling salesman problem.
The T.S.P. can be formulated as a zero-one I.P. problem. Let
n = the number of cities,
cij = the cost of travelling from city i to city j.
Note that if one does not wish the salesman to travel directly from a certain
town to another one can assign a prohibitively large value (denoted by "00")
to the appropriate cij value. This will ensure that such a path is never selected
in any optimal solution. For instance, one sets
Cii = 00, 1 = 1,2, ... , n.
Let
I if the salesman is to proceed directly from city i to city j
xij =
{
0:
otherwise.
Because each city i must be left exactly once,
n

L
j= 1
xij = 1, i = 1,2, ... , n. (4.34)

Also, because each city j must be visited exactly once,


"
L xij = 1, j = 1,2, ... , n. (4.35)
i= 1

For any given circuit defined by xij the objective is to


n n
Minimize: L1 L1
i= j=
cijxij. (4.36)

The reader will recognize that minimizing (4.36) subject to (4.34) and (4.35)
is the assignment problem of Section 2.72. Unfortunately, extra constraints
are needed in order to formulate the T.S.P. This is because (4.34), (4.35), (4.36)
do not exclude the possibility of subtours being formed.
For instance, in a six-city problem one might make the assignments X12 =
X23 = X31 = X45 = X56 = X64 = 1, all other xij = O. That is, the "circuit" is
1 --+ 2 --+ 3 --+ 1 and then 4 --+ 5 --+ 6 --+ 4. This is a feasible solution for (4.34)
and (4.35), as each city is left once and arrived at once. However, it represents
two disjoint subtours. (A subtour is a circuit which does not involve all cities).
Hence such a solution is not feasible for the T.S.P. Hence we need an extra
family of constraints which prevent subtours from being formed. In order to
174 4 Integer Programming

develop this, we notice that there is a partition T, T' of the set of cities N:
T = {1,2,3}, and T = N - T = {4,5,6}
such that xij = 0 for all i E Tand allj E T. This occurs ifand only ifsubtours
exist. Thus the following constraint will prevent subtours:

L L xij ~ 1, for all proper partitions T, T' of N. (4.37)


i"T j"T'

(A proper partition T, T' of N is a partition such that T =1= 0 or N.) Thus the
T.S.P. can be expressed as the following zero-one I.P.: minimize (4.36)
subject to (4.34), (4.35), and (4.37).
Of course, (4.37) involves a relatively large number of constraints for non-
trivial n. Hence it is not practical to use the above formulation on anything
other than very small problems. However before the reader despairs, one can
consider solving the problem ignoring (4.37). If the resulting solution is a
feasible circuit it is optimal; if not, its value represents a valid lower bound
on the value ofthe optimal T.S.P. solution. This suggests that one could use a
branch and bound approach calculating bounds in this way. This has indeed
been done initially by Little et al (1963) and Eastman (1958). There have been
a number of improvements to this approach, including those by Bellmore
and Malone (1971), which have been adopted by Garfinkel and Nemhauser
(1972).

4.5.2 The Vehicle Scheduling Problem


The travelling salesman problem ofthe previous section can be extended in a
number of ways. Suppose that there are now a number of salesmen, all
op~rating from one base, which is one ofthe cities. All ofthe other cities must
be visited by one salesman who delivers a quantity of goods. Each city has a
known demand for the goods and each salesman has a capacity for carrying
goods. The problem is to assign each salesman a circuit of cities, starting and
ending at the base where total demand on a circuit must not exceed the
salesman's capacity. All cities must have their demand met and the total
co;t of travel is to be minimized.
This problem can be made more realistic by thinking ofthe "salesmen" as
representing vehicles (say delivery vans) and the "cities" as demand points
within one city. This problem has a number of important applications, such
as school bus scheduling (F oulds et al. 1977a), milk tanker scheduling (Foulds
et al. 1977b), municipal waste collection (Beltrami and Bodin 1974), fuel oil
delivery (Garvin et al. 1957) and newspaper distribution (Golden et al. 1975).
Surveys ofliterature on the problem have been carried out by Turner, Ghare,
and Foulds (1974) and Watson-Gandy and Foulds (1981),
The problem will now be formulated in terms of integer programming.
The first formulation is due to Balinski and Quandt (1964). First all feasible
circuits which begin and end at the base are identified. This may be an
4.5 Applications of Integer Programming 175

extremely difficult task for problems with 20 or more demand points. How-
ever, the formulation is still useful as a conceptual tool. Let

{l,
m = the total number of feasible circuits
(j .. = ifthejth demand point is on the ith feasible circuit
'1 0, otherwise
= the total cost of travelling the ith feasible circuit
{I,
Ci

x. = if the feasible circuit i is chosen


, 0, otherwise.
Then the problem is to
m
Minimize: L
i=1
Ci X i

m
subject to: L (jijXi = 1,
i= 1
j = 1,2, ... , n

Xi = 0 or 1, i = 1,2, ... , m.
The following formulation, due to Garvin et al. (1957), is more explicit
and is far more amenable to integer programming techniques. Let
Pk = the demand at point k
C the capacity of each vehicle (assumed to be identical for
=
all vehicles)
dij = the cost of travelling from point i to pointj
Yijk = the quantity shipped from point i to point j which is destined
for point k
x .. = {I, if a vehicle travels directly from point i to pointj
'1 0, otherwise.
The base shall be denoted by the subscript O.
Consider two distinct demand points, j and k. Then Yijk denotes the
quantity arriving at point j from point i which is destined for point k. Thus
LYijk
i

denotes the total quantity arriving at point j destined for point k. Also,
Yjrk denotes the quantity leaving point j for point r which is destined for
point k. Thus

denotes the total quantity leaving point j for point r which is destined for
point k. Now because all goods arriving at point j, destined for point k,
should leave pointj, we have:
LYijk = L Yjrk, for all pointsj, k,j =F k. (4.38)
i
176 4 Integer Programming

Also, Yikk denotes the quantity arriving at point k from point i which is
destined for point k. Thus
LYikk
i

denotes the total quantity arriving at point k which is destined for point k.
Now because this total quantity must equal the demand of point k, we have:
L Yikk = Pk' for all points k. (4.39)
i

Also, YOjk denotes the quantity leaving the base for pointj which is destined
for point k. Thus
LYOjk
j

denotes the total quantity leaving the base destined for point k and

L LYOjk
j k

denotes the total quantity leaving the base. Also


Lqk
k

denotes the total demand. Now, as the total quantity leaving the base must
equal the total demand, we have:
(4.40)

It is usually assumed in formulating vehicle scheduling models that only


one vehicle will visit each point. The problem of having points with demand
greater than vehicle capacity can be overcome by distributing the demand
of such a point between a number of artificial points all at the same location,
one vehicle visiting each. The assumption implies that only one vehicle will
leave each point. Thus we have:
L Xij = L Xjr = 1, for all pointsj. (4.41)
i
Also
LYijk
k

denotes the total quantity carried by the vehicle (if any) which leaves point i
for point j. This quantity cannot exceed vehicle capacity, and if no vehicle
travels on this segment, the quantity is zero. Thus we have:

L Yijk ~ xijC, for all points i, j, i #- j. (4.42)


k

Of course it is implicit that


Yijk ~ 0
4.5 Applications of Integer Programming 177

and
Xij = 0 or 1, for all points i, j, k. (4.43)
Then the objective is to
Minimize: (4.44)

which is the total cost of all travel.


Thus it can be seen that the problem of minimizing (4.44) subject to
(4.38)-(4.43) is a mixed 0-1 programming problem. It would be difficult to
solve such problems when there are more than about 10 points, as the
number of constraints would be prohibitive. What is usually done is to solve
realistically sized vehicle scheduling problems by heuristic techniques, such
as those of Clarke and Wright (1964), or Foster and Ryan (1976). A heuristic
technique is a solution procedure represented by a series of rules which,
although not guaranteed to find the optimum, usually produce relatively
good solutions. Techniques guaranteed to produce the optimal solution,
such as branch and bound enumeration, can at present be used only on
small problems because of the amounts of computer time and storage they
require. Hence most people studying the vehicle scheduling problem prefer
to concentrate on heuristic techniques. The heuristic of Foster and Ryan
mentioned above does actually use an integer programming formulation.

4.5.3 Political Redistricting


Consider the problem of finding a just method of assigning the census tracts
of a region to a number of electorates (voting districts) for the purposes of
voting. The assignment must satisfy a number of criteria, including approxi-
mate population equality between electorates and connectedness and com-
pactness of electorates. Each tract is indivisible in the sense that all of it
must be included in exactly one electorate. The number of electorates
created must be equal to the given number of members of parliament
(congressmen) for the region. Each electorate should be connected in the
sense that it is possible to travel between any two points of the electorate
without leaving the electorate. Each electorate should be relatively compact
in the sense that its physical shape should be somewhat circular or square
rather than long and thin.
Some of the above criteria will now be expressed in mathematical form.
Let
m = the number of tracts in the region,
n = the number of electorates to be created,
x .. = {I, if tract i is assigned to electorate j,
I) 0, otherwise,
Pi = the population of tract i.
178 4 Integer Programming

Let
1 m
V=-
ni=1
L Pi
be the mean electorate population. In any true democratic system each
electorate should have a population V to ensure voting equality. However,
this is usually impossible because of the indivisibility of each tract.
The population of electorate j is
m

i=1
L PiXij'

Let its deviation from V be defined as

Then one might attempt to make the maximum deviation over all electorates
as small as possible:
Minimize: Max dj . (4.45)
j=1,2, ...• n

Each tract i must belong to precisely one electorate:


n

L xij = 1,
j= 1
i = 1,2, ... ,m. (4.46)

Also, there must be exactly n electorates created. That is, each electorate
must have at least one tract assigned to it:
m

L xij 2:: 1, j = 1,2, ... , n. (4.47)


i= 1

As
Xij = 0 or 1, i = 1,2, ... , m
(4.48)
j = 1,2, ... , n,
(4.45)-(4.48) would be a zero-one I.P. except for the form of(4.45). However,
all is not lost, as one can convert the problem into a standard zero-one
I.P. as follows. Let
v = Max dj ,
j=1,2, .... n
i.e.,
j = 1,2, ... , n.
Hence

.f
I,= 1 PiXij - vi :s; v, j = 1,2, ... , n,
4.5 Applications of Integer Programming 179

and therefore

L PiXij - P:::;; V
m1
i= )
j = 1,2, ... , n. (4.49)

i= 1
f PiXij-P~-V
Now the problem becomes
Minimize: v
subject to: (4.46)-(4.49),
which is a straightforward I.P.
One can introduce the concept of tract area and develop further con-
straints concerning the connectedness and compactness of the electorates.
This has been done by Smith, Foulds, and Read (1976) and others, including
Garfinkel and Nemhauser (1970); Hess et al. (1965); and Wagner (1968),
who used integer programming to solve his model.

4.5.4 The Fixed Charge Problem


Consider the problem of a factory which must produce at least M units of a
certain commodity and there are n machines available. Let
Pi = the unit cost of producing one article on machine i, i = 1,2, ... ,n
Fi = the positive fixed cost of setting up machine i for production i =
1,2, ... , n
Xi = the number of units produced on machine i, i = 1,2, ... , n.

Then the production cost for producing Xi units on machine i is

if Xi> 0
otherwise,
where we have assumed that production costs for each article are additive.
The problem is to minimize the total production cost:
n
Minimize: L Ci(Xi)'
i= 1
(4.50)

At least M units must be produced, hence


n

L Xi~M.
i-I
(4.51)
Also,
Xi is a nonnegative integer, i = 1,2, ... , n. (4.52)
180 4 Integer Programming

Now (4.50), (4.51), (4.52) would be an I.P. apart from the nonlinearity of
(4.50). However, this nonlinearity can be overcome by defining
if machine i is set up
otherwise.
Also, let
Ui = the maximum possible number of units that machine i could
possibly produce.
Then
i = 1,2, ... , n.
So (4.50) becomes
n n
Minimize: L
i= 1
Pi Xi +L
i= 1
FiYi' (4.53)

Some extra constraints need to be added:


i = 1,2, ... , n. (4.54)
(4.54) ensures that
Xi> 0 = Yi = 1
and
Xi = 0 =Yi = 0,
the latter implication arising from the facts that (4.53) has the objective of
minimization and all Fi > O. So, with the proviso
Yi = 0 or 1, i = 1,2, ... , n, (4.55)
the problem (4.51)-(4.55) is a mixed integer programming problem.

4.5.5 Capital Budgeting

Consider a company which has the opportunity to initiate a number of


projects. Let
n = the number of projects available
m = the number of time periods, during which funds will have to be
injected into the projects
Pi = the ultimate profit of project i
hj = the level of funds that needs to be allocated to project i in time
periodj
Cj = the total capital available for distribution in time periodj

x. = {l, if project i is selected,


, 0, otherwise.
4.5 Applications of Integer Programming 181

Then the objective is to maximize ultimate profit, i.e.,

Maximize: (4.56)

subject to the fact that the total capital available in each period j:

cannot exceed the amount available, i.e.,


n

I
i= 1
Iijx i S Cj ' j = 1,2, ... , m. (4.57)

Also
Xi = 0 or 1, i = 1,2, ... , n. (4.58)

Problem (4.56)-(4.58) is a standard zero-one I.P. If


m = 1,
the variables can be redefined as follows:
Ii! = /;, i= 1,2, ... , n
C1 = C.
Consider now the problem of deciding which items to take on a hiking trip.
Let
n = the number of different types of possessions to be taken
Pi = the value assigned to an item of type i
/; = the weight of an item of type i
C = the total weight that can be carried

Xi = the number of items of type i to be taken.

Then let us assume the objective of maximizing the total value of all posses-
sions taken, i.e.,

Maximize: (4.59)

The total weight of all items:

cannot exceed the total allowable weight, i.e.,


n

I /;Xi S C. (4.60)
i= 1
182 4 Integer Programming

Of course, only integer quantities of each item can be taken along:


Xi = a nonnegative integer, i = 1,2, ... , n. (4.61)
Problem (4.59), (4.60), (4.61) is called the knapsack problem for obvious
reasons, and will be further examined in Chapter 6.

4.6 Exercises

(I) Computational

1. Solve the following integer programming problems by Dakin's method.


(a) Maximize: 3x l + 5X2 + 4X3
subject to: 2Xl + 6X2 + 3X3 ::0:; 8

5x l + 4X2 + 4X3 ::0:; 7

6x l + X2 + X3::O:; 12

Xl, X2, X3 nonnegative integers.

(b) Maximize: 4Xl + 3X2 + 3X3


subject to: 4Xl + 2X2 + x 3 ::o:; 10

3Xl + 4X2 + 2X3 ::0:; 14

2Xl + X2 + 3X3::O:; 7

X l ,X2,X3 nonnegative integers.

(c) Maximize: 2Xl + 4X2 + 5X3


subject to: Xl + X2 + 2X3::O:; 9
2Xl + X 2 + 3X3::O:; 13

3Xl + 2X2 + x 3 ::o:; 11

Xl' X2, X3 nonnegative integers.

(d) Maximize: 5x l+ 4X2 + 3X3


subject to: 3Xl + 4X2 + X3::O:; 12
4Xl + 2X2 + X3::O:; 9

2Xl + 3X2 + 2X3 ::0:; 15

Xl> X2, X3 nonnegative integers.

(e) Maximize: 4Xl + 6x 2 + X3


subject to: 2Xl + X 2 + 2X3 ::0:; 16
Xl + 2X2 + X3::O:; 10

3x l + X 2 + X3::O:; 13

Xl, X2, X3 nonnegative integers.


4.6 Exercises 183

(f) A food factory produces three types of fruit salad: A, B, C. Each type requires
a different amount of three varieties of fruits: peaches, pears, and apples as
summarized in Table 4.8. No more than 5, 4, and 6 pounds of pears, peaches,
and apples can be used in producing a can. How many of each type of can
should be produced in order to maximize profits?

Table 4.8. Data for Exercise I (f).

Weight in pounds:
Type
Pears Peaches Apples Profit per can

A 2 3 4 6
B 2 2 4 5
C 3 3 2 4

(g) Maximize: Xl + 2X2 + X3


subject to: 2XI + X2 + 3X3 ~ 12
Xl + 4X2 + 2X3 ~ 10
Xl + 3x 2 + X3 ~ 14
XI>X2,X3 nonnegative integers.
(h) Maximize: Xl + 3X2 + 2X3
subject to: Xl + 2X2 + 2X3 ~ 9
2XI + X2 + X3 ~ 18
2XI + 2X2 + X3 ~ 20
XI> X2, X3 nonnegative integers.

(i) Maximize: Xl + 3X2 + 2X3


subject to: 2XI + 4X2 + X3 ~ 7
3XI + 2X2 + 2X3 ~ 5
Xl + X2 + 3X3 ~ 6

XI>X2,X3 nonnegative integers.


(j) Maximize: Xl + 2X2 + 3X3
subject to: 3x I + 2X2 + X3 ~ 5
4XI + 3X3 ~ 7
2XI + 4X2 + X3 ~ 4
X I> X 2, X 3 nonnegative integers.
(k) Maximize: 3XI + 4X2 + X3
subject to: Xl + X2 + X3 ~ 8

Xl + 3X2 + 4X3 ~ 15

X2 + 2X3 ~ 12

XI>X 2 ,X3 nonnegative integers.


184 4 Integer Programming

(I) Maximize: 3x I+ 4X2 + X3


subject to: XI + 2X2 - 2X3 :5: 9
2xI - X2 + 4X3 :5: 15
3xI + 3x 2 - x3:5: 0
XJ,X2,X3 nonnegative integers.

(m) Maximize: 2xI + 2x z + 4X3


subject to: Xl + Xz + x3:5: 9
3XI + 4xz + 2X3 :5: 10
-2XI + 4xz + 4X3 :5: 8
XJ, X 2 , X3 nonnegative integers.

(n) Maximize: 3x I + 5x 2 + 2X3


subject to: 2XI + X2 + 5X3 :5: 12
Xl + 3xz + x3:5: 8
5x I + 2X2 + 3X3 :5: 9
XI, X2, X3 nonnegative integers.
(0) Maximize: 5xI + 7X2 + 4X3
subject to: XI + X2 - x3:5: 0
2xI + X2 + 4X3 :5: 32
6x I + 9x 2 :5: 50
XI, X2, X3 nonnegative integers.

(p) Maximize: 2XI + 4X2 + 5X3


subject to: XI + X2 + 2X3 :5: 9
2xI + Xz + 3X3 :5: 13
3xI + 2X2 + x3:5: 11
Xl, XZ, X3 nonnegative integers.

(q) A surfboard manufacturer wants to know how many of each type of surfboard
he should make per week in order to maximize profits. He makes three types
of board: the knee board (K), the beacher (B), and the cruiser (C), which are
4,6, and 8 feet long, respectively, but he can blow only 50 feet offoam per week.
The profits are $40, $60, and $30 for K, B, and C, and they require 10, 15, and
25 feet of fibreglass cloth respectively. He has 140 feet of cloth available per
week, and 70 pounds of resin per week. K, B, and C need 6,10, and 14 pounds
of resin each, respectively.

(r) Maximize: XI + 2x z + 3X3


subject to: X 2 + 2X3 :5: 6

XI + X2 + x3:5: 5
3xI + 2X2 :5: 4
XI, X2, X3 nonnegative integers.
4.6 Exercises 185

(s) Maximize: 16x l + IOx2 + 12x3


subject to: 2Xl + 3x 2 + 4X3::5: 10
4Xl + 3X2 + 2X3::5: 12

Xl + 2X2 + 3X3::5: 6

Xl> X2, X3 nonnegative integers.

(t) A hobbyist making cane baskets (B), trays (T), and plant holders (P), makes a
profit of $10 on each item, and incorporates three colours: white (W), red (R),
and yellow (Y). He has a maximum of 6,9, and 10 yards of W, R, and Y cane
per week respectively. B, T, and P require 2, I, I; I, 3, I; and I, 2, 2 of W, R,
and Y cane, respectively. How many items of each type of product should he
make per week in order to maximize profit?

(u) Maximize: 2Xl + 3X2 + X3

subject to: Xl+2x2+ x3::5:17


3Xl + X2 ::5: 15
X2 + 4X3 ::5: 12
X l ,X 2 ,X3 nonnegative integers.

(v) Maximize: Xl + X2 + 2X3


subject to: tXl + tX2 + tX3 ::5: II
tXl + tX2 + t X3 ::5: V
iXl + tX2 + tX3 ::5: 1/
Xl>X 2 ,X3 nonnegative integers.

(w) Maximize: 2Xl + 4X2 + 6X3

subject to: 2Xl + X2 + x3::5: 3

Xl - 2X3 ::5: 6
4X2 + 6X3 ::5: 10
Xl> X 2 , X3 nonnegative integers.

(x) Maximize: 6Xl + 5X2 + 4X3

subject to: 5Xl + 4X2 + 2X3 ::5: 40


3x l + 3X2 + 4X3 ::5: 30
2Xl + 3X2 + 3X3 ::5: 20
Xl> XZ, X3 nonnegative integers.

(y) A jeweller makes three types of silver rings. Ring A takes 3 hours, 20 g of silver
and I hour of polishing. These quantities are 3, 10, and 3; 1,20, and I for rings
Band C, respectively. The polishing machine is available to him for 2 hours
per day and he can work for another II hours per day and can afford to buy
60 g of silver per day. Profits are $30, $20, and $10 for A, B, and C rings, respec-
tively. How many of each type of ring should he make per day in order to maxi-
mize profit?
186 4 Integer Programming

(z) Maximize: lOx1 + 12x2 + 16x3


subject to: 2Xl + 3X2 + 4X3:S; 20
3X1 + 3X2 + 4X3:S; 30
4X1 + 3X2 + 2X3:S; 25
X[, X 2 , X3 nonnegative integers.
2. Assume for each problem in Exercise 1 that

Find the new optimal solution for each problem using the method of Balas.
3. By converting to zero-one variables solve each problem in Exercise I by the method
of Balas.
4. Solve each problem in Exercise I by the Gomory cutting plane method.
5. Solve each problem in Exercise I, assuming that only X2 must be integral, by the
Gomory mixed integer method.

(II) Theoretical
6. Formulate the N-city travelling salesman problem as an I.P. in a way that requires
fewer constraints than the formulation given in Section 4.4.1.
7. Construct a branch and bound algorithm for the travelling salesman problem along
the lines of the approach suggested at the end of Section 4.4.1.
8. List at least three realistic applications for the vehicle scheduling problem not
listed in Section 4.4.2.
9. Construct a branch and bound algorithm for the vehicle scheduling problem.
10. Construct a branch and bound algorithm for the fixed charge problem.
II. Construct a branch and bound algorithm for the knapsack problem.
12. Construct a branch and bound algorithm for the assignment problem of Chapter 2.
13. Solve each of the problems of Exercise 2 by exhaustive enumeration. Compare the
amount of computation involved with that required by the method of Balas.
Chapter 5

Network Analysis

5.1 The Importance of Network Models


Many important decision-making problems can be described in terms of
networks. Some obvious examples are concerned with traffic and the ship-
ment of goods. However, there are many other examples with less obvious
links with network modelling such as production planning, capital budgeting,
machine replacement, and project scheduling.
One ofthe basic network optimization problems is concerned with finding
the shortest path between two given points in a network, the shortest path
problem. A second problem arises in connection with finding a subset of
links of the network which has the property that there is a path between
every pair of points in the network and the total length of the links in the
subset is minimal. This problem is called the minimal spanning tree problem.
A third problem is connected with maximizing the flow of some commodity
through the links of a network from a given origin to a given destination
where each link has a capacity of flow. This is the maximal flow problem.
A fourth problem is related to minimizing the cost of transporting a given
quantity of a commodity from a given origin to a given destination: the
minimum cost flow problem. A fifth problem, critical path scheduling, is
concerned with scheduling the activities of a project.
Because these and other basic network problems can be modelled as
L.P. problems requiring integer solutions, network analysis has strong links
with integer programming. In the next section the basic mathematical
notions necessary to study networks are introduced. The underlying mathe-
matical subject is called graph theory.

187
188 5 Network Analysis

5.2 An Introduction to Graph Theory


What most people normally think of as a network (as in road, communica-
tion, or telephone networks) is a special example of a mathematical entity
called a graph. In order to analyze network problems efficiently it is neces-
sary to master some graph theoretic concepts, which are presented in this
section. The discussion here is chiefly for reference that is, the reader should
proceed directly to section 5.3, returning to this section for clarification of
terminology when needed. The interested reader who wishes a more
detailed exposition of graph theory is directed to any of a number of excellent
texts on graph theory, including Busacker and Saaty (1965), Deo (1974),
and Harary (1969).
We begin by defining the term graph, and we use the terminology of the
Harary.
A graph G = (P, L) is an ordered pair where P is a nonempty set of points
(sometimes called vertices, nodes, or junctions) and L is a set of unordered
pairs of distinct points of P, called lines (sometimes also called links, edges,
or branches). Although a graph is an abstract mathematical concept, it is
usual to represent a graph by a picture. For instance, the graph G = (P, L)
where
P = {PbP2,P3,P4}
L = {{PI,P2,}, {P2,P3}, {P3,P4,}, {PbP4,}, {P3,Pb}}
is represented in Figure 5.1. It is important to realise that pictures like
Figure 5.1 are only diagrams of graphs, not the graphs themselves, which
are defined abstractly by the specification of P and L. A similar relationship
holds between Venn diagrams and formally defined sets.

Pl~------------------~P2

P4~------------------~P3

Figure 5.1. A graph.

Further terminology is now introduced.


A walk is an alternating sequence of points and lines of the form:

Po, {po,pd, PI' {PbPJ, P2'···' {Pn-I,Pn}, P..


For example, the sequence

Pb {PI,P4}, P4, {P4,P3}, P3, {P3,pd, PI> {PI,P4}, P4


5.2 An Introduction to Graph Theory 189

is a walk for the graph in Figure 5.1. A walk is termed closed if


Po = Pn
and open if
Po =F Pn'
The sample walk above is open, but if the last two elements-{PhP4},
P4-are removed it becomes closed.
A trail is a walk in which all the lines are distinct. Hence the walk
Pi> {Pl,P4}, P4, {P4,P3}, P3, {P3,Pl}, Pi> {Pl,P2} P2
in Figure 5.1 is a trail. A path is a trail in which all the points are distinct.
Hence the trail
Pl, {Pl, P4}' P4, {P4, P3}, P3, {P3, P2}, P2
in Figure 5.1 is a path. Of course if all the points are distinct in any trail,
all its lines are also distinct. A cycle is a closed walk of at least three points
with all its points distinct except that the first and the last are the same.
Hence the walk

in Figure 5.1 is a cycle. A graph is said to be connected if there exists a path


between every pair of points. The graph is Figure 5.1 is certainly connected.
However, if lines {Pl,P2} and {P2,P3} are removed the graph is no longer
connected, as there are no paths from P2 to any of the other points.
A tree is a connected graph without any cycles. If the lines {Pl, P2} and
{P3,P4} are removed from the graph in Figure 5.1 it becomes a tree. The
concept ofa tree is one of the most important in graph theory. We can make
some interesting observations about trees. If a graph G = (P, L) is a tree then
1. Every two distinct points of G are joined by exactly one path.
2. The number of lines in L is one less than the number of points in P.
3. If a line not present in L is added to G, then exactly one cycle is created.
The reader should construct a number of trees according to the definition
and verify that these properties are true for those trees.
A graph G' is said to be a subgraph of a graph G if G' has all its points
and lines in G and G' is a graph. Hence the graph (Pi, L') defined by
pi = {{Pl, P2}, {P2, P3}, {Pl' P3}}
is a subgraph of the graph in Figure 5.1. A subgraph (Pi, L') is said to span
a graph (P, L) if
pl=p,
Le., all the points of the graph are part of the spanning subgraph. A graph
that is a tree and a spanning subgraph of some graph (P, L) is said to be a
spanning tree (of(P,L)).
190 5 Network Analysis

In some applications it is desirable to orient each line of a graph with a


direction. Graphs with directed lines are called digraphs (short for directed
graphs). Pictures of digraphs are drawn in the same manner as those of
graphs, except that each line has an arrow attached to it to signify its direc-
tion. For example, Figure 5.2 depicts a digraph obtained from the graph in
Figure 5.1 by orienting its lines.

Pl~---------~----------'P2

Figure 5.2. A digraph.

More formally, a digraph D = (P, A) is an ordered pair where P is a


nonempty set of points and A is a set of ordered pairs of distinct points of
P, called arcs (sometimes called directed lines). The digraph in Figure 5.2
can be expressed formally as follows:
P = {PI,P2,P3,P4}
A = {(PbP2), (P2,P3), (PI,P3), (P3,P4), (P4,PI)}'
Many of the concepts of graphs can be defined in an analogous fashion
for digraphs. A directed walk is an alternating sequence of points and arcs
of the form:
Po, (Po, PI), Pb (PI' P2), P2, ... , (Pn-l, Pn), Pn-
For example, the sequence:
Pb (PI,P2), P2, (P2,P3), P3, (P3,P4), P4, (P4,PI), PI' (PbP3), P3
in Figure 5.2 is a directed walk. A directed path is a directed walk in which
all the points are distinct. As an example the directed walk

is a directed path A cycle is a directed walk of at least two points with all
its points distinct except that the first and the last are the same. Hence the
directed walk:
PI' (PI,P2), P2, (P2,P3), P3, (P3,P4), P4, (P4,PI), PI
is a cycle.
If Pi and Pj are points in a digraph D and there is a directed walk from
Pi to Pj in D, then Pj is said to be reachable from Pi' A network is a digraph
with at least one point a (called the source) such that every point is reachable
5.3 The Shortest Path Problem 191

from it, and another point w (called the sink) which is reachable from every
other point. It is usual to associate flows of some commodity with the arcs
of a network.

5.3 The Shortest Path Problem


Consider a digraph D in which each arc has a given traversal cost, i.e., for
each arc (i,j) in Diet cij be the cost oftravelling from point i to pointj along
arc (i,j). This traversal cost may be in terms of distance, time, money, or
some other optimality criterion. Graphs or digraphs ofthis nature are called
weighted. Figure 5.3 shows a weighted digraph.

PI

28

27

P8~~--------~~----------~P4
7
Figure 5.3. A weighted digraph.

The cost of a path in a digraph is defined to be the sum of the costs of the
arcs of the path. In the shortest path problem one must find the path of least
cost which joins one given point to another given point. The cost Cij of a
point pair i,j for which there is no arc (i,j) in A is set equal to a prohibitively
large number.

5.3.1 Dijkstra's Method

Suppose one is given a weighted digraph and a source-sink pair of points


such that it is desired to find the shortest path from the source to the sink.
At each iteration Dijkstra's method identifies a new point which is the
closest to the source among all those points which are currently not yet
identified. The length of the path from the source to this point is calculated
192 5 Network Analysis

and associated with the point. The method builds up a series of shortest
paths from the source to successive points until the sink is included in this
set, at which stage the problem is solved. All that remains is to find the actual
arcs making up this shortest path by a backtracking process. The procedure
can be continued until all points have been identified if it is desired to find
shortest paths from the origin to all other points.
The method will now be illustrated by finding the shortest path from
point Pl to point Ps in the digraph in Figure 5.3. We begin by partitioning
the set of points into two sets: A containing the origin and B containing all
other points. So

In the course of the method some of the points are going to be labeled, a
label of d(i) for point Pi representing the shortest distance from the source
to point Pi. First the origin is assigned a label d(l) = O. Next the point in
B which is closest to the origin is found. We require

Min {d(i) + cij}.


iE A
jEB.

The point, j in B satisfying the minimization is the point required. It has


its label dU) set equal to the minimum quantity found. In the present example
this minimum is
(I) d(1) + C 1S = 0 + 1 = 1.
Then the point, j in B found is removed from B and placed in A. So in the
present example:

A = {Pl,PS}'
B = {PZ,P3,P4,P6,P7'PS}
and
d(5) = 1.

This series of steps is now repeated until the sink (in the present example,
P s) is transferred from B to A. These steps are now carried out

(II) Min {d(i) + cd = d(l) + C 13 = 0 + 2 = 2


iEA
jEB
A = {PbP3,PS}'
B = {pz, P4, P6, P7, Ps}·
d(3) = 2;
5.3 The Shortest Path Problem 193

(III) Min {d(i) + cij} = d(5) + CS 2 = 1+ 8=9


i A E

jEB
A = {PbP2,P3,PS},
B = {P4,P6,P7,PS}
d(2) = 9;
(IV) Min {d(i) + cij} = d(2) + C 24 = 9 + 9 = 18
iE A
jE B
A = {PbP2,P3,P4,PS},
B = {P6,P7,PS}
d(4) = 18;
(V) Min {d(i) + cij} = d(4) + C4S = 18 + 7 = 25
iEA
jE B
A = {PbP2,P3,P4,PS,PS}
B = {P6,P7}
d(8) = 25.
Now as Ps, the sink, is included in A the repetition of the above series of
steps in terminated (If the shortest paths from the origin to all points were
required the steps would be repeated until all points were in A.) To find
the sequence(s) arcs making up the shortest path(s) from source to sink
we must work backwards (backtrack) through the digraph as follows. One
forms a list of values of the form
d(j) - cij - d(i), (5.1)
where Pj is the sink and Pi are labelled points connected directly to Pj.
Now
d(8) - C3S - d(3) = 25 - 27 - 2 ¥- 0,
hence arc (P3, Ps) is not on the shortest path. However
d(8) - C4S - d(4) = 25 - 7 - 18 = 0,
so arc (P4, Ps) is.
Next replace Ps in (5.1) by P4, the point just found to be on the shortest
path. A new list of values of the form of (5.1) is found.
d(4) - C24 - d(2) = 18 - 9 - 9 = °
so arc (P2, P4) is on the shortest path. Also
d(2) - C S2 - d(15) = 9 - 8 - 1 = °
194 5 Network Analysis

and
d(5) - CIS - d(1) = 1 - 1 - 0=0,
so arcs (Ps, P2) and (PI, Ps) are also on the shortest path. Unravelling this
information we conclude that the shortest path is

(PloPS), (PS,P2), (P2,P4), (P4,PS), with a length of d(8) which is 25.

There are a number of related shortest path problems including that of


finding a shortest path between each pair of points. The above procedure
(Dijkstra (1959)) could, in theory, be used to solve this problem, however
more efficient procedures have been developed; see for example Floyd
(1962) and Murchland (1967).

5.4 The Minimal Spanning Tree Problem


The following problem is somewhat similar to the shortest path problem
however it is concerned with graphs rather than digraphs. Given a weighted
graph one desires to find among all its subgraphs a spanning tree (see section
5.1) of minimum total weight. In other words we wish to find a subset of
lines forming a tree which spans the graph and which has a sum ofthe weights
of the individual lines which is a minimum among all such spanning trees.
This problem is called the minimal spanning tree problem.
There are many applications of the problem. Examples are transportation
planning problems where the points represent cities or distribution centres
and the lines represent air lanes, railway lines, or roads. In these cases one
is trying to design a system in which it is possible for travel between all
pairs of centres at minimum total outlay. A less direct application arises in
finding a lower bound for the length of a travelling salesman's circuit (see
section 4.41). One can represent symmetric T.S.P.'s by weighted graphs.
In solving such problems by branch and bound enumeration (see section
4.2) one needs the minimum distance the salesman would be required to
travel given that certain lines must be used and others must not. The weight
of a minimum spanning tree incorporating such decisions provides this
information. Other applications occur in project planning and communica-
tions network design.

5.4.1 Kruskal's Algorithm

Given a weighted, connected graph, suppose it is desired to find a spanning


tree of minimum total weight. Kruskal (1956) showed that the following
algorithm always produces such a tree. One begins by ordering all the lines
in the graph in order of nondecreasing weight, i.e., least weight first. Each
5.4 The Minimal Spanning Tree Problem 195

P3 P2

4 4 6 7 8

Ps
P6 P4
9 9

8 10
4

P7
Figure 5.4. A weighted graph.

line is then examined in this order in turn. When a line is examined it is


accepted as part of the spanning tree unless it would form a cycle with those
lines already accepted, in which case it is rejected and the next line is ex-
amined. The examination process is terminated when all the accepted lines
form a spanning tree. This tree constitutes a minimal spanning tree.
As an example, consider the weighted graph in Figure 5.4. The minimal
spanning tree is found for this graph using Kruskal's algorithm as follows.
The order of lines is:

{Pl,PS}, {P3,PS}, {P2,PS}, {Pl,P3}, {Pl>P2}, {P6,PS}, {P6,P7}, {P3,P6},


{Ps,Ps}, {P4,PS}, {P2,P4}, {P4,PS}, {PS,P6}, {P4,P7}, {PS,P7}'
The lines with weight 4-{pl> P3}, {Pb P2}, {P3, P6}, {P6, Ps}, and {P6, P7}-
have been assembled in arbitrary order. We now start building the tree by
examining each line in this order. Lines {Pl,PS}, {P3,PS}, and {P2,PS} are
all accepted. Next, {Pl,P3} is rejected, as it would create a cycle with {Pl,PS}
and {P3,PS}' Then {PbP2} is rejected as it would create a cycle with {Pl,PS}
and {P2,PS}' Moving on, {P6,PS} is accepted and {P3,P6} is rejected. Then
{Ps,Ps} and {P4,PS} are accepted. At this point a spanning tree has been
created and the examination process stops. The minimal spanning tree,
with weight
1+2+3+4 +4+6+7= 27
is shown in Figure 5.5.
It is true that a graph with n points will result in a spanning tree with n - 1
lines. Thus, once n - 1 lines, creating no circuits, been accepted, the mini-
mal spanning tree algorithm has constructed a spanning tree.
196 5 Network Analysis

PI
T

2 Ps 3
p3.---------~~---------P2

4 6 7

Ps

P7
Figure 5.5. A minimal spanning tree found by Kruskal's algorithm.

5.4.2 Prim's Algorithm

We now consider a second algorithm, due to Prim (1957), which also guar-
antees to find a minimal spanning tree in any connected, weighted graph.
Despite refinements to increase efficiency in Kruskal's algorithm, Prim's
approach is superior for all but very sparse (few lines) graphs.
Prim's algorithm does not require that the lines of the graph are ordered
in advance. It builds up a single connected component (which is actually a
tree) until this component spans the original graph. This component then
represents a minimal spanning tree. One begins by selecting the line of least
weight say {Ph pJ. This line and its two incident points forms the initial
component. One then finds the line of minimum weight among all those that
connect a point in the component to a point that is not. This line and its
noncomponent point then become part of the component.
Both Kruskal's and Prim's algorithms involve making the best (in this
case least weight) decision at each stage with little regard to previous deci-
sions. Combinational optimization procedures with this philosophy are
termed greedy. Greedy procedures seldom guarantee optimal solutions as
they do in the two algorithms for the minimal spanning tree problem. How-
ever, a greedy procedure is often used to find relatively good (near optimal)
solutions with little computational effort for many combinatorial optimi-
zation problems.
In order to illustrate Prim's algorithm, let us apply it to the graph in
Figure 5.4. The least-weight line is {PI' Ps} with weight 1. So the initial com-
ponent is [PI' Ps; {PI' Ps}]. We now look for points which are directly con-
nected to the component. There are two: pz and P3· The lines {Ps, P3}' {Ps, pz},
{PbPZ}' and {PI,P3} connect them to the component, {PS,P3} being the
5.5 Flow Networks 197

smallest. Hence this and P3 are added to the component, which becomes
[PbP8,P3; {Pl,PS}, {PS,P3}]. Now P2 and P6 are directly connected to the
component. The least-weight line is {PS,P2} which is added, along with P2,
to the component. Now P4, Ps, and P6 are directly connected to the compo-
nent. However, there is a tie among the weights of the connecting lines:
{P3,P6} and {PS,P6} are both of weight 4. Let us arbitrarily choose {P3,P6},
which is added to the component, along with P6. Next {P6,P7} and P7 are
added to the component, then {Ps,Ps} and Ps, and finally {PS,P4} and P4·
The component is now:
[Pl, Ps, P3, P2, P6, P7, Ps, P4; {Pl, Ps}, {Ps, P3}, {Ps, P2}' {P3, P6}'
{P6, P7}' {Ps,Ps}, {Ps, P4}].
The component now contains all the points of the graph and hence rep-
resents a minimal spanning tree. The minimal spanning tree is given by the
lines present in the component. This tree is shown in Figure 5.6.

2 Ps 3
P3~--------~---------'P2

4 6 7

Ps

P7
Figure 5.6. A minimal spanning tree found by Prim's algorithm.

Although this tree has a different set oflines from that found by Kruskal's
algorithm, it has the same weight (27). Differences between the two trees are
due only to the way in which ties between line weights were settled. Indeed,
both algorithms are capable of producing both trees.

5.5 Flow Networks


Recall that in Section 5.2 a network was defined as a digraph with a source
and a sink. Many flow network problems are concerned with optimizing
some parameter of a network system where there is a flow of material or
198 5 Network Analysis

goods from its source to its sink. A network of pipes carrying crude petro-
leum from an oil field to a port is an example. It is assumed that there is no
loss of the commodity being transported at the intermediate points. This
assumption is called conservation of flow. In effect it means that, for points
other than the source and sink, the total flow travelling into each point is
equal to the total flow travelling out of it. Associated with each arc is a
capacity, which represents the maximum amount of flow that the arc can
accommodate. Many flow networks are such that each oftheir arcs has a unit
transportation cost representing the cost of shipping one unit of the com-
modity along the arc.
Until now we have implicitly assumed that a network has exactly one
source and exactly one sink, and the algorithms to be presented in the next
two sections are designed for networks of this nature only. Any network
with multiple sources and sinks can easily be converted into one with a
single source and a single sink using the following artificial device. If more
than one source is present, a supersource So is created and represented by a
new point. This new point is connected to each source Si by an arc (so, sJ

S7

(a)

S4

(00,0) Ss

S2
So
S13
(00,0)
S6
(00,0)

S7
SIO

(b)
Figure 5.7. The conversion from multiple sources and sinks (a) to a unique source
and sink (b).
5.5 Flow Networks 199

For networks with multiple sinks a supersink Sn+ 1 is created. Each sink Sj is
connected to Sn+ 1 by an arc (Sj' Sn+ 1)' Arcs of the form (so, Si) and (Sj' Sn+ 1) in
multisource-multisink networks are assigned zero unit transportation costs
and infinite capacity. An example of the conversion from multisource to
multisink network to a single source-single sink network is given in Figure 5.7.
The capacity and unit transportation costs are given as an ordered pair for
each new arc. In Figure 5.7(a) the sources are Sl, S2' and S3' and the sinks are
Sl1 and S12' In Figure 5.7(b) the supersource is So and the supersink is S13'
Once the conversion has been made and a solution to the problem has been
found, arcs from So and to Sn + 1 are ignored.
Two network flow problems and solution procedures for them are pre-
sented in the next few sections. In the maximal flow problem one must maxi-
mize the total rate of flow from source to sink neglecting unit transportation
costs. In the minimal cost flow problem one must minimize the cost of ship-
ping a given quantity of a commodity from source to sink.

5.5.1 The Maximal Flow Problem

Consider a network with arc capacities but no unit transportation costs.


The maximal flow problem is concerned with finding an assignment of flow
to each arc so that the total flow from source to sink is maximized. The
problem can be formulated in mathematical terms. Let
n = the number of points in the network
Cij = the capacity of arc (Pi> Pj)
k = the flow assigned to arc (Pi>Pj)
P1 = the source
Pn = the sink.
Given a set of flow assignments iij' the flow out of point i is
I iij·
all arcs
(Pi.Pj)

The flow into a point i is


I
all arcs
}ji'
(pi-pil

Hence the assumption of the conservation of flow implies:

I k- I
all arcs all arcs
}ji =0, i =I 1, i =I n. (5.2)
(Pi,Pj) (Pi-Pi)

Note that the restriction on i in (5.2) is important. Conservation of flow


does not hold for the source or sink. Let F denote the total amount of flow
travelling through the network. This amount of flow F must leave the source
200 5 Network Analysis

and arrive at the sink. Thus


L
all arcs
fl j = L hn = F.
all arcs
(PI ,p j) (pi,PH)

In the maximal flow problem one must


Maximize: F
F, if i = 1
subject to: L hj - L jji = { 0, if i # 1, i # n
all arcs all arcs
(pj,pil -
F, ifi = n
(Pi,Pj)

o :::; hj :::; cij , for all arcs (Pi> p).


We turn now to developing methods for solving the maximal flow
problem. Consider the network in Figure 5.8, where arc capacities are shown.
If the arcs (P4,P6), (P3,P6), (PS,P6) were removed from the network it would
be disconnected, in the sense that there would no longer be any paths from
source, PI to sink, P6' A set of arcs with at least one element in every source-
sink path is called a cut. Thus the removal of the arcs in any cut disconnects
every source-sink path. Hence the set C = {(P4,P6), (P3,P6), (PS,P6)} is a cut.
The capacity of a cut is defined to be the sum of the capacities of the in-
dividual arcs in the cut. Thus the capacity of C is
C46 + C36 + C S6 = 1 + 2 + 2 = 5.
The cut with the smallest capacity is called the minimum cut. The reader
should verify that C is the minimum cut for the network in Figure 5.8.

P4
0

4
4
0
PI P3

2
2 0

Ps
2
Figure 5.8. A minimum cut.

The following important theorem is very useful in the designing of an


algorithm to solve the maximal flow problem:
Theorem 5.1 (The maximum-flow, minimum-cut theorem). In any network
the value of the maximum flow from source to sink equals the capacity of
the minimum cut.
5.5 Flow Networks 201

This theorem was proved by Ford and Fulkerson (1962) in an excellent text
which made a substantial contribution to the theory of network flows. The
book deals with the maximal flow problem and many of the other topics
of this chapter.
When confronted with a maximal flow problem one can begin by iden-
tifying the minimum cut. The network can then be gradually loaded with
flow that satisfies the assumption of conservation of flow. When the flow
from source to sink has total volume equal to the capacity of the minimum
cut, we know because of Theorem 5.1 that no further addition of flow is
possible. Thus the loading can be stopped and the present assignment is
optimal. The strategy by which the network is loaded is called the labelling
method, and is due to Ford and Fulkerson (1962). This method is explained
in the next section.

5.5.2 The Labelling Method

Consider the network in Figure 5.8. Flows in opposite directions in a single


arc have their magnitudes subtracted to produce a single flow in the direction
of the larger flow. For instance, if arc (3,4) had a flow of 5 from 3 to 4 and
a flow of 4 from 4 to 3, the net result would be a flow of 1 from 3 to 4. In
order to be able to change flows already assigned we allow the possibility of
a notional flow in an arc in a direction in which it cannot receive further flow.
This is brought about by the concept of excess capacity. The excess capacity
eij of an arc (i,j) for a given assigned flow iij is initially defined as
(5.3)
i.e., the amount of extra flow that an arc could accommodate, over and
above what it is now assigned. Suppose an arc (i,j), has a present flow of
iij and a capacity of cij' If a further flow iii is assigned to it, its excess capacity
is reduced by iii' but the excess capacity of arc (j, i) is increased by iii' This
allows us the possibility of later changing our minds and reducing the flow
in (i,j) by iii to get back to the original flow of iij' For example, suppose
initially:

Then
e36 = 2- 0= 2
e6 3 = 0 - 0 = O.
Now suppose a flow of I unit is assigned to (3,6); then
i36 = I
1,
e 36 =
but e63 is increased to 1. Although in reality it is impossible for arc (6,3)
to accommodate any flow, this positive excess flow is a useful tool. It allows
us to notionally assign a flow of 1 to arc (6,3) (since it has excess capacity
202 5 Network Analysis

of 1). This unit of flow cancels with the unit flowing along (3,6), leaving no
flow at all. Also, e 63 is reduced to 0, e 36 is increased to 2, and we are back
where we started.
Armed with the above ideas we shall now explain the labelling method
by using it to solve the problem defined by the network in Figure 5.8. We
begin by labelling the source with the symbol b l = 00, to indicate that it
is theoretically the source of an infinite amount of flow as far as the method
is concerned. All arcs are initially assigned an excess capacity as defined by
(5.3), with /ij = O. Any unlabelled points directly connected to a labelled
point by arcs with positive excess capacity are identified. Thus, if unlabelled
point j is such that
eij> 0
for some arc (i,j) and some labelled point i, pointj is then labelled with the
ordered pair (aj,b j), where
aj = i, the starting point for (i,j)
bj = min {e ij , bJ, the maximum possible flow.
This represents the fact that it is possible to find a path from the source
to point j which can carry an extra bj units of flow. Thus points P2 and P4
are unlabelled and connected to the labelled point Pl' Hence they are labelled
(1,2) and (1,4), respectively. The labelled point with the smallest index which
is connected to an unlabelled point is identified. This is point P2' connected
to point P3' Thus point P3 is labelled (2,2). Next the sink is labelled (3,2) by
the same reasoning.
Once the sink has been labelled, breakthrough has been achieved. We
have now discovered a path from source to sink which is capable of carrying
bn additional units of flow; bn is the second label associated with the sink,
point Pn' In the present case the path we have found is capable of carrying
b6 = 2 extra units of flow. This path can be traced back to the source by
examining the ai values of point labels. For instance, a6 = 3, hence the path
proceeds P3 ~ P6; a3 = 2, hence the path is P2 ~ P3 ~ P6; and a2 = 1, hence
the complete path is PI ~ P2 ~ P3 ~ P6' The flows in the arcs of this path
are increased by bn (=2); i.e., /12 = /23 = /36 = 2.
The excess capacity in these arcs is reduced by the amount of flow just
assigned, i.e.,
e12 = 2- 2=0
e23 = 4 - 2 = 2
e36 = 2 - 2 = O.
The excess capacity of arcs in the opposite direction to those on the path
have their excess capacities increased by the amount of flow just assigned, i.e.,
e21 = 0 + 2 = 2
e32 = 0 + 2 = 2
e36 = 0 + 2 = 2.
5.5 Flow Networks 203

4
5

2 (2)
2
2

(2)
2

Figure 5.9. An initial flow assignment.

All labels except that of the source are then removed. This completes one
iteration of the method. Figure 5.9 indicates the present flow assignment.
Actual flows assigned are shown in parentheses, excess capacities without
parentheses.
The process is then repeated. This time only point P4 can be initially
labelled from the source, as the arc connecting point P2 has zero excess
capacity. It is once again labelled (1,4). Next points P3 and P6 can be labelled
from point P4' The maximum amount of extra flow that can travel via the
path P1 ~ P4 ~ P6 is the minimum of two quantities: the amount that can
arrive at point P4 (4 units) and the excess capacity of arc (4,6), namely 1.
Hence the points P3 and P6 are labelled (1,4) and (4, 1), respectively. Break-
through has once again been achieved. We have identified a path: P1 ~ P4 ~
P6 to which we can assign a flow of 1. We now perform the necessary book-
keeping tasks to keep track of present flow assignments:
114 = 146 = 1
e14 = 4 - 1 = 3
e46 = 1 - 1 = 0
e41 = 0 +1= 1
e64 = 0+1= 1.
Figure 5.1 0 indicates the present flow assignments. It now looks as if we
have reached a stalemate and cannot assign any further flows by this method.
Arc (1,2) has zero excess capacity. Arc (1,4) has positive excess capacity (3),
but arc (4,6) has zero excess capacity. Hence we would have to send any
flow arriving at P4 from Pi on to P3' But arcs (3,6) has zero excess capacity
so this flow would have to be sent along arcs (3, 2), (2, 5), and (5, 6). All the arcs
on this path have excess capacity. Hence this path represents a possibility
for increasing flow. We have already assigned a flow of 2 along arc (2,3).
204 5 Network Analysis

Hence if we send any flow along arc (3,2), this flow along arc (2, 3) would be
correspondingly reduced. This possibility allows us to change our minds
and remove the allocation of 2 units along arc (2, 3).
In practice, the next iteration of the labelling method achieves what we
have just discussed: P4 is labelled (1, 3), P3 is labelled (4, 3), P2 is labelled (3, 2),
P5 is labelled (2,2), and P6 is labelled (5,2). Breakthrough has been achieved.
We have discovered the path Pi -+ P4 -+ P3 -+ P2 -+ P5 -+ P6 along which it is
possible to send an extra 2 units. When we perform the necessary book-
keeping, what happens to arc (2,3)? The flow of 2 presently in arc (2,3) is
cancelled with the flow of 2 presently in arc (3,2), leaving zero flow in both
(2,3) and (3,2). The excess capacity of (2,3) is increased:
e23 = 2+2= 4
and
e32 = 2- 2= o.
Hence we are back to the original situation of zero flow between points P2
and P3. The rest of the bookkeeping is recorded:
114 + 2 = 3,
= 1 e 1 4 = 3 - 2 = 1, e41 = 1+2= 3
143 = 0 + 2 = 2, e43 = 4 - 2 = 2, e34 = 5+2= 7
132 = 2- 2= 0
123 = 2 - 2 = 0
125 = + 2 = 2,
0 e 25 = 2 - 2 = 0, e5 2 = 0 + 2 = 2
156 = 0 + 2 = 2, e56 = 2 - 2 = 0, e65 = 0 + 2 = 2.
The present flow assignment is shown in Figure 5.11.
When the next iteration is performed it is found that the sink cannot be
labelled, as there are no arcs incident with the sink with positive excess
capacity. When this occurs the present flow assignment is optimal. As the
5.5 Flow Networks 205

3 (1)
(3)

2 (2)

(2)
2
2

(2) (2)
2

Figure 5.11

value of the present assignment is equal to the capacity of the minimum


cut (5), we could have stopped before this last iteration, knowing the optimum
was at hand by the maximum-flow, minimum-cut theorem.
The labelling method in algorithmic form is given below.

Labelling Method
1. Label point PI, the source, bl = 00. Set
hj = 0
eij = cij' for all arcs (i,j).
2. If there is no unlabelled point Pj connected to a labelled point Pi by an
arc with positive excess capacity, terminate-the present assignment of
flow is optimal. Otherwise go to step 3.
3. Choose the smallest index i of those found in step 2. Set
aj = i
bj = min {eij,b i}.
4. If the sink point Pn is unlabelled, go to step 2. Otherwise go to step 5.
5. Identify a path of labelled points from source to sink. For each arc (i, j)
on this path; let
eij become eij - bn
eji become eji + bn·
If hi = 0, let

If hi > 0, let
hj become bn - hi} lfb
. ';::-];..
hi become 0 n}.
206 5 Network Analysis

and
fij become 0 }'f b
n <
f JI'..
jj; become hi - bm
I

6. Erase all point labels except that of the source. Go to step 2.

5.5.3 The Minimal Cost Flow Problem

Suppose now that a network has not only a capacity but also a unit cost
associated with each arc. The minimal cost flow problem involves finding
the flow assignment for transporting a given quantity F from source to sink
at minimal cost. Using the terminology of Section 5.5.1, the problem can
be formulated mathematically as follows:
Minimize: I dijhi
all arcs
(i,i)
subject to: F, if i = I
I hi - I hi =
{ 0, if i # I, i # n
all arcs all arcs .f .
(Pi,Pj) (Pj,p;) - F, I I = n

o ::; hi ::; Cii , for all arcs (Pi> Pj),


where Pl corresponds to the source, Pn corresponds to the sink, and dij is
the unit traversal cost of arc (Pi> PJ
A multiple-source, multiple-sink minimal cost flow problem with no
intermediate nodes is the transportation problem studied in Chapter 2. Also,
if
F=l
and
Cij = I, for all arcs (Pi' p),
then the minimal cost flow problem reduces to the shortest path problem
of Section 5.3.

5.5.4 An Algorithm for the Minimal Cost Flow Problem

The following algorithm, due to Busacker and Gowan (1961) will be ex-
plained by using it to solve a minimal cost flow problem concerned with the
network shown in Figure 5.12. Each arc has an ordered pair associated with
it. The first entry in the ordered pair specifies the capacity of the arc, the
second the unit cost. Suppose it is desired to assign a total flow of 5 from
source to sink with minimal cost.
Basically, the algorithm identifies at each iteration a least cost path which
can accommodate further flow. The maximum possible flow is added to the
path. This is repeated until the total flow from source to sink is built up to F.
5.5 Flow Networks 207

(1,4)
(4,3)

6
(2, 1)

(2,6)
(2, 1)

(2,5)

Figure 5.12

In calculating the cost of each path which can be assigned further flow one
adds the cost of arcs oriented in the direction of the path and subtracts
costs oriented in the opposite direction.
The method begins with zero flow in each arc. The least cost path from
source to sink for the network in Figure 5.12 is Pl --+ P2 --+ P3 --+ P6' The
maximum flow which can be assigned to this path is the smallest arc ca-
pacity, namely 2, due to arcs (1,2) and (3,6). This flow is duly assigned. The
capacity of the arcs involved is correspondingly reduced. Arcs with zero
capacity are given a unit cost of 00. Arcs in the opposite direction to those
on the path are assigned a capacity equal to that just assigned, and a unit
cost equal to the negative of that originally belonging to that of the arc
concerned. This is shown in Figure 5.13. The flows assigned are written
without parentheses.

(4,3) (1,4)

2
~--~~--~~---1 6
(0, (0)

(2, -1) (2,6)


2
(0, (0)
(2,5)

Figure 5.13
208 5 Network Analysis

(3,3) (1, -4)

(1, -3) (0, 00)


1
(2, -1) 2
(0,00)

(2, -1) (2,6)


2
(0, 00)
(2,5)

Figure 5.14

The process is now repeated. This time the shortest path is Pl --+ P4 --+ P6.
A flow of 1 can be assigned, as this is the minimum arc capacity belonging
to (4,6). The arc labels are adjusted, and the result is shown in Figure 5.14.
The next shortest path is Pl --+ P4 --+ P3 --+ P2 -+ Ps --+ P6' with a length of
d 14 + d43 + d 2S + d S6 = 3 + 2 - 1 + 5 + 6 = 15.
The maximum that can be assigned to this is 2 units. When this assignment
is made, the 2 units assigned to (3,2) cancel with the 2 units assigned to arc
(2, 3) to produce a label for (2,3) of (4, 1). Arc (2, 3) is now in the same state
as it was originally. This has been brought about by the fact that we changed
our minds about the assignment of the 2 units originally made to (2,3) and
withdrew that allocation. The current assignments and arc labels are shown
in Figure 5.15.

(1, 3) (1, -4)


3
(3, -3) (0, 00)
1
2
(2, -1) (0,00)
2(0,00)
(2, -1) (2, -6)
2
(0,00)

Figure 5.15
5.5 Flow Networks 209

The flow now assigned is


1 unit on path P1 -P4-P6
2 units on path P1 - P4 - P3 - P6
2 units on path P1 - P2 - Ps - P6'
which sums to the total of 5 units to be assigned. The cost is:
1 x (cost of path P1 - P4 - P6) = 1 x (3 + 4) = 7
+2 x (cost of path P1 - P4 - P3 - P6) = 2 x (3 + 2 + 1) = 12
+2 x (cost of path P1 - P2 - Ps - P6) = 2 x (1 + 5 + 6) = 24
43'
Note that if the flow to be assigned had been more than 5, the problem
would have had no feasible solution, as there are no P1 - P6 paths of finite
cost and positive excess capacity left.

5.5.5 The Out-of-Kilter Method*


The procedure of Section 5.5.4 requires the search for a source-to-sink path
prior to labelling and flow assignment at each iteration. Thus the procedure
is not very suitable for large networks. A more general algorithm has been
developed Ford and Fulkerson (1962) and is efficient when applied to large
networks. This algorithm, called the out-of kilter method, is presented in this
section.
As with the algorithm of the previous section, flow is progressively added
to the network in the out-of-kilter method. Flow may be added to existing
flow in an arc in the direction of the arc (termed forward flow) and this
increases the flow in the arc. Flow may be added to existing flow in an arc
in the opposite direction to the arc (termed backwardflow), and this decreases
the flow in the arc. For instance, the arc (i,j) in Figure 5.l6(a) has 6 units of
flow. An addition of 3 units of forward flow increases the flow in the i - j
direction to 9, as shown in Figure 5.l6(b). An addition of 4 units of backward
flow decreases the flow in the i - j direction to 2, as shown in Figure 5.l6(c).

• ..
6
• •
9
• • •
2
• •
(a) (b) (c)
Figure 5.16. Forward and backward flow.

It is assumed here that no arc can accomodate a positive flow in the


direction opposite to its orientation. The out-of-kilter algorithm will handle
the case where each arc (Pi,Pj) has a positive lower bound bij on the amount
of flow it carries.
* This section is based on pp. 132-145 of Plane and McMillan (1971).
210 5 Network Analysis

The reader may have wondered about the unusual name of the method.
It comes about as follows. During the course of the method each arc is
assigned a definite state. The state of a particular arc may change from time
to time. There are two possible states: in kilter, signifying that a change of
flow in the arc will not bring about an improvement; and out of kilter,
signifying that a change of flow in the arc will bring about an improvement.
When all the arcs are in kilter, no further improvement is possible, and the
optimal solution is at hand.
In order to decide how to assign states to the arcs, modified costs are
assigned to them. Recall that each arc has a unit transportation cost,
representing the cost of shipping one unit of flow in the direction of the arc.
These unit costs are modified by adding and subtracting tolls from them.
Let t; be the toll for one unit arriving at point i. When a number of units
arrives at point i, let us suppose that t; must be paid for each unit. When
the units are shipped along arc (i,j), t; is charged for each unit. Because
conservation of flow is assumed, this shipping causes no profit nor no loss.
However, the total unit cost of shipping along arc (i,j), including tolls, is
denoted by
(5.4)

Thus aij is the modified cost for each arc. The values assigned to the t;
change from time to time during the course of the method, according to
strict rules. These rules will be explained later.
Given a set of modified costs and flows for the arcs, one is in a position
to discover how the flows might be rearranged in order to save costs. For
instance, if a modified cost is negative, the flow in the appropriate arc should
be increased until it reaches capacity or until the modified cost becomes
zero. An arc with a negative modified cost will be out of kilter unless its
flow is at capacity. Also, if a modified cost is positive, the flow in the appro-
priate arc should be reduced until it becomes equal to the lower bound b;j.
Once all arcs are in kilter, no more savings can be made and an optimal
solution has been found.
One can also ascertain whether it is possible to add forward flow, back-
ward flow, both or neither to a particular arc. We associate with the arcs
the following symbols,
I ~ in-kilter,
o ~ out-of-kilter,
F ~ forward flow possible,
B ~ backward flow possible,
depending upon its status. An arc will be endowed with either an I or 0
depending upon whether it is in or out of kilter, and with either F, B, FB,
or no further symbol depending upon whether forward flow, backward flow,
both or neither is capable of being assigned. The particular mix of symbols
assigned to an arc depends upon its current level of flow relative to its
5.5 Flow Networks 211

Table 5.1. Assignment rules for out of kilter method

Flow Level

Modified
cost Iij < bij Iij = bij bij < Iij < Cij iij = Cij Iij> Cij

aij > 0 OF OB OB OB
aij = 0 OF IF IFB IB OB
aij < 0 OF OF OF OB

capacity and its modified cost. The rules of assignment are summarized in
Table 5.1. An explanation of how some of the symbols in the table are
arrived at has been given. The reader should satisfy himself that, in view of
the previous discussion, the other entries in the table make sense.
Before stating the complete method in algorithmic form we shall outline
the out-of-kilter method in general terms. As with the previous methods in
Section 5.5, we usually begin with all flow assignments set at zero. (However,
if a feasible set of flow assignments is known this could be used instead.) All
point tolls ti are initially set at zero, and then the modified costs aij can be
calculated using (5.4). One can then assign a state to each arc according to
Table 5.1. Next, a path of labelled points is built up. Once breakthrough is
achieved, additional flow is added to the labelled path. Point tolls are
adjusted, all labels are removed, modified costs are recalculated, and the
states of certain arcs may be altered. The process then begins all over again,
building up a new labelled path. When all arcs are in kilter the method is
terminated.
Let us take each of the processes of the method in turn, beginning with
the labelling of the points.

The Labelling Process


To begin the labelling process, when all points are unlabelled, one arbitrarily
chooses an arc (P;,Pi) that is out of kilter: point Pi is labelled ifforward flow
is possible in the arc; point Pi is labelled if backward flow is possible in the
arc. Having done this one searches for other points to be labelled. A point
Pi can be labelled if it is either:

1. Directly connected by an arc (Pi'P) to a labelled point Pi and backward


flow is possible in (Pi,Pi); or
2. Directly connected by an arc (Pi' Pi) to a labelled point Pi and forward
flow is possible in (Pi' p;).
A label for point Pi connected to labelled point Pi is of the form:
[r,A;],
212 5 Network Analysis

where
(X = {+' if extra forward flow is possible in arc (Pi' p),
-, if extra backward flow is possible in arc (Pi, Pj)

and Ai is the maximum amount of flow (either forward or backward) that


can be added to the arc joining Pi and Pj which arrives at Pi from Pj' Thus
Ai will be the smaller of the following two amounts:
Ll. A j, the value which is part of the label of point Pj'
L2. (a) bij - k, ifaij > 0 and iij < bij;
(b) iij - Cij' ifaij < 0 and iij > cij;
(c) cij - k, ifforward flow is possible and neither (a) nor (b);
(d) iij - bij , if backward flow is possible and neither (a) nor (b).

When the very first point at each iteration is labelled, no other points will
have been labelled. In this case Ai is assigned a value according to the second
alternative.
It may be that no further labelling of points can be carried out but some
arcs are still out of kilter. When this occurs all the tolls of unlabelled points
must be adjusted. This means that some modifled costs must be recomputed.
This leads to a change of state of at least one arc, making either forward or
backward flow possible. Thus further labelling will be possible. This toll
adjustment is carried out as follows.

Toll Adjustment

Tl. Identify all arcs which connect a labelled point and an unlabelled point.
T2. Among all such arcs found in Tl, identify those arcs (Pi'P) such that:
(a) aij> 0, Pi is labelled, and iij ::;; cij ' or
(b) aij < 0, Pi is not labelled, and J;j ;;::: bij'
If no arc meets conditions (a) or (b), the problem has no feasible solution.
T3. Among all arcs identified in T2, find the one with minimum laijl (abso-
lute value of aij)'
T4. Increase tolls of all unlabelled points by the amount found in T3.

The out-of-kilter method operates by considering circulation flows, rather


than source-to-sink flows. A circulation flow is one in which flow travels
round a cycle in the network, returning to the point from which it started
out. In order to be able to use the method on a minimal cost flow problem,
we have to make a minor addition' to the network concerned. A sink-to-
source arc (PI> Ps) is added, where Pt is the sink and Ps is the source. The arc
is assigned unit cost dts = 0 (so as to not affect the cost of the final solution)
and bounds bts = Cts = F, the amount of source-to-sink flow required. Be-
cause of conservation of flow, any feasible solution must allow F units to
flow from Ps to Pt (and back via (Pt,ps)')
5.5 Flow Networks 213

The out-of-kilter method is now stated in algorithmic form.


1. Add arc (Pt, P.) to network with dt. = 0, ht• = Ct. = F, where s is the
source and t is the sink. Set
ti = 0, for all points Pi in the network
ii j = 0, for all arcs (Pi> Pj) in the network.
2. Calculate aij = dij + ti - tj , for all arcs. Assign a state to each arc.
3. If all arcs are in kilter, go to step 13; otherwise, continue.
4. Choose arbitrarily an arc (Pi' Pj) which is out of kilter, and label Pi and
Pj according to the point labelling procedure.
5. If there is a path of labelled points including the arc (Pi'P) found in
step 4, go to step 11; otherwise, continue.
6. If another point can be labelled, label it according to the point labelling
procedure and go to step 5; otherwise, continue.
7. Change the tolls according to the toll adjustment procedure. If no tolls
can be adjusted, no feasible solution exists; terminate.
8. Calculate new modified costs according to (5.4), for all arcs with only
one unlabelled point.
9. Assign new states for arcs where necessary.
10. If all arcs are in kilter, go to step 13; otherwise, go to step 6.
11. Adjust the flow in each arc on the path found in step 5 by the minimum
Ai among its point labels.
12. Remove all labels and arc states and go to step 2.
13. The present flow assigmpent is optimal. Terminate the algorithm.

5.5.6 Numerical Example Illustrating the Out-of Kilter Method


We shall now solve again the minimal cost flow problem of Figure 5.12 using
the out-of-kilter method. The out-of-kilter method has a rather elaborate
mechanism, and its use on such a relatively small problem is rather like
using a sledgehammer to crack a peanut. The method is designed for large
problems; we use it on a small one only so that the explanation will be brief.
Following the algorithm, we begin in step 1 by setting all tolls and flows
equal to zero. Thus each modified cost calculated in step 2 will equal the
corresponding unit cost. These modified costs and the arc states are shown
in Figure 5.l7(a), as well as the source-to-sink arc (P6, PI) with capacity and
lower bound 5. (Variables with current value zero are not shown in the
figures accompanying this discussion.)
All arcs are found to be in state I except (P6,PI), which is in state OF.
This arc is chosen as in step 4, and point PI is labelled [6 +,5], indicating
that a forward flow of 5 is possible from P6 to PI according to part L2(c) of
the point labelling process. No further labelling can take place, as neither
of the arcs out of PI-(Pl> P4) and (PI,P2) - have forward flow possible. But
we have not been able to find a path of labelled points including (P6,PI),
the original out-of-kilter arc. Hence, according to step 6, we go to step 7
214 5 Network Analysis

(4,3)

(2, 1)
J 12 = I

OF (5,0) J 6l = 0 b6l = 5
Figure 5.17(a). Applying the out-of-kilter method.

and adjust the tolls. Arcs connecting a labelled point to an unlabelled point
are (Pl,P4) and (Pl,P2)' Hence, according to T2(a), both arcs can be iden-
tified and the tolls of unlabelled points should be increased by 1£1 = 1. nI
New modified costs and states are computed as in steps 8 and 9, and this is
shown in Figure 5.l7(b).

(1,4)

J l4 = 2 J46 = 4

(4,3)

IF (2, I) J36 = I

(2, I)
JS6 = 6
a
l2 =0 (2,6)
J2S = 5
Ps
(2.5)
t2 = I ts = I

OF (5,0) J6l = I b6l = 5


Figure 5.l7(b)
5.5 Flow Networks 215

Going back to step 6, we can now label point P2' as forward flow is possible
in arc(pb P2)' The label is [1 +, 2J, where A2 = 2 = min {Al' e 12 } = min {5, 2}.
Once again no further labelling can take place, so the tolls are adjusted. Arcs
connecting a labelled point to an unlabelled one are (Pb P4), (P2, P3), (P6, pd,
and (P2, Ps). The tolls of unlabelled points are increased by lad = 1. New
modified costs and states are computed as in steps 8 and 9, and this is shown
in Figure 5.l7(c).

(1,4)
a l4 = 1
(4,3) I

PI t3 =2
[6+,5]
(2, 1) a
36 = 1

(2,1)
a l2 =0
IF a 2S =4
)---------( Ps
(2,6)

(2, 5) I
ts = 2

OF (5,0) a61 = 2 b61 = 5


Figure 5.l7(c)

Going back to step 6, we can now label point P3, as forward flow is
possible in arc (P2, P3)' Once again no further labelling can take place, so
the tolls are adjusted. Arcs connecting a labelled point to an unlabelled one
are (Pl, P4), (P2, Ps), and (P3' P6)' The tolls of unlabelled points are increased
by la 36 1= 1. New modified costs and states are computed as in steps 8 and
9 and this is shown in Figure 5.l7(d).
Going back to step 6, we can now label points P4 and P6, as forward
flow is possible in arcs (Pl,P4) and (P3,P6)' We have now created a cycle of
labelled points <Pl,P2,P3,P6,Pl), as required in step 5. Going to step 11,
the flow in the arcs of this path is adjusted by the minimum Ai among the
labels of the points on the path, namely A6 = 2. All arc states and point
labels are removed, as in step 12. States are calculated as in step 2. These
are shown in Figure 5.l7(e).
The only arc out of kilter is arc (P6,Pl)' This arc is chosen as in step 4,
and point Pl is labelled. Next point P4 is labelled, as it is connected by arc
(Pl,P4) in which forward flow is possible. As no further labelling is possible,
216 5 Network Analysis

[1 +, 4]
IF (1,4)
a l4 =0 a 46 =4
(4,3)
t6 = 3

P6
[6 +, 5]
PI
a36 = 0 IF
[3 +,2]
(2, 1)
a l2 =0 aS6 =6
IF a 2S = 3
Ps
(2,6)

ts = 3

OF (5,0) b61 = 5 a 61 = 3
Figure 5.17(d)

IF (1,4)
JI4 = 0

(4,3)
= 3
a =0
t6
36
PI r-----------{ P
a12 = 0
6
(2, 1) f36 = 2 IB

(2, 1)
f12 = 2
IB

ts = 3

fl6 = 2 OF (5,0) a61 = 3 b 61 = 5


Figure 5.17(e)

the tolls of unlabelled points are increased by la43 1= 3. New modified costs
and states are computed and are shown in Figure 5.17(f).
It is now possible to label point P3 and then P2' Once again the tolls of
unlabelled points are adjusted. This time they are increased by Id 46 1 = 1. New
modified costs and states are computed and are shown in Figure 5.l7(g).
5.5 Flow Networks 217

[l +,3]

IF (1,4)
il l4 = 0 il46 = 1

(4,3) I
t6 =6
il36=0
t3 =5
136 = 2

112 =2 ilS6 =6
il l2 = -3 illS =3
I
t5 =6

OF 161 =2 (5,0) il61 = 6 b61 = 5


Figure 5.17(f)

il14 =0
(4,3)

ilS6 =6
(2,6)

OF 161 =2 (5,0) il61 = 7 b61 = 5


Figure 5.17(g)

It is now possible to label point P6' We have now created a path oflabelled
points (Pl,P4,P6,Pl), as required in step 5. Going to step 11, the flow in
the arcs of this path is increased by the minimum Ai = A6 = 1. All arc states
and point labels are removed as in step 12. New modified costs and states are
calculated as in step 2, and are shown in Figure 5.17(h}.
218 5 Network Analysis

al4 = 0 (1,4)

fl4 = 1
(4,3)
IFB

f36 = 2 IB
I
(2, 1)
f12 = 2
al2 = -3
[3-,2]
(2,5)
t2 = 4

OF (5,0) f61 = 3 a61 = 7 b61 = 5


Figure 5.l7(h)

[1 +,2]
al4 = 0 (1,4)
fl4 = 1 f46 = 1
(4,3) a 46 = -2
IFB I (6 = 9

I
(2, 1)
f12 = 2
a l2 = -3

IF

OF (5,0) f61 = 3 a61 = 9 b61 = 5


Figure 5.17(i)
5.5 Flow Networks 219

The only out-of-kilter arc is (P6,PI)' This arc is chosen as in step 4, and
point PI is labelled. Next point P4 is labelled, as it is connected to PI by arc
(PbP4) in which forward flow is possible. Then point P3 is labelled, as it is
connected to P4 by arc(P4, P3)' Then point P2 can be labelled, as it is connected
to the labelled point P3 by arc (P2,P3) in which backward flow is possible. We
cannot label any more points, so the tolls of unlabelled points are changed.
They are increased by 1£1 25 1 = 2. New modified costs and states are computed
and are shown in Figure 5.17(i).

IFB
a l4 =0
114 = 1
(4,3)

PI
(2,1) 136 =2
[6+,2] I
(2,1)
112 = 2
al2 = -3
(2,5) IF

OF (5,0) 161 = 3 a61 = 15 b61 =5


Figure 5.17(j)

It is now possible to label point P5, but no further labels can be attached.
Once again the tolls of unlabelled points are adjusted, and are increased by
1£1 56 1 = 6. New modified costs and states are computed and are shown in
Figure 5.l7(j).
It is nOW possible to label P6' We have created a path points <PI,P4,P3,
P2,PS, P6,PI) as required in step 5. Going to step 11, the flow in the arcs of
this path are adjusted by the minimum Ai = A6 = 2. All arc states and point
labels are removed as in step 12. New modified costs and states are calculated
as in step 2. These are shown in Figure 5.l7(k).
The final solution, as shown in Figure 5.l7(k) is identical to that found in
section 5.5.4, as the arc (P6, PI) can now be ignored. It will be noticed that
once an arc was in kilter, it never became out of kilter. This is no coincidence
and will always happen. In fact, the method adopts the strategy of changing
the status of out of kilter arcs to in kilter, while keeping the status of all in
kilter arcs unchanged.
220 5 Network Analysis

IFB
al4 = 0 f46 = 1
fl4 = 3 a46 = -8
(4,3) I t6 = 15

(2,1) f36=2
(2,1) IB
f12 = 2
al2 = -3
a23 = 0
IB a25 = 0
(2,5) f25 = 2
t2 = 4 t5 = 9

(5,0) f61 = 5 a61 = 16 b 61 =5


Figure 5.l7(k)

5.6 Critical Path Scheduling


The reader has no doubt come across industrial or other real-life projects
which on analysis can be seen to be made up of a number of individual
activities, many of which may possibly be carried out simultaneously, assum-
ing sufficient resources. Usually there are certain pairs of activities {a;, aj}
with the property that aj cannot be started before ai is completed. For
example, in the project of building a house it may not be wise to lay the carpet
before the interior walls are painted.
It is often desirable to represent the interrelationships between the activi-
ties by a network, that is, a digraph with a source and a sink. In order to see
how this can be done we need to develop the notion of precedence. We say
that activity ai precedes aj if ai must be completed before aj can begin. Of
course, certain activities may be preceded by more than one other. We concern
ourselves only with direct precedence. If ai precedes aj and aj precedes ak
then strictly speaking ai also precedes ak; however, in the construction of a
network to represent the project we shall not take notice of this last fact: the
ai-aj and arak precedences imply the ai-ak precedence.
We associate with each activity ai a duration time t i, which is the estimated
time to complete ai' The network for a given project is constructed as follows.
Each activity is represented by a point in the network. There is also a unique
source 0(, which represents the start of the project and a unique sink co, which
represents the completion of the project. It is assumed that the activity 0(
5.6 Critical Path Scheduling 221

precedes any activities with no other precedents. Also the activity co is pre-
ceded by any activities which precede no other activities. The duration time
of the activities represented by 0( and co are defined to be zero. Whenever
activity ai precedes activity aj,join point ai to point aj by arc (ai' aJ Associate
with each point the duration time of its activity. It should be noted that the
network will not possess any cycles, for if it did, no activity on a cycle could
ever be started.
In any project there will be a number of activities with the following
property: If the start of the activity is delayed any later than it strictly has to
be, or if the duration time of the activity is prolonged, then the completion
time of the whole project will be extended. Such activities are termed critical.
Because of the nature of the precedence relationships and the way we have
constructed the network, there will be at least one source-to-sink path of
critical activities-the longest path from source to sink (in terms of the sum
of the duration times of its points) in the network. The aim of our analysis is
to identify all such critical paths. Then a schedule can be devised giving the
recommended starting and finishing times for each activity. Then if &n activity
looks like it is falling behind, extra resources may possibly be channelled into
it from other activities with a comfortable margin.
There is another approach to modelling projects of this sort by digraphs.
This uses arcs to represent activities and the points represent events that
certain activities has been completed. Coverage ofthis approach is beyond the
scope of this book and the interested reader is referred to Taha (1976). That
author also covers the case where the duration time estimates are probabi-
listic in nature; in this case a technique called PERT (Program Evaluation
and Review Technique) is explained. We confine ourselves in this chapter to
constant, given duration times and present what is called the Critical Path
Method (C.P.M.) which will find all critical paths.
We shall explain C.P.M. by using it on a numerical example which has
been streamlined in a rather simple-minded way for expository purposes. Let
us construct the network for the project of building a house with the activities
shown in Table 5.2. Activities 1 and 2 have no precedents, so we create arcs
(0(, 1) and (0(,2) as in Figure 5.18. Then we see that activities 3 and 4 are
preceded by these two, and that 3 precedes 4. Thus arcs (2, 3), (1, 3), (1,4), and
(3,4) are created. Proceeding in this way, as 5, 7, 8, and 11 depend upon 3 and
4, arcs (3,11), (3, 7), (3, 8), and (4, 5) are drawn. No arcs are drawn between any
pair of 5, 7, 8, and 11, as they are not related. The next iteration creates arcs
(11,12), (7, 14), (8,9), (8, 10), and (5, 6). Then arcs (12, 13) and (6, 14) are drawn.
Then arcs (13,14) and (13,15) come into being, where the points of arc (13,14)
were already present. Finally points 14 and 15 are connected to co, as they do
not precede any activities.
We shall now find all critical paths in the network, whose length represents
the minimum possible completion time of the project. Secondly we shall
discover for each activity the earliest start time it could possibly be begun and
the latest finish time it could possibly be finished if the whole project is to be
222 5 Network Analysis

Table 5.2. House Building Projects

Activity Precedence Time (days)

1. Excavate to prepare for foundations 10


2. Establish driveway 2
3. Deliver building materials 1,2 3
4. Establish foundations 1,3 15
5. Build walls and interior 4 40
6. Build roof 5 10
7. Build separate garage 3 10
8. Hook up power supply 3 1
9. Reticulate house with water, gas and electricity 8 5
10. Wallpaper interior 8 2
11. Fence property 3 2
12. Landscape section 11 5
13. Build swimming pool 12 4
14. Spray paint inside and out 6,7,9,10,13 8
15. Plant garden 13 2

Figure 5.18. An activity network.


5.6 Critical Path Scheduling 223

completed at the earliest possible instant. Naturally, a critical activity will


have its earliest start time plus its duration time equal to its latest finish time,
as there is no leeway. Let
eSj = earliest start time for activity aj
II; = latest finish time for activity aj
tj = duration time of activity aj.

Then
eSj + tj = II;, if aj is critical. (5.5)
However, if aj is not critical,
eSj + tj < 1/;. (5.6)
The actual leeway is called the total float tl; for aj:
tl; = I/; - tj - eSj. (5.7)
Thus a critical activity has zero total float. Given the possibility of re-
allocating manpower to speed up ailing activities, it is desirable to define
two further variables for each activity aj:
ISj = latest start time of a j if project is to be completed on time
el; = earliest possible finish time of aj given its precedence.
For each activity aj,
el; = eSj + tj (5.8)
II; = ISj + t j. (5.9)
Therefore,
tl; = ISj - eSj = I/; - ek (5.10)
We associate eSj, 1/;, ISj, el;, and tj with each point aj in the network, as
shown in Figure 5.19. We fill in the four numbers in the interior of each circle
by a two-pass process.

tj

eSj

Figure 5.19. Labelling point aj.


224 5 Network Analysis

PASS I
(a) Define esa. = 0, the earliest start time of the source.
(b) e/; is defined by (5.8).
(e) eS j = max {e/;},
(ai,aj)

where this maximum is taken over all e/; where arc (ai,a) exists.
Using (a), (b), and (c), eSi and e/; can be calculated for all points in the net-
work. We now illustrate pass I on our example, calculating the top two
numbers in each circle in Figure 5.20.
(a) esa. = O.
(b) efa. =0, by (5.8).
(c) eSt = 0
eS2 = O.
(b) eft = 0 + 2 = 2
ef2 = 0 + lO = 10.

Effi)
810
2

40 10

.@
IX

28 68
5 6
Figure 5.20. Calculating start and finish times.
5.6 Critical Path Scheduling 225

(c) eS3 = max {e!l' e!2}


= max{2, to}
= 10.

(b) e!3 = 10 + 3 = 13.


Proceeding in this way, we eventually calculate

efr. = 86.

This establishes that the earliest possible finish time for building the house
is 86 days.
We now make a backwards pass through the network, filling in the bottom
two numbers in each circle. This is done as follows:

PASS II

(d) Define

(e) lSi is defined by (5.9), i.e.,


lSi = IJ; - ti
(f) if; = min {lSi}
(ai,aj)

is taken over all lSi where arc (ai,a) exists.

Using (d), (e), and (f), lSi and IJ; can be calculated for all points in the
network. For our example, as shown in Figure 5.20,
(d) lfm = e!m = 86.
(e) ISm = lfm - tw = 86, by (5.9).
(f) lf14= 86
l!lS = 86.

(e) IS 14 = 86 - 8 = 78
Is 15 = 86 - 2 = 84
1!4 = min {IS 14 , Is lS } = min {78, 84} = 78.
(e) Is 4 = 78 - 4 = 74.
Proceeding in this way we eventually calaculate

Is" = lj",. = 0, (5.11 )

which must be true for any network. In fact, (5.11) is a good check on the
accuracy of one's arithmetric. Having calculated the four numbers in each
226 5 Network Analysis

point it can be seen, for i = (1, 1, 3,4, 5,6,8, w, whether


eS i = lSi (5.12)
and
eh = lh· (5.13)
Points for which (5.12) and (5.13) hold are critical. Thus the critical path is
<(1,1,3,4,5,6,8, w).
We can calculate the total float for each activity using (5.10); the results
are shown in Table 5.3. For example, the total float of a2 is 8 days. This
means that as long as activity 2 is started within 8 days ofthe earliest possible
time it can be started (day 0), and there are no other critical delays, then the
whole project will still be completed on time.

Table 5.3.

Activity Precedence ti eSi lSi el; II; tl; 1[;

IX 0 0 0 0 0 0 0
1 10 0 0 10 10 0 0
2 2 0 8 2 10 8 8
3 1,2 3 10 10 13 13 0 0
4 1,3 15 13 13 28 28 0 0
5 4 40 28 28 68 68 0 0
6 5 10 68 68 78 78 0 0
7 3 10 13 68 23 78 55 55
8 3 1 13 72 14 73 59 0
9 8 5 14 73 19 78 59 59
10 8 2 14 76 16 78 62 62
11 3 2 13 67 15 69 54 0
12 11 5 15 69 20 74 54 0
13 12 4 20 74 24 78 54 0
14 6,7,9, 10, 13 8 78 78 86 86 0 0
15 13 2 24 84 26 86 60 60
OJ 14,15 0 86 86 86 86 0 0

There is another type of float called free float. Total float is a global
concept, in the sense that it defines the leeway in getting an activity started
with regard to the project as a whole. Free float is a local concept, in the
sense that it defines the leeway in getting an activity started with regard
only to the activities it precedes. For example, consider activity 8 with total
float 58. Assume we wish to start the activities which as precedes (a g and
alO) as early as possible. Activities 9 and 10 both have earliest start times of
14. As the earliest finish time of as is 14, it cannot be delayed. In this case
the free float of as is zero. However consider activity 2. Its earliest finish time
is 2 but the earliest start time of the only project it precedes (a3) is 10. Thus
5.7 Exercises 227

az could be delayed 10 - 2 = 8 days and project 3 would still be started as


early as possible. In this case the free float of az is 8. In general we define
free float.f[; for activity ai to be
.f[; = min {esj - eJ;},
(ai,aj)

where the minimum is taken over all eSj - eJ; where arc (ai' a j ) exists. The
free floats for all activities are also listed in Table 5.3.

5.7 Exercises

(I) Computational

1. Solve the following shortest path problems using Dijkstra's method. The entry
i,j in each matrix is the cost oftraversing arc (i,}1; a dash or a blank space indicates
the fact that there is no arc present. In all caSeS the i, j and j, i entries are equal.
(a) From 1 to 11.

2 3 4 5 6 7 8 9 10 11

12 12
2 12 6 11
3 12 3 9
4 6 3 5
5 11 9 10
6 5 9 6 12
7 9 6 11
8 10 8 7
9 12 8 9 12
10 II 9 10
11 7 12 10

(b) From 1 to 10.


2 3 4 5 6 7 8 9 10

I 25 14
2 7 2 8
3
4 12
5 18 13
6 20
7 16 7
8 4
9 6
228 5 Network Analysis

(c) From 1 to 11.


2 3 4 5 6 7 8 9 10 11

1 6 2 3
2 6 7
3 7 8 2
4 6
5 5 8
6 7 1
7 4
8 5 4
9 4
10 5

(d) From 1 to 10.


2 3 4 5 6 7 8 9 10

1 21
2 5 8
3 16 17 24
4 13
5 10
6 12
7 18
8 20
9 19
10

(e) From 1 to 12.


2 3 4 5 6 7 8 9 10 11 12

1 3 2
2 5 6
3 6 7
4 7 7 5
5 2 4
6 1 3
7 4 3 2
8 6
9 9 4
10 5
11

2. Find a minimal spanning tree for each of the problems in Exercise 1 using the
method of Prim.
3. Find a minimal spanning tree for each of the problems in Exercise 1 using the
method of Kruskal.
5.7 Exercises 229

4. In the following maximum flow problems, the source is point I and the sink is the
point with the largest number as its label. The i, j entry in each matrix represents
the capacity of arc (i,j). Find the minimum source-sink cut.

(a) 2 3 4 5 6 7 8

7 12
2 6 4
3 3 3
4 8
5 9 5
6 2 3 4
7 5
8

(b) 2 3 4 5 6 7 8 9 10

I aJ aJ
2 7
3 5
4 6
5 7 4
6 5 8 2
7 4
8
9 4
10

(c) 2 3 4 5 6 7 8

2 3
2 4 8
3 2
4 2 6
5 5 4
6 8
7 9
8

(d) 2 3 4 5 6 7 8 9

4
2 3
3 3 2 .5
4 2 2
5 5 4
6 4 I
7 3
8 2 1 1
9 3 3
230 5 Network Analysis

(e) 2 3 4 5 6 7 8

3 2
2
3 2 2
4 I
5 3
6 2
7 5
8

(f) 2 3 4 5 6 7 8

2
2 3
3 3 2 2 2
4 2
5 2 4
6 2 2
7 2
8 4 2

(g) 2 3 4 5 6 7 8

I 2
2 3 2
3 2 I 2
4 3 I 2
5 2 2 3 4
6 2 3
7 4
8

(h) 2 3 4 5 6 7 8

I 3 3
2 3 5 3
3 5 2 2 3
4 3 2 4 4
5 4 2
6 2 2
7 3 2 4
8 2 I 4

5. Solve each of the problems in Exercise 4 by the labelling method.


6. Solve each of the following minimal cost flow problems using the out of kilter
method. The networks with their arc capacities are given in Exercise 4. Each matrix
5.7 Exercises 231

below indicates the arc costs. Assume that the amount of flow to be transported
is the maximum amount possible, as found in Exercise 4.

(a) 2 3 4 5 6 7 8

1 5 3
2 3 3
3 8
4 7
5 2 3
6 2 6 4
7 5
8

(b) 2 3 4 5 6 7 8 9 10

2 5 7
2
3 2
4 2
5 7
6 10
7
8 6
9 5
10

(c) 2 3 4 5 6 7 8

1
2 2 3
3 4
4 2 4
5 4
6
7

(d) 2 3 4 5 6 7 8 9

2
3 2
4 2
5 2
6 1
7 2
8
9
232 5 Network Analysis

(e) 2 3 4 5 6 7 8

2
2
3 2 3
4 I
5 2
6 2
7 2
8

(f) 2 3 4 5 6 7 8

4 3 3
2 4 6
3 3 6 6 7
4 3
6 3 7 2
7 2 5
8 2 5

(g) 2 3 4 5 6 7 8

2
2 2 2
3 2 2
4 2 2 2
5 2 2 2 2
6 2
7 2 2
8

(h) 2 3 4 5 6 7 8

2 4
2 2 8 9
3 6 I I
4 4 6 5 3
5 5 7
6 3 2
7 9 3
8 7 2 3

7. For each of the following projects identify critical activities, earliest completion
time and activity float.
5.7 Exercises 233

(a) Activity Precedence Duration time

1 14,16,13 3
2 16 7
3 1,2 9
4 9,10,3 4
5 9,10,3 6
6 4,5 1
7 1
8 7 8
9 8 7
10 8,12 4
11 7 5
12 7 2
13 12 3
14 11,15 16
15 12 20
16 17 11
17 11 19

(b) Activity Precedence Duration time

1 5
2 10
3 8
4 6
5 1 12
6 2,4 7
7 3 4
8 5,6,7 6
9 3 10

(c) Activity Precedence Duration time

1 6
2 4
3 2 5
4 2 6
5 2 4
6 3 3
7 4,5 10
8 7 12
9 6,8 4

8. Consider Exercise 7(c). Suppose the duration time of activity 4 is reduced from 6
to 4 units. How does this affect the outcome?
234 5 Network Analysis

9. Consider the project of painting the exerior of a house with two coats of paint.
Assume a team of three men is to carry out the task. Construct a list of about 10
activities with their duration times. Analyze the project using critical path sched-
uling.
10. Carry out critical path scheduling on each of the following tasks: making jam,
bottling fruit, making a cup of coffee, laying a concrete path.
II. Critical path scheduling assumes there is sufficient manpower to do as many activ-
ities simultaneously as is necessary. Examine the solutions obtained to exercise 10
to determine the smallest number of people necessary to carry out task in mini-
mum time.

(II) Theoretical
12. Prove observations 1-3 of Section 5.2.
13. A graph is termed simple if it has no loops (lines of the form {Pb Pi}) or parallel lines
(lines connecting the same pair of points). Show that a simple graph with n points
can have no more than n(n - 1)/2 lines.
14. Prove that a simple graph (see Exercise 13) with n vertices must be connected if it
has more than (n - I)(n - 2)/2 lines.
IS. Prove that if Gland G2 are the two subgraphs resulting from any decomposition of
a connected graph G, that there must be a least one point which is in both G 1 and G 2 .
16. Prove that a line in a graph G belongs to at least one circuit in G if and only if G
remains connected after the removal of the line.
17. Prove that all trees are simple (see Exercise 13).
18. Suppose that it is desired to find the shortest tour for a travelling salesman in a
connected, weighted graph G. Prove that the weight of a minimal spanning tree
of G is a lower bound on the weight of the minimal tour.
19. Two distributors, A and B, have 6 and 4 units, respectively of a commodity on
hand. Warehouses C and D require 3 and 4 units, respectively. Unit shipping costs
to supply C and D from A are $1.00 and $2.00, respectively, and from Bare $4.00
and $3.00 respectively.
(a) Devise a network representation of this situation, regarded as a minimal cost
flow problem. Add a supersource So, a supers ink S;, and an arc (Si, So) to the
network, taking care to label all arcs as is necessary for implementation of the
out-of-kilter algorithm.
(b) Implement the out-of-kilter algorithm on the problem until toll adjustment
occurs for the first time.
(c) State why it is no longer possible to proceed with the implementation of the
algorithm.
(d) Devise a new network formulation with a single node representing both Si and
So, making it possible to solve the problem using the out-of-kilter algorithm.
(e) Implement the algorithm on the new network up to and including the first toll
adjustment.
(f) State the difference in conditions between the situations reached for (b) and
(e), and explain why it is possible to proceed with the algorithm from the former
situation.
Chapter 6

Dynamic Programming

6.1 Introduction
Dynamic programming is a technique for formulating problems in which
decisions are to be made in stages-a multistage decision problem. This
represents a departure from the types of problems we have analyzed so far,
where it has been assumed that all decisions are made at one time. It is not
difficult to think of real world scenarios which are multistage decision
problems. Many construction projects can be divided up into stages cor-
responding to the completion of events. However, there are also many such
problems in which different stages are not identified with different time
periods. For instance, many problems involving the investment of funds to
maximize return can be formulated with the different investment options
being represented by different stages.
Dynamic programming (D.P.) has been used to solve successfully problems
from a wide variety of areas including all branches of engineering, operations
research, and business. It is an implicit enumeration approach (as was
branch and bound enumeration, presented in Chapter 4) and can be very
useful in reducing the computational effort required to solve a problem by
other means. However, before the reader begins to think that he has found
the answer to all his planning problems let us sound a note of caution.
There are weaknesses with the D.P. approach, including the large number
of intermediate calculations that have to be recorded. This is summed up as
"the curse of dimensionality," which will be referred to later in this chapter.
The name of the technique was coined by Richard Bellman (1957), who
developed D.P. and also wrote the first book on the subject. Since that time
many books on D.P. have appeared, including those by Bellman and Dreyfus
(1962), Hadley (1964), Nemhauser (1966) and White (1969). This vast and

235
236 6 Dynamic Programming

ever expanding field could not be explained in any depth in a single chapter
of a book of the present size. Hence all that is attempted here is to introduce
some of the basic D.P. ideas with view to stimulating the reader to attempt
some of the more specialized texts mentioned earlier. In particular Hadley's
book is recommended for techniques and White's for the mathematical
theory of D.P. A knowledge of the calculus is required to comprehend the
remainder of this book. The unprepared reader is referred to the appendix.

6.2 A Simple D.P. Problem


Consider a tramper who wishes to walk from a national park hut to the
coast. On studying the map of the area he finds that there is quite a network
of paths linking the intermediate huts one day's walk apart. He rates each
path with a number which represents the enjoyment to be gained by walking
along it, based upon scenery, the likely number of users, and travel time.
He wishes to select a route with maximum enjoyment. The network is shown
in Figure 6.1, where point 1 represents his present hut and points 8,9, and
10 each represent coastal huts. The rating for each path is shown alongside

0.3 0.6

0.2 0.4 0.7 1.0

0.8 0.5 0.9

5 7

0.8 0.4 0.1 0.1

0.8 0.2 0.6

8 10

Figure 6.1. The network for the tramper's problem.


6.2 A Simple D. P. Problem 237

its arc. The problem is to find the longest path from point 1 to any of points
8,9, or 10.
As in most combinatorial optimization problems, it is theoretically
possible to evaluate all solutions to this problem and select the best; this is
exhaustive enumeration, as discussed in Chapter 4. As more points are
introduced into the network, however, the amount of computational effort
required quickly becomes enormous. Clearly, a method which reduces the
number of calculations required for exhaustive enumeration must be
employed for networks with a reasonably large number of points. Dynamic
programming offers such a reduction and will now be applied to the present
problem.
It can be seen from Figure 6.1 that the tramper must pass through exactly
one of the points from each of the following sets:
{l} (stage 0)
{2,3,4} (stage 1)
{5,6,7} (stage 2)
{8,9,10} (stage 3).
When the tramper is currently at a point in one of these sets he is at a parti-
cular stage of his journey. The stages are numbered so that the number of
a stage represents the number of paths walked to get to it from point 1.
When the tramper is at a particular stage he will be in a particular state
(apart from probably being cold, wet, tired, or hungry!), defined to be the
particular point of that stage at which he is located. Associated with each
state there is a return, which represents the maximum possible enjoyment the
tramper could have experienced so far in arriving at that point. These
concepts will now be used to solve the problem.
Initially the tramper leaves point 1 (stage 0), walks to one of points
2, 3, or 4, and finds himself at stage 1. He is now in either state 2, with a
return of 0.3; state 3, with a return of 0.9; or state 4, with a return of 0.6.
He now leaves stage 1 and proceeds to stage 2, ending up in one of states
5,6, or 7. If he proceeds to state 5, which route is best in the sense of affording
maximum enjoyment? He could have come from state 2, with cumulative
enjoyment of 1.1 (0.3 + 0.8); or from state 3, with cumulative enjoyment of
1.3 (0.4 + 0.9). Thus the return at state 5 is the maximum of these two,
which is 1.3. By the same reasoning, the return at state 6 is the maximum
of (0.2 + 0.3), (0.5 + 0.9), and (0.6 + 1.0), which is 1.6. Similarly, the return
at state 7 is the maximum of (0.7 + 0.9) and (0.9 + 0.6), which is 1.6. In sum,
the returns at states 5, 6, and 7 are 1.3, 1.6, and 1.6, respectively.
The tramper now leaves stage 2 and arrives in stage 3 in one of states
8, 9, or 10. The return at state 8 can be calculated by adding the returns for
states from which state 8 is accessible to the gains incurred in making the
transition to state 8. For instance, ifthe tramper arrived to state 8 from state
5, the maximum enjoyment would be the return at state 5 (1.3) plus 0.8,
i.e., 2.1. Note that one does not need to know how the return of 1.3 for
238 6 Dynamic Programming

state 5 was arrived at. It is sufficient to know that the state 5 return is 1.3.
This fact embodies a very important assumption made in the problems to
be solved by D.P. in this chapter. This assumption is that the return for a
state depends only upon the optimal path to that state from a previous
state and the previous state. If the tramper arrived at state 8 from state 6
the maximum enjoyment would be the return at state 6 plus 0.4, i.e., 1.6 +
0.4. Thus the return at state 8 is the maximum of 1.3 + 0.8 and 1.6 + 0.4,
i.e., 2.1. Similarly the return at state 9 is the maximum of 1.3 + 0.8, 1.6 + 0.2,
and 1.6 + 0.1, i.e., 2.1. The return at state 10 is the maximum of 1.6 + 0.1
and 1.6 + 0.6, i.e., 2.2.
So state 10 has the largest return, and we now know that this return of
2.2 represents the maximum enjoyment that can be attained. The actual path
to be traversed in attaining this maximum can be found by unravelling the
information contained in the state returns. To begin with it was the return
at state 7 (1.6) plus the gain from the state 7 to state 10 transition (0.6) that
produced the return of 2.2 at state 10. Hence point 7 and arc (7,10) is on
the longest path. By the same token it was the return at state 3 (0.9) plus the
gain from the state 3 to state 7 transition (0.7) that produced the return of
1.6 at state 7. Hence point 3 and arc (3,7) is on the path. Thus arc (1,3) must
also be included. The optimal path is then
(1,3,7,10).
There are 17 paths from point 1 to points 8,9, and 10. We could have
evaluated them all and chosen the longest. The above approach involves
less calculation and benefits become more and more apparent as the net-
work size increases. The solution procedure just unfolded contains the basic
approach of dynamic programming. The next section sets the stage in a
more general fashion.

6.3 Basic D.P. Structure


The longest path problem of the previous section has the following property:
In finding the return at a particular point Pj by arriving from a given point
Pi> all one needed to know was the return at Pi and the gain in the Pi to Pj
transition. This latter return was independent of the way in which the system
arrived at Pi- Systems with this property are called serial systems. Thus a
serial system is one in which the return at a stage i of the system depends
only upon the returns at the stage (i - 1) immediately preceding it and the
gains in transforming the system from stage (i - 1) into stage i.
In nonserial systems this property is not present and feedback loops or
dependence upon earlier stages may occur. D.P. can be extended to analyze
these problems; the resulting theory is related to another area called optimal
control. This topic is outside the scope of the present book. We shall deal
with only serial systems.
6.3 Basic D.P. Structure 239

As can be seen by the example problem, in a serial system one is required


to make a number of sequential, interrelated decisions. A complete set of
decisions for a serial system problem, representing a solution to the problem,
is called a policy. A single decision of how to transform the system from one
stage to the next is called a policy choice. And a set of policy choices which
transform the system from some intermediate stage to the final stage is
called a subpolicy.
We now introduce notation which will allow us to express the D.P.
approach to the example problem in general terms. Let
N = the number of the last stage in the problem
Sn = the state of the system at the nth stage

cij = the benefit gained in transforming the system from state i to state j
f,.(s) = the return when the system is in state s at the nth stage
(i.e., f,.(s) is the optimal benefit gained in transforming the system from the
initial stage to state s at the nth stage). Let us now explain this notation in
terms of the longest path problem. We wish to find the longest path from
stage 0 to stage 3. That is, we require a path of arcs of the form
«so, S1), (Sh S2), (S2, S3)
whose total benefit
N

L
;=1
CSi-1Si' N=3
is a maximum.
We will begin to solve the problem by using the above machinery. Initially
the system is in state 1 at stage 0, having accrued no benefit so far. Thus
So = I
and
lo(so) = O.
The returns at the next stage are calculated as simple additions: the return
at stage I in state 2 is
11(2) = 10(1) + C12 = 0 + 0.3 = 0.3.
Similarly,
11(3) = 0.9
and
11(4) = 0.6.
Let us now calculate the return at stage 2, state 6. The tramper can
arrive at state 6 from one of states 2, 3, or 4. The respective benefits are
11(2) + C26
11(3) + C36
11(4) + C46'
240 6 Dynamic Programming

The return at stage 2, state 6 (f2(6)) is the maximum of these. Thus

f2(6) = max {J1(sd + Cs,6}


s,; 2.3,4
= max {(0.3 + 0.2), (0.9 + 0.5), (0.6 + 1.0)} = 1.6.

The other returns at stage 2, f2(5) and f2(7), can be found in the same way.
Having calculated the three stage 2 returns, we can then use them to
find the stage 3 returns. In general the return for stage n, state s is:
fn(s) = max {.f(n-1)(Sn-1) + C(n-1)s}, n = 1,2, . . . (6.1)
Sn-l

We can now use (6.1) to solve the longest path problem:


fo(1) = 0
f1(2) = 0.3, f1(3) = 0.9,

s, ;2,3
f2(6) = 1.6, as found before
f2(7) = max {J1(Sl) + Cs,7} = max {(0.9 + 0.7),(0.6 + 0.9)} = 1.6.
s,; 3,4

Using these values in (6.1) recursively, we can calculate the stage 3 returns:
f3(8) = max {J2(S2) + CS2 8} = max {(1.3 + 0.8), (1.6 + 0.4)} = 2.1
S2; 5,6

f3(9) = max {J2(S2) + CS2 9} = max {(1.3 + 0.8),(1.6 + 0.2), (1.6 + 0.1)} = 2.1
s2;5,6,7

f3(10) = max {J2(S2) + Cs2 1O} = max {(1.6 + 0.1), (1.6 + 0.6)} = 2.2.
S2; 6,7

Thus the optimal solution has value 2.2, with the actual longest path being
<1,3,7,10), as found before.
Equations of the form of (6.1) are called recursive equations. Such equa-
tions, in one form or another, are usually used in solving a problem by
dynamic programming. The family of equations in (6.1) underline the key
fact that an optimal subpolicy at any stage of a multistage decision problem
depends upon the state at that stage, and does not depend upon policy
choices made at earlier stages. This can be started as follows:
The Dynamic Programming Principle of Optimality. When a system is at a
given stage, the decisions of the optimal policy for future stages will con-
stitute an optimal subpolicy regardless of how the system entered that stage.
Any system optimization problem for which the above principle is true
can be attacked using D.P. Such systems are the serial systems, as described
earlier.
We now examine some of the implications of this principle. In the longest
path problem, when the tramper left stage 2 and walked to one of points,
6.4 Multiplicative and More General Recursive Relationships 241

8,9, or 10 (stage 3), the return at each state of stage 3 was calculated without
regard to states prior to stage 2. This allowed us to solve the problem one
stage at a time. For example, we could temporarily "forget" about earlier
decisions and find the best returns for the stage 3 states by examining only
the stage 2 state returns and the stage 2 to stage benefits.
In the example problem the returns at each stage were calculated by
finding the maximum among sums of pairs of numbers. In other uses of
D.P. different ways of calculating optima must be used. How this is done is
not part of the D.P. approach in itself. The user of D.P. is very much alone
in finding returns and must use what ingenuity and knowledge of the partic-
ular system that he has.

6.4 Multiplicative and More General


Recursive Relationships
As was stated in the previous section, the returns at each stage were cal-
culated by finding the maximum among sums of pairs of numbers. Dynamic
programming can also be applied to serial problems in which returns are
calculated in other ways. Such a problem will now be presented.
Consider once again the network of Figure 6.1, but suppose it represents
a different scenario. We now have a spy who wishes to send a confidential
document from his present station (point 1) to any of the three receiving
stations in his home base (points 8,9, and 10). The number cij attached to
the arc (Pi'P) in the network represents the probability that the document
will be safely transmitted from the station represented by Pi to the station
represented by Pj without falling into enemy hands. The problem is to find
the route from Pl to one of PS,P9' and Pl0 which affords the highest prob-
ability of a safe trip.
Using the notation of the previous section, we wish to find a path of
arcs of the form:

whose total probability

n
N

CSi-1Si' N=3
i= 1
is a maximum.
Notice here that we are multiplying relevant Cij values together, rather
than adding them as we did in the longest path problem. This is because
of a basic property of probability theory: If events A and B are independent
with probabilities p(A) and p(B), then the probability of both A and B
occurring is p(A)p(B). Thus (6.1) has a different form for this problem, namely
f,,(s) = max U(n-lisn-l) x c(n-l) x C(n-l)s}, n = I, 2, ... , (6.2)
5 n -1
242 6 Dynamic Programming

where
10(1) = 1.0. (6.3)

Note that (6.3) is true because the system begins in state 1 with probability
one.
The only difference between (6.1) and (6.2) occurs in the replacement of
the" +" in (6.1) by the" x" in (6.2). It is conceivable that other operations
may be involved in the interaction between f(n-l)(Sn-l) and c(n-l)s> such as

In(s) = max {f(n-l)(S(n-l» ± "/C(n-l)s}, n = 1,2, ...


S}1-1

Hence the general recursive equation (forward form) is:

In(s) = optimum {f(n-l)(Sn-l) EB C(n-l)s}, n = 1,2, ... , (6.4)


Sn-1

where "EB" is an operation on f(n-l)(Sn-l) and c(n-1)s depending upon the


particular system being analyzed, and the objective may be one of maximi-
zation or minimization.
The problem of the spy will now be solved using (6.2) and (6.3):
11 (2) = 1.0 x 0.3 = 0.3
11 (3) = 1.0 x 0.9 = 0.9
11(4) = 1.0 x 0.6 = 0.6
12(5) = max {fl(Sl) x Cs ,5} = max {«0.3) x (0.8»,«0.9) x (0.4»} = 0.36
s, =2,3

12(6) = max {fl(Sl) x Cs ,6}


s, = 2,3,4
= max {«0.3) x (0.2»,«0.9) x (0.5»,«0.6) x (1.0))} = 0.6
12(7) = max {fl (S 1) X CSt 7} = max {( (0.9) x (0.7», «0.6) x (0.9))} = 0.63.
s,=3,4

Using these values in (6.2) we can calculate the stage 3 returns:

13(8) = max {f2(S2) x CS28 } = max {«0.36) x (0.8»,«0.6) x (0.4))} = 0.288


S2 = 5,6

13(9) = max {f2(S2) x Cs2 9}


s2=5,6,7

= max {( (0.36) x (0.8», «0.6) x (0.2», «0.63) x (OJ))} = 0.288


13(10) = max {f2(S2) x CS210 } = max {«0.6) x (0.1»,«0.63) x (0.6»} = 0.378.
S2 = 6,7

Thus the optimal solution has probability 0.378. (Let's hope it isn't vital
that the documents arrive safely, as the chances aren't too high!) The actual
route is
<1,3,7,10).
6.5 Continuous State Problems 243

6.5 Continuous State Problems


In the problems analyzed so far the state and decision variables have been
allowed to assume only values from a finite, discrete set. In this section this
assumption is relaxed, and we allow the variables to assume any feasible
real value: the continuous state serial system problems. Such a problem is
presented below:
N
Maximize: Xo= I1~
i=
(6.5)

N
subject to: I Xi = d, a positive real constant (6.6)
i= 1

Xi> 0, i = 1,2, ... , N. (6.7)


That is, it is desired to subdivide a given positive real number d into N
positive parts, Xl' X 2 , ••• ,XN (N being given), so that the sum of the square
roots of the parts is a maximum. We now approach this problem with
dynamic programming.
The problem can be looked upon as a serial system problem in which it
is desired to assign a value to each Xi one at a time in the order Xl> X2, ... ,XN .
Thus the problem has N stages. When the system is at the nth stage, the
state of the system Sn is defined to be the amount of the number d which
has been assigned to Xb X2, . . . , Xn so far, i.e.,
sn = Xl + X 2 + ... + X n , n = 1,2, ... , N. (6.8)
Because the only restrictions on the decision variables, Xi are those of (6.6)
and (6.7), there is an infinite number of possibilities at each stage.
It so happens, because of the addition of the terms ~ in (6.4), that the
recursive relationship is additive. So (6.5) becomes
n = 2, 3, ... , N.

But, from (6.8),


n = 2, 3, ... , N.
Thus
!,,(sn) = max U(n-l)(Sn - xn) + Fr.}, n = 2, 3, ... , N (6.9)

and
(6.10)
Now, from (6.8),

so that, from (6.10),


(6.11 )
244 6 Dynamic Programming

Setting n = 2 in (6.9), we obtain


12(S2) = max {Jl(S2 - X2) + JX;.}.
X2
o < X2'::::; S2
By (6.11),
12(S2) = max {.../S2 - X2 + JX;.}.
X2
o <x2,:5; Sz
Thus, in order to findI2(s2) we must find the maximum value of(.../s2 -X2 +
°
JX;.), where X2 can range between and S2' To do this we use basic differential
calculus. Let

Then
of 2 = !(S2 _ X2)-1/2( -1) + !X21/2.
OX2
For a stationary point,

Hence

Therefore
x! = S2/2,
which is certainly in the range (0, S2]. Also,

o2F2(s2/ 2) _ [_1.( _ )-3/2 _1. -3/2J


:1 - 4 S2 X2 4X2 X2= '2/2
uX2
= -i(s2/2)-3/2 - i(s2/2)-1/3 < 0,
indicating a maximum. Thus
12(S2) = .../S2 - (s2/2) + .../S2/2 =.j'fS;. (6.12)
Now, setting n = 3 in (6.9),
13(S3) = max {J2(S3 - X3) + JX;}.
By (6.12),
13(S3) = max {.../2(S3 - X3) + ~}.
X3
o <X3 '::::;S3
We repeat the calculus technique just used in order to find 13(S3)' Let
F 3(X3) = .../2(S3 - X3) + JX;
of 3 = ![2(S3 _ X3)] -1/2( _ 2) + !X3"1/2.
OX3
6.5 Continuous State Problems 245

For a stationary point,

hence

Therefore
X~ = S3/3, (6.13)
which is certainly in the range (0, S3J. Also,

°2 F:'13(S3/
2
3) _ [-1.[2( _
- 4 S3 X3
)]-3/2(4) _1.X -3/2]
4 3 X3=S3/3
VX3
= -H2(S3 - (S3/3))]-3/2(4) - HS3/3)-3/2 < 0,
indicating a maximum. Thus
f3(S3) = ,J2(S3 - (S3/3)) + ,JS3/3 = .J3S;.
Let us now review what has been achieved so far by way of temporarily
setting N = 3. In this case the problem has been solved, and
S3 = Xl + X2 + X3 = d
x~ = S3/3 = d/3, by (6.13).
Thus
X! + x! = S2 = 2d/3,
hence

and therefore
x! = d/3.
Also,
fn(sn) = Jni:., n = 1,2, 3
and the optimal solution has value f3(S3) = J3d,.
The reader has no doubt suspected by now that the above results are true
for general N. That this is so will be proved by induction:
x: = din, n = 1,2,3, ... , N (6.14)
f,.(sn) = Jni:., n = 1, 2, 3, ... , N. (6.15)
Now (6.14) and (6.15) are certainly true for n = 1,2, and 3. Assume that they
are true for n = k, i.e.,
xt = d/k
h(Sk) = Jki,..
By (6.9),
max
Xk+ 1
O<Xk+lSSk+l

max {.jk(Sk+1 - Xk+l) + ,JXk+l}'


Xk+ 1
o <Xk+ 1 SSk+ 1
246 6 Dynamic Programming

Let
Fk+l(Xk+d = .jk(Sk+l - Xk+l) + .jXk+l.
For a stationary point,

hence
1 [k( Sk+l
"2 - Xk+1 )] -1/2( - k) + "21 Xk+l °
-1/2 = .

Therefore
xt+ 1 = Sk+ d(k + 1),
which is certainly in the range (0, Sk+ 1]. Also,

a-
-
2
Fk+l
-<
aX~+l
° at Sk+ 1
X k + 1 =--,
k+1
indicating a maximum:
fk+l(Sk+l) = .jk(Sk+l - sk+d(k + 1)) + .jsk+d(k + 1)
= .j(k + l)Sk+l
which completes the proof that (6.14) and (6.15) are true. The solution to
the problem is
x:
= din, n = 1,2, ... , N
and the optimal solution value is
JIUi.

6.6 The Direction of Computations


The problems solved so far in this chapter have all been approached by
finding values for the return functions 1;, in the order fl' f2' ... ,fN. This is
called forward recursion. In terms of the longest path problem and the spy
problem this approach seems logical. However, there exist some serial
systems for which the reverse order is more straightforward. That is, it is
sometimes desirable to calculate the I; in the order fN, fN _ 1, . . . ,fl. This is
called backward recursion, and it involves "working back" through the
problem in the opposite direction to the actual sequence of events as they
will take place when a feasible solution is implemented.
Let us approach the longest path problem using backward recursion.
First we redefine fn(s) as follows:

fis) is the return (optimal benefit to be gained) from the remaining stages,
(n + 1), (n + 2), ... , N, given that the system is in state S at the nth stage.
6.6 The Direction of Computations 247

This represents a departure from the definition of lis) in Section 6.3, where
f,,(s) was the return gained so lar from the previous stages, 1,2, ... , n given
that the system is in state s at the nth stage.
In terms of the longest path problem, we know that there is no further
benefit to be gained once the system is in any of states 8, 9, or 10 at stage 3.
Thus

Suppose the system is in state 6, stage 2. What further benefit can the
tramper look forward to? If he proceeds to state 8, 9, or 10, the extra benefits
are 0.4, 0.2, and 0.1, respectively. Thus the extra benefit to be gained in
leaving state 6 and arriving at the coast is (/3(8) + 0.4), (/3(9) + 0.2), or
(/3(10) + 0.1), depending upon which choice is made. However, each first
term in these expressions is zero; thus the maximum addition benefit to be
had in leaving state 6 is 0.4:

12(6) = max {f3(S3) + C6S3 } = 13(8) + 0.4 = 0.4.


S3 = 8.9,10

The state returns can be similarly calculated for each other stage 2 state.
In general the backward recursive equations for this problem are

f,,(s) = max {f,,+ l(Sn+ 1) + CS(sn+ 1l}' n = 0,1,2 (6.16)


Sn+ 1

where
(6.17)

The reader may find it instructive to actually use (6.16) and (6.17) to verify
that this approach produces the same optimal solution as is obtained by
forward recursion.
There is no difference in the computational effort required to solve the
problem by forward or backward recursion. This is because the benefit cij
gained from transforming the system from state i to state j is a given constant
which is simply added to the return of the present state (state i for forward
recursion and state j for backward recursion). However, not all serial systems
rejoice in such simplicity. Indeed, for problems with more complicated state
transformations there may be a very marked difference in the amount of
computational effort required depending upon whether forward or backward
recursion is used. In fact some problems can be solved in only one direction.
For instance, serial systems in which the state variable at each stage is
random can be solved only by backward recursion.
We end this section by stating the general recursive equation (backward
form):
f,,(s) = optimum {fn+ l(Sn+ 1) EB Cs(n+ 1)}'
Sn+ 1
248 6 Dynamic Programming

6.7 Tabular Form


In the problems examined so far it has been a relatively easy matter to keep
track of all the intermediate information necessary to use the recursive
equations, as in each case there have not been many stages. For problems
with many more stages and many possibilities at each stage one needs an
efficient way of recording state returns, such as in tables. We illustrate this
now on the following problem.
The Easy Tread shoe company has received an order for 20 truckloads
of its biggest seller, the Won't Get Wet Tennis Shoe. The warehouse wishes
to receive the total order all at one time. As the company can make at most
5 truckloads in anyone production period, production must be scheduled
over a number of periods. In fact, production costs vary over the next 6
periods, the time during which the order must be filled or it will be lost.
Table 6.1 gives the total production cost for different numbers of truckloads
produced in the different periods, 1,2, ... , 6. In addition to production
costs there also storage (inventory) costs of $1.00 per truckload stored per
period. These inventory costs are incurred for complete periods only. Thus
if 3 loads are produced in period 5 they incur only the cost of storage in
period 6, i.e. $(3 x 1). The problem is to schedule production over the
6 periods so as to guarantee that the 20 loads are ready at the end of the
sixth period and the total costs (production and inventory) are minimized.

Table 6.1.

Cost in period
Production
number 2 3 4 5 6

0 3 2 4 5 2
1 4 6 6 7 11 6
2 9 11 11 12 12 9
3 16 15 12 19 14 10
4 19 18 14 27 19 15
5 20 21 20 32 23 20

This problem will now be formulated as a serial system and solved by


D.P. Each production period will constitute a stage. At the nth stage, the
state variable Sn is defined as the total number of loads produced so far by
the end of that stage. Thus, assuming the system starts out at stage 0,
So = o.
Suppose, for instance, that three loads were produced in the first period;
then
6.7 Tabular Form 249

As 20 loads must be produced by the end of the sixth period,


86 = 20.
Setting the problem up in general terms, let
N = the number of the last stage (in this case, N = 6)
cij = the production cost if i loads are produced in period j
Xi = the number of loads produced in period i
fn(8) = the return (minimum cost that can be incurred) when the system
is in state 8 at the nth stage.
Assume that after n - 1 periods 8n - l loads have been produced, i.e.,
Xl + X2 + ... + X n - 1 = 8n - b

and the return for the system to be in state 8 n - 1 at the (n - 1)th stage is
known, i.e., 1(n- d8n-l) has been calculated. Suppose now that it is decided
to produce Xn loads in period n. The costs involved are cx"n for production
and 1.0(xn)(N - n) for inventory. (We have assumed forward recursion is to
be adopted.) Thus the complete cost involved in this decision is
1(n-l)(Sn - xn) + cx"n + 1.0(xn)(N - n),
as

Thus
fn(8 n) = min {1(n-l)(8 n - xn) + cx"n + 1.0x n(N - n)}, n = 1,2,3, ... , N
O~xn::;;: 5
xn::;;:sn (6.18)
where
(6.19)
and
80 = O.
We now use (6.18) and (6.19) to solve the problem, storing information
calculated in tables. We begin by creating a table of inventory costs, where
the entry in the ith row, jth column represents the total inventory cost if i
loads are produced in the jth period. (See Table 6.2.) Using (6.18), (6.19) and
Tables 6.1 and 6.2 we can calculate the stage 1 returns f1(8 1 ) for each possible
state, 8 1 = 0, 1,2, ... ,5. For instance, if nothing is produced at stage 1,
Sl = O. The contribution from Table 6.1 is COl = 3 and the contribution
from Table 6.2 is zero, i.e.,

However, if one load is produced at stage 1, then 8 1 = 1. The production


cost is C 11 = 4 and the inventory cost of Table 6.2 is 5, i.e.,
f1(1) = 9.
The complete set of values is given in Table 6.3.
250 6 Dynamic Programming

Table 6.2

Cost in period:

Number 2 3 4 5 6

0 0 0 0 0 0 0
1 5 4 3 2 1 0
2 10 8 6 4 2 0
3 15 12 9 6 3 0
4 20 16 12 8 4 0
5 25 20 15 10 5 0

Table 6.3

SI XI II(sl)

0 0 3
1 9
2 2 19
3 3 31
4 4 39
5 5 45

We now calculate the stage 2 returns. Recall that

As
O:-S; XI :-s; 5
o :-s; x 2 :-s; 5,
then
O:-S; S2 :-s; 10.
Suppose, for instance, that
S2 = 6.
The (Xl> x 2 ) pairs which result in this value of S2 are: (1,5), (2,4), (3,3), (4,2),
(5,1). Using (6.18) and Table 6.3, we obtain the following costs:
Xl = 1, x 2 = 5: cost = fl(1) + 20 + 21 = 50
Xl = 2, X2 = 4: cost = f1(2) + 16 + 18 = 53
x l =3,x 2 =3: cost = fl(3) + 12 + 15 = 58
x 1 =4,x 2 =2: cost = f1(4) 58 + 8 + 11 =
Xl = 5, x2 = I: cost=fl(5)+ 4+ 6=55.
Taking the minimum of these costs, we obtain
f2(6) = 50.
6.7 Tabular Form 251

Table 6.4
X2

S2 0 2 3 4 5 x~ f2(S2)

0 5 0 5
1 11 13 0 11
2 21 19 22 1 19
3 33 29 28 30 2 28
4 41 41 38 36 37 3 36
5 47 49 50 46 43 44 4 43
6 55 58 58 53 50 5 50
7 64 66 65 60 5 60
8 72 73 72 3,5 72
9 79 80 4 79
10 86 5 86

We now layout the calculations for all possible stage 2 states and their
returns in Table 6.4. As 20 loads have to be produced by stage 6, at least
20 - 5(6 - n) loads have to be produced by stage n, where n = 3,4, 5, 6.
Hence at least 5 loads must be produced by stage 3, i.e.,
5 ::;; S3 ::;; 15.
The stage 3 returns are calculated as shown in Table 6.5 by the previous
method, the stage 4 returns are calculated in Table 6.6, remembering that
10 ::;; S4 ::;; 20; the stage 5 returns are calculated in Table 6.7, remembering
that 15 ::;; S5 ::;; 20; and the stage 6 returns, where S6 = 20, are shown in
Table 6.8.
The optimal solution is xi = 1, xi = 3, x; = 4, xt = 4, x~ = 3, x~ = 5,
with minimum cost $124.

Table 6.5

S3 0 2 3 4 5 x~ f3(S3)

5 47 45 45 40 37 40 4 37
6 54 52 53 49 45 46 4 45
7 64 59 60 57 54 54 4,5 54
8 76 69 67 64 62 63 4 62
9 83 81 87 71 69 71 4 69
10 90 88 89 81 76 78 4 76
11 96 93 86 85 5 85
12 103 100 98 95 5 95
13 116 105 107 4 105
14 112 114 4 112
15 121 5 121
252 6 Dynamic Programming

Table 6.6
X4

S4 0 2 3 4 5 x! f4(S4)

10 77 78 78 79 80 79 0 77
11 86 85 85 89 89 87 1,2 85
12 96 94 92 94 87 96 4 87
l3 106 104 101 101 94 104 4 94
14 113 114 111 110 101 111 4 101
15 122 121 121 120 120 118 5 118
16 l30 128 l30 l30 127 5 127
17 l37 l37 140 l37 2,3,5 l37
18 146 147 147 3 146
19 156 154 5 154
20 163 5 163

Table 6.7.
Xs

Ss 0 2 3 4 5 x! fs(ss)

15 123 113 108 104 108 105 3 104


16 132 l30 115 111 110 113 4 110
17 142 l39 l32 118 117 115 5 115
18 151 149 141 135 124 122 5 122
19 159 158 151 144 141 129 5 129
20 168 175 160 154 150 146 5 146

Table 6.8
X6

S6 0 2 3 4 5 x~ f6(S6)

20 148 l35 l31 125 125 124 5 124

6.8 Multi-state Variable Problems and the


Limitations of D.P.
The dynamic programming formulation of each serial system problem
examined so far has the property of possessing just one state variable at
each stage. Many problems cannot be adequately formulated without de-
6.8 Multi-state Variable Problems and the Limitations of D. P. 253

fining two or more state variables per stage. (There are techniques available
to reduce the number of state variables. See, for example, Bellman and
Dreyfus (1962). Coverage ofthese is, however, beyond the scope ofthis book.)
Although such problems can in theory be solved by dynamic program-
ming, in practice the amount of computational effort is often enormous. For
example, consider a 5-stage serial system with a single state variable, capable
of assuming 10 different states at each stage. At each stage no more than
10 calculations are necessary to evaluate the return for each state, i.e., there
are 10 2 calculations per stage. Thus a maximum of 5 x 10 2 calculations are
needed to solve the problem. Suppose now that a new state variable is added
at each stage, which also is capable of assuming 10 states at each stage. A
maximum of 10 3 calculations are necessary per stage and thus 5 x 10 3 cal-
culations may be necessary to solve the problem. Thus the number of
calculations has increased by a factor of 10. For problems with more possi-
bilities per stage, the increases are enormous, hence the term "the curse of
dimensionality," coined by R. Bellman (1957) for this problem. This is a
severe limitation to the ability of dynamic programming to solve realistic
serial system problems.
We now present and formulate a two-state variable serial system problem
with dynamic programming.
The Quick As A Flash freight company carries cargo in its aircraft
between two cities. Each aircraft has 1,000 cubic feet of capacity and can
carry 5,500 Ib of freight. The company accepts three commodities for car-
riage, Cb C 2 , and C 3 , with unit volumes of 100,200, and 50 cu. ft, respectively,
and unit weights of 1,000, 500, and 1,500 lb, respectively. The profits for
transporting one item of each of C1 , C2, and C3 are $90, $100, and $150,
respectively. The problem is to decide how many of each of Cb C2, and C3
will be flown per trip in order to maximize profit.
The decision variable is X n , the number of Ci accepted, and the stages
correspond to an allocation of each of the three commodities, Cb C2 , C3'
Since the problem has two constraints (volume and weight) at each stage
the system will be in two states, Sn and tn, corresponding to the total volume
and weight, respectively, which has been allocated to the nth stage. Let
fn(sn' tn) be the return at the nth stage when the system is in states Sn and tn'
The problem can be stated as:
Maximize: 90Xl + 100x2 + 150x3 = Xo
subject to: 100Xl + 2OOX2 + 50X3 s 1000
1000x 1 + 5OOX2 + 1500x3 S 5500
Xl,X2,X3 nonnegative integers.
This is, of course, an integer programming problem and could be solved by
the methods of Chapter 4. This particular scenario, of deciding how many
of a number of commodities to select subject to various restrictions, is an-
other example of the knapsack problem, also called the fly-away kit problem.
254 6 Dynamic Programming

Assume that at the nth stage Sn and tn units of volume and weight, respec-
tively, have been allocated:
n

L ViXi = Sn
i= 1

i = 1,2
and
n

L WiXi = tn
i= 1

i = 1,2,
where

and
(Wi> W2, W3) = (1000,500,1500).
We shall adopt forward recursion. Suppose that the return of the system
in states Sn and tn is known, i.e., 1n(sn' t n) has been calculated. Suppose now
that it has been decided to allocate X n+l of cn+1• The profit at the (n + I)th
stage is then

where

Thus the return at the (n + l)th stage is

Xn +1

Xn +1
(6.20)
where x n + 1 is an integer such that

lOOOJ
o ~ xn+l ~ [- - (6.21)
Vn + 1

and
(6.22)

where [a] is the integer part ofreal number a.


Ifthe reader solves this problem using (6.20) and (6.21) he will become
convinced that the addition of the extra constraint (6.22) has created a great
deal of extra computation. If further constraints of the form of (6.21) and
(6.22) were added to the problem, it would become increasingly unattractive
to solve it using dynamic programming.
6.9 Exercises 255

The optimal solution to problem (6.20), (6.21), (6.22) is

x! = 0
x! = 4
x! = 2
with value
X6 = 700.

6.9 Exercises
I. Find the shortest path from the point with the lowest index number to the point
with the highest index number in the networks in Exercise I, Chapter 5, using
dynamic programming.

2. Solve the following problem by dynamic programming using (a) forward recursion
and (b) backward recursion, and compare the computational effort involved in the
two approaches.
Given a total resource of 8 units and a benefit of txn - nx~ at stage n (n = 1,2,
3,4), where Xn is the allocation made at the nth stage, find the optimal allocation
policy to maximize total return if all 8 units must be allocated. Assume that each
X n , n = 1,2, ... , 4, is a nonnegative real number.

3. Solve Exercise 2 if Xn must be a nonnegative integer, n = 1, 2, 3, 4.

4. A production process produces integer numbers of units of a single commodity


over 4 periods I, II, III, and IV, where the maximum number produced in any
period is 6. There is a storage cost of $1.00 per unit per complete period. Table 6.9
gives the production cost for different numbers of units in the different periods.
Find the minimum total cost of production and storage if 19 units must be produced
by the end of period IV. Solve this problem by dynamic programming.

Table 6.9. Data for Exercise 4.

Cost in period:
Number
produced II III IV

0 2 6 5 4
I 4 7 8 5
2 8 9 11 9
3 9 11 15 13
4 II IS 16 IS
5 12 19 17 17
6 14 20 20 22
256 6 Dynamic Programming

5. Solve the following problem by dynamic programming:


Maximize: Xo = XIX2 X 3X 4

subject to: Xl + X2 + X3 + X4 = 9
i = 1,2,3,4.

6. Solve the following problem by dynamic programming:


Minimize: xf + x~ + x~ + x~ + x~
subject to: XIX2X3X4XS = 11
i = 1,2,3,4, 5.
7. Solve the following linear integer programming problem by dynamic program-
ming:
Minimize: Xl + X2

subject to: 3x I + 4X2 ~ 12


X 1> X 2 nonnegative integers.
8. Solve the following nonlinear integer programming problem by dynamic program-
ming using (a) forward recursion and (b) backward recursion. Compare the amount
of computational difficulty involved in the two approaches.
Maximize: 8xf + 4x~ - 3XI - 4X2

subject to: 3XI + 4X2 ~ 24


4XI + 5x 2 ~ 20
X I, X2 nonnegative integers.
9. (A knapsack problem.) A burglar is confronted with seven objects with respective
weights and values (40,50,30,10, 10,40,30) and (40,60,10,10,3,20,60). He can
carry away 100 units in weight. Solve by dynamic programming the problem of
determining which objects he should remove given an objective of maximum value
and that only one item of each object is available.
10. Solve Exercise 9 with the extra proviso that each object is now also assigned a
volume of (25, 50, 25, 25, 50,0,75), respectively, and the total volume that the bur-
glar can remove is 100 units.
II. Solve the problem posed in Section 6.8.
12. (The farmer's problem.) At the beginning of a certain year a farmer has 20 tons of
seed potatoes. In five years' time he is going to sell all the potatoes, if any, that he
then has. If he keeps a ton of seed potatoes it will produce 3 tons of seed potatoes in
a year's time. He estimates that the selling price of a ton of seed potatoes over the
next five years is going to be 400,330,44, 15, and 5 units, respectively. The problem
is to decide how many tons of seed to keep and plant each year and how many to
sell. Formulate and solve this problem using dynamic programming, assuming
that only integer numbers of tons of potatoes are considered.
Chapter 7

Classical Optimization

7.1 Introduction
Until now we have considered the optimization of a linear function subject
to linear constraints. This assumption of linearity is now relaxed and we
examine the complex problems of optimizing a function which is not nec-
essarily linear which may possibly be subject to constraints which are also
not necessarily linear. This present chapter is concerned with the calculus
necessary to identify the optimal points of a continuous function or a func-
tional. This is often called classical optimization, even though many of the
results are of relatively recent origin. Occasionally these methods can be
used to solve real-world problems. However, it is usual that too many vari-
ables are present for the methods to be at all efficient from the point of view
of numerical computation. In these cases nonlinear programming algorithms
must be developed and some of these are presented in the next chapter.
However, most of these algorithms rely on the theoretical development of
the present chapter.

7.2 Optimization of Functions of One Variable

7.2.1 Definitions

Consider a continuous function,


J:I-+R,

257
258 7 Classical Optimization

where I = (a, b) is some open interval on the real line and R is the set of real
numbers. We now present some definitions concerning properties of the
values that f(x}, x E I can assume.

Definition 7.1. f has a global minimum at Xl E I if


f(x l} :$; f(x} for all x E I.
(A global minimum is sometimes called an absolute minimum.)

Definition 7.2. f has a global maximum at Xl E I if


f(Xl} ~ f(x} for all x E I.
(A global maximum is sometimes called an absolute maximum.)

Definition 7.3. f has a global extremum at Xl E I if f has either a global mini-


mum or a global maximum at Xl. (A global extremum is sometimes called
an absolute extremum.)

Definition 7.4. f has a local minimum at Xl E I if there exists a DE R + such


that
f(x l } :$; f(x}
for all x E I satisfying
Ix-xll<D.
(A local minimum is sometimes called a relative minimum.)

Definition 7.5. f has a local maximum at Xl E I if there exists a DE R + such


that

for all x E I satisfying


Ix - xli < D.
(A local maximum is sometimes called a relative maximum.)

Definition 7.6. f has a local extremum at Xl E I if f has either a local mini-


mum or a local maximum at Xl. (A local extremum is sometimes called a
relative extremum.)

Figure 7.1 serves to illustrate the concepts just defined.

Definition 7.7. f has a stationary point at Xl E I if f is differentiable at Xl and


f'(xd = O.
(A stationary point is called a critical point by some authors.)

There may be points which are local or even global extrema of f but
which are not stationary points. Such a point is X6 in Figure 7.1, where f is
7.2 Optimization of Functions of One Variable 259

a Xl X2 X3 X 4 Xs X6 X7 b X
Figure 7.1. Points Xl' x 3 , and Xs are local maxima. Point X3 is a global maximum.
Points X 2 , X 4 , and X6 are local minima. Point X6 is a global minimum.

not differentiable. Further if f is defined on the closed interval, [a, b], defi-
nitions 7.4 and 7.5 have to be modified for the special cases Xl = a or Xl = b.
In these cases the neighbourhoods IXI - xl < 15 are defined as

IXI - xl < 15, Xl "# a or b


0< X - Xl < 15, if Xl = a,
0< Xl - X < 15, if Xl = b.
It is possible that Xl = a or Xl = b are global extrema even though f does
not have a stationary point at either.

7.2.2 A Necessary Condition for Local Extrema

Given a function f: 1-+ R with I open, it is often of interest to find the global
extrema of f. Unfortunately it is not easy to find the global extrema directly.
Thus we set our sights a little lower and develop ways of finding all local
extrema.

Theorem 7.1. If f: 1-+ R is differentiable at Xl, in an open interval I, then f


has a local extremum at Xl E I = !'(XI) = O.

PROOF. Suppose Xl E I is a local minimum. By assumption !'(XI) exists, and

lim f(XI + Llx) - f(XI) = lim f(x i + Llx) - f(x l ) = !'(x I ). (7.1)
Llx-+O+ Llx Llx-+O- Llx

But, as Xl is a local minimum, there exists abE R + such that, for all Llx
where ILlxl ~ 15,
f(XI + Llx) - f(xd 0
ILlxl ~ .
260 7 Classical Optimization

Hence the first limit in (7.1) is nonnegative. However, if


Llx < 0,
as it is in the middle expression in (7.1), we have

f(XI + Llx) - f(x l ) < O.


Llx -

Hence the second limit in (7.1) is nonpositive. As f'(x l ) must equal both of
these limits, one nonnegative and one nonpositive, they must both be zero.
Hence f'(x l ) = O.
The proof when Xl is a local maximum is similar. D

The necessary condition for a local extremum is not sufficient, as evi-


denced by point X 7 in Figure 7.1, where

but X7 is not a local extremum. Indeed X7 is a point of inflection which is


defined as follows.

Definition 7.S. f has a point of inflection at Xl if f has a stationary point at


Xl but f does not have a local extremum at Xl and I' has a local extremum
at Xl'

This leads to a distinction between the stationary points of f:

Definition 7.9. f has a critical point at X I iff has a stationary point at XI and
Xl is a local extremum for f but not for 1'.

Thus points of inflection are stationary points which are not critical points.

7.2.3 Sufficient Conditions for Local Extrema

In view of Theorem 7.1, it is desirable to develop sufficient conditions for


local extrema to exist. Then the stationary points found can be examined to
see if any of them are local extrema. To this end, consider the Taylor series
expansion about a point Xl E I (see Section 9.2 of the Appendix):
f(x i + h) = f(x l ) + hf'(Xl)
h2
+2 !,,(eX I + (l - e)(Xl + h)), for some e, 0< e < 1, (7.2)
where it is assumed that
(Xl + h) E I,
7.2 Optimization of Functions of One Variable 261

and I has first and second derivatives 1', f" for all points in I. Suppose that
I has a stationary point at Xl' Then, by Theorem 7.1,

Then (7.2) can be rearranged as:


h2
l(x 1 + h) - I(xd = 2 f"(8X l + (l - 8)(Xj + h), o< 8 < 1. (7.3)

Assume
f"(X I) > O.
Now if Iff is continuous on I, at all points sufficiently near Xl' Iff will have
the same sign as it does at Xl' i.e., positive. Thus, for all h sufficiently small in
magnitude,
h2
2 f"(8x I + (1 - 8)(XI + h)) > O.
Using this result in (7.3), we obtain
I(xj + h) - I(XI) > O.
We conclude that if
f'(xd = 0,
and Iff is continuous in a neighbourhood of Xl> then
f"(xd> 0
is a sufficient condition for Xl to be a local minimum.
It can be shown analogously that if
f'(x l ) = 0,
and Iff is continuous in a neighbourhood of Xl> then
f"(x l ) < 0
is a sufficient condition for Xl to be a local maximum.
The preceding deductions cannot be used to come to any conclusions
about the character of Xl if

Indeed, it may be that


pn)(xd = 0, n = 1,2, ... , k
for some integer k > 2. The following theorem settles such cases.

Theorem 7.2. II
n = 1,2, ... , k, (7.4)
and
(7.5)
262 7 Classical Optimization

and pk + 1) is continuous in a neighbourhood of x b then f has a local extremum


at x 1 if and only if (k + 1) is even. If
pH l)(Xl) > 0,
Xl is a local minimum. If

X1 is a local maximum.

PROOF. Sufficient condition. We assume the hypothesis of Theorem 7.2. We


wish to show that f has a local extremum at Xl. Taylor's theorem about
point Xl yields

hHl
+ (k + l)! PH1)( 8x l + (1 - 8)(Xl + h)),
for some 8, 0 s 8s 1. Using (7.4) and rearranging, this becomes
hk+ I
f(XI + h) - f(XI) = (k + l)! pH 1)(8x l + (1 - 8)(Xl + h)), Os 8 s 1. (7.6)

It is assumed that PHI) is continuous at Xl. This fact can be used to show
that at all points sufficiently near Xl' p k + 1) will have the same sign as
PH1)(Xl). Hence if h is sufficiently small, P H1 )(8x I + (1 - 8)(Xl + h)) will
have the same sign as p k + 1)(Xl). In view of this, on examining (7.6) we can
see that for odd k + 1 and sufficiently small positive h, f(x l + h) - f(XI)
will have the same sign as f(H l)(X l ). However, for sufficiently small negative
h, f(x l + h) - f(x l ) has the opposite sign to pH l)(xd. Hence, for odd
k + 1, Xl is not a local extremum. However if k + 1 is assumed to be even,
f(Xl + h) - f(x l ) has the same sign as pH l)(XI), independently of the sign
of h. If

then

for all h sufficiently small in magnitude, and thus Xl is a local minimum. If


p k + l)(Xl) < 0
then

for all h sufficiently small in magnitude and thus Xl is a local maximum.


Necessary condition. We assume (7.4) and (7.5) and that f has a local
extremum at Xl. We wish to show that k + 1 is even. Let us suppose for de-
finiteness that f has a local minimum at Xl> i.e.,
f(x l + h) - f(xd > 0
7.2 Optimization of Functions of One Variable 263

for all h sufficiently small in magnitude. Using (7.6), we obtain


hk+l
(k + I)! j<k+ll(eXI + (1 - e)(XI + h)) > 0, 0< e < 1, (7.7)

i.e., the expression on the left-hand side of (7.7) is of constant sign, namely
positive. However, from the arguments mounted earlier in the proof,
j<k+ll(eXI + (1 - e)(XI + h)) will have constant sign (it cannot be zero if
(7.7) is to hold) for h sufficiently small in magnitude. Now when h is negative
the expression in (7.7) can have constant sign only if k + 1 is even.
A similar argument follows when f has a local maximum at Xl' This
completes the proof. D

7.2.4 Examples

Consider
f(x) = X3 - 9X2 + 27x - 27.
We use the previous results to find the extrema of this function. The first
derivative is
f'(X) = 3X2 - 18x + 27,
which has a unique zero at Xl = 3, which, by Theorem 7.1, is the only
candidate for an extremum. However,
f"(X) = 6x - 18,
so that
1"(3) = 0.
Now
j< 3l(X) = 6 i= 0,
but, as k + 1 = 3 is odd, XI = 3 is not an extremum. Indeed, f has a point
of inflection at X I = 3.
Consider
f(x) = X4 - 8X3 + 24x2 - 32x + 16.
The first derivative is
f'(X) = 4X3 - 24x2 + 48x - 32,

which has a unique zero at Xl = 2, which is thus the only candidate for an
extremum.
However,
f"(X) = 12x2 - 48x + 48
and
1"(2) = 0.
264 7 Classical Optimization

Also,
J<3)(X) = 24x - 48
and

But

Hence

Now as
(k + 1) = 4, which is even, hence
by theorem 7.02
XI = 2 is a local extremum.
As

is a local minimum.

7.2.5 The Solution of Nonlinear Equations

It can be seen in the previous example that in order to locate the extrema of
a function f it is necessary to find the roots of
f'(X) = O. (7.8)
This is often a difficult task when f is of high order. There are many nu-
merical methods which exist for locating the roots. Some of these are pre-
sented in Conte and de Boor (1972). We present one simple method here;
the interested reader should seek further advice if he suspects his function
is ill-behaved. The method presented here is called Newton's method and is
motivated as follows.
We assume that f has continuous second derivatives and that some esti-
mate Xl of a solution to (7.8) is available. If no such estimate is known, Xl
is chosen at random. If Xl is a reasonably good estimate, the Taylor series
expansion of f' about X I can be approximated as:
f'(X) = f'(x l ) + (x - XI)f"(XI)'

Hence if X is a solution to (7.8),


0= f'(x l ) + (x - XI)f"(X I )
X = Xl - f'(x1)/f"(xd· (7.9)
Now unless f is a quadratic, X will not in general be an exact solution to
(7.8). However, X can be used as an improved estimate. Indeed, (7.9) can be
looked upon as the first equation in a family which generates successive
improved estimates of a solution to (7.8). The family has the following
7.2 Optimization of Functions of One Variable 265

general form:

n = 1,2, .... (7.10)

Once an estimate is finally found which is sufficiently close to a root, a new


starting point can be selected in an effort to find a new root. This procedure
is repeated until all roots are found. However there is no guarantee that
this method will be successful. The reader is referred to Himmelblau (1972)
for a more complete treatment of this problem.

7.2.6 Global Extrema

Let us now return to the classical optimization of a function of one variable.


As was explained in Section 7.2.2, if I is an open interval Theorem 7.1 is used
to identify local extrema. However, if I is closed the possibility exists that
a global extremum occurs at one or both of the endpoints of I. Hence when
I is closed its endpoints must be considered candidates for global extrema.
This possibility will occur when f has no critical points in the interior of I.
For example, let
f(x) = X3 - 9X2 + 24x + 1, x E I = (0,3).
Then
f'(X) = 3X2 - 18x + 24,
which has zeros at Xl = 2 and 4. By Theorem 7.1, as I is open, these points
are the only candidates for extrema. However, 4 if: (0,3), hence Xl = 4 can
be disregarded. Now,
1"(X) = 6x - 18.
Therefore
1"(2) = -6 < 0.

°
Thus Xl = 2 is a local maximum. No local (or global) minimum exists within
I. This is because, as is approached from the right, values of f become
successively lower, without ever attaining the limit of f(O).
However, if I is redefined as
1=[0,6],
it is now closed and the endpoints Xl =
Indeed,
°and Xl = 6 must be checked.

f(O) = 1
f(6) = 37.
But
f(2) = 21
f(4) = 17.
266 7 Classical Optimization

Hence
f(O) < f(4) < f(2) < f(6).
Thus
Xl = 0 is the global minimum
Xl = 4 is a local minimum
XI = 2 is a local maximum
Xl = 6 is the global maximum.

7.2.7 Concave and Convex Functions

It was pointed out in Section 7.2.2 that the necessary condition of Theo-
rem 7.1 is not always sufficient. However, there are two classes of functions
for which the condition is sufficient. These are concave and convex functions,
which are defined next.

Definition 7.10. A function, f defined on a closed interval, I is said to be


concave on I if for all r:J. E R, 0 :-s; r:J. :-s; 1, and for all Xl> X2 E I,
(7.11 )

Definition 7.11. A function f defined on a closed interval I is said to be


convex on I if - f is concave on I.

Some examples of concave and convex functions are given in Figures 7.2(a)
and (b), respectively.
We now build up a series of results which amount to a somewhat stronger
result than the converse of Theorem 7.1. This is that, for f concave (convex),

(a) concave functions

(b) convex functions


Figure 7.2. Concave and convex functions.
7.2 Optimization of Functions of One Variable 267

f'(x*) = 0 is sufficient for x* to be a global maximum (minimum). We begin


with:

Theorem 7.3. If f is concave on a closed interval I with a local maximum at


x* E I, then f must have a global maximum at x*.

PROOF. As f has a local maximum at x* there exists /) E R + such that for


all x I satisfying
E
/x* - < /), xl (7.12)
we have
f(x*) ;;::: f(x). (7.13)
Hence if we can show for all x E I that (7.13) holds we have shown that x*
is a global maximum. We prove this by contradiction. Let Xl E I be such that
f(x*) < f(XI)· (7.14)
Now, as f is concave, we can invoke (7.11), with
X2 = x*.
That is, for all a ER, 0 S a S 1,
f(ax i + (l - a)x*) ;;::: af(xI) + (1 - a)f(x*). (7.15)
By taking a sufficiently close to 0, i.e., 0 < a < /)/(/x* - XI/), the point
(axi + (l - a)x*) will satisfy (7.13), that is,
f(x*) ;;::: f(axi + (1 - a)x*). (7.16)
Then by (7.15), we have
f(x*) ;;::: af(xI) + (1 - a)f(x*),
which contradicts (7.14). Thus no point Xl E I can be found for which (7.14)
holds. Thus
f(x*) ;;::: f(x) for all X E I.
That is, f has a global maximum at x*. o
One can prove an analogous theorem for convex functions, as follows:

Theorem 7.4. If f is convex on closed interval, I with a local minimum at


x* E I then f must have a global minimum at x*.

We leave the proof of Theorem 7.4 as an exercise for the reader. We now
prove a theorem which leads to Theorem 7.7, the main result of this section.

Theorem 7.5. Iff is concave on a closed interval I and there exists a neighbour-
hood, N(XI) of an interior point Xl E I such that f' is continuous in N(XI), then
f(x) s f(x l ) + f'(XI)[X - Xl] for all X E I.
268 7 Classical Optimization

PROOF. Let x be a point in I. Then as f is concave on I, by (7.11) we have


f(r:t.x + (l - r:t.)xd ~ r:t.f(x) + (1 - r:t.)f(XI) for all r:t. E R, 0 ::; r:t. ::; 1.
On rearranging, we obtain
f(XI + r:t.(x - Xl» - f(XI) ~ r:t.(f(x) - f(xd). (7.17)
By Taylor's theorem (see Section 9.2 in the Appendix), with
h = r:t.(x - Xl),
we have
f(XI + r:t.(x - Xl» - f(XI)
= r:t.f'(XI + 8r:t.(x - XI»(X - Xl), for some 8,0 < 8 < 1.

hence

Therefore,
lim f'(XI + 8r:t.(x - XI»(X - xd = f'(XI)(X - Xl) ~ f(x) - f(x l ),
and

as required. o
The corresponding theorem for convex functions is left for the reader to
prove.

Theorem 7.6. If f is convex on a closed interval I and is differentiable at a


point Xl in I, then
f(x) ~ f(XI) + f'(XI)(X - Xl) for all X E I.
We leave the proof of Theorem 7.6 as an exercise for the reader. We are
now in a position to prove the following theorem:

Theorem 7.7. If f is concave on a closed interval I and there exists a point


x* E I such that
f'(x*) = 0,
then f has a global maximum at x*.

PROOF. Let N be a neighbourhood of x* contained in 1. Let x be any point


point in N. Then, by Theorem 7.5, we have
f(x) ::; f(x*) + f'(x*)(x - x*)
7.2 Optimization of Functions of One Variable 269

which reduces to
f(x) ~ f(x*).
Thus f has a local maximum at x*. Hence by Theorem 7.3 f has a global
maximum at x*. 0

The sequel for convex functions is

Theorem 7.S. If f is convex on a closed interval I and there exists a point x*


in I such that
f'(x*) = 0,
then f has a global minimum at x*.

We now turn our attention to the minimization of a concave function.


Of course no minima may exist for such a function defined on interval I
where I = R or where I is bounded but open. However, if I is bounded the
global minimum will occur at an endpoint. This is now stated and proven
formally.

Theorem 7.9. If f is concave on a closed interval I = [a, b], then f will have
a global minimum at a or b or both.

PROOF. This proof is by contradiction. Suppose that there exists a point x*


in the interior of I which is a global minimum and a and b are not global
minima, i.e.,
f(x*) < f(a), f(x*) < f(b). (7.18)
Then, as a < x* < b, there exists IX E R, °< IX < 1 such that
x* = lXa + (1 - lX)b.
Now from Definition 7.10, with Xl = a and X2 = b, we have
f(x*) ~ IXf(a) + (1 - lX)f(b),
and, by (7.18),
f(x*) > IXf(x*) + (1 - lX)f(x*),
which is a contradiction. o
For completeness we state the equivalent result for convex functions. We
leave the proof as an exercise for the reader.

Theorem 7.10. If f is convex on a closed interval I = [a, b], then f will have
a global maximum at a or b or both.
270 7 Classical Optimization

7.3 Optimization of Unconstrained Functions


of Several Variables
Of course many models of real-world problems involve functions of many
variables. In this section we study the classical mathematics required for
the optimization of such functions and generalize the results obtained in
the earlier sections of this chapter.

7.3.1 Background

As in earlier chapters we denote a vector of n variables by X = (Xl> X2, ... ,


xnf. Consider a function, f: S --+ R where S is a region in n-dimensional
Euclidean space. Then Definitions 7.1-7.11 hold for multidimensional func-
tions with X replacing x.

7.3.2 A Necessary Condition for Local Extrema

As in the single-variable, case we develop a necessary condition for the


existence of local extrema.

Theorem 7.11. If of(X)/oxj exists for all XES and for all j = 1,2, ... , n,
and if f has a local extremum at X* in the interior of S, then

j = 1,2, ... , n.

PROOF. Suppose that the conditions of the theorem hold. Let


X* = (xr,x~, ... , x:f.
Consider the points in S which are generated when all the variables Xi except
for Xj are held fixed at Xi = xt, i = 1,2, ... ,j - 1, j + 1, ... ,n for some j,
1 ~j ~ n. Now define
f(Xj) = f(xr,x~, ... , xj-l>Xj,xj+l> ... , x!).
As f has a local extremum at X*, J must have a local extremum at Xj. Thus,
by Theorem (7.1), we have
l'(X) = o.
But as

we have
7.3 Optimization of Unconstrained Functions of Several Variables 271

Thus, as j was chosen arbitrarily, we have

j = 1,2, ... , n. D

7.3.3 A Sufficient Condition for Local Extrema

As with the single-variable case, it is desirable to develop sufficient condi-


tions for a local extrema to exist. Then any points identified by an application
of the result of Theorem 7.11 can be examined to see if they are local extrema.
Theorem 7.12 provides the desired conditions. We assume that the second
partial derivatives not only exist but are continuous in some neighbourhood
of any point X* for which the condition of Theorem 7.11 holds.

Theorem 7.12. If
of(X*) = 0
j = 1,2, ... , n,
oX j ,

for some X* in the interior of S, and if H(X*), the Hessian matrix of f evaluated
at X*, is negative definite, then f has a local maximum at X*.

PROOF. Consider the Taylor series expansion of f about X* (see Section


9.2 of the Appendix), where it is assumed that the first and second deriva-
tives of f exist in S:
f(X* + h) = f(X*) + Vf(X*fh
+thTH(()X* + (1- ())(X* + h))h, for some (), 0 <() < 1. (7.19)
In view of the hypothesis, we have

17'f(X*) = (Of(X*) of(X*) Of(X*))T = 0


~ ox! ' OX2 , ... , oX n •

Hence (7.19) can be rearranged to become


f(X* + h) - f(X*) = thTH(()X* + (l - ())(X* + h))h, o < () < 1. (7.20)
Let us now consider the sign of the right-hand side of (7.20). As the second
partial derivatives of f are continuous in some neighbourhood of X*, for
h sufficiently small, the entries of H(()X* + (1 - ())(X* + h)) will have the
same sign as the corresponding entries of H(X*). Now, as H(X*) is negative
definite, H(()X* + (1 - ())(X* + h)) will be negative definite. Thus hTH(()X*
+ (1 - ())(X + h))h will be negative (see Section 9.2 on quadratic forms). 0
We have shown, using (7.20), that, for all X* + h in a neighbourhood of
X* ,
f(X* + h) - f(X*) < 0,
272 7 Classical Optimization

i.e., f has a local maximum at X*. One can prove the following theorem
in an analogous fashion:

Theorem 7.13. If
of(X*) = 0, j = 1,2, ... , n,
OXj

for some X* in the interior of Sand H(X*) is positive definite then f has a
local minimum at X*.

We leave the proof of Theorem 7.13 as an exercise for the reader.

7.3.4 Illustrative Examples

Find the extreme points of


f(X) = -xi - 6x~ - 4Xl + 8X2 + 143,
Using the result of Theorem 7.11, we have
of
- = - 2Xl - 4 = 0 ~ Xl = - 2
oX l

of
- = -12x2 + 8 = 0 ~ X2 = i,
OX2

Thus X 0 = ( - 2,~) is the only candidate for an extreme point. Now

H(X) = (-~ -1~).


which is negative definite. Thus, by Theorem 7.12, X 0 is a local maximum.
As X 0 is the only maximum point, f has a global maximum at X 0, with value
f(Xo) = 149l
Let us now consider an example that is a little more challenging. Find the
extreme points of
f(X)=xi+x~ +x~ +XlX2 +X l X3 +X2X3 -7Xl -8X2 -9X3 + 101, X E R3.
Using the result of Theorem 7.11, we have
7.3 Optimization of Unconstrained Functions of Several Variables 273

Solving these three equations simultaneously yields the unique solution


X 0 = (1,2,3), which is thus the only candidate for an extreme point. Now

H(X) (2 1 1)
= 1 2 1 ,
112

which is positive definite. Thus, by Theorem 7.12, Xo is a local minimum.


As X 0 is the only minimum point, f has a global minimum at X 0, with value

f(Xo) = 76.

7.3.5 Discussion

Lest the reader begin to believe that the above procedure is always as straight-
forward as in analyzing the examples of Section 7.3.4, a few words of cau-
tion are in order. First, a system of equations derived from

j = 1,2, ... , n (7.21)

must be solved in order to find the stationary points. The system of equa-
tions will be nonlinear if f has terms of cubic or higher powers. This can
sometimes be achieved using what is known as Newton's method for systems.
However, this usually requires a great deal of computational effort, and unless
there is some information available about the likely location of roots, the
method may fail to converge. The reader is referred to Henrici (1964) for
a more full discussion of this problem. Of course it is possible that the system
(7.21) may be inconsistent in the sense that it has no solutions. In this case
f has no extreme points.
Even if it is possible to locate the possible candidates for extrema by
finding all solutions X 0 to (7.21), one still has to establish the definiteness
of H(X 0)' For nontrivial systems this is often a difficult task. In fact, for
systems arising from most real-world problems it is usually far more efficient
to try and establish the nature of X 0 by examining the behaviour of f in
the neighbourhood of X 0 directly.
As has been seen in Theorems 7.12 and 7.13, if (7.21) holds and H(X*)
is negative (positive) definite then f has a local maximum (minimum) at
X*. However, if hTH(X*)h changes sign for different h, then X* is not a
local extremum. The reader will note that nothing has been said about the
cases where H(X*) is negative semidefinite or positive semidefinite. These
are equivalent to the single-variable situations covered in Theorem 7.2. The
multivariable situation, however, is complicated and will not be examined
here. The reader is referred to Hancock (1960) for a detailed treatment.
274 7 Classical Optimization

In the previous paragraph we alluded to the single-variable case. It is


easily seen that this is merely a special case of the theory developed in Sec-
tion 7.3.

7.3.6 Global Extrema

In the examples in Section 7.3.4 S, the domain of I, was defined to be R2.


When S is thus unrestricted, the point satisfying (7.21) which yields the
highest (lowest) value of 1 will be the global maximum (minimum). As in
Section 7.2.6, if the domain of1 is closed, the possibility exists that the global
extrema occur on the boundary. This will certainly happen if no interior
points of S satisfy (7.21).
For example, consider the first function given in Section 7.3.4, with S
redefined as the rectangle:
S = {(Xl,X2): -3:::;; Xl:::;; 0,0:::;; X2:::;; I}.
Then, as X 0 = ( - 2, ~), the only candidate for a local extremum, belongs to
S, it is still the global maximum. However, if we redefine S as the square:
S = {(XbX2): 0:::;; Xl:::;; 1,0:::;; X2:::;; I},
then X 0 ¢ S. Thus we must examine the boundaries of S, which are
Bl = {IX(O,O) + (1 - IX)(O, 1): 0:::;; IX:::;; I}
B2 = {IX(O, 1) + (1 - 1X)(I, 1): 0:::;; IX:::;; I}
B3 = {1X(l, 1) + (l - 1X){l, 0): 0:::;; IX:::;; I}
B4 = {1X{l, 0) + (1 - IX)(O,O): 0:::;; IX:::;; I}.
We now find the extrema of! on each boundary B i , = i = 1,2, ... , n. The
value of 1 at any point on a Bi can be expressed as a function of one variable
in IX. Starting with B l , let
g(lX) = 1(0, 1 - IX) = -6(1 - 1X)2 + 8(1 - IX) + 143.
Using the methods of Section 7.2, we obtain
g'(IX) = -12(1 - IX)( -1) - 8,
which has a unique zero at
IX* =!.
Also, we have
g"(IX*) < 0,
indicating that IX* corresponds to a maximum. As the interval over which g
is defined is closed we must also check the endponts:
g(O) = 145
g(l) = 143
gH) = 145~.
7.3 Optimization of Unconstrained Functions of Several Variables 275

Thus the maximum value of Ion B 1 occurs at (0, t), with value 1451, and the
minimum at (0, 1) with value 143.
A complete display of this analysis for all four boundaries is given in
Figure 7.3, with values of I given. It can be seen that I has a global maximum
at (O,t) with value 1451 and a global minimum at (1,0) with value 138.

(0, 1) .-:--:-.,,--_ _ _ _... (1, 1)


145 140

(O,t) 145j l40j (I,!)

143 138
(0,0) (1,0)
Figure 7.3. Examining boundaries for global extrema.

7.3.7 Concave and Convex Functions

The definitions for concave and convex functions of a single variable can be
generalized for functions of several variables.

Definition 7.12. A function I defined on S, a simply connected region in n-


dimensional Euclidean space, is said to be concave on S if for all oc E R,
0::;; oc ::;; 1, and for all Xl, X 2 E S,

Definition 7.13. A function I defined on S, a simply connected region in n-


dimensional Euclidean space, is said to be convex on S if -lis concave on S.

As with the one-dimensional case, we can prove far stronger results for
concave and convex functions than for more general functions. We begin by
generalizing Theorem 7.3. The proof of Theorem 7.14 follows along the lines
of that for theorem 7.3.

Theorem 7.14. II I is concave on a simply connected region S s; R n with a


local maximum X* E S then I has a global maximum at X*.

PROOF. As I has a local maximum at X* there exists a (j-neighbourhood about


X* such that I at X* is no less than at any other point in the neighbourhood.
That is, there exists (j E R + such that, for all XES such that IIX* - XII < D,
276 7 Classical Optimization

we have
f(X*) ~ f(X). (7.23)
Hence if we can show for all XES that (7.23) holds we have shown that X*
is a global maximum. This is done by contradiction. Suppose (7.23) does not
hold for all XES, i.e., there exists Xl E S such that
f(X*) < f(X 1)· (7.24)
Now, as f is concave, we have

f(aX 1 + (1 - a)X*) ~ af(X d + (1 - a)f(X*) for all a, °


~ a ~ 1. (7.25)

By taking a to within b/(IIX* - X liD of 0, i.e.,


0< a < b/(IIX* - Xii), (7.26)
the point aX 1 + (1 - a)X* will satisfy (7.23); that is,
f(X*) ~ f(aX 1 + (1 - IX)X*). (7.27).
Then, by (7.25), we have
f(X*) ~ af(X 1) + (1 - lX)f(x*),
which contradicts (7.24). Thus
f(X*) ~ f(X) for all XES.
That is, f has a global maximum at X*. D

The proof of the analogous theorem for convex functions is left to the
reader.

Theorem 7.15. If f is concave on a simply connected region S £; Rn with a


local minimum X* E S, then f has a global minimum at X*.

7.4 Optimization of Constrained Functions


of Several Variables
Feasible solutions to many realistic optimization problems are constrained
to be within a subset of n-dimensional Euclidean space. As examples, a
company may not be able to invest more funds than it possesses, time allo-
cated to a machine must be nonnegative, and pollution laws may require
values of a certain variables to be less than a given level. When this is the
case, as it is in nearly all real-world problems, one must maximize the ob-
jective function subject to a number of constraints on its variables. These
constraints are usually expressed in the form of equations or inequalities. We
begin with the case where all the constraints are equations.
7.4 Optimization of Constrained Functions of Several Variables 277

7.4.1 Multidimensional Optimization with Equality Constraints

This problem can be stated in general as follows:


Maximize: f(X) (7.28)
subject to: giX) = 0, j = 1,2, ... , m (7.29)
where

A typical numerical example of such a problem is given in the next section.


One obvious approach is to use the equations to eliminate some of the
variables from the problem. For example consider the following problem:
Maximize: f(X) = xi + 2(X2 - 4)2 +8
subject to: Xl - x~ + 4 = o.
As

we are left with the following unconstrained problem in one dimension:


Maximize:
which is easier to solve. Of course, this approach of elimination will be
successful in reducing the number of variables in the problem only if it is
possible to express a solution for one or more of the variables explicitly.
Often, however, this cannot be done.
It can be shown that when the variables of the objective function must
satisfy constraints which are equations, the optimal point must lie on the
boundary of the feasible region F. There are a number of methods available
for locating optima which lie in the interior of F. We now present the Jacobian
method and Lagrange's method, which both transform such a problem into
one with its optima all lying in the interior of F.

7.4.1.1 The Jacobian Method


We now present a method which solves the problem (7.28), (7.29). It is as-
sumed thatf and gj,j = 1,2, ... , m have continuous second derivatives. The
strategy is to find a suitable expression for the first derivatives of f at all
points which satisfy (7.29). The feasible stationary points of f are the ones
among these for which

of = 0, i = 1,2, ... , n. (7.30)


ox;
The maximum points are identified among those satisfying (7.30) by using
Theorem 7.12.
These ideas are now placed on a firm mathematical basis. Consider any
point X which satisfies (7.29). In any neighbourhood of X there will exist at
278 7 Classical Optimization

least one point X + h which satisfies (7.29), because X is on the boundary of


the region defined by (7.29). Expanding f and gj,j = 1,2, ... , m, in a Taylor
series about X, we get

f(X +h)= f(X) + Vf(Xfh+thH f(OX +(1-0)(X +h))h,


gj(X +h)=giX) + Vgj(Xfh+thHgj(OX +(1- O)(X +h»h, j=1,2, ... ,m,

for some 0, 0 < 0 < 1. As X + h approaches X, we get


f(X + h) ~ f(X) + Vf(Xfh
j = 1,2, ... , m.
Therefore
of(X) ~ Vf(XfoX
ogj(X) ~ Vgj(XfoX, j = 1,2, ... , m.
Using (7.29), we get
j = 1,2, ... , m.
Thus we can state, to within a first order approximation,

j = 1,2, ... , m. (7.31)

Now as Vf(X) and Vgj(X), j = 1,2, ... , m, consist of known constants,


(7.31) constitutes a set of(m + 1) linear equations in (n + 1) unknowns, ox!>
OX2, ... , oXm of(X). If the equations are linearly dependent one discards the
smallest number whose removal leaves an independent set. Hence we can
assume that there are no more equations than variables, i.e.,

m:::;;n.
Now

leads to the unique solution


oX=o,

which implies that there are no feasible points other than X in any neighbour-
hood of X. That is, the set of feasible points is discrete. Hence we can assume
that
m<n.

(7.32)
The variables Wi' i = 1, 2, ... , m are called state variables and the variables
Yb i = 1,2, ... , (n - m) are called decision variables. Now (7.31) can be re-
7.4 Optimization of Constrained Functions of Several Variables 279

written using (7.32), as follows:

f of(X) OWi + ni of0Yi(X) 0Yi


i=l OWi
m

i=l
= of(X) (7.33)

j = 1,2, ... , m. (7.34)

Suppose now that the 0Yi, i = 1, 2, ... , (n - m) are given arbitrary values.
When these are substituted into (7.34) unique values for the OWi, i =
1, 2, ... , m can be found which keep X + h inside the feasible region. One
can then use all these values in (7.33) to see if
of(X) > 0,
i.e., the new point X + h is an improvement over X.
We now state the explicit steps needed to carry this out using vector
notation. The matrix
Ogl Ogl Ogl
OWl OW2 OWm
Og2 Og2 Og2
OWl OW 2 OWm

ogm ogm ogm


OWl OW2 OWm
is called the Jacobian matrix, and the matrix
Ogl Ogl Ogl
°Yl OY2 °Yn-m
Og2 092 Og2
c= °Yl OYz °Yn-m

ogm ogm ogm


°Yl OYz °Yn-m
is called the control matrix. It is important in defining the state and decision
variables that the left-hand sums in (7.33) and (7.34) be linearly independent.
It is always possible to make a choice of which x/s become state variables,
so this happens because we have assumed that the equations in (7.31) are
linearly independent. The implication of this is that J is nonsingular. Now
let
W = (Wl' W2, ... , wmf
Y.= (Yl, Y2' ... , Yn_m)T.
280 7 Classical Optimization

Then (7.33) and (7.34) become


VwfTaw + VyfTay = af(w, Y) (7.35)
and
Jaw + Cay= 0, (7.36)
respectively. As J is nonsingular, we can multiply (7.36) by J- l :

aw= -FlCay. (7.37)


It can be seen, as was stated earlier, that if the elements in ay are given
values, aw can be calculated using (7.37). Substituting this into (7.35) yields
(7.38)
From (7.38) we can form what is known as the constrained gradient of f
with respect to y, which is

VCf = a'j(w,y) = V fT - V fTJ-lC (7.39)


y aCy y w •

Each element of V~f, namely a'jlaCy;. i = 1, 2, ... , (n - m), is called a con-


trained derivative. It represents the rate of change of f resulting from per-
turbing Xi from Yi (all other x;'s being held constant) to feasible points.
When constrained derivatives are used one can show that Theorem 7.11
is applicable, i.e., if X* is a feasible maximum it is necessary that
V~f(X*) = o. (7.40)
Equation (7.40) can be used to identify all the stationary points; it remains
to find which one is the global maximum. To do this we use Theorem 7.12,
with the modification that H is the matrix of constrained second derivatives
with respect to the independent variables Yl' Y2, ... ,Yn-m only, and not
WI' W2' •.. , W m • The complete method will be illustrated with a numerical
example.

7.4.1.2 Numerical Example


Consider the following problem:
Maximize: f(X) = f«XlX2X3)) = -2xi - x~ - 3x~
subject to: gl(X) = Xl + 2X2 + X3 - 1=0
g2(X) = 4Xl + 3X2 + 2X3 - 2 = o.
Here m = 2, n = 3, and we define
w = w2 f = (XhX2)T
(WI'

y = (Yl) = (X3)
Vwf = (-4Xh -2X2)T
Vyf = (- 6X3)
7.4 Optimization of Constrained Functions of Several Variables 281

J = (! ~)

c=G)'
Now, by (7.39), we have
V~f(X) = Vyf - V wFr lC
= - 6X 3 - (-4Xl' -2X2)( - ; -DG)
= -6x 3 + !x l + !X2
= 0, by (7.40).
Combining this equation with the two original constraints, we have

!Xl + !X2 - 6X3 = °


Xl + 2X2 + X3 = I
4Xl + 3X2 + 2X3 = 2,
which have a unique solution:
X* = U\, ~~, l7f,
which is a stationary point. It is now determined whether this is a maximum
point by using Theorem 7.12:
V~ = !x l + !X2 - 6X3
Therefore
oe2f 4 dX l 4 dX2 6
--2 =--+--- . (7.41)
OYl 5 dX3 5 dX3
Now, from (7.37), we have

= -
( -s3
! -DG)
-(-!)
-
-S
2'
282 7 Classical Optimization

Substituting these values into (7.41) yields


~2cf

°a 2 = (!)( -!) + (!)( -~) - 6 < O.


Yl
Thus X* is indeed a maximum point.

7.4.1.3 The Method of Lagrange


The following method was developed by Lagrange in 176l. From the argu-
ment that was used to develop (7.31) it can be said that
af(W, y) = V wfTaw + VyfTay, (7.42)
ag = JaW + cay, (7.43)
where g = (gl,g2,' .. ,gmf, the vector of constraint functions. Eliminating
aw from (7.42) and (7.43) produces
Jaf(W, Y) - VwfTag = VyfTJay - VwfTCay.
Therefore

and from (7.39),


af(W, Y) = vwfTrlag + V~fTay. (7.44)
Now, if X*T = (W*T, y*T) is a local maximum, then
VU(X*) = O.
So, from (7.44), we have
af(w*, Y*) = vwfTr1ag,
hence
af(w*, Y*) = V fTJ-l (7.45)
ag w .

Equation (7.45) is useful in allowing one to analyze the rate at which


f(W*, Y*), the optimal value, changes when g is perturbed. The individual
components of this vector are called sensitivity coefficients for this reason.
These sensitivity coefficients (which are constant, as can be seen from
(7.45)) are now introduced into (7.29). Let

From (7.46) we have


af(w*, Y*) = Aag. (7.47)
Define F, the Lagrangian, as
F(X,J,) = f(X) - Ag.
7.4 Optimization of Constrained Functions of Several Variables 283

Then the system of equations (7.29) and (7.47) correspond to


of
ok =0, j = 1,2, ... , m (7.48)
J

of =0, i = 1,2, ... , n. (7.49)


OXi
It is necessary that any stationary point satisfies (7.48) and (7.49), which
constitute a system of m + n equations in m + n unknowns, Ai> A2, ... , Am,
Xi> X2, ... , Xn • Any stationary point will produce a unique set of values for
the elements of A, as long as (7.48) and (7.49) are independent. Hence these
values are independent of which members of X are assigned to Wand
which to Y. We now illustrate these ideas by using the method to solve the
previous numerical example.

7.4.1.4 Numerical Example


Maximize: f(X) = -2xI - x~ - 3x~
subject to: gl(X) = Xl + 2X2 + X3 - 1 = 0
g2(X) = 4Xl + 3X2 + 2X3 - 2= o.
Now
F(X, A) = F(xl, X2, X3, Al , A2)
= -2xI - x~ - 3x~ - Al(Xl + 2X2 + X3 - 1)
- A2(4xl + 3x 2 + 2X3 - 2)

of
- = - 6X3 - Al - 2A2 = 0
OX3
of
OA l = -(Xl + 2X2 + X3 - 1) = 0

of
OA 2 = -(4Xl + 3X2 + 2X3 - 2) = o.
This yields
xt = 17
xi = ~~
xt = l7
At = - 247

Ai = - 247'
284 7 Classical Optimization

which is of course the same optimal solution as that produced by the Jacobian
method.

7.4.2 Multidimensional Optimization with Inequality Constraints


This problem can be stated in general as follows:
Maximize: (7.50)
subject to: j = 1,2, ... , m. (7.51 )
Necessary (and in some cases, sufficient) conditions for X to be a stationary
point for (7.50), (7.51) were developed by Kuhn and Tucker (1951). Many of
the algorithms for solving the above problem are based on these conditions,
and termination criteria concerned with recognizing when a stationary point
has been reached are derived from them. Wilde and Beightler (1967) presented
a development of the conditions based on constrained derivatives. The
nonrigorous formulation given here uses the Lagrangian.

7.4.2.1 The Kuhn- Tucker Conditions


Consider problem (7.50), (7.51). Adding nonnegative slack variables Sj to the
left-hand side of (7.51) produces the following equations:
j = 1,2, ... , m (7.52)
Sj ~ 0, j = 1, 2, ... , m. (7.53)
Apart from (7.53), we are now confronted with a problem in equality con-
straints and can use one of the methods of Section 7.4.1. In particular, we
use the method of Lagrange and form the Lagrangian:
m
F(X,A) = f(X) - L Aj(gj(X) + sJ (7.54)
j= 1

From (7.46), we have


of
Aj =-;-, j = 1,2, ... , m.
ugj
That is, Aj is the rate of change of f with respect to gj' the jth constraint
function. Now, if
g/X) ::; 0
becomes
gj(X) ::; 8,

where 8 is a relatively small positive number, the set of feasible solutions for
the original problem is no smaller. Hence f(X*), the optimal solution value,
will be no less than what it is for (7.50), (7.51). Hence
(7.55)
7.4 Optimization of Constrained Functions of Several Variables 285

Following the method of Lagrange, for any local maximum X*, with corre-
sponding value s = s* and .1= .1*, we have
of(X*)
~=O,

of(X*)
0.1 = 0,
and, by (7.55),

j = 1,2, ... , m
that is,

Vf(X*) -
m

L
j= 1
AFgj(X*) = °
g(X*) + S* = °
.1* ~ 0,

where S is the vector (SbS2, . .. ,smf.


Now if Aj > 0, by (7.46) we have
of
-;- > 0,
ugj
that is, the jth constraint is tight and sj = 0. Hence
giX*) = 0.
However, if gj(X*) < 0, then Sj>
this case
° and the jth constraint is not tight. In

Therefore

Recapitulating, we have

°
°=
Aj> 0= giX*) =
giX*) < Aj = 0.
Therefore
j = 1,2, ... , m.

We have given an intuitive, nonrigorous outline ofthe following theorem:

Theorem 7.16. If f has a local maximum X* in the feasible region R of the


problem:
maximize: f(X)
subject to: j = 1,2, ... , m,
286 7 Classical Optimization

where f and gj' j = 1,2, ... , m have continuous first derivatives, and R is
well-behaved at its boundary, then it is necessary that

Vf(X*) - I
m

j= 1
).jVgj(X*) = ° (7.56)

g(X*) S ° (7.57)
Ajgj(X*) = 0, j = 1,2, ... , m (7.58)
Aj:::::O, j=1,2, ... ,m (7.59)
for some set of real numbers A* = (Ai),!, ... , ),~).

The phrase "R is well-behaved at its boundary" needs some explanation.


When Kuhn and Tucker developed the above theory they found that for
certain feasible regions their theorem did not hold. This occurred when it
was possible to find a point X in such a region R with the following property.
There does not exist a continuous, differentiable curve C beginning at X
such that one can travel along C from X for a positive distance and remain
in R. This situation is thankfully rare in practice and occurs when X is at
the vertex of a cusp in R. Kuhn and Tucker therefore qualified the conditions
of their theorem by stating that it must be possible to find such a curve C
for any point X in R. This is called the constraint qualification.
The values ),* = (Ai,),!' ... , ).~) are called generalized Lagrange multi-
pliers for obvious reasons. Of course (7.57) represents the fact that X* must
be a feasible point for the original set of constraints. (7.56), (7.57), (7.59) are
termed the Kuhn- Tucker conditions.
For completeness we state the analogous theorem for the problem with
a minimization objective. We leave the proof as an exercise for the reader.

Theorem 7.17. If f has a local minimum X* in the feasible region R of the


problem:
Minimize: f(X)
subject to: j = 1,2, ... , m,

where f and gj' j = 1,2, ... , m have continuous first derivatives and the con-
straint qualification is satisfied, then it is necessary that

Vf(X*) - I
m
).jVgj(X*) = ° (7.60)

°
j= 1

gj(X*) S (7.61 )
AjgiX*) = 0, j = 1,2, ... , m (7.62)
Aj sO, j = 1,2, ... , m (7.63)

for some set of real numbers J. * = (}.i ,).!, ... , ).~).


7.4 Optimization of Constrained Functions of Several Variables 287

Note that in (7.63) the inequality signs have the opposite sense to those
in (7.59).

7.4.2.1.1 When the Kuhn-Tucker Conditions are Sufficient. In previous


sections in this chapter we have shown that the necessary conditions for
f to have a local maximum (minimum) at X* are sufficient if f is concave
(convex). This is also true for Theorem 7.16 if the feasible region R defined
by (7.51) is convex. When will R be convex? A sufficient condition is given
by Theorem 7.18.

Theorem 7.18 The region R defined by


g/X)::::; 0, j = 1,2, ... , m
will be convex if gj is convex for all j = 1, 2, ... , m.

PROOF. Consider two distinct points X b X 2 E R. Then for all j, j = 1,


2, ... ,m,
gj(X 1 )::::; 0
g/X 2 )::::; O.
Also for all a E R, 0 ::::; a ::::; 1,

g j(aX 1 + (1 - a)X 2) ::::; ag iX 1) + (1 - a)g j(X 2)


::::; aO + (l - a)O.
Hence

Therefore
o
Before stating the main result of this section we first prove two lemmas
which are needed in the proof of Theorem 7.2.1. The lemmas (Theorem 7.19
and 7.20) are an n-dimensional generalization of Theorem 7.5.

Theorem 7.19. If f is convex on a convex region R c Rn with continuous


first partial derivatives within R, thenfor any two points X, X + hER,
f(X + h) - f(X) ~ Vf(X)Th.

PROOF. As f is convex, we have


f(a(X + h) + (1 - a)X) ::::; af(X + h) + (1 - a)f(X), for all a E R, 0 ::::; a ::::; 1.
On rearranging, we obtain
f(X + ah) - f(X)::::; a(f(X + h) - f(X)). (7.64)
288 7 Classical Optimization

Making a first-order expansion of the Taylor series of the left-hand side


of (7.64), we obtain
Vf(X + eahfah ~ a(f(X + h) - f(X), for some e, 0 < e < 1
Vf(X + eahfh ~ f(x + h) - f(X), for a > 0
and
lim Vf(X + eah)Th = Vf(X)Th ~ f(X + h) - f(X). 0

We also need the analogous result for concave functions:

Theorem 7.20. If f is concave on a convex region R with continuous


partial derivatives within R, thenfor any two points X, X + hER,
f(X + h) - f(X) ~ Vf(Xfh.

We leave the proof of Theorem 7.20 as an exercise for the reader.


We come now to the main result of this section: the sufficiency of Kuhn-
Tucker conditions for concave functions.

Theorem 7.21. If, in the problem:


Maximize: f(X),
subject to: j = 1,2, ... ,m,
f is concave and gj is convex for j = 1,2, ... , m and there exist X* and
A* = (At, A!, ... , A!) which satisfy (7.56)-(7.59), then f has a global maximum
at X*.

PROOF. As f is concave, by Theorem 7.19 we have


Vf(x*fh ~ f(X* + h) - f(X*),
and by (7.56) we have
m

L AjVgj(X*fh ~ f(X* + h) - f(X*). (7.65)


j= 1

Now using the result of Theorem 7.19 in the left-hand side of (7.65) as gj
is convex and (7.59), we obtain
m
L Aj[giX * + h) - giX *)] ~ f(X* + h) - f(X*),
j= 1

which by (7.58) becomes


m
L AjgiX * + h) ~ f(X* + h) - f(X*). (7.66)
j= 1

Now if (X* + h) is a feasible solution to the problem, then


j = 1,2, ... , m.
7.4 Optimization of Constrained Functions of Several Variables 289

Hence, by (7.59), we have


Ajgj(X* + h) :$; 0 j = 1,2, ... , m.
Therefore, from (7.66), we have
O?: f(X* + h) - f(X*),
for all feasible X* + h. That is, f has a global maximum at X*. 0

We state the analogous result for the problem with a minimization


objective; we leave the proof as an exercise for the reader.

Theorem 7.22. If, in the problem


Minimize: f(X)
subject to: j = 1,2, ... , m,
f and gj,j = 1,2, ... , m are convex and there exist X* and A* = (A!, A!, ... ,
A!) which satisfy (7.60)-(7.63), then f has a global minimum at X*.

7.4.2.1.2 Numerical Example. Let us return to the numerical example of


Section 7.4.1.2 and relax the equality constraints so that the problem be-
comes
Maximize: f(X) = 2xi - x~ - 3x~
subject to: gl(X) = Xl + 2X2 + X3 - 1 :$; 0
g2(X) = 4XI + 3x 2 + 2X3 - 2 :$; O.
It can be shown that the feasible region defined by gl(X) and g2(X) obeys
the constraint qualification and so we can apply the result of theorem 7.16.
Let X* = (x!, x!, x!) be a local maximum, then:
(-4x!, -2x!, -6x!) - A!(i,2" 1) - A!(4,3,2) = 0 (7.56),
x! + 2x! + x! - 1 :$; 0
(7.57)'
4x! + 3x! + 2x! - 2 :$; 0
At( x! + 2x! + x! - 1) = 0
(7.58)'
A!(4x! + 3x! + 2x! - 2) = 0
A! ?: 0, A! ?: O. (7.59),
From (7.56)' and (7.58)' we have the following system of five equations in
five unknowns:
-4x! - A! - 4A! = 0
-2x! - 2A! - 3A! = 0
-6x! - A! - 2A! = 0
A!X! + 2A!X! + A!X! = A!
4A!X! + 3A!X! + 2A!X! = 2A!,
290 7 Classical Optimization

which has the following solution:

(xT,x!,x~) = (0,0,0),
(AT, A!) = (0,0).

Hence X* = 0 is a local maximum. However, as f is concave and gj, and


g2 are convex, we can apply Theorem 7.21 and need look no further for other
local maxima. X* = 0 is a unique local maximum and hence is the global
maximum.

7.5 The Calculus of Variations


The calculus of variations is the branch of mathematics which is concerned
with the optimization offunctionals. A functional is a special kind offunction
which has as its domain a set of functions and as its range the set of real
numbers. The calculus of variations has applications in many areas: astro-
nautics, economics, business management, the physical sciences, engineering,
and others. As will be seen in the next section, some of the problems of this
subject have been studied since the dawn of mathematics.

7.5.1 Historical Background

One of the earliest recorded problems on this topic is concerned with the
finding of a curve of fixed length which encloses the greatest area with a
given straight line. It is said that this problem was solved intuitively by the
Phoenician queen Dido in approximately 850 B.c. According to Virgil she
persuaded a North African chieftain to allow her to have as much of his
land as she could enclose within the hide of a bull. She apparently had the
hide cut up into very thin strips which were joined together to form a single
length. This she laid out in semicircle with the Mediterranean coast as
diameter. The piece of land enclosed, which has the maximum possible area
for the given length, was used to found the city of Carthage.
The calculus of variations received a large impetus in the seventeenth
and eighteenth centuries when some of the great mathematicians of those
times studied some of its problems. Many tasks were undertaken, such as
finding the shape of an object which caused least resistance when propelled
at constant velocity through a fluid. One of the most famous problems has
already been discussed in Chapter I-the brachistochrone. Newton also
considered a related problem: that of finding the shape of a tunnel through
the earth joining two points on the surface which would cause a bead on a
frictionless wire in the tunnel to travel between the two points in minimum
time when falling under gravity. Contrary to the intuitive feeling of some
7.5 The Calculus of Variations 291

people, the solution turns out to be not a straight line joining the two points,
but a hypocycloid.
150 years later the German mathematician Zermelo solved the following
problem. Find the path of a boat crossing a river in minimum time from a
given point on one bank to a given point on the other. The river current is
known at all points and it is assumed that the boat has constant power.
In 1962 an isoperimetric problem similar to Queen Dido's was solved
by the Soviet mathematician Chaplygin. The problem was to find the course
of an aeroplane which encloses the greatest area in a given time while a
constant wind blows. It is assumed that the aeroplane has constant power.
The solution is an ellipse, which tends to a circle as the wind velocity tends
to zero.

7.5.2 Modern Applications

As was mentioned earlier there are numerous applications of the calculus


of variations to diverse areas, a few of which will be detailed now. One
of the main applications is concerned with problems in rocket control. For
example, designers often wish to find the minimum amount of fuel required
for a rocket of given specifications to achieve a given height above the earth's
surface while it experiences atmospheric resistance; or a designer may wish
to find the minimum time required for a rocket to reach the height when it
has only a given amount of fuel. Other applications occur in the financial
planning of both companies and individuals. For instance, a manager may
wish to discover how to maximize the production of certain commodities
within a fixed budget where costs are due to storage, machine set up, pro-
duction runs, and inflation.

7.5.3 A Simple Variational Problem

In this section we introduce a simple general problem of the calculus of


variations. Unfortunately problems of this type cannot be solved by the
methods of elementary calculus. Hence we extend the theory so that such
problems can be tackled.

Definition 7.14. A functional J is a function:


J: D -+ R,
where D is a set of real-valued functions each of which is defined on a real
interval.

Note that in all the optimization problems studied so far in this book
we have wished to optimize a function f whose domain is some subset S
292 7 Classical Optimization

ofRn (n ~ 1). That is, we have searched for a vector X = (X l ,X2,··· ,xnf.
such that f(X) is a maximum or minimum among all vectors in S. In this
section we consider the optimization of a functional rather than a function.
That is, we search for a function f (rather than a vector) such that J(f) is a
maximum or minimum among all functions in D. The mathematics necessary
to optimize functionals is known as the calculus of variations.
Let D be defined by:
D = {J: f(x) = sin nx, Xo ::; x ::; Xl' n = 0,1,2, ... }
where x o, Xl are two given real numbers. Let J be defined by
J(f) = min {J(xn.

Then a typical variational problem is to find f* E D such that


J(f*) = max {J(f)}.
JED

7.5.3.1 Necessary and Sufficient Conditions for a Local Optimum


As can be seen from the description of the historical problems given earlier,
often in the calculus of variations one wishes to optimize some functional
of time, distance, area, or volume. In many such cases J is of the form:

J(f) = iX!

Xo
F(x,f,f') dx,

where D is the set of continuous bounded functions defined on [xo,x l ] with


continuous second derivatives. D is called the set of admissable curves. F is
a continuous three-variable function with continuous partial derivatives. We
assume J is of this form throughout this section, and we further assume that
any f E S obeys boundary conditions. That is, there exists Yo, Yl E R such that

f(xo) = Yo}
f( Xl ) -- Yl for all fED.

What we wish to do is to find an f* E D which optimizes J. In some varia-


tional problems,
J(f*) = max {J(fn,
JED
while in others,
J(f*) = min {J(fn.
JED

J is said to have a local maximum of f* E D if there exists a positive real


number f3 such that

for all fED, such that


7.5 The Calculus of Variations 293

Since all fED are bounded there exist m, MER such that

We now address ourselves to the task of developing necessary conditions


for J to have a local maximum at f*. It is possible to find a positive real
number (j such that

Let fED be a function within the (j-neighbourhood of f*, i.e.,


(7.67)
Then we can represent f as follows.
Let IX be an arbitrary real-valued function with domain [xo, x 1J and con-
tinuous second derivative such that
IX(XO) = IX(X 1 ) = O. (7.68)
Then it is possible to find a small number S such that
f(x) = f*(x) + SIX (x), (7.69)
The function SIX is called the variation of f*. We now define the variation
of J as follows:

,1J = fXl F(x,J,J') dx - fXl F(x,J*,J~) dx. (7.70)


Jxo Jxo
As it has been assumed that J has a local maximum at f*, for all fED
satisfying (7.67) we have
,1J ::; 0
for sufficiently small (j. Substituting (7.69) in (7.70), we get

,1J = fXl F(x,J*


Jxo + SIX,J~ + sIX')dx - Jxo
fXl F(x,J*,J~)dx

fXl {F(x,J* + SIX,J~ + SIX') - F(x,J*,J~)} dx.


= Jxo (7.71)
We can form the Taylor series of the integrand of (7.71) about (x,J*,J~) to
obtain.
,1 _ fXl {OF of, 102F 2 2 02F ,
J - Jxo of* SIX + of~ SIX +"2 of! S IX + of* of~ S!X.8IX

- (IX')2S2 + ... } dx
102F
+-
2 of~
(7.72)
fXl {OF OF,} S2 fXl{02 F 2 0 2F ,
= S Jxo of* IX + of~ IX dx + "2 Jxo of! IX + 2 of * of' IXIX

02F
+ of~ (IX)
, 2} dx + O(s3).
294 7 Classical Optimization

The expression to the right of the first integral in (7.72), 0(8 3 ), can be ne-
glected if 8 is small in magnitude. Thus we have

fXl {OF of '}


LJJ = 8 Jxo of*!Y. + of~!Y. dx.
Now there is no restriction on the sign of 8, so that

and 8 > 0) => LJJ ~0

and 8 < 0) => LJJ ~0


and

But we must have


LJJ ~ O.
Hence
fXl {OF of '}
Jxo of*!Y. + of~!Y. dx = O.
Therefore
fX10F fx, of
Jxo of* !Y. dx + Jxo of~ !y" dx = 0,

which becomes, on integrating the second expression by parts,

fX, of [OF JXl fXl d of


Jxo of* !Y.dx + of~!Y. Xo - Jxo !Y. dx of~ dx = O.
On rearranging, we obtain

fX, (OF d OF) [OF JXI


Jxo !Y. of* - dx of~ dx + of~!Y. Xo = O.
Because of (7.68) the term to the right of the integral vanishes. Hence

fX, (OF d OF) (7.73)


Jxo !Y. of* - dx of~ dx = O.
Now suppose there exists at least one x E [XO,Xl] for which
of d of
(7.74)
of* dx of~
is nonzero. As !Y. is arbitrary, it is possible to define !Y.(x) to have the same sign
as (7.74). Thus
7.5 The Calculus of Variations 295

will be positive everywhere it is nonzero, and there is at least one point x


where this occurs. This contradicts (7.73). Thus

is a necessary condition for J to have a local maximum at f*. The proof for
the case of a local minimum is analogous. We have proven a result known
as the Euler-Lagrange lemma, which is now stated formally:

Theorem 7.23. (The Euler-Lagrange lemma.) If J has a local extremum at


f* it is necessary that

(7.75)

Of course, the result in Theorem 7.23 is only necessary and not sufficient.
One strategy that may be considered to identify the global extremum is to
find all local extrema using the lemma (if there are not too many) and then
choose the best. A sufficient condition for the existence of an extremum has
been provided by Elsgolc (1961): If J has a local extremum atf* a sufficient
condition for f* to be a local maximum (minimum) is
iYF
a(f~)2 ~ 0 (~O).

The reader will have noticed the strong similarity between the results on
the optimization of functions in the calculus of variations and the optimi-
zation of functions in elementary calculus. We now apply the result of
theorem 7.23 to some examples.

7.5.3.2 Applications of the Euler-Lagrange Lemma


(i) The Shortest Length Problem. Consider the problem of joining two
given points (xo, Yo), (Xl> Yl) E R2 with the curve of shortest length. A curve
f joining the points has arc length
i ~l + (f'(xW dx,
X

Jxo !

where
(7.76)
In the context of the general problem, we have
F(x,f,f') = ~l + (f'(x)?
and
J(f) = iX! ~l + (f'(x)? dx.
Jxo
296 7 Classical Optimization

Applying the Euler-Lagrange lemma, we obtain


of
of = 0

~; = f'(1 + (f')2) -1/2

~ (OF) = f"(l + (f')2)-1/2 _ (f')2f"(1 + (f,)2)-3/2


dx of' .

Substituting these results into (7.75) produces

Hence

°
and therefore
f~(x) =
f~(x) = a, a constant
f*(x) = ax + b, b a constant.

This is the curve of a straight line with (7.76) uniquely determining a and b.
Hence we have shown that the shortest distance between two points is a
straight line.
(ii) The Problem of Least Surface Area of Rotation. Consider once again
a curve f joining two points (xo, Yo), (Xb Y1) E R2. Suppose now that f is
rotated about the x-axis. The surface described by this rotation has area:

J(f) = 2n f~' f(x))l + (f'(x))2dx.


The problem is to find the curve f which describes least surface area, that is,
minimizes J(f). Now

of = (1 + (f')2)1/2
of

~; = ff'(1 + (f,)2) - 1/2

~ of = (f,)2(1 + (fy)-1/2 + ff"(1 + (f')2)-1/2 - f(f,)2f"(1 + (f,)2)


dx of' .

Applying the Euler-Lagrange lemma, we obtain


(1 + (f,?)1/2 - {(f~)2 + f*f~}(l + (f~?)-1/2 + f*(f~)2 f~(1 + (f~)2)- 3/2 = 0,
hence
7.5 The Calculus of Variations 297

and therefore

or
I + (f~)2 - f*f~ = O.
The solution of this differential equation is a curve f* called a catenary:
x+b
f*(x) = a cosh - - ,
a
where a and b can be determined uniquely by (7.76).
(iii) The Brachistochrone. This problem was described in Chapter 1. It
involves finding a curve f joining points (xo, Yo), (Xl> Yl) E R2 which, if made
of frictionless wire, would cause a bead to slide under gravity from one point
to the other in least time. Thus the problem is to find the curve f* which
minimizes
I + (f'(X) )2 d
J(f) = IX!
Xo 2gf(x) X,

where g is the acceleration due to gravity. In order to solve this problem


the following theorem is useful.

Theorem 7.24. If F does not depend upon x, thenf*, the solution to (7.75), obeys.

F(f*J~) - f~Ff*(f*J~) = c,
where c is a constant.

PROOF. Consider the expression


F(fJ') - f' F J'(fJ').
Upon differentiation, this gives

~ {F(fJ') - f'FJ'(fJ')} = Ff(fJ')f' + FJ'(fJ')f"

- f" F J'(fJ') - f' ~ F J'(fJ')

= f' {F f(fJ') - :x F J'(fJI)}.

Now, setting
f=f*,
and using the Euler-Lagrange lemma, we get
298 7 Classical Optimization

Therefore
D

Returning to the brachistochrone, F is defined by:


{I + (f')2}
F(XJJ') = F(fJ') =
2gf
which is not directly dependent upon x. Hence, on using Theorem 2.24, we
obtain

Hence
1 + (1')2 - (1')2
f1/2(1 + (1')2)1/2 = (J2g)c,
and therefore
f(1 + (1')2) = a,
where

Therefore
f '( ) =
x
Ja - f(x)
f(x)'

The solution to this differential equation can be expressed in parametric


form as follows:

x = Xo + ~ (t - sin t)

a
f(x) = -- (1 - cos t) (7.77)
2
where it has been assumed that f(xo) = 0, and to, t 1 correspond to the end-
points ofthe wire, (xo, Yo), (Xl> Y1)' (7.77) describes a curve known as a cycloid.

7.5.4 The Relationship Between C.V. and D.P.

The calculus of variations (C.V.) and dynamic programming (D.P.) (intro-


duced in the previous chapter) have a great deal in common as branches of
mathematics. We now present one instance of how D.P. can be used to solve
a simple, general variational problem.
Suppose it is wished to find the curve f* satisfying (7.76) which minimizes
some functional J, where
J(f) = i
Xo
X!
F(xJ,f')dx.

Consider now an intermediate point (x', y') on f*. Then, as f* is optimal, the
7.5 The Calculus of Variations 299

part of the curve from (x', y') to (Xl' Yl) must also be optimal for the problem:

Minimize: D' F(x,f,f')dx.


The reasoning behind this statement is embodied in the principle of opti-
mality stated in Chapter 6. We can look upon the problem as one of D.P.
in which there is an infinite number of stages-points along the x-axis from
Xo to Xl> each point x' corresponding to a state (x',f(x')).
Suppose f* is the optimal curve for [XO,XI - Llx] and is arbitrary on
[Xl - Llx, Xl]' except that

Then

I Xl F(x,f*,f~)dx = IXl-AXF(x,f*,f~)dx + I Xl
Jxo Jxo JXl -Ax F(x,f*,f~)dx.
Now define a two-variable function dependent upon (x, y) by

S(x, y) = min IX F(x,f,f')dx.


JED Jxo
As f* is taken to be optimal from Xo to Xl - Llx,

S(XI - LlX,f*(XI - Llx)) = Xl


Jxo
I -
AX
F(x,f*,f~)dx.
Expanding f*(XI - Llx) in a Taylor series, we obtain

S(XI - LlX,f*(XI) - f~(XI)Llx + O(LlX2)) = IXl


Jxo -
Ax
F(x,f*,f~)dx.
Also, it can be shown that

I Xl - Ax F(x,f,f')dx = F(x,f,f')Llx + O(LlX2).


Jxo
Putting these results together, we have

min I
Xl
F(x,f,f')dx = I
Xl
-
Ax
F(x,f*,f~)dx + min IXl_ F(x,f*,f~)dx
JED Jxo Jxo JED JXl Ax

= S(XI - LlX,f*(XI) - f~(XI)LlX + O(Llx2))


+ min {F(x,f,f')Llx + O(Llx2)}
JED

as as, 2
= S(XI,f*(XI)) - aX I Llx - af* f *(xI)Llx + O(Llx )

+ min {F(x,f,f')Llx + O(LlX2)}.


JED

Now, as Llx -... 0,

0= min {F(X,f,f') - aas - !'(x l ) aas}.


JED X X

This is the basic partial differential equation of D.P.


300 7 Classical Optimization

However, the use of D.P. to solve C.V. problems which have simple
analytical solutions is like using a sledge hammer to crack a peanut. The
approach is most appropriate when f is so complicated that it has to be
approximated by numerical methods.

7.5.5 Further Horizons of c.v.

As the reader has no doubt gathered, the material presented so far in this
chapter represents only a mere glimpse at the most elementary theory of
the calculus of variations. While a detailed analysis of the more advanced
ideas is beyond the scope of this book, we present a brief outline of the
scope of the topic.

7.5.5.1 Multivariable Functions


Until now we have assumed that D, the domain of the functionals under
consideration, comprises functions of a single variable. It is desirable to
generalize this to functions of many variables as applications of this gener-
alization arise in many areas. So now we are considering functionals J where
J:D--+R,

and D is a set of functions such that if fED:


f: R n --+ R, n> 1.
In this case the variational problem becomes

Optimize:
JeD

where each fED must satisfy appropriate boundary conditions. If certain


conditions are met Theorem 7.23 can be generalized as follows:

Theorem 7.25. If f has continuous second partial derivatives and if J has a


local extremum at f* it is necessary that

PROOF. See Gelfand and Fomin (1963).

7.5.5.2 Multivariable Functionals


We can also make a different generalization to the case in which the func-
tional J is multivariable, that is, J depends upon, say n functions:
J: D n --+ R.
7.5 The Calculus of Variations 301

In this case the variational problem becomes

optimize:

where the functions h, i = I, 2, ... , n, are assumed to satisfy appropriate


boundary conditions. Once again if certain conditions are met Theorem 7.23
can be generalized:

Theorem 7.26. If each h, i = 1,2, ... , n, has continuous second partial de-
rivatives and if J has a local extremum at (fl*' f2*, ... ,f,,*) then it is necessary
that
of
d aF
---~=O.
ah* dx af:*

PROOF. See Gelfand and Fomin (1963).

7.5.5.3 Parametric Form


The solution to the brachistochrone was expressed in parametric form in
Section 7.5.3.2. Indeed, it is often convenient to express the curves of certain
variational problems in parametric form. Consider the simple variational
problem given at the beginning of Section 7.5.3.1 and suppose that x and f
depend upon the parameter t. If
x(t o) = Xo and x(t l ) = Xl'
then J becomes

J(f) =
it
Jro 1 ( dfjdX) dx
F x(t), f(t), dt dt dt dt

= lto
tl •
G(x,f,f, X, t) dt,

where I and x denote, respectively, the derivative of f and x with respect


to t and G is the appropriate five-variable function.
The following theorem provides a necessary condition for a local ex-
tremum for J.

Theorem 7.27. If J has a local extremum at f* it is necessary that

-
~~ :tC~) = 0 (7.78)

aG _ ~(aG) = 0 (7.79)
af dt al .
PROOF. See Gelfand and Fomin (1963).
Equations (7.78) and (7.79) are not independent and are equivalent to
(7.75).
302 7 Classical Optimization

7.5.5.4 Constrained Variational Problems


In the simple variational problem of Section 7.5.3 the curves in D had to
obey very few conditions. Namely, any such curve had to be bounded, have
continuous second derivatives and obey boundary conditions. However it
is necessary in any applications that the curves also obey additional con-
straints. These are usually of three types: integral, differential, or algebraic
equations or inequalities.

7.5.5.4.1 Integral Constraints. As an example of a problem with integral


constraints we introduce the isoperimetric problem:
Optimize: J(f) =
iXI
Xo
F(x,f,f') dx, fED

subject to: K(f) =


i Xo
XI G(x,f,f') dx - q = 0, (7.80)
where F and G have continuous second derivatives and q is a given real
constant. Applying the ideas of Section 7.4.1.3 we form the Lagrangian:

J + 2K = Ix:1 {F(X,f,f') dx + 2 Ix:1 G(x,f,f') - q} dx

= fXI
Jxo {F(x,f,f') + 2G(x,f,f')} dx - 2q(XI - xo),

which will have the same optimum as


fXI
Jxo {F(x,f,f') + 2G(x,f,f') dx.
Applying Theorem 7.23 to this last expression produces the following neces-
sary condition for J to have a local extremum at f* :
aF d aF (aG d aG) (7.81)
af - dx df' +2 af - dx af' = O.

Equations (7.80) and (7.81) can be solved to find f* and 2.

7.5.5.4.2 Differential Constraints. Let us now consider problems involving


differential constraints. Consider the following problem of two functions,
fl andf2:
Optimize:

subject to:
where once again it is assumed that F and G have continuous second deriva-
tives. Here we do not form the Lagrangian, but instead the integral:
I(fl,f2) = J + 2(x) fXI K dx
Jxo
= fXI
Jxo {F(X,fl,f'1,f2,f2) + 2(X)G(X,fl,f'1,f2,f2)} dx.
7.5 The Calculus of Variations 303

It can be shown that if I has a local extremum at fl * and f2* then it is neces-
sary that
d (8F 8G ) 8F 8G
d 8if~ + A(X) 8if~ i = 1,2.
= 81". + A(X) 81". ' (7.83)
x '* ,* :Ji* :Ji*

It can further be shown (see Gottfried and Weisman (1973)) that the appli-
cation of necessary conditions for the extremization of I is equivalent to the
application of them for the original constrained problem (7.82). Hence (7.83)
constitutes a set of necessary conditions for (7.82).
7.5.5.4.3 Algebraic Constraints. The constraint in (7.82) involved f'l and
f~ and hence was called a differential constraint. If these functions are not
present, we are left with a variational problem with a solely algebraic
constraint:

Optimize: J(fl'/2) = ,X, F(X'/1'/2'/'1,/~)dx,


Jxo
subject to: K(fl'/2) = G(X'/1'/2) = o.
Once again one can use the integral
I(fl'/2) = J + A(X) ,X, K dx
Jxo
to develop necessary conditions for the existence of a local extremum. These
are presented in the following theorem.

Theorem 7.28. If J has a local extremum at (fl*'/2*) it is necessary that


8F 8G d 8F
8fl + A(X) 8fl - dx 8f'1 = 0
and
8F 8G d 8F
8f2 + A(X) 8f2 - dx 8f~ = o.
PROOF. See Gelfand and Fomin (1963).

7.5.5.5 The Maximum Principle


The topic of control theory or optimal control is concerned with the finding
of a policy for the efficient operation of a physical system. Sometimes the
state of the system can be described by a real vector x = (Xl' X2' ... , x m ),
the elements of which vary with time as follows:
dx·
dt' = F;(Xl' X2' ... , Xn ,fl,f2' ... ,fm), i = 1,2, ... , n. (7.84)

Here fl' f2' ... ,fm are bounded, piecewise continuous real functions, depen-
dent on time, forming a vector f = (fl'/2, ... '/m), and the Fi are also
continuous.
304 7 Classical Optimization

Now the Xi' i = 1,2, ... , n also depend upon t and it is assumed that a
set of initial boundary conditions:

i = 1,2, ... , n, (7.85)

are satisfied for the beginning to ofthe time span [to, t 1] under consideration,
where the ai are given constants. Consider now some measurement F o(x,f)
(a differentiable function) of the performance of the system. Then for any
solution fl' f2' ... ,fm to (7.84) we can calculate a real number J(f) where

J(f) = i to
tl
F o(x,f) dt.

Let D = {f = (fl,f2, ... ,fm): /;, i = 1,2, ... ,m are continuous real func-
tions defined on [to, t 1 ], satisfying (7.84)}. D is called the set of admissable
processes and sometimes has further restrictions placed upon it. Then J is
said to have a local minimum at f* E D if

J(f*) ::; J(f), for all fED.

We now examine what conditions are necessary for J to have a local


minimum atf*. To this end we introduce a new variable x o, where
dx
ato = F o(x,f), (7.86)

and define
(7.87)

Integrating (7.86), we obtain

ft l dxo = ft (x,f) dt = J(f).


l

Jto dt Jto
Therefore
J(f) = [xo(t)]:~ = xo(t 1 ) - xo(to)
= xo(t 1 ), by (7.87).

Hence the problem can be restated as follows:


Minimize: xo(t 1 )
feD

subject to: Tt
dx·
= Fi(X,f), i = 0,1, ... , n (7.88)

i = 0,1, ... , n. (7.89)


Applying the necessary conditions of Section 7.5.5.4.2 to this problem, it
is easy to show that, on ignoring the constraints in (7.88) and (7.89) corre-
7.5 The Calculus of Variations 305

sponding to i = 0, we obtain

oFo _
OXi k= I
±
A.k(t) oFk _ dA.i = 0,
OX i dt
i = 1,2, ... ,n (7.90)

and
oFo
j = 1,2, ... , m. (7.91)
ofj
We now construct what is known as the Hamiltonian function H where
n
H(x,j, A.) = L A.i(t)Fi(x,j). (7.92)
i=O

Then (7.88) can be expressed as follows:


oH dXi
i = 0, 2, ... ,n. (7.93)
OA.i dt'
Taking the partial derivatives of H with respect to Xi' we obtain
oH n oFk
OA.i = k~O A.k(t) OXi'
which, by (7.90), yields

oH = A.o(t) of 0 + of 0 _ dA.i.
OA.i OXi OXi dt
Now, as (7.86), which is the first constraint in the family (7.88), is artificial,
we can assign A.o(t) an arbitrary constant value for all t E [to, tIl Thus, let
A.o(t) = -1, t E [to, tIl (7.94)
Then we have
oH
i = 0, 1, 2, ... , n. (7.95)
dXi dt'
Taking the partial derivative of H with respect to fj, we obtain

oFo
by (7.91)
ofj
o dxo
by (7.86)
ofj dt
=0

oH
ofj = 0, j = 1,2, ... , m. (7.96)
306 7 Classical Optimization

Thus we can replace the necessary conditions (7.90) and (7.91) by (7.93),
(7.95), and (7.96), and state the following theorem:

Theorem 7.29. (The maximum principle) If f* = (f!J~, ... J:) is optimal


and x = (Xl> X2, ... , xn) obeys (7.84) and (7.85) then there exists ).(t) = (Ao(t),
Al(t), ... , An(t» such that (7.93), (7.91), (7.95), and (7.96) are satisfied for H as
defined in (7.92).

These results have been known for many years. However, recently Pon-
tryagin et al. (1962) have extended this theory to cover the case when the
functions fl, f2, ... , fm must also obey a family of inequality constraints.
Their results have come to be known as the maximum principle. It is identical
to Theorem 7.29 when the optimal vector f is in the interior of the region
defined by the inequality constraints.

7.6 Exercises
1. Locate all extrema of the following functions and identify the nature of each,
where x E R.
(a) f(x) = x3 + tx2 - 18x + 19
(b) f(x) = 6x 4 + 3x 2 + 42
(c) f(x) = x 2 + 4x - 8
(d) f(x) = 6x 2 + .J3X -
9
(e) f(x) = X12 - 14xll + x lO + 90x 9 + 8x 8 + 6.
2. Given that x E [ -tH find the global extrema of each function in Exercise 1.
3. Prove that a function f: I --+ R is convex if and only if, for all IX E R, 0 ::;; IX ::;; 1 and
for all Xl> X2 E I,

4. Prove Theorem 7.4.


5. Prove Theorem 7.6.
6. Prove Theorem 7.8.
7. Prove Theorem 7.10.
8. Use the results of Section 7.2.7 to find the global extrema ofthe following functions,
which are either concave, convex, or both.
(a) f(x) = sin x, 0 ::;; x ::;; n.
(b) f(x) = cos x, -n/2 ::;; x ::;; n.
(c) f(x) = 4x - 2, - 6 ::;; x ::;; 9.
(d) f(x) = 3x 2 - 18x + 2, -3::;; x::;; 20.
9. Prove Theorem 7.13.
7.6 Exercises 307

10. Locate all extrema of the following functions and identify the nature of each, where
XER2.
(a) !(xt. X2) = XI - Xl + 3x~ + I8x2 + 14.
(b) !(Xt.X2) = 3xI + 4x~ - 6Xl - 7X2 + 13X 1X2 + 1.
(c) !(Xl,X2) = xi - 6x l + x~ - I6x2 + 25.
11. Given that
-1 :s; Xl :s; 5
-2:S;X2:s;6

find the global extrema of each function in Exercise 10.


12. Prove that a function!: S ..... R, S c R" is convex on S if and only if, for all IX E R,
o :s; IX :s; 1 and for all X 1, X 2 E S,

13. Prove Theorem 7.15.

14. Solve the following problems using the Jacobian method.


(a) Maximize: !(Xl,X2,X3,X4) = -4xi - 3x~ - 6x~ - X~
subject to: Xl + X 2 + X3 + X4 - 2 = 0
3x l + 2X2 + 4X3 + X 4 - 3 = 0
Xl + 4X2 + 3X3 + X4 - 1 = 0
[X* = (0.5752, -0.3856,0.0784,1.732)].
(b) Maximize: !(Xl,X2) = 6xi + 3x~ + 4X 1X2
subject to: X1X2 = 7.
(c) Maximize: + x~ + 3x l + 4X2 + 9
!(X l ,X 2) = 2xi
subject to: xi + X2 + 3X1X2 = 11
Xl + x~ + 4X1X2 = 12.
(d) Maximize: !(Xl,X 2 ,X 3 ,X4 ) = -4xi - 2x~ - x~ - 2x~
subject to: 2Xl + X2 + X3 + X4 - 2= 0
Xl + 2X2 + 2X3 + X4 - 1= 0
3x l + 3X2 + X3 + X4 = 0
[X* = nt ig, - i~, 168)].
(e) Maximize: !(Xl,X2,X 3 ,X4) = -xi - 2x; - 3x~ - 4x~ +5
subject to: Xl+ X2 - X3 + X4 + 1 = 0
2Xl + 3X2 - X3 + 2X4 - 2 = 0
2Xl + X 2 + X3 + 3X4 - 1 = 0
[X* = m,g,g, -m].
(f) Minimize: !(X l ,X2) = xI + X~
subject to: X 1X 2 = 8.
308 7 Classical Optimization

(g) Maximize: !(X1,X2,X3,X4) = -xi + 2x~ + 4x~ - 3xi


subject to: Xl + 3X2 + 4X3 - 2X4 = 0

Xl + X 2 + X3 + X4 = 0
4X1 + 3X2 + 2X3 + X4 - I = 0
[x* - (366, -168, -43, -155)].
(h) Maximize: !(XhX2,X3,X4) = -xi - 2x~ - 3x~ - xi
subject to: Xl + X2 + X3 + X4 = 4
Xl - X2 + 2X3 - X4 = 5
3x 1 + 2xx - X3 - 2X4 = 3
[X* = eS430, - \S34 , -fJ,W)],
(i) Maximize: !(x 1, X2, X3, X4) = - 2xi - 3x~ - x~ - 3xi
subject to: 2X1 - X4 = 0
X2 + X3 + 1= 0
X2 + 2X3 + X 4 + 6 = 0
[X* = (- n, i~, - tL - It)].
(j) Maximize: !(XhX2,X3,X4) = 4X1 - xi - x~ - 2x~ - 3xi
subject to: -4=0
X2 + 2X3 + X4 + 2 = 0
Xl + X3 - X4 - 3 = 0
[x* = es4 , -!, -!, -~)].
(k) Maximize: !(X1,X2,X3,X4) = -3xi - x~ - 9x~ - 6xi
subject to: Xl + 3X2 + X3 + 3X4 - I = 0
3X2 + 4X3 + 2X4 - 2 = 0
Xl + 6x 2 + 4X3 + 3x 4 - I = 0
[X* = (-t -1.1.1)],
(1) Maximize: !(X1,X2,X3,X4) = -xi - x~ - 3x~ - 2xi
subject to: 2X1 + 3X2 + 4X3 + X 4 = 5
3X1 + 4X2 + X3 + 2X4 = 3
Xl + X2 + X3 + X4 = I
[X* = (-!s, U,l, -184)]'
(m) Maximize: !(X1,X2,X3,X4) = -3xi - 4x~ - x~ - 2xi
subject to: Xl + X2 + X3 + X 4 - 3 = 0
2X1 + X2 + 3X3 + X4 - 5 = 0
4X1 + X2 + X3 + 3X4 - 4 = 0
[x* = ( - 0.34, 1.16, 1.17, 1.04)].
IS. Solve the linear programming Problem 2.1 of Chapter 2 by the Jacobian method.
16. Solve the linear programming Problem 2.1 of Chapter 2 by the method of Lagrange.
7.6 Exercises 309

17. Solve the problems in Exercise 14 by the method of Lagrange.


18. Suppose now that the right-hand side constants of each of the constraints in the
problems in Ex,ercise 17 are increased by 0.01. Use the sensitivity coefficient of
Section 7.4.1.3 to calculate the increase in the value of the optimal solution to each
problem.
19. Prove Theorem 7.17.
20. Prove Theorem 7.20.
21. Prove Theorem 7.22
22. Develop the Kuhn-Tucker conditions for the following problem:
Minimize: I(X)
subject to: j = 1,2, ... , m.

23. Replace each of the" =" signs by "::;;" signs in each of the problems in Exercise 17
and present the Kuhn-Tucker conditions for each of the problems.
24. Minimize:
feD

where D = {J: [0, 1] -+ R II has continuous derivatives,


is bounded, 1(0) = 0,/(1) = I}.
25. Find the curve joining points (0,0) and (4,4) whose arc length is 6, the area under
which is a maximum.

26. Prove that if


I(x) = I(xt. X2, ... , xn) = };(Xi),

where each};, i = 1,2, ... , n is concave (convex), then 1 is concave (convex).


Chapter 8

Nonlinear Programming

8.1 Introduction
This chapter is concerned with presenting algorithms for finding the optimal
points of a continuous function. As was pointed out in the previous chapter,
there exists a body of knowledge called classical optimization which provides
an underlying theory for the solution of such problems. We now use that
theory to develop methods which are designed to solve the large nonlinear
optimization problems which occur in real-world applications.
The general nonlinear programming problem (N.P.P.) is
Maximize: f(X) = Xo (8.1)
subject to: gj(X) = 0, j = 1,2, ... , m (8.2)
h)X) ~ 0, j = 1,2, ... ,k (8.3)
where X = (x1,X Z,"" xnf is an n-dimensional real vector, and f; gj,j =
1,2, ... , m; hj' j = 1,2, ... , k, are real valued functions defined on Rn.
Before dealing with the specific techniques we classify some of the special
cases of the N.P.P. Iff is quadratic, the g/s are all linear and hj(X) = - x j '
j = 1, 2, ... , k, then the N.P.P. is said to be a quadratic programming problem.
In this case the problem can be expressed as follows
Maximize: Xo = CTX + XTDX
subject to: AX = B
X~O.

If D is symmetric negative definite, Xo is concave, which guarantees that an


optimum exists.

310
8.1 Introduction 311

If there are no equality constraints and Xo and the h /s are all convex
then the N.P.P. is said to be a convex programming problem, which can be
handled by Zoutendijk's method of feasible directions (see Section 8.3.1).
If Xo can be expressed as
Xo = f1(X1) + f2(X2) + ... + fn(x n),
where the /;'S are all continuous functions of one variable, then the N.P.P.
is said to be a separable programming problem. Unconstrained problems with
this type of objective function can be attacked using pattern search (see
Section 8.2.4.1). For constrained problems in which each constraint func-
tion is also separable, an approximate solution can be found by making a
linear approximation of each function (including xo) and using linear pro-
gramming (see Section 8.3.4.2).
If Xo and the constraint functions are of the form:

where
j = 1,2, ... , p,
then the N.P.P. is said to be a geometric programming problem. Problems of
this type have been solved by a recently developed technique due to Duffin,
Petersen, and Zener (1967; see Section 8.3.6).
Of course many N.P.P.'s belong to more than one of the above groups.
Unfortunately, some N.P.P.'s belong to none. This chapter develops some
of the more popular techniques for various nonlinear problems. Before be-
ginning with the unconstrained case, we mention two simple but important
concepts, resolution and distinguishability.
It may often happen when using the methods outlined in this chapter
that the limit of precision to which numbers are calculated is exceeded. For
example, a computer with 6 decimal place precision will not distinguish
between the numbers 6.8913425 and 6.8913427, and the last digit is arbitrar-
ily chopped. This phenomenon may occur when an objective function f is
being evaluated, and in this case it would be said that the distinguishability
off is 10- 6 . Formally:

Definition 8.1. The distinguishability of f is the minimum postive number y


such that for all Xl, X 2 in the domain of f, if If(X 1) - f(X 2)1 ;;::: y, then it
can be concluded that f(X 1) and f(X 2) are unequal.

Hence if f is of distinguishability y and


If(X 1) - f(X 2)1 < y,
one cannot conclude that f(X 1) and f(X 2) are unequal.
In the application of any numerical optimization technique there will be
a practical limit on the accuracy with which one can deal with the values
312 8 Nonlinear Programming

of Xl' X2, •.. , X n . This accuracy may be governed by the conditions of an


experiment or task which produces values of f. Readings on a gauge, the
availability of only certain units of quantity of a commodity (for example,
drugs with which to dose rats may be available only in 5 cc lots), or one's
eyesight in reading a slide rule are examples. Hence if one is working with
four-figure logarithm tables it may make little sense to attempt to consider a
value of Xi of 4.0693. In this case we say the resolution of Xi is 0.001. Formally:

Definition 8.2. The resolution of a variable Xi is the smallest positive number


Sisuch that, for all pairs xl, xl of Xi' if Ixl - xli ~ Si, then it can be concluded
that xl and xl are unequal.

8.2 Unconstrained Optimization


In this case there are no constraints of the form of (8.2) or (8.3) and one is
confronted solely with maximizing a real-valued function with domain Rn.
When such problems arise in practice first or second derivatives of the func-
tion are often difficult or impossible to compute and hence classical methods
are usually unsuitable. Whether derivatives are available or not, the usual
strategy is first to select a point in Rn which is thought to be the most likely
place where the maximum exists. If there is no information available on
which to base such a selection, a point is chosen at random. From this first
point an attempt is made to construct a sequence of points, each of which
yields an improved objective function value over its predecessor. The next
point to be added to the sequence is chosen by analyzing the behaviour of
the function at the previous points. This construction continues until some
termination criterion is met. Methods based upon this strategy are called
ascent methods.
Thus ascent methods are ways to construct a sequence: Xl, X 2, X 3, . . . ,
of n-dimensional real vectors, where f(X 1) < f(X 2) < f(X 3), .... In genera-
ting a new point X j + 1 from the previous points Xl> X 2 , ••• , Xj' it is usual
to express X j +! as some function of Xj' Hence it must be decided (i) in
what direction X j + 1 lies from Xj and (ii) how far (in terms of the Euclidean
metric) X j + 1 is from Xj' So X j + 1 can be expressed as follows:

X j + 1 = Xj + sjDj.
The vector Dj is called the jth direction vector, and the magnitude Isjl of the
scalar Sj is called thejth step size. Thus we find the new point X j + 1 by"mov-
ing" Sj from Xj a distance in the direction D j •
There are a host of methods which arise from using the information
gained about the behaviour of f at the previous points Xl, X 2, . . . , Xj to
specify Dj and Sj. Of course, in order to generate a new point Xj + sjDj
8.2 Unconstrained Optimization 313

which will satisfy


(8.4)
it is usually necessary to have to consider only certain Sj' D j pairs. Indeed,
some methods consider only D/s for which (8.4) holds for a small value of
Sj' that is, f must yield improved values near Xj'
Ascent methods can be classified according to the information about the
behaviour of f that is required. Direct methods require only that the func-
tion be evaluated at each point. Gradient methods require the evaluation of
first derivatives of f. Hessian methods require the evaluation of second de-
rivatives. Although Hessian methods usually require the least number of
points to be generated in order to locate a local maximum (which is all that
any ascent method aims to produce), these methods are not always the most
efficient in terms of computational effort. In fact, there is no superior method
for all problems, the efficiency of a method being very much dependent
upon the function to be maximized.

8.2.1 Univariate Search

Many search methods for unconstrained problems require searches for the
maximal point of f in a specified direction. Suppose it is necessary to find
the maximal point of f along a direction dj from a point Xj' The feasible
points can be expressed as
SjER.

(Negative values of Sj represent the possibility that the maximal point may
lie in the - Dj direction from X;-) Thus the problem is to maximize a func-
tion ()( of Sj, where
SjER,

with Xi and Dj fixed. Because this type of problem has to be solved repeatedly
in many direct, gradient, and Hessian search methods, it is important that
these one-dimensional searches be performed efficiently.
One crude technique is to first somehow find an interval I of the line
X j + S;Dj in which the maximum is known to lie. One then evaluates ()( at
equally spaced points along 1. Then I is replaced by a smaller interval l'
which includes the best point found so far. The procedure is then repeated
with l' replacing 1. It is not hard to construct simple examples for which this
technique performs rather poorly. It is usually better to make just one func-
tion evaluation each time and to decide where to make the next on the basis
of the outcome. This approach is still inefficient unless it is assumed that
()( belongs to a restrictive class of functions of one variable called unimodal
functions, which are described next.
It will be assumed in this section that the global maximum of ()( is known
to lie in a closed interval and that within this interval the maximum occurs
314 8 Nonlinear Programming

at a unique point. Thus IX must strictly increase in value as s (we shall drop
the subscript i) increases until the maximum is attained. Then IX strictly de-
creases as s assumes values greater than the maximum. A function satisfying
these properties is said to be unimodal. Hence, if IX is unimodal and

So < Sl < s* or So > Sl > s*,


then (8.5)

where s* is the maximum of IX.


If a function IX is unimodal and its unique maximum is known to lie within
a closed interval [a, b], then when IX is evaluated at any pair of points Sl' S2
where Sl > S2, such that either

upon comparison of IX(Sl) and IX(S2), the interval in which the maximum s*
lies can be reduced in length from b - a. This is because one of three events
must occur: either
(8.6)
or
(8.7)
or
(8.8)
so that, by (8.5), we have
(8.6) => s* E (S2' b]
(8.7) => s* E [a, s 1) (8.9)
(8.8) => s* E (S2' Sl).
A one-dimensional search procedure is termed adaptive if it uses the in-
formation gained about the behaviour of IX at the previous point to decide
where to evaluate IX next. There are many adaptive procedures available
which take advantage of (8.9).
The above concepts will be illustrated by some examples. Consider the
functions shown in Figure 8.1. It can readily be seen that IX is unimodal.
It can be seen that
s* = t.
Now,
i = So < Sl = t < s*,
so that
IX(SO) < IX(Sl) < IX(S*),
as can be seen from Figure 8.1. Also, if
8.2 Unconstrained Optimization 315

o 1
8"
1
4 i S
Figure 8.1. A unimodal function.

then

as can be seen from Figure 8.1.


Now suppose that this diagram is unavailable and that information can
be gained about the location of s* only by evaluating rx at selected points.
The results of (8.9) will now be illustrated. Suppose So = i and S1 = tare
evaluated. Then, as
rx(so) < rx(S1)'
the interval [0, iJ can be eliminated as shown in Figure 8.2(a). The same
elimination could have occurred if any Si had been chosen instead of S1,
as long as

Suppose S1 = t and S3 = i are evaluated instead. Then, as


rx(S1) > rx(S3)'

the interval G, 1J can be eliminated, as shown in Figure 8.2(b). The same


elimination could have occurred if any Si had been chosen instead of S1, as
long as
316 8 Nonlinear Programming

IX

(a)

IX

(b)

o 1 S

IX

(c)

Figure 8.2. Interval elimination.

Suppose S1 = * and S2 = i are evaluated instead. Then, as


a{s1) = a(s2),
the intervals [0, t] and [i,l] can be eliminated, as shown in Figure 8.2(c).
So far it has been assumed that the optimal point lies within a known
closed interval [a, b]. There are many ways by which this initial interval
can be found. One method is carried out as follows. Let the most likely
location of the optimal point be a1. If no information is known about the
likely whereabouts of the optimum the point a1 is chosen at random along
the line. Next a positive real number {3 is chosen. The function is then eval-
uated at a1 and a 1 + {3. Three cases must be examined.
8.2 Unconstrained Optimization 317

CASE I. lX(al) < lX(al + [3). IX is evaluated at al + 2[3, al + 4[3, ... , until a
decrease occurs in the value of IX at, say, al + 2n[3. Then set
[a,b] = [al + 2n- 2 [3, al + 2n[3].
CASE II. lX(al) > lX(al + [3). IX is evaluated at al - [3, al - 2[3, al - 4[3, ... ,
until no increase occurs in the value of IX at, say, a l - 2m[3. Then set

CASE III. lX(al) = lX(al + [3). Set


[a,b] = [al,al + [3].
Of course if two points a', b' are found such that
1X(a') = lX(b' ),
then set
[a, b] = [a', b' ].
For example, let
[3=1.
Now if
lX(al)= 1X(4) = -6
lX(al + [3) = 1X(5) = -14,
then we have case II. Suppose, then, that
lX(al - [3) = 1X(3) = 0
lX(al - 2[3) = 1X(2) = 4
lX(al - 4[3) = IX(O) = 6
lX(al - 8[3) = IX( - 4) = -14.
Then we have
[a,b] = [-4,2]'
This is shown in Figure 8.3.
One of the most efficient adaptive one-dimensional search procedures is
called Fibonacci serach. It is described next.

8.2.1.1 Fibonacci Search


Fibonacci search depends upon the Fibonacci numbers Ao, At. A 2 , ... ,
defined as follows:
Ao =0
Al = 1
Ai=A i - 1 +A i- 2 , i=2,3,4, ....
The procedure is used to reduce the interval of uncertainty of a unimodal
function IX. Suppose the initial interval is [at. b l ]. After a number of iterations
318 8 Nonlinear Programming

Figure 8.3. Bounding the interval of search.

the interval is reduced to [ai' bJ In order to make a further reduction two


points Si and Si are generated by the following formula
Si = ai + (bi - ai)An-dAn+2 - i } i = 1, 2, ... , n - 1. (8.10)
Si = ai + (bi - ai)An+l-dAn+2 - i
(Note that Si and Si are placed symmetrically within [ai> ba.) Here n is the
number of function evaluations which must be made in order to achieve
the desired interval reduction.
Now r1. is evaluated at Si and Si' If
r1.(Si) > r1.(Si),
then the remaining interval [a i+1,b i+1] is defined as [ai,sJ If
r1.(Si) ,< r1.(Si),
8.2 Unconstrained Optimization 319

then the remaining interval is defined as [Si, b;]. If


a(sJ = a(5i),
then the remaining interval is defined as [Si,5;]. In this case only, the search
is begun all over again, starting with this new interval [Si,s;], and a new
number n of evaluations must be calculated.
The last two points generated by the procedure as it stands would be
placed at
Sn-l = an-+ (b n- 1 -
1 an- 1 )Ar/A 3 = t(b n- + an-I)
1

5n- 1'= an- 1 + (bn- 1 - an- 1)A2/ A 3 = t(b n- 1 + an-I)'


This means that both points would be placed at the same spot, which would
be of no advantage. Hence Sn-1 is to be placed in the position as defined
above and 5n- 1 is placed as close as possible to the right of Sn-1 so as to
guarantee that the points Sn-1 and 5n - 1 are distinguishable. This minimum
distance of distinguishability is the resolution E:
Sn-l = t(b n- 1 + an-I)
5n- 1 =t(b n- 1 +a n- 1 )+E.
Then the interval [a n - 1,bn - 1 J is reduced as before.
(i) If

set

(ii) If

set

(iii) If

set
[an,bnJ = [Sn-l,5n- 1 ].
The final interval will be of maximum length when (i) occurs. This maximum
length is
bn - an = 5n- 1 - an- 1
= t(b n- 1 + an-I) + E - an- 1
= t(b n - 1 - an-I) + E.
Hence

and therefore
320 8 Nonlinear Programming

by Theorem 8.1 (below). Thus we have

In order to be certain of a reduction of at least the fraction r, that is,


bn - an
r> ,
- b 1 - al
in the least number of function evaluations, n must be the minimum integer
satisfying
e 1
r~ + . (8.11)
b 1 - a 1 3A n - 2 + 2A n - 3
Thus the number An can be determined by (8.11). After the first iteration
only one point, either Si or Si, needs to be calculated, as the other is already
present. Kiefer (1957) has shown that for a given number of function evalua-
tions, Fibonacci search minimizes the maximum interval of uncertainty and
in that sense is optimal.
We now prove Theorem 8.1, which was used in the derivation of(8.11).

Theorem 8.1

PROOF. We present the proof of this theorem in outline only. Let


i = 1, 2, ... , n - 1.
In particular,
11 = b 1 - aI'

the length of the initial interval. It can be shown that the lengths of successive
intervals are related by:
i = 1, 2, ... , n - 3.
Therefore
11=12+13
=(/3+ 14)+1 3
= 21 3 + 14
= 2(/4 + 15) + 14
= 314+ 21 5

= A n- 2 1n- 2 + A n- 3 1n- 1
= An- 2 (Yn-l) + An- 3 1n- 1 ·
Hence
D
8.2 Unconstrained Optimization 321

As an example of Fibonacci search, consider the reduction of the interval


[ - 10, 10] to at most 10% of its present length. Let
B= i.
Then the number of evaluations can be found by (8.11):

-2 ~
i 1
+ ----,----
20 10 - (-10) 3A n - 2 + 2A n - 3
Hence
3 1
- > for minimum n.
32 - 3A n - 2 + 2A n - 3'
Therefore
n= 6.
Thus 6 evaluations will be necessary. The first two points are placed at
81 = -10+ (1O-(-10)h53 = -~g
and
81 = - 10 + (10 - ( - 10)) 183 = ~~.
Suppose

Then

82 = 81
82 = -~~ + (10 - (-~))i = {g.
Suppose

Then

and therefore
83 = - i~ + G~ - (- ~g)H = t~
83 = 82'
Suppose

Then

Hence,
[a4' b4] = [ - ~~, in
84 = - ig + a~ - (- i~ - (-~))t·
Therefore

and

Suppose

Then
322 8 Nonlinear Programming

It can be seen that the point remaining in the interval [as, bsJ is at the
centre of [as, b s]. Also

Thus in order to place 55 symmetrically it must coincide with ss, which is


of no advantage. Hence 55 is placed c to the right of ss:

55 = !m - ig) + c
-~
- 104'
Suppose

The final interval is:


[a6,b 6 J = [_~g,t034J,

which is only 8.3% as long as the original interval.

8.2.1.2 Golden Section Search


The Fibonacci search technique, although most efficient, requires that one
know in advance how many points are going to be evaluated. Golden
section search, although not quite as efficient, does not make such a require-
ment. Recall that we must know n, the number of evaluations in order to
calculate the ratios

in (8.10) in order to find Si and 5i at each iteration. Golden section search


overcomes this problem by using an approximation of these ratios based on

lim An - 1 = 3- 15
n~COAn+l 2

lim ~ = J5 - 1 = 1_ 3- J5 .
n~coAn+l 2 2
Using these results at each step, (8.10) becomes

Si = ai + (hi - a;)
3- 15 '
2 i = 0, 1, 2, ...

_Si = ai + (b i - a;)
15-1
2 ' i = 0, 1,2, ....

With this strategy it can be shown that the ratio of the lengths of successive
intervals found is a constant and
bi - ai = b,-1 - ai - 1 = 1 + J5 = (J5 - 1)-1
bi+ 1 - ai + 1 bi - ai 2 2
8.2 Unconstrained Optimization 323

The method proceeds as in the previous section. Two initial evaluation


points Si and Si are found, then at each successive step there will be one point
present in the remaining interval, and the new point is placed symmetrically
with respect to it. The procedure is therefore very similar to that of Fibonacci
search, except that the initial points S1 and S1 would most likely differ.
Hence all the remaining points Si and Si are likely to differ in the two proce-
dures for the same problem. Also, golden section does not have an automatic
stopping point as does Fibonacci search. The search proceeds until some
termination criterion is met: the interval is sufficiently reduced, or the next
point is to be placed within the resolution distance of the last.
The performance of golden section search on the problem of Section
8.2.1.2 will be compared with that of Fibonacci search. The problem is now
solved by golden section search. The first two points are placed at
So = -10 + (10 - ( -10»(2 - r)
so= -10+00-(-10))(r-1),
where
1 +J"S
r=---:--,--
2
is the ratio of the golden section of Greek geometry (hence the name of the
method). Hence
So = 10(2 - J"S)

So = 1O(J"S - 2).
Now if

then the new interval becomes

and
S1 = So
S1 = 10(2 - J"S) + [10 - 10(2 - J"S)]
= 50 - 20J"S.
Now if

then the new interval becomes

and
S2 = 10(2 - J"S) + (10(5 - 2J"S) - 10(2 - J"S))(2 - r)
= 10(9 - 4J"S)

Now if
324 8 Nonlinear Programming

then the new interval becomes


[a3' b3] = [10(2 - J5), 10(J5 - 2)]
and
83 = 10(2 - J5) + (10(J5 - 2) - 10(2 - J5))(2 - r)
= 10(4J5 - 9)

Now if

then the new interval becomes

and
84 = S3

S4 = 10(4J5 - 9) + (10(J5 - 2) - 10(4J5 - 9))(r - 1)


= 10(9J5 - 20).
Now if

the final interval is


[a5,b 5] = [10(4J5 - 9),10(9J5 - 20)],
which has length 10(5J5 - 11). This interval is a little over 9% of the
original interval in length. This comparison is typical, and in general golden
section search is not quite as efficient as Fibonacci search.

8.2.1.3 The Method of Bolzano


If first derivatives of the objective function are available, then the Bolzano
technique for finding the root of a decreasing function in numerical analysis
can be profitably modified. In using Bolzano's method (also called the
method of successive bisection) one successively evaluates the function in
the middle of the current interval of uncertainty. The right-hand or left-hand
half of the interval is eliminated depending upon whether the derivative is
negative or positive, respectively.
In attempting to find the maximum of the objective function IX one is
trying to find the unique root of the first derivative of IX. The root is unique
because IX is assumed unimodal. So the Bolzano technique can be applied
to IX' in order to find the maximum of IX. The modified technique will now
be described in precise terms. Assume that the maximum is bounded by an
initial interval [ao, bol Then

IX
,(ao +
2 bo) > 0 => [ a 1 , b] - [ao +
1 - 2 bo' b°]
and
8.2 Unconstrained Optimization 325

The general step is

(X (ai +
I
2 bi) > ° [.
=> a" b.]
!
= [a i + i
2b ' b.]
!

and

(x' ( Y+ b.) < °


a.
=> [ab b;] =
[ai, Y+ b.] .

Of course, if
(X' ( ai ; bi) = 0,

the maximum has been found.


It can readily be seen that at each step the remaining interval is halved.
Thus after n steps,
bn - an
bo - a o 2n '
Thus the number of derivative evaluations required to achieve a specified
reduction ratio is the minimum integer n satisfying:
bn - an 1
- - - - > -n
bo - ao - 2 '
Bolzano's method will also be tried on the example of Section 8.2.1.1.
Recall that
[ao, bo] = [ - 10, 10].
Let
ai + bi
si=-2-'
If
(X'(SO) > 0,
then
Cal' b l ] = [0,10].
If
(X'(Sl) < 0,
then
[a2,b 2] = [0,5]'
If
(X1(S2) < 0,
then
[a3,b 3] = [0,2.5]'
If
(X1(S3) < 0,
then
[a 4,b 4] = [0,1.25]'
If
(X1(S4) < 0,
326 8 Nonlinear Programming

then

If

then
[a 6 ,b 6 ] = [0.3125,0.625]'
Hence after only six iterations the interval has been reduced to one of
length 0.3125, or just 1.56% of the original length. This rapid decrease
compared with the previous two procedures comes at the cost of calculating
derivatives, which may be no easy task, if not impossible.

8.2.1.4 Even Block Search


A simplified version of the general even block search method will be pre-
sented in this section. When derivatives are unavailable, it is still possible
to simulate the Bolzano method in the following way. The sign of the deriva-
tive of a function can be approximated at a point by making two distinct
evaluations, each as close to the point as the resolution c; will allow. The
points about which these evaluations are made are the same as those that
would be used in the normal Bolzano method.
Suppose that the first derivative of rt. is unavailable. Let [ao, b o] be the
initial interval, bracketing the maximum. Thus the first evaluation would
have been of rt.' at (ao + b o}/2. Instead, we approximate the sign of rt.'(so),
denoted by O"(rt.'(so}, by

In general, we have

rt.(Si + 6} - rt.(sJ > °=> [ai+ I, bi+I] = [Si' bJ


rt.(Si + 6} - rt.(sJ < o=> [ai+bbi+l] = [absi + c;]
rt.(Si + 6) - rt.(Si} = o=> [ai+l,b i +l ] = [SbSi + c;].
In the last case the procedure must be terminated, as no further observations
can be made in the remaining interval.
Neglecting resolution, this simple even block method will require twice
as many evaluations as Bolzano's method. However, because it usually takes
far less effort to evaluate a function than to calculate and evaluate its deriva-
tive, even block search is often more efficient.

8.2.2 Hessian Methods

Recall from the initial remarks of Section 8.2 that ascent methods generate
a new point Xi + 1 by a calculation of the form:
8.2 Unconstrained Optimization 327

In the case of gradient methods and Hessian methods this equation has the
special form:
X i +1 = Xi + SiBiVf(XJ
The matrix Bi may be a constant matrix or may vary according to previous
calculations. In the description of the methods ofthis section, Bi is a function
of Hessian matrix H(XJ

8.2.2.1 The Method of Newton and Raphson


The following is a "classical" method and should be related to Section 7.3.3.
In attempting to optimize an n-dimensional function we are attempting to
find a root of
Vf(X) = o. (8.12)
In what follows it is necessary to assume that the Hessian matrix evaluated
at each point Xi is nonsingular, i.e., H- 1 (XJ exists. Now suppose we have
found an estimate X i+1; the Taylor series of Vf(X i+1) is expanded about
Xi as follows:
Vf(X i+d = Vf(XJ + H(X;}(Xi+ 1 - XJ
Now if X i + 1 is an estimate of a root of (8.12) it is hoped that
Vf(Xi+d ~ o.
Hence

and we have found an iterative method for generating X i + 1 ,X i +Z,""


namely
i = 1,2, ....

8.2.2.2 Variable M etric Method


The variable metric method does not require that the Hessian matrix of
the function be calculated and inverted, as does the method of Newton and
Raphson. Instead, the inverse of the Hessian matrix is estimated more and
more accurately until the optimum is found. This means that the method is
often the most efficient currently available when the gradient is available
and when the Hessian matrix is not available, is expensive to calculate, or
must be found by numerical methods. Apart from the initial step, the one-
dimensional searches performed in pursuit of the optimum are not usually
in the direction of the gradient. They are carried out in a direction EiVf(X i),
where Xi is the current estimate of the optimum and Ei is a negative definite
matrix. Thus the direction from each point Xi is "deflected" away from the
gradient by matrix E i •
The method will be outlined with view to maximizing the following
quadratic f, in which H is assumed negative definite:
(8.13)
328 8 Nonlinear Programming

For a complete description of how the method maximizes a general function


see Davidon (1959), or Fletcher and Powell (1963). Suppose it is desired to
find the optimum X* to (8.13) from a present estimate X l' Note that
Vf(X) = C + HX,
hence
Vf(X l ) = C + HX l .
Therefore
Xl = H-l(vf(Xd - c)
and
X* = H-l(vf(X*) - C).
But
Vf(X*) = O.
Hence
X* = -H-IC.
Thus
X* = Xl - H-lVf(X l ). (8.14)
Equation (8.14) shows why it is worthwhile to search along a direction which
is different from the gradient direction. Thus when H is known, the optimum
to (8.13) can be found in one step by using (8.14). Problems arise when, for
one reason or another, H- l is not readily at hand.
The method proceeds by calculating X i+l from Xi by using the relation:
X i +l = Xi - SiEiVf(XJ,
where Ei is a negative definite matrix and Si is the step size taken in the
EiVf(X i ) direction. If fcontains n variables, then
(8.15)
The method generates the estimates of X*(X 2 ,X 3 , ... , Xn+d in such a
way that
i = 1,2, ... , n. (8.16)
Now, as
i = 1,2, ... , n

are constructed to be linearly independent, from (8.16) it must be that


Vf(Xn+d = O.
Hence the optimum is found after n iterations if f is quadratic.
The method begins by setting
El = I,
where I is the identity matrix (for simplicity), unless the analyst has further
information and can choose El such that E 1 Vf(Xd is a more promising
direction than Vf(X 1)' This first step just turns out to be a basic gradient
search, which will be explained in Section 8.2.3.
8.2 Unconstrained Optimization 329

In general Ei is computed by the relation:


Ei = E i - 1 + Fi + G i, (8.17)
where the matrices Fi and Gi are chosen so that
n
I Fi = H- 1
i= 1
and
n
I Gi = -E 1 •
i= 1
Thus usually
n
I Gi = -1.
i= 1

Now at any iteration of the method Fi and Gi must be found from previous
information, namely
V!(X;), V!(X i - 1), ... ,Xi' X i - b ... ,Ei - b E i- 2 , •..

One possible choice for Fi and Gi is

F· = (X ' - X·,- l)(X,, - X·,- l)T


, (Xi - X i - 1fW!(X;) - V!(X i - 1))
and
E i - 1W!(X;) - V!(Xi-1)fW!(Xi-d)Ei-1
Gi =
W!(X;) - V!(X i _ dfE i - 1W!(X i) - V!(X i - 1))"

This is the well-known D.F.P. formula (Davidon (1959), Fletcher and


Rowell (1963)). However, recent numerical evidence supports the comple-
mentary D.F.P. formula, labelled B.F.G.S .. For this see Broyden (1970),
Fletcher (1970), Goldfarb (1970), and Shanno (1970).

8.2.3 Gradient Methods

Recall that
V! = (aa! 'aa! , ... 'aa!)T,
Xl X2 Xn

a vector of first partial derivatives. Because this vector points in the direction
of greatest slope of the function at any point, it is called the gradient. (For a
proof of this fact see Theorem 9.7 in the Appendix.)
Gradient methods for seeking a maximum for! involve evaluating the
gradient at an initial point, moving along the gradient direction for a calcu-
lable distance, and repeating this process until the maximum is found.
One of the problems of gradient methods is that they require V! and
hence the first partial derivatives to be calculated. In many problems the
330 8 Nonlinear Programming

mathematical form of f is unknown and hence it is impossible to find deriv-


atives. In such cases gradients may be approximated by numerical proce-
dures. This introduces errors, which make the methods less attractive.
Throughout this section it will be assumed that the necessary gradients are
available either by direct computation or by approximation.
In the basic gradient method first an initial point X 1 is selected. The
gradient Vf(X 1) of f at Xl is computed. A line is then drawn through Xl in
the gradient direction Vf(X 1)' The point on this line X 2 is then selected which
yields the greatest value for f of all points on the line. Suppose the distance
from Xl to X 2 is Sl' Then
(8.18)
and
f(X 2) = max {J(X): X = Xl + Vf(X l)sd
81 eR

= max {J(X 1 + D 1Sd},


51 eR

where
Dl = Vf(X 1 )·
The obvious task is to decide where along the line the best point for flies.
That is, s 1 must be found. There are two ways of going about this. Whenever
derivatives can be evaluated and f is well behaved, the best method is to
substitute (8.18) into the equation for f and differentiate with respect to s.
One can solve for the maximizing value of s, say S10 by setting the derivative
equal to zero. The second method is to use one of the one-dimensional
search methods explained in Section 8.2.l.
Having found Sl and thus X 2, the procedure is repeated with X 2 replacing
X l ' The process continues until no improvement can be made. There are a
number of variations on this process. One such variation, which requires
considerably less effort, is to use a fixed step size Si at each step. This has the
disadvantage of there being no way to predict a satisfactory step size for a
given f. A relatively small step size will usually produce an improvement at
each step but require a large number of steps. A relatively large step size may
sometimes produce a decrease in objective function value from one step to
another.
Gradient methods were first introduced by Cauchy (1847) and were later
used by Box and Wilson (1951) on problems in industrial statistics.

8.2.3.1 Gradient Partan


A specialized version of the gradient method will now be presented. It might
have occurred to the reader that the maximization of a function of two vari-
ables has some aspects in common with climbing a mountain, the maximum
being the peak. As everyone who has looked at a mountain knows, most
mountains have ridges. Quite often these ridges lead to the summit. This
8.2 Unconstrained Optimization 331

geological fact can sometimes be used to advantage in unconstrained


optimization.
The preceding idea was used by Forsythe and Motzkin (1951) to find the
maximum of certain two-dimensional functions. Consider the objective
function in Figure 8.4, which has ellipsoidal concentric contours. Unless
the initial search point lies on an axis of the ellipses, the normal gradient
search will proceed to the maximum X* as shown. It can be seen that the
points 'in the search, X 1> X 2, . . . , are bounded by two "ridges" which both
pass through the maximum X*. Thus a short-cut could be made after three
search points have been identified. When X 1> X 2, and X 3 have been found
the next search should be made in the direction of the line through X 1 and
X 3' If the contours of f are concentric ellipses then the maximum will be
found immediately.

Figure 8.4. Accelerated gradient search.

This method can be generalized to maximize a function f of n variables.


The method is most efficient when f is a negative definite quadratic (see the
Appendix). Suppose that the initial starting point is X l ' Let the best point
found in the direction of the X 1 gradient be X 3 (rather than X 2)' The next
332 8 Nonlinear Programming

point found by the gradient method is X 4. The point X 5 is found by maxi-


mizing along the line through Xl and X 4. For n > 2 it is unlikely that X 5
will be a maximum.
The process can be described in general as follows. Once the process has
"warmed up" and i > 3, Xi is found by gradient search from X i - 1 for i odd,
and Xi is found by an accelerated step by maximizing over the line through
X i - 1 and X i - 3 . It can be shown that the global maximum of a negative
definite quadratic function of n variables can be found after 2n - 1 steps
using this procedure.
The above method requires the calculation of first derivatives. A similar
method, due to Shah, Buehler, and Kempthorne (1964), does not involve
such a restriction. Consider a function f of two variables whose contours are
negative definite quadratics. It can be shown that the contours are concen-
tric ellipses. Let X* be the global maximum of f and Xl and X 2 two points
lying on an arbitrary line through X*. It can be shown that the tangents to
the contours at X 1 and X 2 are parallel. Conversely, it can be shown that if
the situation of the previous sentence holds then Xl and X 2 are colinear
with X*. It is also true that X 1 and X 2 represent optima for f along the lines
corresponding to the respective tangents.
The foregoing facts suggest an interesting method for locating X*. One
begins by arbitrarily selecting two parallel lines and finds the maximum of
f along each line. These two maxima are denoted by Xl and X 2. A line is
then passed through these two points and a search is carried out along this
line. The resulting point is X*.
The method can be generalized to maximize ellipsoidal functions of any
number of variables. The reader may very well have spotted that partan is an
acronym for parallel tangents.

8.2.3.2 Conjugate Directions


This section and the next deal with methods which represent improvements
on the basic gradient method of Section 8.2.3. The reader will recall from his
study of linear algebra that vectors X and Y from n-dimensional Euclidean
space are orthogonal if
XTy=o.
Now if the identity matrix with n rows is inserted into the expression, the
equation still holds, i.e.,
XTIY= 0.
Let us now replace I by any n-square matrix H. If
XTHY= 0,
X and Yare said to be H-conjugate. Thus orthogonality is a special case of
conjugacy, in the sense that orthogonal vectors are I-conjugate.
The methods of conjugate directions can be used to produce a sequence of
points Xl' X 2, X 3, . . . , which each yield improving values in maximizing
8.2 Unconstrained Optimization 333

a quadratic
f(X) = AX + !XTHX,
where H is the Hessian matrix of f. The directions of search Di and Dk all
obey the relationship
DTHDk = 0, for all i, k, i =I k.
The general method of conjugate directions can be implemented as fol-
lows. A point X 1 is chosen initially as the point most likely to be optimal.
(A random choice is made if no relevant information is available.) A one-
dimensional search is carried out in the direction Db the first conjugate
direction. This produces a new point X 2' Another one-dimensional search
is carried out along D 2 , the next search direction, where

DiHD2 = O.
Here H is the Hessian matrix of f. In general, when point X i - 1 is found, Xi
is found by a one-dimensional search along Di from Xi and

j, k ~ i, j =I k.

The maximum will be located in at most n steps.


The trouble is that the sequence of conjugate directions D 1 , D2 , ••• is not
known in advance. The usual method is to generate each new direction Di as
it is needed. One way of doing this is to find the new conjugate direction Di
from point Xi' using Vf(XJ Such directions are called conjugate gradient
directions and are discussed in the next section.

8.2.3.3 Conjugate Gradients


The conjugate gradient method generates each new conjugate direction from
the gradient at the point concerned in such a way that the direction is con-
jugate to all those previously generated. First a starting point, X 1 is chosen.
A one-dimensional search is then performed in the gradient direction from
X l ' That is, to start the ball rolling the first direction is
D1 = Vf(X 1)·
The maximum point X 2 on this line is found. In general the direction Di is
constructed from Vf(X i) so as to make it conjugate to Di - 1. For this purpose
define a scalar ai _ 1 such that
(8.19)
and
DTHD i - 1 = o. (8.20)
Here Q still represents the Hessian matrix of f. Now from (8.19) and (8.20)
we have
334 8 Nonlinear Programming

Hence

and therefore
(Vf(XJ)THD i _ 1
ai-l = - T
Di-IHD i - 1

It can be shown by induction that the directions Di are mutually conjugate.


When it comes to actual implementation the D/s can be calculated by a
simple recurrence relation and only a few vectors and no matrices need be
stored. In fact, storage requirements vary with dimensionality n rather than
nZ as for the quasi-Newton methods (Fletcher and Reeves 1964).

8.2.4 Direct Methods

If the mathematical expression defining f is difficult to manipulate, its gra-


dient vector and Hessian matrix are likely to be complicated to calculate
and evaluate. If the expression is unknown and hence unavailable these eval-
uations cannot be performed. When these situations arise, or when f has
many local maxima, the following direct search approach is appropriate. It
assumes that f can be evaluated at any point X = (Xl, Xz, ... , xnf E R by
performing some task or experiment, or employing some algorithm with
input the specific values of Xj, XZ' ••. , X n •

8.2.4.1 Pattern Search


The first direct search method to be examined is called pattern search and
was developed by Hooke and Jeeves (1961). As with most direct search
methods, pattern search begins by evaluating f at the point X I most likely
to be maximal, or at a random point if all points are initially equally likely
to be maximal. Exploration about X I now begins in order to find the best
direction for improvement. A pre-defined step is taken in the increasing
direction of the first variable. To make this more clear let Xa = X I = (Xii'
Xl, ... , x~) and let the initial perturbation size for each variable, Xi be a
given positive real number, ei' Then f is first evaluated at Xa and then at
(Xii + ej, Xl, ... , x~) and the two values thus obtained are compared. If the
second value is greater than the first, (an improvement over the original
point has been found) the next variable, X z is increased from the new point.
If no improvement was found Xl is decreased by ej, i.e. f is evaluated at
(Xii - el,xl,'" ,x~). If this second value is greater than f(Xa) then Xz is
increased from the new point. Otherwise Xz is increased from Xa' Note that
if (Xii + el' Xl, ... , x~) corresponds to an improvement in f over Xa we don't
bother to evaluate f at (Xii - ej, Xl, ... ,x~). The result of all this is that we
end up with a point which is the best for f found so far, call it (Xii' Xl, ... , x~)
where Xii is either Xii or Xii + el or Xii - e l . We now perturb Xz about this
best point. We evaluate f at (Xii' Xl + ez, x~, ... , x~) and if this represents
8.2 Unconstrained Optimization 335

an improvement we perturb X3 about it. If not we evaluate f about (XiI,


Xl - e2'x~, . .. ,x~) and so on. This strategy is continued, perturbing (in-
creasing, and if necessary decreasing, each variable) about the best point
found so far. When all the variables have been perturbed in this manner the
final best point Xb is identified. The first direction vector DI is defined as

The first step size S 1 is defined to be twice the Euclidean distance between
Xb and Xa. Thus the new point X 2 is derived as

X 2 = Xl + sID! = Xa + 2(Xb - Xa)


= 2Xb - Xa·
The process is now repeated with X 2 replacing X 1. Indeed each successive
step size Sj+ 1 is twice its predecessor Sj unless no perturbations about Xj
bring about an improvement. If this happens the pre-defined perturbation
sizes ei are successively halved until the final best point represents an im-
provement over Xj. If no improvement in f is found before an ei becomes
less than the corresponding resolution Bi for Xi then the process is terminated
and Xj is declared the best point that could be found by the method.
It can be seen that the search seeks a general trend or direction of improve-
ment. Hooke and Jeeves call this a pattern. If exploration about a point is
unfruitful the perturbation size by which each variable is decreased and the
process is repeated about this point. The method seems to reflect the saying:
"nothing succeeds like success", for at each successful iteration the step size
is doubled. However when the exploration process fails to yield further im-
provement the method begins over again and slowly builds up increasingly
large steps once more.
Pattern search often has excellent success because of its ability to follow
a ridge of the geometric mountain it is trying to climb. Before the reader
begins to regard pattern search as the ultimate in direct search methods
there is a shortcoming. The method may fail to yield any further improve-
ment in a function with tightly curved ridges or sharp-cornered contours
while still far from a local maximum. However the technique is often suc-
cessful on real applications and is easily programmed which can be seen
by looking at the following algorithmic statement.
Pattern Search Algorithm. Let
ei = the initial perturbation size for Xi' i = 1, 2, ... , n,
Xa = the current point about which perturbations are being made,
Xb = the current best point found so far while perturbations are being made,
Xc = the best point found once all variables have been perturbed,
Ki = the vector (0,0, ... , 0, 1,0, ... , 0) consisting of all zero entries except
for a unit entry in the ith position, i = 1, 2, ... , n.
336 8 Nonlinear Programming

1. Initialization: set Xa = Xl, the initial point of the search, and setj = 1.
2. An exploration is carried out about X a.
(a) Set Xb = Xa and set i = 1.
(b) If f(X b + eiKi) > f(X b), set Xb to become Xb + eiKi> go to step 2(c),
otherwise continue. If f(X b - eiKi) > f(X b), set Xb to become Xb -
eiK i, otherwise continue.
(c) Set i to become i + 1. If i < n go to step 2(b), otherwise continue.
3. If any termination criterion is met go to step 6, otherwise continue.
4. Iff(X b)::; f(Xc)[j(X b) = f(Xa) in the first iteration] go to step 5, otherwise
continue. An extrapolation is made in the new search direction: (Xb - XJ
Set Xa to become Xc + 2i(Xb - Xc) andj to becomej + 1 (Xa + 2(X b -
Xa) in the first iteration). Set Xc = X b. Go to step 2.
5. Set ei to become ed2 for i = 1,2, ... , nand j = 1. If there exists some i,
1 ::; i::; n such that ei < ei> then set Xb = Xc (except for first iteration)
and go to step 6. Otherwise, set Xa = Xc (except for first iteration) and
go to step 2.
6. The best point found was X b.
Example. The following is an account of the use of pattern search in the
maximization of a function f(X) = f(x b X2), x b X2 E R, using pattern search,
where
Xl = (0,0)
e 1 = e2 = 0.1
e 1 = e2 = 0.03.
The pattern the search takes is shown in Figure 8.5. Following the algorithm,
Xa = (0,0).
Now we have
f((O,O) + 0.1(1,0) < f(O,O).
But
f( (0,0) - 0.1(1,0)) > f(O,O).
Hence
Xb = (-0.1,0).

Further,
f(( -0.1,0) + 0.1(0,1)) > f( -0.1,0).
So
Xb = (-0.1,0.1).
Proceeding to step 4, as this is the first iteration, Xa becomes
Xa + 2(Xb - Xa) = (0,0) + 2(( -0.1,0.1) - (0,0))
0.2, 0.2)
= (- ( = X 2)
Xc= (-0.1,0.1).
8.2 Unconstrained Optimization 337

X7

o 0
Xs
o

Figure 8.5. An example of pattern search.


338 8 Nonlinear Programming

Returning to step 2, we now explore about X 2 :


f« -0.2, 0.2) + 0.1(1,0)) < f( -0.2, 0.2).
But
f( ( - 0.2, 0.2) - 0.1(1,0)) > f( - 0.2,0.2).
Hence
Xb = (-0.3,0.2).
Further,
f« -0.3,0.2) + 0.1(0,1» > f( -0.3,0.2).
So
Xb = (-0.3,0.3).
Proceeding to step 4, X a becomes
Xc + 2(Xb - Xc) = (-0.1,0.1) + 2« -0.3,0.3) - (-0.1,0.1»
= (- 0.5, 0.5) ( = X 3)
Xc= (-0.3,0.3).
Returning to step 2, we now explore about X 3 :
f«-0.5,0.5) + 0.1(1,0» > f(-0.5,0.5).
So
Xb = (-0.4,0.5).
Further,
f« -0.4,0.5) + 0.1(0,1)) > f( -0.4,0.5).
Hence
Xb = (-0.4,0.6).
Proceeding to step 4, X a becomes
Xc + 2(Xb - Xc) = (-0.3,0.3) + 2« -0.4,0.6) - (-0.3,0.3»
= ( - 0.5, 0.9) ( = X 4)
Xc= (-0.4,0.6).
At this point it is interesting to note that the direction of search has changed
slightly, as seen in Figure 8.5.
Returning to step 2, we now explore about X 4:
f« -0.5,0.9) + 0.1(1,0» < f( -0.5,0.9).
But
f« -0.5,0.9) - 0.1(1,0)) > f( -0.5,0.9).
Hence
Xb = (-0.6,0.9).
Further,
f« -0.6,0.9) + 0.1(0,1» > f( -0.6,0.9).
So
Xb = (-0.6,1.0).
8.2 Unconstrained Optimization 339

Proceeding to step 4, X a becomes


Xc + 2(Xb - Xc) = (-0.4,0.6) + 2« -0.6,1.0) - (-0.4,0.6))
= ( - 0.8, 1.4) ( = X 5)
Xc= (-0.6,1.0).
Returning to step 2 we now explore about X 5 :
f« -0.8,1.4) + 0.1(1,0)) < f( -0.8,1.4)
and
f( ( - 0.8,1.4) - 0.1(1,0)) < f( - 0.8,1.4).
Thus Xb remains at (-0.8,1.4). However,
f( ( - 0.8,1.4) + 0.1(1,0)) > f( - 0.8,1.4).
Hence
Xb = (-0.8,1.5).
Proceeding to step 4, X a becomes

Xc + 2(Xb - Xc) = (-0.6,1.0) + 2« -0.8,1.5) - (-0.6,1.0))


= ( - 1.0,2.0) ( = X 6)
Xc= (-0.8,1.5).

Returning to step 2, we now explore about X 6:


f( ( -1.0, 2.0) + 0.1(1,0)) < f( -1.0, 2.0)
and
f( ( -1.0,2.0) - 0.1(1,0)) < f( -1.0, 2.0).
Thus X b remains at ( - 1.0,2.0). Further,
f« -1.0,2.0) + 0.1(0,1)) < f( -1.0, 2.0)
and
f«-1.0,2.0) - 0.1(0,1)) <f(-1.0,2.0).

Thus X b still remains at ( - 1.0,2.0). Proceeding to step 4,


f( - 1.0,2.0) > f( - 0.8, 1.5).
So X a becomes
Xc + 2(Xb - Xc) (-0.8,1.5) + 2« -1.0,2.0) - (-0.8,1.5))
=
= ( - 1.2, 2.5) ( = X 7)
Xc = (-1.0,2.0).
Returning to step 2, we now explore about X 7 :
f( ( -1.2,2.5) + 0.1(1,0)) < f( -1.2, 2.5)
and
f« -1.2,2.5) - 0.1(1,0)) < f( -1.2,2.5).
340 8 Nonlinear Programming

Thus X b remains at ( -1.2, 2.5). Further,


f( ( - 1.2,2.5) + 0.1(0, 1)) < f( - 1.2,2.5)
and
f( ( - 1.2,2.5) - 0.1(0, 1)) < f( - 1.2,2.5).
Thus X b still remains at ( - 1.2,2.5). Proceeding to step 4,
f( - 1.0,2.0) > f( - 1.2, 2.5).
The extrapolation to X 7 has failed to yield an improvement over X 6
and the pattern is destroyed. Proceeding to step 6, an attempt is made to
find a new pattern by casting about near X 6:
e1 = e2 = 0.05.
As this exceeds the resolution, setting
X a = Xc = ( - 1.0,2.0)
and proceeding to step 2, we obtain
f( ( - 1.0,2.0) + .05(1,0)) > f( -1.0, 2.0).
Hence
Xb = (-0.95,2.0).
Further,
f(( -0.95,2.0) + 0.05(0,1)) > f( -0.95,2.0).
So
X b = ( - 0.95,2.05).
Proceeding to step 4, as
f( - 0.95, 2.05) > f( - 1.0, 2.0),
a new search direction has been formed. X a becomes

Xc + 2(Xb - Xc) = (-1.0,2.0) + 2(( -0.95,2.05) - (-1.0,2.0))


=(-0.9,2.1) (=X s)
Xc = ( - 0.95, 2.05).

As can be seen from Figure 8.5, the search took a new turn at X 7 and
proceeded in the (0.1,0.1) direction. Returning to step 2, we now explore
about Xs:
f((-0.9,2.1) + 0.05(1,0)) <f(-0.9,2.l)
and
f((-0.9,2.1) - 0.05(1,0)) <f(-0.9,2.1).
Thus Xb remains at (-0.9, 2.l). Further
f( ( - 0.9,2.1) + 0.05(1,0)) < f( - 0.9, 2.1)
and
f(( -0.9,2.1) - 0.05(0,1)) < f( -0.9,2.1).
8.2 Unconstrained Optimization 341

Thus X b still remains at ( - 0.9, 2.1). Proceeding to step 4,


f( -0.9,2.1) < f( -0.95,2.05).
The extrapolation to X 8 has failed to yield an improvement over X 7 and
the pattern is destroyed. Proceeding to step 6 an attempt is made to find
a new pattern by casting about near X 7,
e1 = e2 = .025.
As these values do not exceed resolution, the search is terminated. Let
X b = Xc = ( - 0.95,2.05), which is the best point found.

8.2.4.2 One-at-a- Time Search


One-at-a-time search, or sectioning, as it is often called, is a classical method
of direct search. Given an initial estimate,
Xl = (X'hX2,"" x~f,
one first searches in the direction of the first variable x l ' Suppose the maxi-
mum of f in this direction from X h lies at a distance 15 1 from Xi' This maxi-
mum point can be found by one of the one-dimensional search methods of
Section 8.2.1. This yields a new estimate X 2, where

Next one searches for a maximum in the X2 direction from X 2 • This yields
X3 where

Eventually X n + 1 is found, where


Xn+ 1 = (Xl + b1 ,x2 + 15 2 " •• , x~ + bnf.
The process is then repeated with X n + 1 replacing X l' The above steps are
carried out until the steps bi , i = 1, 2, ... , n become less than the resolution.
The rate of convergence of this method is usually painfully slow. Indeed,
if the function has ridges which are far from parallel to any coordinate axis
the method will probably grind to a halt far from the optimum.

8.2.4.3 The Method of Rosenbrock


One of the problems of the previously mentioned direct search strategies is
that they often fail if they encounter a ridge. Rosenbrock (1960) tried to
overcome this by developing a method which attempts to identify a ridge
and then searches in the direction of the ridge.
The method begins by making a first attempt at finding a ridge direction
by using one-at-a-time search. Each variable direction is searched for the
optimum about that direction, as in Section 8.2.4.2. Thus ifthe initial estimate
is
342 8 Nonlinear Programming

let the result of this search be X 2, where

X2 = (Xl + bb X 2 + b 2 , · · ., Xn + bnf·
Now instead of repeating the process for X 2, the method searches in the
direction

from X 2. Thus we replace the original point Xl by a new estimate X 2.


The direction from X 1 to X 2 is given by (bb b 2 , ••• , by, hence it seems
a promising direction in which to search. This acceleration step is similar
in idea to that mentioned in connection with the gradient method of Section
8.2.3.
Once the optimum is found in this direction the remaining search direc-
tions are chosen so that they are orthogonal to all previous directions. These
remaining directions can be generated by Gram-Schmidt orthogonaliza-
tion (see Section 9.1 in the Appendix). Once each of these directions has
been searched, an acceleration step is made in the direction of the line from
the first point to the last point corresponding to these directions. This
furnishes the first direction for the next orthogonalization process. This
cycle of acceleration and orthogonalization is repeated until no significant
improvement can be found, or some termination criterion is satisfied. Im-
proved procedures, requiring computational time of dimension n2 rather
than n 3 for the Gram-Schmidt process have been given by Powell (1968).

8.2.4.4 The Method of Powell


This method was developed by M. J. D. Powell (1964). It is similar to the
method of conjugate directions of Section 8.2.3.2, except that derivatives
are not required. The method is also similar to Rosenbrock's method of the
previous section, except that each search is carried out along a conjugate
direction. The directions become conjugate with respect to an approximation
of the Hessian matrix. An algorithmic statement of the method is given next.
Let X 0 be the initial estimate of the maximum point. Let d 1 , d 2 , . . . , dn
be the search directions.

1. Set db d 2 , ••• , dn to be equal to the coordinate directions.


2. Set i = 1.
3. Find the maximum of f in the d i direction from Xi-I, at, say, X i- 1 + Sidi.
4. Let

If i < n, let i become i + 1, and go to step 3. If i = n, go to step 5.


5. Let
di = di + 1, i = 1,2, ... , n - 1
dn = Xn - Xo·
8.2 Unconstrained Optimization 343

6. Find the maximum of f in the d. direction from X n, at, say, Xn + s.d•.


Let
Xo = Xn + sndn·
7. Return to (2) unless some termination criterion is met.
It can be seen that the initial coordinate directions are gradually replaced
by new directions, one per iteration. When the method is applied to a qua-
dratic function these new directions are usually mutually conjugate. This
means that the method is likely to terminate after at most n iterations for a
quadratic objective function.

8.2.4.5 Brent's Praxis Method


There is a problem that may occur in the implementation of Powell's method,
described in the previous section. That is, that even if f is quadratic, it may
happen that
for some i, 1 ::;; i ::;; n.
That is, having some estimate Xi -1 of the optimum, the next estimate is
calculated to be the same point. This occurs when the directions d 1 , d 2 , •.. ,
dn become linearly dependent. When the method is implemented on a digital
computer it is unlikely that the step size will ever become exactly zero
(because of roundoff). However, it can come alarmingly close, and to avoid
this problem a new direction Xn - Xo should replace one of the d/s so as
to make the set linearly independent. While this modification has been found
to be quite successful (Fletcher 1965; Box 1966), there is no longer any
guarantee that for a quadratic function the set of directions will be mutually
conjugate. This means that the method may not produce fast convergence
to the optimum.
Brent (1973) has suggested a different approach to the problem of avoiding
linear dependence among the search directions. His modification of Powell's
method is to periodically reset the search directions to be a set of orthogonal
directions based on the original conjugate directions which are replaced.
This results in faster convergence than would occur if the search directions
were reset as the coordinate directions, as in that case information built up
about the function is thrown away at each reset.
The new set of normalized search directions a1> a2 , ••• , an are built up by
assuming that the objective function f is a quadratic. Iff is indeed quadratic,
a
the a1> 2 , •.. , an will be mutually conjugate. They can be assembled as
column vectors into a matrix
D = (a 1 , a2 , ..• , am)·
Then
H = (DDT)-l

will be the Hessian of f evaluated at the optimum. D is then replaced by an


orthogonal matrix satisfying this last equation. This ensures fast convergence.
344 8 Nonlinear Programmlllg

a
As the directions at> 2 , ..• , an
are calculated to be orthogonal, they span
R" and thus no potential optimum is overlooked. The computational details
are given in Brent (1973), p. 129.

8.2.4.6 The Method of Stewart


As can be seen, the variable metric method of Section 8.2.2.2 requires that
the partial derivatives of f be evaluated at successive points of the search.
If it is difficult to perform these evaluations a natural question arises: Is it
possible to estimate successfully the required values of the derivatives at
the necessary points? Although the answer is often yes, difficulties may arise
due to rounding and approximation errors in the computation of the
estimates.
Stewart (1957) followed this approach by estimating the value of each
partial derivative by a difference quotient, i.e.,
af(X) f(X + L1iK;) - f(X)
i = 1,2, ... , n, (8.21)
aX i
~~;:;:O ~~~~~~~-

L1i
where L1 i , i = 1,2, ... , n is a small positive number (some suggestions for
the choosing of which are given in Stewart's paper) and K;, i = 1, 2, ... , n
is a vector with all zero entries except for a unit entry in the ith position.
If the L1i are relatively small, rounding error will be relatively high and the
error in (8.17) will be unacceptable. However, if the L1i are relatively large
(8.21) does not provide a very accurate approximation. Stewart attempts to
steer a middle course by choosing the L1i according to the curvature at X
(requiring estimates of second derivatives). For greater accuracy, rather than
using the forward difference formula (8.21) he uses a central difference
formula:
af(X) f(X + ! L1 iK ;) - f(X - !L1iKi)
~~;:;:O
i = 1,2, ... , n.
aXi
~~~~~~~~~~~-

L1i
Despite these precautions, it sometimes happens that the method fails to
yield an improvement in f after reaching a particular point Xi' In this case,
E i , the ith approximation of H- 1 , is reset to I, the identity matrix, i.e., the
search once again sets off initially in the direction Vf(XJ

8.2.5 A Comparison of Unconstrained Optimization Methods

Having presented a large number of unconstrained optimization methods


we must make some attempt to compare them. As would be expected,
Hessian methods usually take fewer steps to converge to the maximum of a
given problem than gradient methods. Also, the latter usually take fewer
steps than direct search methods. However, because derivatives are often
expensive to compute, it is not always true that direct search methods
8.3 Constrained Optimization 345

require more overall computational effort than the more sophisticated


methods.
Whenever second derivatives are available a modified version of the
Newton-Raphson method of Section 8.2.2.1 should be used. The modifica-
tion consists of using an acceleration step X n - X 0 as a direction of search
after Xn has been found after n steps. (Further discussion is given in Jacoby,
Kowalik, and Pizzo (1972}.) When the Hessian matrix is unavailable, vari-
able metric methods are generally the most powerful. (See especially the
comment at the end of Section 8.2.2.2.) However when the objective function
is known to be quadratic or near quadratic, the method of conjugate gradients
(Section 8.2.3.3) is just as efficient. When derivatives are unavailable, Powell's
method (Section 8.2.4.4) seems best on problems with a small number of
variables. With problems with a larger number of variables, Stewart's method
(Section 8.2.4.5) is often appropriate. Lill (1970) has modified the linear
search in Stewart's algorithm to yield good results. However Himmelblau
(1972) found Stewart's method with golden section search inferior to a
modification of Powell's method (the modification being to include the
method of Davies, Swann, and Campey, mentioned by Swann (1964), in
both the linear search and iterative quadratic fitting.)

8.3 Constrained Optimization


Feasible solutions to many realistic optimization problems are constrained
to lie within a subset of n-dimensional Euclidean space. These constraints
are usually expressed as equalities or inequalities involving the decision
variables, as outlined in Section 8.1.

8.3.1 The Method of Zoutendijk

The first method to be described for solving nonlinear optimization problems


with inequality constraints is called the method of feasible directions, pre-
sented by Zoutendijk (1960).
The method starts with a point X 1 which satisfies the inequalities (8.3).
A feasible direction in which to move from X 1 is now determined. A feasible
direction is one in which small steps produce points which are both improve-
ments over Xl (in magnitude of f) and satisfy (8.3). An obvious candidate
would be the gradient direction, as this would yield the biggest improvement
info If this direction is feasible it is chosen. However, suppose it is not feasible.
Suppose also that Xl lies on the boundary of the feasible region (otherwise
an unconstrained optimization strategy can be used until a constraint is
met). The problem now is to find a direction which both increases the objec-
tive function and leads to points within the feasible region.
346 8 Nonlinear Programming

The feasible direction chosen is the one which in general makes the
smallest angle (J with the gradient direction. There may be pitfalls however.
If the "active" constraint (the constraint which forms the part of the boundary
on which X 1 lies) is linear, everything is satisfactory and one of the two
directions defined by this constraint is chosen. However, if the active con-
straint is nonlinear, it is possible that this procedure will produce a direction
leading out of the feasible region. Having made such a step, one would then
have to "jump" back into the feasible region. But there is no guarantee
that such pairs of steps would not be performed repeatedly, causing an
inefficient zig zag. To avoid these and other traps, it becomes increasingly
obvious that we must choose a direction which moves decisively away from
the boundary of the feasible region while also increasing the value of f.
For this purpose the desirable direction d is found by solving the following
program:
Maximize: E
subject to: for all i for which hi(X 1) = 0,
0::;; ti ::;; 1
Wf(X 1 )fd::::-: E
dT d = 1.
The direction d* which is the solution to this problem is the most desir-
able direction to use. One proceeds in this direction as far as possible from
X 1 until the function begins to decrease or a boundary of the feasible region
is met. The process is then repeated until the maximum value for E is non-
positive. The process is then terminated. If all the functions in (8.3) are con-
cave the global maximum will have been found. The method often performs
well on maximization problems when concavity is not present.

8.3.2 The Gradient Projection Method

Let X* be the optimal solution to the problem (8.1), (8.3). It is very likely that
at this point some of the m constraints in (8.3) will be active. These active
constraints form a subspace of the original feasible region. Thus if this sub-
space is examined using unconstrained optimization techniques, the opti-
mum will be found. The main problem is to identify the correct subspace
from among the multitude of subspaces defined by combinations of the con-
straints. Rosen (1960) developed the gradient projection method which
solves this problem efficiently if all the constraints are linear. Unlike Zout-
endijk's method, Rosen's method does not require the solution of a linear
programming problem each time a new search direction is to be found. This
decrease in computational effort has a price. At each iteration, the method
does not search in the feasible direction which brings about the greatest
objective function increase. Instead it chooses a direction which both in-
8.3 Constrained Optimization 347

creases the function value and ensures that a small step in this direction does
not lead to an infeasible point. This direction is defined as the projection of
the gradient vector onto the intersection of the hyperplanes associated with
the active constraints. If there are no active constraints, the gradient direc-
tion is taken as the direction of search.
In outlining the method here it will be assumed that all the functions hi,
i = 1, 2, ... , m in (8.3) are linear. Specifically, assume that the method be-
gins with an estimate Xl of X*. Let X k = Xl'
1. First calculate V!(X k ).
2. Let Pk be the set of hyperplanes corresponding to active constraints at
X k•
3. Find the projection of V!(X k ) onto the intersection of the hyperplanes in
P k • (If there are no active constraints, X k is an interior point and P k is
empty. In this case the projection is V!(X k ).)
4. Maximize along the direction of the projection, taking care to remain
within the feasible region.
5. This produces a new point X k+ l'
6. (a) If P k was not empty in step 2, replace X k by Xk+ 1 and return to step 1.
(b) If P k was empty in step 2, then

8!
8x i
+ t
i=l
Ai 8hi = 0,
8x i
where hl' h2' ... , hq are the functions which correspond to the hyper-
planes in P k • If
i = 1,2, ... , q,
X k satisfies the Kuhn-Tucker conditions (see Chapter 7). Hence X k
is a maximum. If at least one Ai is such that
Ai> 0, (8.22)
a plane corresponding to a function hi for which (8.22) holds is re-
moved from P k • Return to step 2.
The method is unlikely to be as efficient on problems with nonlinear con-
straints. In these cases the projections are made onto hyperplanes which are
tangent to the constraint surfaces. Steps taken in these hyperplanes may very
well move out of the feasible region. Thus jumps back into the feasible region
are likely to be necessary.

8.3.3 A Penalty Function Method

Carroll (1961) presented a method for solving (8.1), (8.3) which generates a
sequence X 1> X 2, . . . , of successively better estimates of X*, each of which
is feasible. Fiacco and McCormick (1968) refined the method and call their
348 8 Nonlinear Programming

modification SUMT (Sequential Unconstrained Minimization Technique).


The method can just as easily accommodate maximization problems.
The SUMT method creates a new objective function:

Jl hi(X)'
k 1
F(X,q) = f(X) - q

where q is negative as the objective is maximization. First an initial feasible


starting point X 1 must be chosen. Then an initial search is carried out for
the maximum of F having chosen a large value of q, say qo. The methods of
Powell (Section 8.2.4.4) or Davidon (Section 8.2.2.2) appear to be two of the
most appropriate for this search. Note that the maximum X 2 for F will not
lie on the boundary of the feasible region, as, if

for any i, i = 1, 2, ... , m, then F will become arbitrarily small. Hence X 2 will
not be the maximum for the original problem if X* lies on the boundary. A
more accurate approximation of X* is found by maximizing F by searching
from X 2 after reducing the value of q. The above series of steps are repeated,
with q being successively reduced at each iteration. The sequence of feasible
points found approaches an optimum if certain assumptions are met.

8.3.3.1 The Generalized Reduced Gradient Method


Consider the nonlinear programming problem:
Minimize: f(X)
subject to: j = 1,2, ... , m
i = 1,2, ... , n,
where X = (Xl> X2, ... , xn) and the L i, Vi' i = 1,2, ... , n are given constants.
We now outline a method due to Abadie and Carpentier (described in
Fletcher (1969)) which solves this problem. Inequality constraints can be
handled by this method by introducing nonnegative slack variables to force
equality. To make this clear, suppose a problem contained the constraint:
hj(X) :?: o.
We introduce a new (slack) variable Sj' forcing equality:
hj(X) - Sj = o.
We set
Lj =0
Vj = 00,
and thus the problem is now in the desired form.
As in the simplex method (assuming nondegeneracy) one partitions the
set of variables {X l ,X2' ... , xn} into two subsets: X B , comprising m basic
8.3 Constrained Optimization 349

variables (one for each constraint), and X NB comprising n - m nonbasic


variables. Now
df(X) = VXNBf(XfdXNB + VxBf(XfdX B
and
df(X) T dX B
dX NB = VXNBf(X) + VXNBf(X) -d-'
XNB
where
V f(X) = (af(X) af(X) af(X»)T
XB aXB1 , dX B2 ' ••• , aXBm
and
V f(X) = (af(X) af(X) ... af(X»)T.
XNB aXNB' aXNB
2' , axn-m
NB

Now, as
j = 1,2, ... , m,
j = 1,2, ... , m
we have

where

Thus
dX B _ _ (~)-1 ~
dX NB - aX B aX NB ·
On substitution, we obtain
df(X)
dX NB = VXNBf(X) - VxBf(X)
T( )-1ag
ax
ag
ax·
B NB

This last expression is called the generalized reduced gradient, and permits a
reduction in the dimensionality of the problem.
Now if f has a local minimum at X*, it is necessary that
df(X*) = 0
dX NB .

The search for X* begins at point X 0 on the boundary of the feasible region.
One then searches from X 0 along the boundary until
df(X) = 0
dX NB '
at which point a local minimum has been found.
350 8 Nonlinear Programming

8.3.4 Linear Approximation

It must have been obvious to the reader who studied Chapter 2 that linear
programming is a very powerful tool. Hence it seems fruitful to consider the
possibility of converting nonlinear optimization problems into linear ones
so that L.P. theory can be applied.
One of the best known linearization methods is due to Wolfe (Abadie
1967). The feasible region defined by the constraints in (8.3) is approximated
by selecting a number of points called grid points and forming their convex
hull. The r grid points Xl' X 2, . . . , X, are chosen by methods described by
Wolfe. The function f is approximated between a pair of grid points X j - 1 ,
and X j by linear interpolation; i.e., if

j = 2, 3, ... ,r, 0::;; Il( ::;; 1,


then
f(X) = f(Il(X j - 1 + (1 - Il()X)
,
~ L Il(J(X), Il(j ~ 0, j = 1, 2, ... , r,
j= 1

where

The grid points must be carefully selected so that only a small number ofthe
Il(i are nonzero for the representation of any point within the convex hull.
The constraint functions in (8.3) can also be approximated by linear in-
terpolation:

,
~ L Il(jhi(Xj), = 1,2, ... , k.
j= 1

Thus the original problem can now be replaced by an approximate linear


programming problem:
,
Maximize: f= L Il(J(X)
j= 1
,
subject to: L Il(jhi(X)::;; 0, i = 1, 2, ... , k
j= 1

j = 1,2, ... , r.
8.3 Constrained Optimization 351

The expressions:
f(X), hi(X), i = 1,2, ... , k; j = 1,2, ... , r
are known constants and the decision variables in the L.P. are the rx/s.
Of course many grid points and hence many rxj are needed for a good
approximation of the nonlinear functions in (8.3). This is accomplished by
allowing the simplex method to choose which new grid points are best by
way of solving certain subproblems. Thus the approximation of (8.3) be-
comes increasingly more accurate as the optimum is approached.

8.3.4.1 The Method of Griffith and Stewart


Griffith and Stewart (1961) presented a linearization method which attacks
a nonlinear programming problem by starting with a feasible solution and
reducing each nonlinear function to a linear one by a Taylor series approxi-
mation. This produces a linear programming problem which is then solved to
yield another solution to the original problem. The cycle continues and a
sequence of L.P. problems are solved.
Consider the following problem:
Maximize: f(X)
subject to: gi(X) = 0, j = 1,2, ... , m
°: ; Xi ::; Vi' i = 1,2, ... , n.
As in the generalized reduced gradient method, any inequality constraints can
be converted into equations. It is assumed that all ofthe above functions have
continuous first partial derivatives. Assume now that X 0 is a feasible solution.
Then the first-order Taylor approximations of the above functions are:
f(X 0 + h) = f(X 0) + Vf(X o)Th
giXo + h) = giXo) + Vgj(Xofh = 0, j = 1,2, ... , m.
Now we attempt to find h = (hl' h2' ... , hm ) such that X 0 + h represents an
improvement over X 0, i.e.,
f(Xo + h) > f(Xo)·
Ignoring the constant f(X 0) we

Maximize: Vf(Xofh = i
i=l
of(Xo) hi
OXi
~ ogj(X o)
subject to: i~l OXi hi = -gj(Xo), j = 1,2, ... , m,

which is a linear programming problem with variables hl' h2' ... ,hn • Of
course, these are unrestricted in sign, so the technique for converting the
problem to one with all nonnegative variables (given in Chapter 2) must be
used.
352 8 Nonlinear Programming

The solution to the above L.P. may produce a new point X ° + h which is
outside the feasible region of the original problem. Thus we must place
restrictions on the magnitude that the h;'s can attain in the above L.P. to
ensure that this does not happen. Hence we add to the above L.P. the follow-
ing constraints:
i = 1,2, ... ,n.
Thus the method proceeds by establishing a feasible point X 0, constructing
an L.P. based on the Taylor approximations and the bounds on the h/s and
then solving this L.P. to produce an improved point X + h. Once this new °
point is found, the process is repeated until the improvement in f from one
iteration to the next falls below some given level or two successive solutions
are sufficiently close together. The success of the method depends upon
choosing efficient m;'s at each iteration. If the mi values are too large, the
method may produce an infeasible solution. However, relatively small m/s
lead to a large number of steps.

8.3.4.2 Separable Programming


Consider a special case of(8.01), (8.02), (8.03) in which

f(X) = L /;(x;)
i= 1

gi(X) = L gij(X;), j = 1,2, ... , m


i= 1

h;(X) = -Xi, i = 1, 2, ... , k = n


(i.e., Xi ;::: 0, i = 1, 2, ... , n),
where

That is, each function in the problem can be expressed as the sum of a
number of functions of one variable. Such a problem is called a separable
programming problem.
In this section we develop a technique for approximating the above
problem by a linear programming formulation. We then show that when
f is concave and the constraint functions are all convex the approximating
technique can be made more efficient.
We begin by approximating each function by a piece-wise linear function
as follows. First, we construct such an approximating function, ]; for each
/;. Suppose that for a particular i, where 1 :-::; i:-::; n, /; can be represented by
the graph in Figure 8.06. Suppose

for any feasible solution. The values Ui must be calculated by examining the
constraints. Suppose the interval over which Xi is defined, [0, uJ is divided
8.3 Constrained Optimization 353

into Pi subintervals not necessarily of equal length :

where
X;o < Xi! < Xi2 < Xi3 < ... < XiPi_1 < XiPi

and
X;o = 0,
XiPi = Ui'

]; is defined over each subinterval [Xik-l, Xik], k = 1,2, ... , Pi by the line
segment joining (Xik-l,/;(Xik-l)) and (Xik,/;(XiJ). That is ]; is shown in
Figure 8.6 by the straight line segments approximating /;. Formally:

(8.23)

Now, as Xi E [Xik-l> Xik], Xi can be expressed as


(8.24)

o
XiO

Figure 8.6. A piecewise linear approximation of k


354 8 Nonlinear Programming

where

and
(8.25)
By (8.24)

By (8.25)

Thus (8.23) becomes


JJx i) = h(Xik-l) + [h(Xik) - h(Xik-l)]Clik'

By (8.25)
(8.26)
In general

where

and
k = 1, 2, ... , Pi'
The family of Pi equations given by (8.27) can be combined as
};(Xi) = CliOh(XiO) + CLilh(Xil) + ... + CLip,!(X ip ,),

where
CLiO + CL il + ... + CLipi = 1

CLik 2 0, k = 0, 1, ... , Pi'


And for each i, 1 ~ i ~ n, at most two CLik'S can be positive, and if two are
positive they must be adjacent. That is, if

for some k E {O, 1, ... , pd, then at most one of


(8.27a)
holds and all other CLik'S are zero.
We define the approximating functions for all of the functions in the
problem in this way:
Pi
};(x;) = L CLikh(X ik ),
k=O
i = 1,2, ... , n, (8.28)

Pi
gi(X;) = L CX ik 9;j(x d,
k=O
i i = 1,2, ... , n, (8.29)
j = 1,2, ... , m.
8.3 Constrained Optimization 355

We can now formulate an approximation of the original problem by


substituting J; for /; and gij for gij using (8.28) and (8.29):
n Pi
Maximize: L L r:J.ikk(Xik)
i=l k=O
j = 1,2, ... , m,

n Pi
subject to: L L r:J.ikgiiXik) = 0,
i=lk=O
i = 1,2, ... , n,

Xi ~ 0, i = 1, 2, ... , n,
p,
L r:J.ik = 1, i = 1, 2, ... , n,
k=O

r:J.ik ~ 0, k = 0, 1, ... , Pi'


i = 1,2, ... , n.
And constraint (8.27a) holds.
This is a linear programming problem in which the j;(XiSS and the
gij(XiSS are constants and the r:J.ik'S are the decision variables. (8.27a) poses
a minor problem in that it imposes restrictions on which variables can enter
the basis if the simplex method is used to solve it. This means that a check
must be made at each iteration to ensure that 8.27a is not violated when a new
variable is brought into the basis. If the selection criterion specifies that the
variable to enter is such that 8.27a is violated, then that variable is ignored
and the criterion is applied anew. This type of strategy will also appear in
the quadratic programming technique of Section 8.3.5.
Unfortunately there is no guarantee that the solution to the L.P. problem
will be even feasible let alone optimal for the original problem. This is
because the approximate feasible region may yield an optimal point which
lies outside the original feasible region. Even if the optimal point for the
L.P. is feasible it may be only a local maximum.
When each /; is concave, j, being the sum of a number of convex functions
is also concave. (The proof of this fact is left as an exercise.) Similarly, when
each gij is convex, each gj is convex. Now the necessary conditions for a
point to be a global maximum are sufficient when j is concave and the
feasible region is convex. As the feasible region is defined by a set of convex
functions it is convex. Thus if each /; is concave and gij is convex, any
stationary point will be a global maximum.
In this case the optimal solution to the approximating problem will be
optimal for the original problem. To see why this is so we formulate a new
approximating problem. Let
Xi = Yil + Yi2 + ... + Yip"
where
h<k
h=k (8.30)
h>k
356 8 Nonlinear Programming

when

As an example, if
X iq = 2q, q = 0, 1,2, ... , 5,
as in Figure 8.7:
Pi = 5
and
Xi = 5.5,
then
k = 3,
i.e., Xi lies in the 3rd interval. In this case
Yil = XiI - XiO =2- 0=2
Yi2 = Xi2 - XiI = 4 - 2 = 2

Yi3 = Xi - X i3 = 5.5 - 4 = 1.5

Yi4 = Yi5 = o.
Thus
Xi = Yil + Yi2 + Yi3 + Yi4 + Yi5
= 2 + 2 + 1.5 + 0 + 0 = 5.5.
Consider };(x;) where

By (8.23)

Since
h < k,

The expression
h(Xir) - h(x ir - 1)
(X ir - X ir - 1 )

0 2 4 5.5 6 8 10
I I
X iO Xi! X i2 Xi X i3 X i4 XiS

Figure 8.7
8.3 Constrained Optimization 357

is an approximation of the slope of J; over the interval [Xir - 1, Xir], which


we denote by J;r' Therefore
Pi
J:(Xi) = J;(XiO) +L J;rYir'
r= 1

Similar approximations can be constructed for each gij:


Pi
gij(Xi) = gij(XiO) + L gijrYi"
r= 1
where

Summing all expressions of each type we have


n n Pi

j(x) = L J;(XiO) + L r=1


i=1
L J:rYiri=1

n n Pi
gj(Xi) = L
i=1
gij(XiO) +L L
i=1 r=1
gijrYir'

These expressions can be used to formulate a new linear programming


problem:
n Pi
Maximize: L L
i= 1 r= 1
J;rYir
n Pi n
subject to: L r=1
i=1
L O'ijrYir:$; - L
i=1
giixiO), j = 1,2, ... , m.

j = 1,2, ... , n
r = 0, 1, ... , Pi'
Note that the expression
n
L J;(XiO)
i=1

has been omitted as it is a constant. The terms: J;" O'ij" gij(XiO), Xir are all
constants and the Yi;S are the decision variables. Conditions (8.30) have
not been included in the above formulation as they are implicitly satisfied
when f is concave and the g/s are convex.
This is because for any i and r:
J;r ~ J;., for all s < r, (8.31)
as J; is concave. That is, the slope of J;(Xi) decreases as Xi increases by virtue
of the nature of concavity. Therefore the objective function coefficients J;r
are automatically assembled in nonincreasing order for each given i. How-
ever, by analogous reasoning, for any pair i and j
O'ijr :$; O'irs for all s < r. (8.32)
35S 8 Nonlinear Programming

Thus the technological coefficients, 9i j r are automatically assembled in


nondecreasing order for each pair i and j.
Equations (S.31) and (S.32) together imply that for a given i, the order
preference of the variables is: Yil, YiZ, ... 'Yip,. Hence conditions (S.30) will
be satisfied automatically. The very last family of constraints can be ignored,
as f is concave and the g/s are convex.

8.3.5 Quadratic Programming

There exists a special class of nonlinear optimIzation problems called


quadratic programming (Q.P.), for which a considerable amount of theory
has been developed. The maximization version of the Q.P. problem is given
below:
Maximize: f(X) = CTX + XTDX
subject to: AX:s::B
X;::::O.

x = (XI' Xz, ... , xn)T is the vector of decision variables, C is an n x 1 vector


and B an m x 1 vector, both of given real numbers. A and D are real matrices
of appropriate dimensions, and D is assumed symmetric negative definite.
This last assumption means that f is strictly concave (see Section 9.1.6
in the Appendix). As can be seen, f is a quadratic function subject to only
linear constraints. The solution to this problem can be obtained by applying
the Kuhn-Tucker conditions of Section 7.4.2.l. As f is strictly concave and
the constraint set defines a convex region, satisfaction of the Kuhn-Tucker
necessary conditions guarantees a global optimum. The constraints, in-
cluding the non negativity conditions, can be rewritten in a form compatible
with Section 7.4.2.l:
AX - B:s:: 0 (S.33a)
-X:s::O. (S.33b)

Let AI, A2 , •.. , I'm be the Lagrange multipliers associated with (S.33a), where
A is m x n. Let 6[,6 2 , ••• , 6n be the Lagrange multipliers associated with
(S.33b). Then the Kuhn-Tucker conditions yield (on dropping the *'s):
C + 2X T D - leA +6= 0
AX - B:s:: 0
-X:S::O
X(AX - B) = 0
6X = 0
),;:::: 0
6;:::: 0,
8.3 Constrained Optimization 359

where
A = (AbAz, ... , Am)
6 = (6 b 6z, ... , 6n ).
Introducing slack variable Sl' Sz, ... 'Sm' where
S = (Sl' SZ, ... , sm)T
we obtain
AX +S= B.
The conditions can be rearranged as follows:
- 2X T D + AA - 6 = C (S.34)
AX +S= B (S.35)
6X = 0 (S.36)
AS= 0 (S.37)
S 2:: 0, X 2:: O.
The problem is now to solve (S.34) and (S.35) while also satisfying (S.36) and
(S.37). Because f is strictly concave and the feasible region is convex, the
solution found must be optimal for the original problem. Thus it is enough
to find a feasible solution to the system (S.34), (S.35) viewed as the constraint
set of an L.P. problem. The only restrictions are (S.36) and (S.37), which
imply that 6j and Xj or Ai and Si cannot both be simultaneously positive for
any i or j. Restrictions of this type occurred in the separable programming
method of Section S.3.4.2.
The solution is found by using phase I of the two-phase method of Section
2.5.4, making sure that (S.36) and (S.37) are never violated. In practice, this
means that, if one of 6i and Xi or Aj and Sj are in the basis (assuming no
degeneracy), then the other cannot enter the basis. When phase I has been
completed, the optimal solution (if it exists) will have been found. As de-
scribed in Chapter 2, if all the artificial variables are zero on termination of
phase I, the problem has a feasible solution; otherwise it has no feasible
solution.

S.3.5.1 Examples of Q.P.


Consider the following problem:
Maximize: f(X) = f(x 1, Xz) = 3x 1 + 2xz - xi - X1XZ - X~,
subject to: 4X1 + 5X2 ::; 20

Written in matrix form, the problem is:

Maximize: f(X) = (3,2) ( Xl)


X z
+ (Xb XZ) (-1 -t)(X1).
-2
1.
-1 Xz
360 8 Nonlinear Programming

Thus
cT = (3,2)

D- _(-1 -!)
-2
1
-1
A = (4,5)
B = 20.

We need one Lagrange multiplier Al to be associated with the constraint


and <5 1 and <5 2 to be associated with the nonnegativity conditions.
Substituting all this into the rearranged Kuhn-Tucker conditions yields:

=n + Al(4,5) - (<51><5 2) = (3,2),

(4, 5)GJ + 83 = 20,

where the vector of slack variables S consists of a single variable, say 83'
Writing out the first two equations and introducing artificial variables 81
and 82 in the first, we have:

2Xl + X2 + 4Al - <5 1 + 81 = 3,


+ 2X2 + SAl
Xl - <5 2 + 82 = 2,
4Xl + 5x2 + 83 = 20.
Recall that in the two-phase method the objective is to remove the artificial
variables from the basis by minimizing their sum, i.e.,

Minimize 80 = 81 + 82'
In tabular form the system is:

Xl X2 Al <5 1 <5 2 81 82 83 r.h.s.


2 1 4 -1 0 1 0 0 3
1 2 5 0 -1 0 1 0 2
4 5 0 0 0 0 0 1 20
80 0 0 0 0 0 1 1 0 0
8.3 Constrained Optimization 361

In canonical form:
Xl X2 Al 15 1 15 2 Sl S2 S3 r.h.s.
2 1 4 -1 0 1 0 0 3
1 2 5 0 -1 0 1 0 2
4 5 0 0 0 0 0 1 20
So -3 -3 -9 1 1 0 0 0 -5
We apply phase I to this tableau, taking care that none of the pairs: 151> Xl;
15 2 , X2; A l , S3 are simultaneously positive. Al cannot enter the basis as S3 =f. O.
But Xl can enter the basis as 15 1 = 0:

Xl X2 A1 15 1 15 2 Sl S2 S3 r.h.s.
1 1 1 3
1 2 2 -2 0 2 0 0 2
1 1 1
0 ~ 3 2 -1 -2 1 0 2
0 3 -8 2 0 -2 0 1 14
3 1 3 1
So 0 -2 -3 -2 1 2 0 0 -2

Now X2 can enter the basis as 15 2 = 0:

Xl X2 Al 15 1 15 2 Sl S2 S3 r.h.s.
2 1 2 1 4
1 0 1 -3 3 3 -3 0 3
1 2 1 2 1
0 1 2 3 -3 -3 3 0 3
0 0 -14 1 2 -1 -2 1 13
So 0 0 0 0 0 1 1 0 0
Phase I has now been completed with So = O. Thus the original problem
does have a feasible solution. Indeed the optimal solution to the original
problem can be found from this tableau with
X! = 1,
X! = 1,
and
f(x!, x!) = l

8.3.6 Geometric Programming

Geometric programming is a technique developed in the late 1960's for


solving a certain class of nonlinear programming problems. Although this
class includes certain kinds of problems which have constraints, the only
constraints we allow here are that the decision variables must all be strictly
positive. For the general techniques the reader should refer to the book
written by the inventors of the subject, Duffin, Peterson and Zener (1967)
and to Beightler and Phillips (1976) for applications.
362 8 Nonlinear Programming

Geometric programming is concerned with the optimization of the class


of functions of the form:

Minimize:

subject to: Xi>O, i = 1,2, ... , n,


where

j = 1,2, ... , m,

and aij' i = 1, 2, ... , n;j = 1, 2, ... , m are arbitrary real numbers. Note that
f(X) is not in general a polynomial as the aij's may possibly be negative. As
the coefficients, Cj of the terms of f(X) must be positive, Duffin, Peterson,
and Zener call f(X) a posynomial.
Let X* be the optimal solution to the above problem. Then if we define
n
Pj(X) = Il xf'j, j = 1,2, ... , m,
i= 1

we can express f as
m
f(X) = L CjPj(X).
j= 1

As each Xi is constrained to be positive, each term CjpiX) must be positive


when X = X*. Thus once we know the value of f(X*) we can calculate the
fractional contribution Wj made to it by the ith term, that is

CjPj(X*)
Wj = f(X*) , j = 1,2, ... , m.
Of course
(8.38)

(8.38) is called the normality condition. Also


° ~ Wj ~ 1, j = 1, 2, ... , m.
The fractions Wj are called weights.
Applying the necessary conditions for f to have a minimum at X* we
have
m n
of(X*)
L Cjakj(Xt)(akr Il (xna'j = 0,
1) k = 1,2, ... , n,
OXk j=1 i*k

where
X* = (xt,x~, .. . ,xn
8.3 Constrained Optimization 363

Multiplying the kth equation by xt (> 0) these conditions reduce to


n (x1)a
m n
L Cjakj ij = 0, k = 1,2, ... , n
j; 1 i; 1
or
m
L
j; 1
CjakjPj(X*) = 0, k = 1,2, ... , n.

k = 1,2, ... , n,

or
k = 1,2, ... , n. (8.39)
j; 1

(8.39) is called the set of orthogonality conditions.


Now

n {f(X*)}Wj
~ m
f(X*) = {f(X*W = {f(X*)}j~l Wj =
j; 1

As the c/s are given constants, once the weights are found we can compute
f(X*). The w/s are found using (8.38) and (8.39) which represent a system of
(n + 1) linear equations in m unknowns. When n + 1 = m the system can be
solved by conventional methods (such as the Newton-Raphson method
referenced earlier). When m exceeds (n + 1) special techniques must be
364 8 Nonlinear Programming

employed to find the optimal weights. Indeed the more m exceeds (n + 1),
the harder the problem. This has led to the quantity m - (n + 1) being called
the degree of difficulty of the problem.
We now solve a problem with degree of difficulty zero:
Minimize: f(X) = f(x 1 , x 2 , x 3 )
= 2X1X21 + 3X2X32 + 2X;-2X2 X3 + x 1XZ
subject to: X),X Z'X 3 > O.
Now
C = (c 1 , C z , C3, c4 ) = (2,3,2,1),
P1(X) = X1X21X~,
pz(X) = X?XZ X3 z,
P3(X) = X;-ZX ZX3,
P4(X) = X1X2X~,
Thus
o -2
1 1
-2
The orthogonality and normality conditions are
o -2
1
-2 1
1
Thus we have a system of four linear equations in four unknowns with zero
degree of difficulty. This system has a unique solution:

Thus
* _ (~)7/14
f(X ) - 7
(~)Z/14
Z
(~)4/14
4
(~)1/14
1
14 14 14 14
= 6.50491068
and
2X'j'(X!)-1 = 174(6.504)} x! = 0.869255252
3x!(x!)-Z = /4(6.504) = x! = 0.534522483
2(x!)-Zx!x! = 1~(6.504) x! = 1.213626700.
Note that the final set of equations solved are nonlinear. However one can
linearize these by taking logarithms. Further, it is interesting to note that
the above derivation of (8.38) and (8.39) does not rely on the c/s. Thus
the xts are independent of these values. Hence for the above problem, the
solution found is optimal for any set of (positive) cj values. Of course the
optimal solution value f(X*) will change as the c/s change. We end this
8.3 Constrained Optimization 365

section with a brief guide of how to solve problems with positive degree of
difficulty by solving a numerical example.
Consider the following problem:
Minimize: f(X) = f(xl>x 2 )
= 2X 1 X2 1 + 3x 2 + 2x12 + X1 X2 3
subject to: Xl>X2 > O.
The orthogonality and normality conditions are:

o -2
1 0
1 1

As the number of unknowns is one more than the number of constraints,


the degree of difficulty of the problem is one. Solving for the first three
weights in terms of the fourth, we obtain
Wi = ~(1 - ~W4)
W2 = ~(1 + 3W4)
W3 = t(1 - 2W4)'
There is an infinite number of solutions to this system, and we now set about
selecting the optimal one. Recall that

f(X*) = Ii (Cj)Wj.
j= 1 Wj

Thus in this case


* _( 2 )2/5(1 + 9/2 w 4) ( 3 )2/5(1 + 3 W 4)
f(X ) - 2
5
(1 9
- zW 4
) 2
5
(1 + 3W4)
X(1 2 )1/5(1- 2W4) (~)W4.
5(1 - 2w 4 ) W4

Now Inf(X*) will be minimal when f(X*) is minimal. Taking logarithms of


both sides, we get
Inf(X*) = ~(1- ~w4)[ln 10 -In (2 - 9W4)] + ~(1 + 3w4)[ln 15 -In (2 + 6W4)]
+ ~(1- 2w4)[In 10 -In (1 - 2w 4)] + w4[ln 1 -In W4].
Employing the necessary conditions for Inf to have a minimum at X*, we
get
olnf(X*)
oW4 = -Hln 10 - In(2 - 9w 4)] +! + Hln 15 - In(2 + 6w 4)]
- ~ - Hln 10 -In(1 - 2W4)] + ~ + [In 1 - In W4] - 1
=0.
366 8 Nonlinear Programming

Hence
!In(2 - 9w 4 ) - ~ln(2 + 6w 4 ) + ~ln(1 - 2w 4 ) -In W4 = 1.81602693,
so that
W4 : : :; 0.07980696
wi = 0.25634747
wi = 0.49576835
w! = 0.16807722.
Also
* _ ( -2
f(X)- )W~ ( -3 )W; (- 2 )W; ( - 1 )w:
wi wi w! W4
= (1.69322)(2.44125)(1.51625)(1.22356)
= 7.66869727.
Further
2(xi}-2
w*--~-
3 - f(X*)

w* _ 3xi
2 - f(X*)'
so that
xi = 1.24566074
xi = 1.26729914.

8.4 Exercises

(I) Computational

1. Suppose it is wished to locate the maximum value of the following functions within
the given interval I. Reduce the interval to within 10% of its original length using
using Fibonacci search.
(a) a(S) = - 2S 2 + S + 4, I = ( - 5, 5), e = ~
[1* = (- /3, N7)].
(b) a(S) = _4S2 + 2S + 2, 1= (-6,6), e = lo
[1* = ( - 0.46, 0.56)],
(c) a(S) = S3 + 6S 2 + 5S - 12, I = (- 5, - 2), e = lo
[1* = (- 3.615, - 3.384)].
(d) a(S) = 2S - S2, 1= (0,3), e = 160
[1* = m,H)].
(e) a(S) = S2 - S - 10, I = (-10,10), e = t
[1* = (-1, ¥o)].
(f) a(S) = -(S + 6)2 + 4, I = (-10,10), e = t
[1* = ( - 6.924, - 5.386)].
8.4 Exercises 367

(g) a(S) = 3S 2+ 2S + 1, 1 = ( - 5, 5),8 = l2


[1* = ( - 0.383, 0.467)]'
(h) a(S) = - S2 + 4S + 4, 1 = ( - 5, 5), 8 = to
[1* = m~,H)]'
(i) a(S) = S3 - 2S 2 +S- 4,1= (-tt), 8 = iz
[1* = (t, t)].
(j) a(S) = - S2 + 4S - 3, 1 = (0,5), 8 = t
[1* = (1j-, 187 )].

(k) a(S) = S - S2 + 4, 1 = (0,2), 8 = lo


[1* = (f.,-, ~6)]'
(I) a(S) = S2 + 2S - 3, 1 = ( - 4, 6), 8 = t
[1* = m,6)].
(m) a(S) = -2S 2 +3S+6,1=(-3,5),8=i
[1* = (0, ~)].
2. Repeat Exercise 1 using golden section search.
3. Repeat Exercise 1 using Bolzano's method.
4. Repeat Exercise 1 using even block search.
5. Attempt to maximize the following unconstrained functions using pattern search,
given initial starting point X 0, el = e2 = 0.1, and resolution 81 = 82 = 0.001.
(a) f(X) = -xi + X l X 2 - x~, Xo = (2,3)
[X* = (0,0)].
(b) f(X) = 4Xl - xi - +2XlXl - 2x~, Xo = (1' 1)
[X* = (2,4)]'
(c) f(X) = 3x l - xi - 2x~ + 4X2 + 15, Xo = (0,0)
[X* = (t 1)].
(d) f(X) = -(2X2 - xd 2 - 4(Xl + W, XO = (0,0)
[X* = (-3, -f)].
(e) f(X) = -xi-xlx2-x~,Xo=(5,5)
[X* = (0,0)].
(f) f(X) = -xi - x~, Xo = (3, -4)
[X* = (0,0)].
(g) f(X) = - (Xl - 1)2 - (X2 + 2)2, X 0 = (0,0)
[X* = (1, -2)].
(h) f(X) = -(Xl - 2)2 - (X2 - W, XO = (0,0)
[X* = (2,3)].
6. Repeat Exercise 5 using one-at-a-time search.
7. Repeat Exercise 5 using Rosenbrock's method.
8. Repeat Exercise 5 using Powell's method.
9. Maximize the following unconstrained functions using the gradient method of Sec-
tion 8.2.3, with X 0 = (0,0,0).
(a) f(X) = -3(Xl - 2)2 - 4(X2 - W - 2(X3 + 5)2
[X* = (2,3, -5)].
368 8 Nonlinear Programming

(b) f(X) = -2x l (X l - 4) - X2(X2 - 2)


[X* = (2,1)]'
(c) f(X) = X1X2 - 2x~ - X1X3
[unbounded].
(d) f(X) = X1X2 - 2xI - 2X1X3
[X* = (O,t!)].
(e) f(X) = (Xl - 3)2 - 4(X2 - 2)2 - x~
[X* = (3,2,0)]'
(f) f(X) = -xi - (X2 - 2)2 - 2(X3 - W
[X*= (0,2,3)]'

(g) f(X) = X1X2 - XI - x~ - x~


[X* = (0,0,0)].
(h) f(X) = 5xI + x~ + xi - 4X1X2 - 2Xl - 6X3
[X* = (1,2,3)]'

10. Solve the above problems by using the gradient partan method of Section 8.2.3.1
11. Solve the above problems by using the conjugate gradient method of Section 8.2.3.3.

12. Attempt to maximize the following functions using the method of Newton and
Raphson of Section 8.2.2.1.
(a) f(X) = -(Xl - W - (X2 - 4)2 + 1
[X* = (3,4)].
(b) f(X) = -(Xl + 2)2 - (X2 - 1)2
[X* = (-2,1)].
(c) f(X) = (Xl + 1)2 - X1X2 - 2x~
[X* = (0,2,0)].
(d) f(X) = -(Xl - 3)2 - 4(X2 - 6)3.
(e) f(X) = - 5(X1X2 - W- 4XIX2 - 2(X1X2 - 1)3.
(f) f(X) = (Xl - 2)2 + (X2 - W- 4xIx 2·
(g) f(X) = xi - 3XIX~,
(h) f(X) = 4XI - (2X2 - 2xd 2 + 4XIX2 + 3x 2.
(i) f(X) = 5xI + x~ + x~ - 4X1X2 - 2Xl - 6X3
[X* = (1,2,3)]'

13. Repeat Exercise 8.1.2 using the variable metric method of Section 8.2.2.2.

14. Minimize the following functions using geometric programming.


(a) f(X) = 3xix2x3+ XIX~X3 + 4xIx2X3 + 6xi5XZ5/2X32.
(b) f(X) = 3X1Xz3X~ + 2xi2X2 + X~X3l
[X* = (1.094,1.077,1.041)]'
(c) f(X) = 4xi + 4xi2X~ + 5XZ4X~ + X3 3
[X* = (0.9073,1.0561,0.8211)]'
(d) f(X) = xix2x35 + 4xIx~/2X33 + 4X1XZ2X~
[X* = (2(1/4)1/4, (1/4)1/14, (1/4)- 1128)].
8.4 Exercises 369

(e) f(X) = 3XIX2 + 5x2x32 + xllx2"l + 2xllx~


[X* = (1.82,0.234, 1/.J2)].
(f) f(X) = 8X1X2 + 2X1X3 + 4x12x2"lx3 2 + x l
l
[X* (1.5004,0.4992,3.9936)]'
=

(g) f(X) = X~/2X2 + 2xIx~X32 + XllX3 l + 4x2"lx2


[X* = (0.2878,2.3438,1.842)].
(h) f(X) = 3X~X3l + 6XIX2" 2 + 2xll + xllx2"2x~
[X* = (1.03,0.872, -1.05)].

(II) Theoretical

15. Show that the directions D j , i = 1, 2, ... generated in Section 8.2.3.3 are mutually
conjugate.
16. Compare the performance of the methods used in Exercises 1-4 with the method
outlined in Section 8.2.1 by using it to solve the problems in Exercise 1.
17. Prove that the global maximum of a negative definite quadratic function of n vari-
ables can be found after 2n - 1 steps by the gradient partan method.
18. Show that if a function of two variables has contours which are negative definite
quadratics that these contours are concentric ellipses.
19. If X* is a global maximum for the function described in Exercise 18 and Xl' X 2 E
R2, prove tangents Tl and T2 to the contours of f at Xl and X 2 are parallel if and
only if Xl and X 2 are collinear with X*.
20. Prove that Xl in Exercise 19 is the maximum point for f along T l .
21. Justify the formulae for F j and Gj in Section 8.2.2.2.
22. Apply the Kuhn-Tucker conditions to the quadratic programming problem of
Section 8.3.5.
Chapter 9

Appendix

This appendix comprises two parts: an introduction to both linear algebra


and basic calculus. We begin with linear algebra.

9.1 Linear Algebra

9.1.1 Matrices

Definition. A matrix is a rectangular array of elements (often real numbers).

If A is a matrix, we write
all ... a 1n )
... a 2n
A = ( ~21

am 1 amn
Here A is said to have m rows and n columns and the element in the ith row
and jth column, 1 ::; i ::; m, 1 ::; j ::; n is called the i, j element, or aij' A is also
denoted by (aij)m x n.

Definition. A matrix is termed square if m = n.

Definition. The identity matrix is a square matrix in which


O, if i # j
{
aij = 1, otherwise.

370
9.1 Linear Algebra 371

The identity matrix with n columns (and rows) is denoted by In> or simply I
if no confusion arises; it is of the form:
n columns
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
In= n rows.

0 0 0 1 0
0 0 0 0 1

Definition. A matrix is termed a zero matrix if all its elements are zero, i.e.

The zero matrix with m rows and n columns is denoted by Om x n' or simply
o if no confusion arises, and is of the form:
n columns

Om x n = (6~)t
o
6
0 ... 0
0 m rows.

Definition. The transpose of a matrix A = (aij)m x n is a matrix with n rows and


m columns with its i, j element aij defined by
l~i~n,l~j~m.

The transpose of A is denoted by A T and is obtained from A by making


the ith row in A the jth column in AT, 1 ~ i ~ m. Note that for any matrix A,
(ATV = A.

Definition. Two matrices A = (aij)m x nand B = (bi)m x n are termed equal if,
and only if,

Note that equality is not defined if A and B do not have the same number
of rows and of columns.

9.1.2 Vectors

Definition. A vector is a matrix which has either:


a. exactly one row (m = 1), or
b. exactly one column (n = 1).
372 9 Appendix

It is usual to drop the first (second) subscript in case a (b). In case a the
vector is often called a row vector and is denoted by:

In case b the vector is often called a column vector and is denoted by

A vector (either row or column) with n entries is called an n-vector.

Definition. A finite set {X b X 2 , ..• ,Xq }, of n-vectors is said to be linearly


independent if and only if
q

L (XiXi = 0,
i= 1

for (Xi' i = 1, 2, ... , q real numbers, implies that


(Xi = 0, i = 1,2, ... , m.

Definition. A finite set of n-vectors is said to be linearly dependent if it is not


linearly independent.
For example, the set of 3-vectors {X 1, X 2, X 3} given by
X 1 =(1,0,0), X 2 = (0,1,0), X3 = (0,0,1)
is linearly independent. However, if X 3 = (1,1,0) the set is linearly depen-
dent.

9.1.3 Arithmetical Operations on Vectors and Matrices


Unless otherwise mentioned, the following operations are defined for vec-
tors, which can be thought of as simply a special type of matrix.

(i) Scalar Multiplication


The scalar product of a matrix A = (ai)m x n and a real number (X is defined as
a new matrix, denoted by (XA. The i, j element of (XA is defined by
(Xaii' 1 ::;; i ::;; m, 1 ::;; j ::;; n.

(ii) Addition
Two matrices A = (ai)rn x nand B = (bi)rn x n can be added together to form a
new matrix, called the sum of A and B, denoted by A + B. The i, j element of
A + B is defined by
1 ::;; i ::;; m, 1 ::;; j ::;; n.
9.1 Linear Algebra 373

Note that the sum of two matrices is not defined if they do not have both the
same number of rows and of columns.

(iii) Subtraction
Two matrices A = (ai)mxn and B = (bi)mxn can be subtracted to form a new
matrix, denoted by A - B. A - B is formed by forming the sum of A and the
scalar product of -1 and B. Thus the i, j element of A - B is defined by
1:::;; i:::;; m, 1 :::;;j:::;; n.
Note that A - B is not defined if A and B do not have both the same number
of rows and of columns.

(iv) Multiplication
Two matrices A = (aij)m x q and B = (bi)q x n can be multiplied to form a new
matrix, C = (Ci)m x"' called the product of A and B. The i, j element of C is
defined by
q
Cij = L aikbkj ,
k= 1
l:::;;i:::;;m,l:::;;j:::;;n.

C is denoted by AB. Note that the product AB is not defined unless A has the
same number of columns as B has rows. Thus, although AB may be defined,
for a given pair of matrices A and B, BA may not necessarily be defined, and
even if it is, it is not necessarily so that
AB=BA.
Examples of this multiplication are given below:

G~)C 6) ~ (1 x 5 + 2 x 7
8 3 x 5+4 x 7
1 x 6 + 2 x 8) = (19
3 x 6+4 x 8 43
22)
48

(! ~ !)(: 10) 11 = (1 x 7 + 2 x 8 + 3 x 9 1 x 10 + 2 x 11
12 4 x 7 + 5 x 8 + 6 x 9 4 x 10 + 5 x 11
+ 3 x 12)
+ 6 x 12

50 68)
= ( 122 167

(1,2)G :) = (1 x 3 + 2 x 5 1 x 4 + 2 x 6) = (13,16)

G:)G)=G: ~::: D=G~)


(1, 2)(!) = (1 x 3 + 2x 4) = (11).
Note in the last example that the product of two vectors yields a scalar.
374 9 Appendix

It is not difficult to prove some simple properties of the above operations.


For all matrices A, B, C, D, and E with m rows and n columns, and all real
numbers 1)(:
A+O=O+A=A
A+B=B+A
A + (B + C) = (A + B) + C
A - (B - C) = (A - B) - C
(A+Bf=AT+BT
(A - Bf = AT - BT
I)((A + B) = I)(A + I)(B,
and if all the necessary multiplication is compatible:
IA = AI = A
A(DE) = (AD)E
A(D+ E) = AD + AE
(D + E)A = DA + EA
I)((AB) = (I)(A)B = A(I)(B).

(v) Vector Distance


Definition. The distance between two n-dimensional vectors, X = (Xb X2, •.. ,
xmf and Y = (Yb Y2, ... , YY is \\X - Y\\' defined as follows:

9.1.4 Determinants

Any square matrix A whose entries are real numbers has associated with it a
unique real number called its determinant, denoted by \A\ or det A. Rather
than define \A\ explicitly, we will outline a method for calculating \A\ for any
A. First we introduce some basic concepts. Associated with each element
aij of A is a number which is the determinant of the matrix arrived at by
deleting the ith row and jth column of A. This determinant is called the i, j
minor and is denoted by Mij:

all a12 a 1j - l alj+ 1 a 1n

a21 a 22 a2j-l a2j+ 1 a 2n

Mij= ai-ll a i - 12 ai-Ij-l ai-lj+ I ai-In

a i + ll ai+ 12 ai+ I j - l ai+ lj+ 1 a i + In

amI a m2 a mj - l amj+l a mn
9.1 Linear Algebra 375

For example, if
7

A =
(48 931 11
2 6 12 20
9
1~)
M" ~(~ 96 5)
1 13,
20
where the inside parentheses are omitted by common convention. Asso-
ciated with each minor Mij is a cofactor Cij defined by
(9.1)
That is, each cofactor is either + 1 or -1 times the determinant of the
associated minor.
We can now begin to calculate IAI. Let A= (aij)nxn and r be such that
1 :$; r :$; n. Then
n

IAI = L j=l
a,jCrj , (9.2)

Thus to calculate IAI we choose any row, say row r of A:

We multiply each of these elements arj by the cofactor C rj and sum up all
the products. Thus (9.2) reduces the problem of finding a determinant of an
n x n matrix to n problems of calculating the determinant of an (n - 1) x
(n - 1) matrix. Substituting (9.1) into (9.2) produces
n
IAI = L1
r=
arj( -1)'+ jM'j' (9.3)

We can now use (9.2) to reduce the problem of finding M rj from that of
finding determinants of(n - 1) x (n - 1) matrices to that of finding (n - 2) x
(n - 2) determinants. Eventually the problem is reduced to finding the de-
terminants of 2 x 2 matrices. We use the following definition in this case:

IAI = l(aij}zx21 = la ll a12 1 = all a22 - a12 a21'


a21 a 22
(9.4)

We now illustrate this approach by finding the determinant of the matrix


A given earlier in this section. Let r = 1 in (9.2) throughout the rest of this
example. Then, by (9.3), we have

IAI = a ll (-1)1+1Mll + a12(-1)1+2M 12


+ a13(-1)1+3M 13 + a14(-1)1+4M14 (9.5)
= 6M ll - 9M 12 + 7M13 - 5M 14 ·
376 9 Appendix

Now
395
M ll = 1 11 13,
6 12 20
which, by (9.3), becomes

Mll=3(-1)1+11~~ ~~1+9(_1)1+21! ~~1+5(_1)1+31! ~~I'


=3(11 x 20-13 x 12)-9(1 x 20-6 x 13)+5(1 x 12-6 x 11), by (9.4)
=474.
Similarly,

M =8(_1)1+11 11 131+9(-1)1+214131+5(-1)1+314111
12 12 20 2 20 2 12
= 8(220 - 156) - 9(80 - 26) + 5(48 - 22)
= 156

M 13 =8(-1)1+11! ~~1+3(_1)1+21~ ~~1+5(_1)1+31~!1


= 8(20 - 78) - 3(80 - 26) + 5(24 - 2)
= -512

M14 = 8(_a)1+11! ~~I + 3(_1)1+21~ ~~I + 9(_1)1+31~ !I

= 8(12 - 66) - 3(48 - 22) + 9(24 - 2),


= -312.
Substituting all this information into (9.5) yields
IAI = 6 x 474 - 9 x 156 + 7 x (-512) - 5 x (-312)
= -584.
The reader may like to verify that the following properties hold for some
numerical examples and then prove them true in general.
1. If A and B are two matrices which have the property that one can be
obtained from the other by the interchange of two rows (or columns) then

2. IAI = IATI for all matrices A.


3. If a matrix A has a row (or column) of zeros, then
IAI =0.
9.1 Linear Algebra 377

4. If B is a matrix which is obtained by adding a scalar multiple of a row


(or column) to another row (or column) of another matrix A, then

IBI = 1Ai-
From this and property 3 it follows that if two rows (or columns) of a
matrix A are identical then
IAI =0.
5. (9.2) can be used to show that if a matrix B is obtained by multiplying by
a scalar rJ. all the elements of a row (or column) of another matrix A, then

6. If A = (ai). x. and B = (hi). x"' then the determinant of their product


equals the product of their determinants, i.e.,

IABI = IAIIBI·
Definition. The cofactor matrix A = (a i). x. of a matrix, A = (ai). x. is a
matrix defined by

where Cij is the cofactor defined in (9.1).

Definition. The adjoint matrix, Aadj of a matrix, A = (a i). x n is a matrix


defined by Aadj = AT, the transpose of the cofactor matrix. So

C
C 21

A .= C 12 C 22
ad) :

C 1• C2 •

9.1.5 The Matrix Inverse

Definition. The inverse of a square matrix, A = (aij). x. is a matrix B =


x.
(hi). with the property that

It is usual to denote B by A -1.

Definition. A matrix A is termed nonsingular if

IAI =F O.
378 9 Appendix

The reader is encouraged to attempt to prove the following properties


of the inverse.
1. If A is a nonsingular square matrix, A -1 is unique.
2. If A and B are both nonsingular square matrices with the same number
of rows then

3. A -1 A = I for all square, nonsingular matrices.


4. If A is nonsingular and the given multiplications are defined, then AB =
AC implies B = C. The inverse A -1 of a square, nonsingular matrix A
may be computed using the following formula:

A
-1
= W
1
(Aadj). (9.6)

As an example, let us take the inverse of the matrix, A where


2

A ~ (~ 3
1 ~)
n -~)
we have
2
A= -2
-6

n
and
1
Aadj = -2
1
-~)
-5
Further,
IAI =4.
Hence
-3 1
1
A
-1
= W (Aadj) = 1
4:
(
~ -2
1

This can be verified as follows:

AA - 1 = (1 2 3~ (- !it -!tt
4
111
3 6 i) (1
-1
i
= 0
0
o1
o
0)0
1
= I.

A second way in which the inverse of a matrix can be calculated is by


Gauss-J ordan elimination. Suppose it is desired to calculate the inverse of
9.1 Linear Algebra 379

the square, nonsingular matrix A. We first form an appended matrix:


I
B= [A I I],

where I is the identity matrix with as many rows as A. If the left-hand part of
B is now transformed into I by adding scalar multiples of the rows of B to
other rows, the right-hand part is transformed into A -1. As an example, we
once again calculate the inverse of the matrix just displayed:

2 3 1 0
B~ ~)
R1
(; 3 6 0 1 Rz
1 1 0 0 R3
becomes

2 3 1 0

(~ ~)
R1
-5 -6 -4 1 R z - 4R1
-1 -2 -1 0 R3 - R1

2 3 1 0

(~ ~)
R1
4
1 6
5 5 -5
1
R z /( - 5)
-1 -2 -1 0 R3

3 3 Z
0 R1 - 2R z

(~ ~
'5 '5 5
6 4 1
1 5 5 -5 Rz
0 4
'5
1
-5 -5
1
R3 + Rz
3 3 Z
0

(~ ~)
'5 5 5 R1
6 4 1
1 5 5 -5 Rz
0 1 1
4
1
4 R3/( -!)

-1')
3 1
0 0

(~
-4 4 -4 R1 - !R3
1 0 1
2 -2
1
R z - *R 3
1 1
0 1 4 4 R 3·

Thus
A -1 = right-hand part of B

_(-1
- z
1
1
4
1
2
1
-1)
-z ,
5
'4 '4 4

which is the same result as that obtained by the previous method.


380 9 Appendix

9.1.6 Quadratic Forms, Definiteness, and the Hessian

A quadratic form is an expression of the type:


f(X) = XTHX,

where X is an n-dimensional vector and H is an n x n matrix. A quadratic


form f(X) and its associated matrix H is said to be:
negative definite if

f(X) < 0, for all X =f. °


negative semidefinite if

f(X) ::; 0, for all X =f. °


f(X) = 0, for some X =f. °
positive definite if

f(X) > 0, for all X =f. °


positive semidefinite if

°
°
f(X) ~ 0, for all X =f.
f(X) = 0, for some X =f.

indefinite if

f(X) > 0, for some X


f(X) < 0, for some X.

The following rules can be invoked in order to determine the definiteness


of any matrix H.

1. H is negative (positive) definite if and only if all the eigenvalues of Hare


negative (positive). (See Section 9.1. 7 for a discussion of eigenvalues.)
2. H is negative (positive) semidefinite if and only if all the eigenvalues of H
are nonpositive (negative) and at least one is zero.
3. H is indefinite if and only if H has some positive and some negative
eigenvalues.

Thus the definiteness of H(X) can be discovered by examining the eigen-


values of H(X). This is often no trivial task. However the rules 1-3 require
that only the signs of the eigenvalues be known, not the values themselves.
These signs can be found by using Descarte's rule of signs.
9.1 Linear Algebra 381

Let us now examine the behavior of f at a critical point X*. That is,
Vf(X*) = o.
If:
(a) H(X*) is negative definite, X* is a local maximum for f.
(b) H(X*) is positive definite, X* is a local minimum for f.
(c) H(X*) is indefinite, X* is a saddle point for f.
(d) H(X*) is either positive or negative semidefinite, nothing can be said
about X*.

9.1.7 Eigenvalues and Eigenvectors

Given any n x n matrix H, the scalars AI, A2 , ... , An which are the zeros of
the characteristic equation
det(H - AI) = 0

are called the eigenvalues of H. Here I is the n x n identity matrix.


Corresponding to the n eigenvalues there are n eigenvectors X 1, X 2, ... ,
X no which satisfy
Xi i: 0, i = 1,2, ... , n.

It can be shown that all the eigenvectors of a real symmetric matrix are real.
(Note that the Hessian matrix of a multivariable function with continuous
second partial derivatives is symmetric.) Further, the eigenvectors which
correspond to distinct eigenvalues are orthogonal, i.e.,

The relationship between the Hessian, the eigenvectors and eigenvalues can
be expressed as
H = ET /\E, (9.7)
where

and
382 9 Appendix

if the eigenvectors X l' X 2, . . . , Xn are replaced by unit vectors in the same


direction. Now it can be shown that
XTHX i = XTE T 1\ EX i , i = 1,2, .. , n, by (9.7),
i = 1,2, ... , n.
Hence it can be shown that a necessary and sufficient condition for H to be
negative semidefinite is
i = 1,2, ... , n.

9.1.8 Gram-Schmidt Orthogonalization

Given n linearly independent vectors d 1 , d z , ... ,dn , the Gram-Schmidt


orthogonalization process can be used to construct from them n orthonormal
vectors e 1 , e z , ... , en- A set of vectors {eb ez, ... ,en} is termed orthonormal
if
1 if i = j
e·e· = { ' (9.8)
'J 0, otherwise.
The process begins by setting,
d1
e1 = ldJ
so that (9.8) is satisfied for i = j = 1. Next choose
ez = y1d1 + Y2dZ
= 6 1 e 1 + 6 zd z ·
Now

Hence
e 1(6 1e 1 + 6zdz ) = 6 1e 1e 1 + 62 e 1 dz
= 6 1 + 62 e 1 d 2
=0
and

Therefore
e2 = 6 1e1 + 62 d 2
= -6 2(e 1d 2)e 1 + 62d2
= 6 2[d 2 - (e 1d 2)el].
Let

hence

Now
9.l Linear Algebra 383

Therefore

Next choose
e3 = (Jld l+ (J2 d2 + (J3d3
= wle l + w 2 e 2 + w 3 d 3 •
Now

and

Therefore
0= el(wle l + w 2e2 + w 3d 3) = wlele l + w2ele2 + w 3e l d 3
0= e 2(w l e l + W2e2 + w3d3) = w l e 2e l + w2e2e2 + W3e2d3'
Hence
0= w l+ W3eld3 => wl = - w3e l d 3
0= w2 + W3e2d3 => w2 = -W3e2d3'
Therefore
e3 = w l e l + W2 e2 + W3 d3
= -w 3(e l d 3)e l - w3(e 2d 3)e2 + W3 d3
= w 3[d 3 - (e l d 3)e l - (e2 d3)e2J.
Let

Hence

Now

Therefore
g3
e3 = jgJ'
The process continues in this manner until en is constructed.
As an example, consider the orthogonalization of the following (row)
vectors
{d l ,d 2,d 3,d4} = {(0,2,0,0), (2,0,1,0), (0,0,1,1), (1,0,0,2)}.
Now
dl
el = ldJ
= (0,1,0,0)
g2 = d 2 - (e l d 2)e l
= (2,0,1,0) - [(0,1,0,0)(2,0,1,0)](0,1,0,0)
= (2,0,1,0).
384 9 Appendix

Therefore
gz
ez = 19J
1
= 15 (2,0, 1,0).

Now
g3 = d 3 - (e t d 3)e t - (e Zd 3)ez
= (0,0,1,1) - [(0,1,0,0)(0,0,1,1)](0,1,0,0)

-[Js (2,0, 1,0)(0,0,1, l)J(Js (2,9,1,0)

= (to,t 1).
Therefore
g3
e3=~

134
= )2(5'0,5,1)

g4 = d 4 - (et d4)et - (e Z d 4)ez - (e3 d4)e3


= (1,0,0,2) - [(0,1,0,0)(1,0,0,2)](0,1,0,0)

- [Js (2,0,1,0)(1,0,0, 2)J Js (2,0,1,0)

Therefore
1
e4 = r'ifli'i ( - 29, 0, - 62,35).
3,\/ 390

9.2 Basic Calculus

9.2.1 Functions of One Variable

Definition. A real-valued function of one real variable f comprises a set D


together with a rule for associating exactly one real number f(x) with each
element x of D.
9.2 Basic Calculus 385

The set D is called the domain of f and the set {J(x): xED} is called the
range of f. In this book the range is assumed to be a subset of R the set of
real numbers, as is the domain (expect for the functionals of Chapter 7).
We now turn to the concept of a limit of a function.

Definition. The limiting value of f as XED tends to b is said to be d if and


only if for all B > 0 there exists a (j > 0 such that
o < Ix - bl < (j => If(x) - dl < B.

The fact that f has a limiting value of d as x tends to b is denoted by:


lim f(x) = d.

Sometimes a function f is not defined for values of x that are either greater
than a given value b or less than b. In these cases the above definition of a
limit is invalid and we are lead to the concept of one-sided limits:

Definition. The limiting value of f as XED tends to b from the right (left)
is said to be d and only if for all B > 0 there exists a (j > 0 such that
0< x - b < (j => If(x) - dl < B
(0 < b - x < (j => If(x) - dl < B).

These limits are denoted by;


from the right: lim f(x) = d
x-b+

and
from the left: lim f(x) = d.
x-b-

We come now to the concept of continuity of a function of one variable.

Definition. A function f is said to be continuous at a point d E D if and only


if for all B > 0, there exists a (j > 0 such that
Ix - dl < (j => If(x) - f(d)1 < B.

Note that, unlike the definition ofa limit, the above definition is such that
there is no necessity that the left-hand quantity Ix - dl be positive.

Definition. A function f is said to be differentiable at d E D if and only if the


limit:
lim f,,---,(_d _+----:h)_----=f'--'.(d--'-)
h-O h
exists.
386 9 Appendix

If this limit does exist it is denoted by 1'(d), and is called the derivative
of fat d.

Definition. Iff is differentiable at x for all XED, f is said to be differentiable.

The proof of the following theorem is left as an exercise for the reader:

Theorem 9.1. If a function, f is differentiable at d E D, then f is continuous at d.

One can attempt to find derivatives of 1'; if l' is differentiable then the
derivative of l' at a point dE D is denoted by 1'(d). In general, when the
process is repated k times, the final derivative is denoted by Pk)(d).

9.2.2 Some Differential Theorems of Calculus

Theorem 9.2 (Weierstrass' Theorem). If f is a continuous function on [XI.


Xl + h] ~ D, then f attains both a maximum and a minimum value on [XI.
Xi + h].
We omit proof of Theorem 9.2. Stated in other words, this theorem means
that there exist x*, x* E [x I. X I + h] such that
f(x):s:; f(x*) for all x E [XI,XI + h],
f(x) ;;:: f(x*) for all x E [XI. Xl + h].
Theorem 9.3 (Rolle's Theorem). Iff is differentiable on (XI. Xl + h) ~ D and
continuous on [Xl' Xl + h] and
f(x l ) = f(x i + h) = 0,
then there exists 8, °< 8 < 1, such that
1'(8XI + (1 - 8)(X1 + h)) = 0.

PROOF. If
f(x) = 0, for all X E [XI,XI + h],
the result is true. If not, there exists X 2 E [XI,XI + h] such that
f(x 2 ) -:f. 0.
Assume
f(X2) > 0. (9.9)
(Iff is negative at this point an analogous prooffollows.) Since f is continuous
we can invoke Weierstrass' theorem and state that f attains a maximum
value on [XI,XI + h]. Thus there exists X*E [XI,XI + h] such that
f(x):S:;f(x*), for all XE [XI,X I +h]. (9.10)
9.2 Basic Calculus 387

By (9.9), we have
f(x*) > O.
But as

we have
x* E (XbXI + h).
Hence 1'(x*) exists, by assumption. Assume
1'(x*) > O.
Then there exists b > 0 such that
f(x* - b) < f(x*) < f(x* + b).
(Prove this.) But this contradicts (9.10). Assuming
1'(x*) < 0
leads to a similar contradiction. Thus
1'(x*) = o.
Define 0 to be such that
x* = OX I + (1 - O)(XI + h).
Then 0 < 0 < 1 and the theorem is proved. o

Theorem 9.4 (First Mean Value Theorem). Iff is differentiable on (Xb Xl + h)


and continuous on [X b X I + h] s; D then there exists 0, 0 < 0 < 1 such that

f(x i + h) = f(x l ) + h1'(Ox l + (1 - 8)(XI + h)).

PROOF. Set

g(x) =
Xh
{- - Xl
- (f(x i + h) - f(x l )) } + f(XI) - f(x).
Now

and g is differentiable on (Xl> Xl + h) as f is. Thus, by Rolle's theorem there


exists 8, 0 < 0 < 1, such that

g'(OXI + (1 - 8)(x l + h)) = O.


Hence

0= (f(XI + h~ - f(x l )) _ 1'(8X I + (1 - 8)(XI + h)).

Hence the result. o


388 9 Appendix

9.2.3 Taylor's Theorem

Let a function f with domain D be differentiable at x E [x I, XI + hJin D. The


first mean value theorem states that
f(XI + h) = + hf'«()XI + (1 - ())(XI + h), for some (), < () < 1.
f(XI) °
Ifj<k) is continuous on [XbXI + hJ andj<HI)(x) exists for all x E (XI,XI + h)
we can generalize this result as follows:

Theorem 9.5 (Taylor's Theorem). If j<k) is continuous on [XbXI + hJ and


j<k+ 1)(X) exists for all x E (Xb Xl + h) then
h2
f(x i + h) = f(XI) + hf'(XI) + 2 f"(x l ) + ...
hk + l
+ (k + I)! j<H 1)«()XI + (1 - ())(XI + h)), for some (), °< () < 1.
PROOF (By induction). For k = 1, the above hypothesis is the first mean
value theorem. Assume the result holds for k. Let

g(x) = f(x l ) + (x - XI)f'(X I) + (x -2xd 2 f"(x l ) + ... + (x ~!XI)k j<k)(XI)


(x - Xl)k+l
+ (k + I)! R,
where R is such that
g(XI + h) = f(XI + h).
Now we wish to show that

Let
R = f(H 1)«()X I + (1 - ())(XI + h)), for some (), °< () < 1.

m(x) = f(x) - g(x),


As
m(XI) = m(xi + h) = 0,
by Rolle's theorem we have
m'«()lx I + (1 - ()1)(X I + h)) = 0, for some ()l, °< () < 1.
Therefore
1'(1]) = f'(XI) + (1] - xdf"(xd + ...
+ (1] ~tl)k j<k+I)«()2XI + (1 - ()2)1]), for some ()2, °< ()2 < 1.

Thus
9.2 Basic Calculus 389

Define () such that


()x 1 + (1 - ())(X 1 + h) = ()2Xl + (1 - ()2)11
and the theorem is proven. D

9.2.4 Functions of Severable Variables

Many of the results of the previous sections can be generalized to functions


of several variables. The definition of a function can be so generalized by
simply defining D, the domain of f, to be a set of n-dimensional real vectors.
The definition of the limiting value of f can be amended as follows:

Definition. The limiting value off as XED tends to b is said to be d if and


only if for all 8 > 0, there exists (j > 0 such that
o < IIX - bll < (j =*" If(X) - dl < 8.
The definition of continuity follows analogously:

Definition. A function, f is said to be continuous at XED if and only if for


all 8 > 0 there exists (j > 0 such that
IIX - dll < (j =*" If(X) - f(d)1 < 8.

Things are a little more complicated when it comes to generalizing


differentiation:

be two points in the domain D of a function f. Then f is said to be differen-


tiable with respect to X; if and only if the limit:

exists.

If this limit does exist, it is denoted by of(X o)/ox; and is called the first
partial derivative off with respect to x;.
Assuming that all the partial derivatives of(X)/ox;, i = 1, 2, ... , n, exist
for all XED, each can be thought of as a function on D. Each of these
functions may have partial derivatives, which are termed second partial
derivatives of f. Thus if of/ox; has a partial derivative with respect to Xj
at X, the derivative is denoted by o2f/ox j x;. This process can of course be
repeated if the necessary limits exist.
390 9 Appendix

The first partial derivatives of f at X can be assembled into a vector:

called the gradient vector off at X, denoted by Vf(X). The set of second
partial derivatives of f at X can be assembled into a matrix:

(Pf o2f (Pf


---
ox! ox! ox! oX 2 ox! oXn
o2f o2f o2f
oX 2 ox! OX2 0X2 ox 2 oxn

o2f o2f o2f


oxnox! ox nox 2 oXnoxn

called the Hessian matrix off at X, denoted by H(X) in hour of the German
mathematician who discovered it, Hesse.
Taylor's theorem can be extended to functions of several variables:

Theorem 9.6 (Taylor's Theorem for Functions of Several Variables). If the


second partial derivatives of f are continuous and X and X + h are two points
in the domain of f, then

f(X + h) = f(X) + Vf(Xfh + ~hTH(eX + (1 - e)(x + h»h, for some e,


o<e<l.

Another theorem involving the gradient is of some importance in


optimization:

Theorem 9.7 (Gradient Direction Theorem). The gradient

Vf=(oOf 'oOf , .. "OOf)T


x! X2 Xn

points in the direction of steepest slope of the hypersurface of f.

PROOF. Construct an n-dimensional hypersphere of radius r about an arbi-


trary point XED. Let points on the sphere be of form:
C + L1X,
where

Then
9.3 Further Reading 391

Now by using a first-order Taylor series approximation it can be shown that


Af = f(X + AX) - f(X) = Vf(XfAX.
Let us now attempt to find the point on the hypersphere for which Af is a
maximum. We must form the Lagrangian:
L(AC) = 17fT AX - A[(AXf(AX) - r2]
(}L
(}AX = Vf - 2AAX

=0
for a maximum. Therefore

AX* =~
2A Vf .

Hence the vector AX* yielding the greatest improvement in f has the same
direction as the gradient, Vf (as 1/2). is a scalar). D

Theorem 9.8. If Xl' X 2 , •.. ,X n are nonnegative numbers and Ai> A2 , •.. , ).n
are positive numbers such that
n

I Ai = 1,
i= 1
then
n n

I
j= 1
AjXj ;:::: TI
j= 1
(9.11)

If
j = 1,2, ... ,n,

then the left- and right-hand sides of (9.11) are the arithmetic mean and the
geometric mean of Xl> X 2 , ..• , X n , respectively.

9.3 Further Reading


The purpose of this section is to outline some of the texts that are available
to the reader who wishes to pursue some of the topics of this book to a
deeper level. We begin with general books on optimization and then cover,
in order, linear programming, integer programming, network analysis, dyna-
mic programming, and finally nonlinear programming. In addition to those
listed here, the reader should be aware of books in the fields of operations
research, management science, industrial engineering, and computer science
which sometimes contain substantial content of an optimization nature.
392 9 Appendix

One of the most important general books on optimization is by Beightler,


Phillips, and Wilde (1979) covering classical optimization; linear, integer,
nonlinear, and dynamic programming; and optimal control, all at an
advanced level. Some of the topics are also covered at a more elementary
level by Wilde (1964). Another more elementary text covering linear pro-
gramming and a little nonlinear programming is Claycombe and Sullivan
(1975). Mital (1976) covers most of the topics of this book, together with a
chapter on game theory at an elementary level, as do Cooper and Steinberg
(1970). Husain and Gangiah (1976) cover nonlinear programming and varia-
tional methods, with special emphasis on the techniques required for certain
problems in chemical engineering. Sivazlian and Stanfel (1975) have written
an undergraduate text covering the optimization techniques needed to solve
many of the deterministic models of operations research. Gottfried and
Weisman (1973) cover most of the optimization topics at a more advanced
level than Sivazlian and Stanfel, and include a chapter on optimization
under uncertainty and risk. Finally Geoffrion (1972) has edited a collection
of expository papers covering a wide range of optimization topics at an
advanced level.
Dantzig (1963) is the most important early reference on linear program-
ming at an advanced level. Coverages at an elementary level include: Clay-
combe and Sullivan (1975), Daellenbach and Bell (1970) (with good sections
on the formulation of L.P.'s and a computer code), Driebeek (1969) (with a
good coverage of real-world applications), Campbell (1965) (covering the
linear algebra underlying L.P.), and Fryer (1978). Intermediate level texts
include: Hadley (1962), Spivey and Thrall (1970), Garvin (1960) (with some
good applications of L.P.), Smythe and Johnson (1966), and Bazaraa and
Jarvis (1977). Of the advanced texts on linear programming we mention:
Gass (1969) and Simmonard (1966) (both requiring a mathematical back-
ground). Also Gal (1978) has written an advanced text covering postoptimal
analysis and parametric programming.
Recent publications in integer programming include Salkin (1974) (ad-
vanced level) and Taha (1978) (very readable). Of the earlier works, Plane
and McMillan (1971) is easily accessible to those with little mathematical
background, Garfinkel and Nemhauser (1972) is more advanced and Green-
berg (1971) is an intermediate text with interesting J.P. applications. A recent
survey of integer programming articles published between 1976 and 1978
was completed by Hausmann (1978). An earlier survey was published by
Geoffrion and Marsten (1972).
The best early reference on network analysis is Ford and Fulkerson (1962).
Since then Busacker and Saaty (1965) have covered some aspects of network
flow in a mathematically sophisticated fashion and Hu (1969) has given
network flow problems a thorough examination at an advanced level. Geof-
frion (1972), mentioned earlier, contains articles on optimization in networks.
Plane and McMillan (1971), already mentioned, contains a chapter covering
most of the material in Chapter 5 which is accessible to those with little
9.3 Further Reading 393

background. Finally, Bazaraa and Jarvis (1977) cover the network flow and
shortest path problems in a book that is very easy to read.
As was mentioned in Chapter 6, Bellman (1957) wrote the first book on
dynamic programming. It is an advanced-level treatise. Since then Bellman
and Dreyfus (1962), Hadley (1964), Nemhauser (1966) and White (1969) have
written books which are also somewhat advanced in level. Bellman and
Dreyfus present many applications. Hadley contains, apart from two chapters
on D.P., a great deal of useful material on classical optimization, stochastic,
integer, and nonlinear programming. Nemhauser's book is difficult to read,
while White concentrates on the mathematical aspects of D.P. For the
reader with limited mathematics background, Dreyfus and Law (1977) is
recommended. While Dreyfus and Law state their book is graduate-level,
it concentrates on applications and numerical examples of D.P.
Of the many books which specialize in classical optimization we mention
Panik (1974). This book is intermediate in level and contains a great deal
of mathematical background before covering classical optimization and
many of its extensions. Panik does not cover variational problems and hence
we cite the following, which cover the calculus of variations in increasing
depth: Arthurs (1975), Craggs (1973), Young (1969), Smith (1974), Pars (1962),
Ewing (1969). Well worth special mention are Hestenes (1966) and Gelfand
and Fomin (1963). Hestenes covers introductory variational theory and also
optimal control theory in some detail. Gelfand and Fomin slant their
approach toward physical applications, and the proofs of many of the
theorems of Chapter 7 of the present book can be found there. Finally,
Blatt and Gray (1977) have provided an elementary derivation of Pontrya-
gin's maximum principle. Many of the books mentioned in the next para-
graph also contain sections on classical optimization.
There is an enormous amount ofliterature on nonlinear programming and
we mention only a relatively small number of references here. Among the
general references, Wilde and Beightler (1967) was mentioned earlier; Abadie
(1967) surveys many of the areas of nonlinear programming in a collection
of expository papers; Luenberger (1969) contains an advanced coverage of
the mathematical aspects of nonlinear programming; Zangwill (1969) has
become something of a classic and represents one of the first attempts at
unifying nonlinear programming theory; Pierre (1969) covers classical opti-
mization, the calculus of variations, linear and dynamic programming, the
maximum principle, as well as many nonlinear programming techniques at
the graduate level; Beveridge and Schechter (1970) constitutes a comprehen-
sive treatment of most of the theory of nonlinear programming as it was in
1970 with, a valuable section on optimization in practice, at the senior/grad-
uate level; Aoki (1971) is an undergraduate level text written for an audience
interested in the applications rather than the mathematical theory of N.L.P.
and contains some applications to engineering; Martos (1975) sets out to
give a systematic treatment of the most important aspects of N.L.P. with
many numberical examples; and finally Simmons (1975) covers some classical
394 9 Appendix

optimization but emphasises solution algorithms that have shown them-


selves to be of continuing importance and practical utility. Worthy of special
mention is Himmelblau (1972) and Adby and Dempster (1974), which de-
scribe and compare in simple terms many of the N.L.P. methods which have
proven to be effective.
Nonlinear programming books of a more specialized nature include:
Zoutendijk (1960), the original work on feasible directions; Hadley (1964),
which includes advanced-level material on separable problems, the Kuhn-
Tucker conditions, quadratic programming, and gradient methods; Kunzi,
Tzschach, and Zehnder (1968), which contains a list of FORTRAN and
ALGOL computer codes for some N.L.P. algorithms; Kowalik and Osborne
(1968), Fiacco and McCormick (1968), Murray (1972), (covering methods of
computing optima of unconstrained problems); and Duffin, Peterson, and
Zener (1967), the fathers of geometric programming, present the mathe-
matical theory of G.P. and its application to problems in engineering design.
Also there is an excellent survey of unconstrained optimization methods by
Powell in Geoffrion (1972). Though dated, the survey gives a good introduc-
tion to the area.
References*

Abadie, J., ed., (1967) Nonlinear Programming. North-Holland. [350, 393J


Adby, P. R., and Dempster, M. A. H. (1974) Introduction to Optimization Methods.
Chapman and Hall. [394J
Aoki, M. (1971) Introduction to Optimization Techniques. Macmillan. [394J
Arthurs, A. M. (1975) Calculus of Variations. Routledge and Kegan Paul. [393J
Balas, E. (1965) An additive algorithm for solving linear programs with zero-one
variables. Operations Research, 13: 517-546. [159J
Balinski, M. L., and Quandt, R. E. (1964) On an integer program for a delivery problem.
Operations Research 12: 300-304. [174J
Bazaraa, M. S. and Jarvis, J. J. (1977) Linear Programming and Network Flows. Wiley.
[392, 393J
Beightler, C. S., and Phillips, D. T. (1976) Applied Geometric Programming. Wiley.
[361J
Beightler, C. S., Phillips, D. T., and Wilde, D. J. (1979) Foundations of Optimization,
2nd ed. Prentice-Hall. [392J
Bellman, R. (1975) Dynamic Programming. Univ. Press. [3,235, 393J
Bellman, R., and S. E. Dreyfus (1962) Applied Dynamic Programming. Princeton
University Press. [235, 253, 393J
Bellmore, M., and Malone, J. C. (1971) Pathology of travelling salesmen subtour
elimination algorithms. Operations Research 19: 278-307. [174J
Beltrami, E., and Bodin, L. (1974) Networks and vechic1e routing for municipal waste
collection. Networks 1: 65-94. [174J
Beveridge, G. S., and Schechter, R. S. (1970) Optimization: Theory and Practice.
McGraw-Hill. [394J
Blatt, J. M., and Gray, J. D. (1977) An elementary derivation ofPontryagin's maximum
principle of optimal control theory. J. Aust. Math Soc., 20B: 142-6. [393J
Box, G. E. P., and Wilson, K. B. (1951) On the experimental attainment of optimum
conditions. J. Roy. Stat Soc. B13: 1. [330J
Box, M. J. (1966) A comparison of several current optimization methods and the use
of transformation in constrained problems. Compo J., 9: 67-77. [343J
Brent, R. P., (1973) Algorithms for Minimization Without Derivatives. Prentice-Hall.
[343J

* References to page numbers in the text are given in square parentheses.

395
396 References

Broyden, C. G. (1971). The Convergence of an algorithm for solving sparse nonlinear


systems. Math. Comp: 25: 285-294. [329J
Busacker, R. G., and Gowan, P. J. (1961) A procedure for determining a family of
minimal-cost network flow patterns. ORO Technical Rept. 15, Operations Re-
search Office, Johns Hopkins University. [206J
Busacker, R. G., and Saaty, T. L. (1965) Finite Graphs and Networks. McGraw-Hill.
[188, 392J
Campbell, H. G. (1965) Matrices, Vectors and Linear Programming. App1eton-Century-
Crofts. [392J
Carroll, G. W. (1961) The created response surface technique for optimizing nonlinear
restrained systems. Operations Research 9: 169-184. [347J
Cauchy, A. (1847) Methode generale pur la resolution des systemes d' equations si-
multanees. Compt. Rend. Acad. Sci. (Paris) 25: 536-38. [330J
Clarke, G., and Wright, S. W. (1964) Scheduling of vehicles from a central depot to
a number of delivery points. Operations Research 12: 568-681. [177J
Claycombe, W. W., and Sullivan, W. G. (1975) Foundations of Mathematical Program-
ming. Reston. [392J
Conte, S. D., and de Boor, C. (1965) Elementary Numerical Analysis, 2nd ed. McGraw-
Hill. [264J
Cooper, L., and Steinberg, D. (1970) Introduction to Methods of Optimization. Saunders.
[392J
Craggs, J. W. (1973) Calculus of Variations. Allen and Unwin. [393J
Daellenbach, H. G., and Bell, E. J. (1970) User's Guide to Linear Programming. Prentice-
Hall. [392J
Dakin, R. J. (1965) A tree search algorithm for mixed integer programming problems.
ComputerJ. 8: 250-255. [155J
Dantzig, G. B. (1963) Linear Programming and Extensions. Princeton Univ. Press.
[2, 122, 392J
Dantzig, G. B., and, Wolfe, P. (1960) Decomposition principle for linear programs.
Operations Research, 8: 101-111. [122J
Davidon, W. C. (1959) Variable metric methods for minimization. Argonne Laboratory
Rep. ANL-5990, Revised. [328, 329, 348J
Deo, N., (1974) Graph Theory with Applications. Prentice-Hall. [188J
Dijkstra, E. W. (1959) A note on two problems in connection with graphs. Numerische
Mathematik 1: 269 [194J
Dreyfus, S. E., and Law, A. M. (1977) The Art and Theory of Dynamic Programming.
Academic Press. [393J
Driebeek, N. J. (1969) Applied Linear Programming. Addison-Wesley. [392J
Duffin, R. J., Peterson, E., and Zener, C. (1967) Geometric Programming: Theory and
Applications. Wiley. [311, 361, 393J
Eastman, W. L. (1958) Linear programming with pattern constraints. Ph.D. Diss.,
Harvard University. [174J
Elsgolc, L. E. (1961) Calculus of Variations. Pergamon Press Ltd. [295J
Ewing, G. M. (1969) Calculus of Variations with Applications. Norton. [393J
Fiacco, A. V., and McCormick, G. P. (1968) Nonlinear Programming: Sequential Un-
constrained Minimization Techniques. Wiley. [347, 393J
Fletcher, R. (1965) Function minimization without evaluating derivations-a review.
Compo J. 8: 33-41. [343J
Fletcher, R., ed. (1969) Optimization. Academic Press. [348J
Fletcher, R. (1970) Anew approach to variable metric algorithms. Compo J. 13: 317 -322.
[329J
Fletcher, R., and Powell, M. J. D. (1963) A rapidly convergent descent method for
minimization. Compo J. 6: 163-8. [328, 329J
References 397

Fletcher, R., and Reeves, C. M. (1964) Function minimization by conjugate gradients.


Compo J. 7: 149. [334]
Floyd, R. W. (1962) Algorithm 97-Shortest Path, Comm. ACM 5: 345. [194]
Ford, L. R., and Fulkerson, D. R. (1962) Flows in Networks. Princeton Univ. Press.
[201,209,392]
Forsythe, G. E., and Motzkin, T. S. (1951) Acceleration of the optimum gradient
method. Bull. Amer. Math. Soc. 57: 304-305. [331]
Foster, B. A, and Ryan, D. M. (1976) An integer programming approach to the vehicle
scheduling problem. Opnal. Res. Quart. 27: 367-384. [177]
Foulds, L. R., Robinson, D. F., and Read, E. G. (1977a) A manual procedure for the
school bus routing problem. Australion Road Research 7: 21-25. [174]
Foulds, L. R., O'Brien, L. E., and Pun, T. J. (1977b) Computer-based milk tanker
scheduling. NZ J. Dairy Sci. and Tech. 12: 141-145. [174]
Fryer, M. J., (1978) An Introduction to Linear Programming and Game Theory. Arnold.
[392]
Gal, T. (1978) Postoptimal Analysis. Parametric Programming and Related Topics,
McGraw-Hill. [392]
Gale, D. (1960) The Theory of Linear Economic Models. McGraw-Hill. [56]
Garfinkel, R. S., and Nemhauser, G. L. (1970) Optimal political districting by implicit
enumeration techniques. Mgmt. Sci. 16: 495-508. [179]
Garfinkel, R. S., and Nemhauser, G. L. (1972) Integer Programming. Wiley. [174, 392]
Garvin W. W. (1960) Introduction to Linear Programming. McGraw-Hill. [392]
Garvin, W. W., Crandall, H. W., John J. B., and Spellman, R. A (1957) Applications
of iinear programming in the oil industry. Mgmt. Sci. 3: 407. [174, 175]
Gass, S. I. (1969) Linear Programming, Methods and Applications, 3rd ed. McGraw-Hill.
[35, 38, 392]
Gelfand, I. M., and Fomin, S. V. (1963) Calculus of Variotions. Prentice-Hall. [300,
301,393]
Geoffrion, AM., ed. (1972) Perspectives on Optimization. Addison-Wesley. [392,393]
Geoffrion, A, and Marsten, R. (1972) Integer programming: A framework and state-
of-the-art survey. Mgmt. Sci. 18: 465-91. [392]
Golden, B. Magnanti, T., and Nguyen, H. (1975) Implementing vehicle routing algo-
rithms. M.LT. Operations Research Center Technical Rept. No. 115. [174]
Goldfarb, D. (1970) A family of variable-metric methods derived by variational means.
Math. Compo 24: 23-26. [329]
Gomory, R. E. (1958) Outline of an algorithm for integer solutions to linear programs.
Bull. Amer. Math. Soc. 64: 275-278. [2, 162]
Gottfried, B. S., and Weisman, J. (1973) Introduction to Optimization Theory. Prentice-
Hall. [303, 392]
Greenberg, N. (1971) Integer Programming. Academic Press. [392]
Griffith, R. E., and Stewart, R. A (1961) A nonlinear programming technique for the
optimization of continuous processing systems. Mgt. Sci. 7: 379-392. [351]
Hadley, G. H. (1962) Linear Programming. Addison-Wesley. [122,392]
Hadley, G. (1964) Nonlinear and Dynamic Programming. Addison-Wesley. [235, 393]
Hancock, H. (1960) Theory of Maxima and Minima. Dover Publications. [273]
Harary, F. (1969) Graph Theory. Addison-Wesley. [188]
Hausmann, D., ed. (1978) Integer Programming and Related Areas. Springer-Verlag.
[392]
Henrici, P. (1964) Elements of Numerical Analysis. Wiley. [273]
Hess, S., Weaver, J., Siegfeldt, H., Whelan, J., and Zitlau, P. (1965) Nonpartisan
political districting by computer. Operations Research 13: 998-1006. [179]
Hestenes, M. R. (1966) Calculus of Variotions and Optimal Control Theory. Wiley.
[393]
398 References

Himmelblau, D. M. (1972) Applied Nonlinear Programming. McGraw-Hill. [265, 345,


394]
Hooke, K., and Jeeves, T. A. (1961) Direct search solution of numerical and statistical
problems. J. ACM 8: 212-229. [334]
Hu, T. C. (1969) Integer Programming and Network Flows. Addison-Wesley. [22,392]
Husain, A., and Gangiah, K. (1976) Optimization Techniques. Macmillan India. [392]
Jacoby, S. L. S., Kowalik, J. S., and Pizzo, J. T. (1972) Iterative Methods for Nonlinear
Optimization Problems. Prentice-Hall. [345]
Kiefer, J. (1957) Sequential minimax search for a maximum. Proc. Amer. Math. Soc.
4: 502-506. [320]
Kowalik, J., and Osborne, M. R. (1968) Methods for Unconstrained Optimization
Problems. Elsevier. [393]
Kruskal, J. B. (1956) On the shortest spanning subtree of a graph and the traveling
salesman problem. Proc. Amer. Math. Soc. 7:48. [194]
Kuhn, H. W., and Tucker, A. W. (1951) Nonlinear programming. Proc. 2nd Berkeley
Symp. on Math. Stat. Prob., Univ. Calif. Press., (1951) p. 481-492. [3, 284]
Kunzi, H. P., Tzschach, H. G., and Zehnder, C. A. (1968) Numerical Methods of Math-
ematical Optimization. Academic Press. [393]
Land, A., and Doig, A. (1960) An automatic method of solving discrete programming
problems. Econometrica 28:497-520. [155]
Lill, S. A. (1970) A modified Davidon method for finding the minimum of a function
using difference approximations for derivatives. Computer J. 13: 111-113. [345]
Little, J. D. c., Murty, K. G., Sweeney, P. W., and Karel, C. (1963) An algorithm for
the traveling salesman problem. Operations Research 11 :979-989. [174]
Luenberger, D. G. (1969) Optimization by Vector Space Methods. Wiley. [393]
Martos, B. (1975) Nonlinear Programming. North-Holland. [394]
Mital K. V. (1976) Optimization Methods. Wiley Eastern. [392]
Murchland, J. D. (1967) The once-through method of finding all shortest distances in a
graph from a single origin. London Graduate School of Business Studies, Rept.
LBS-TNT-56. [194]
Murray, W., ed. (1972) Numerical Methods for Unconstrained Optimization. Academic
Press. [393]
Nemhauser, G. L., (1966) Introduction to Dynamic Programming. Wiley. [235, 393]
Panik, M. J. (1976) Classical Optimization. North-Holland. [393]
Pars, L. A. (1962) An Introduction to the Calculus of Variations. Heineman. [393]
Pierre, D. A. (1969) Optimization Theory with Applications. Wiley. [394]
Plane, D. R., and McMillan, C. (1971) Discrete Optimization. Prentice-Hall. [209,
392]
Pontryagin, L. S., Boltyanskii, V. G., Gamkrelidze, R. V., and Mishchenko, E. F. (1962)
The Mathematical Theory of Optimal Processes. Wiley-Interscience. [5,306]
Powell, M. J. D. (1964) An efficient method for finding the minimum of a function of
several variables without calculating derivatives. Compo J. 7: 155-62. [342, 345,
348]
Powell, M. J. D. (1968) On the calculation of orthogonal vectors. Compo J.11 :302-304.
[342]
Prim, R. C. (1957) Shortest connection networks and some generalizations. Bell Syst.
Tech. J. 36: 1389. [196]
Rosen, J. B. (1960) The gradient projection method for nonlinear programming. Part 1 :
Linear constraints. SIAM J. Appl. Math. 8: 181-217. [346]
Rosenbrock, H. H. (1960) An automatic method for finding the greatest or least value
of a function. Compo J. 3: 175-184. [341]
Salkin, H. M. (1974) Integer Programming. Addison-Wesley. [392]
Shah, B. V., Buehler, R. J., and, Kempthorne O. (1964) Some algorithms for minimizing
a function of several variables. SIAM J. Appl. Math. 12: 74-92. [332]
References 399

Shanno, D. F. (1970) Conditioning of Quasi-Newton methods for function minimization


Math. Compo 24:647-656. [329]
Simmonard, M. (1966) Linear Programming. Prentice-Hall. [392]
Simmons, D. M. (1975) Nonlinear Programming for Operations Research. Prentice-Hall.
[394]
Sivazlian, B. D., and Stanfe1, L. E. (1975) Optimization Techniques in Operations Re-
search. Prentice-Hall. [392]
Smith, D. R. (1974) Variational Methods in Optimization. Prentice-Hall. [393]
Smith, R G., Foulds L. R, and Read, E. G. (1976) A political redistricting problem.
New Zealand Operational Research 4: 37 -52. [179]
Smythe, W. R, and Johnson, L. A. (1966) Introduction to Linear Programming with
Applications. Prentice-Hall. [392]
Spivey, W. A., and Thrall R. M. (1970) Linear Optimization. Holt, Rinehart and Wins-
ton. [392]
Stewart, G. W. (1957) A modification of Davidon's minimization method to accept
difference approximations of derivatives. J. A.C.M. 14: 72-83. [344, 345]
Swann, W. H. (1964) Report on the development of a new direct search method for
optimization. I.C.I. C.I. Lab. Res. Note. 6413. [345]
Taha, H. A. (1976) Operations Research, 2nd ed. Macmillan. [221]
Taha, H. A. (1978) Integer Programming. Macmillan. [392]
Turner, W. C., Ghare, P. M., and Foulds, L. R (1974) Transportation routing problem:
A survey. AIlE Transactions 6:288-301. [174]
Wagner, J. (1968) An application of integer programming to legislative redistricting.
34th National Meeting of the Operations Research Society of America. [179]
Watson-Gandy, C. and Foulds, L. R. (1981) The Vehicle Scheduling Problem: A survey.
New Zealand Operational Research 9 no. 2: 73-92. [174]
White, D. J. (1969) Dynamic Programming. Oliver and Boyd. [235,393]
Wilde, D. J. (1964) Optimum Seeking Methods. Prentice-Hall. [392]
Wilde, D. J., and Beightler, C. S. (1967) Foundations of Optimization. Prentice-Hall.
[284,393]
Young, L. C. (1969) Calculus of Variations and Optimal Control Theory. Saunders. [393]
Zangwill, W. I. (1969) Nonlinear Programming. Prentice-Hall. [394]
Zoutendijk, K. G. (1960) Method of Feasible Directions. Elsevier. [345, 393]
Solutions to Selected Exercises

Chapter 2

Section 2.8
l(a). Let
Xl = the number of chocolate cakes
X2 = the number of banana cakes.
Then the problem is to:

Maximize: 75xl + 60X2


subject to: 4Xl + 6X2::; 96
2Xl + X2::; 24
X l ,X 2 ~ o.
The optimum point can be found graphically (see Figure S.l) or by solving
the two equations:
4Xl + 6X2 = 96
2Xl + X 2 = 24
=> 4X2 = 48.

Thus the optimal solution is

xT = 6, xi = 12.
The value is
6 x 0.75 + 12 x 0.60 = 11.7.
400
Solutions to Selected Exercises 401

24

16

12

12 24
Figure S.1

Hence the best profit the baker can hope to make is $11.70 by baking six
chocolate cakes and twelve banana cakes a day.
2(a). Let
Xl = the number of trucks manufactured
X2 = the number of automobiles manufactured
X3 = the number of vans manufactured.
Then the problem is (on dividing the profits by 1000) to:

Maximize: 6x l + 4X2 + 3X3


subject to: 4Xl + 5X2 + 3X3 ~ 12
3x l + 4X2 + 2X3 ~ 10
4Xl + 2X2 + X3 ~ 8
X l ,X2,X3 ~ O.
On introducing slack variables, the problem becomes

Maximize: 6Xl + 4X2 + 3x 2


subject to: 4Xl + 5X2 + 3X3 + X 4 = 12
3Xl + 4X2 + 2X3 + Xs = 10
4Xl + 2X2 + X3 + X6 = 8
Xi ~ 0, i = 1, 2, 3, 4, 5, 6.
402 Solutions to Selected Exercises

The problem can now be solved using the simplex method.

XI X2 X3 X4 Xs X6 r.h.s Ratio

12 12
4 5 3 0 0 4
10
3 4 2 0 0 10 3"""
@ 2 0 0 1 8 ~
Xo -6 -4 -3 0 0 0 0

XI X2 X3 X4 Xs X6 r.h.s Ratio

2 4
0 3 0 -1 4 2
0 s
2 i 0 1 3
4 4 16
5
I I 2
1 2 4 0 0 1
4 8
3 3 12
Xo 0 -1 -2 0 0 2

XI X2 X3 X4 Xs X6 r.h.s.

0 I I
~ 2 0 -2 2
s S I 3
0 8 0 -8 8 2
I 1 3 3
8 0 -8 0 8 2
Xo 0 s
4 0 i 0 3
4 15

Tableau 3 yields the optimal solution:

xi = ~, xj = 2, x~ = ~,
x! = xl = x~ = 0 x~ = 15,000.

3(a). Let
Xl = the number of pounds of chutney produced per week
X2 = the number of pounds of sauce produce per week.
Then, with the introduction of the slack variables X3, x 4 , and xs, and the
Solutions to Selected Exercises 403

artificial variable X6, the problem is

Maximize: = Xo
subject to: 3x I+ 5x2 + X3 = 24 (1)
4XI + 2X2 + X4 = 16 (2)
Xl + X 2 - Xs + X6 = 3 (3)
j = 1,2, ... ,6.

This formulation will be solved by the big M method. The feasible region
for the problem is shown in Figure S.2.

Figure S.2

The initial tableau for the problem is:

Constraints Xl X2 X3 X4 Xs X6 r.h.s.

(1) 3 5 0 0 0 24
(2) 4 2 0 1 0 0 16
(3) 1 CD 0 0 -1 3
Xo -4 -5 0 0 0 M 0

The initial basis is (X3, X4, X6). However, because the objective function co-
efficient of the basic variable X6 is nonzero, the tableau is not yet in canonical
form. This is remedied by replacing the Xo row by the sum of the Xo row
404 Solutions to Selected Exercises

and M times (3). This gives:

Constraints XI X2 X3 X4 X5 X6 r.h.s. Ratio

(1) 3 5 1 0 0 0 24 254

(2) 2 16
4 0 1 0 0 16 2
(3) 1 1 0 0 -1 3 t
Xo -(M + 4) -(M + 5) 0 0 M 0 -3M

The simplex iterations required to reach the optimal solutions are

3
Constraints XI X2 X3 X4 X5 X6 r.h.s. Ratio

(1) -2 0 1 0 ~ -5 9 t
(2) 2 0 0 2 -2 10
10 2
(3) 1 0 0 -1 1 3
Xo 0 0 0 -5 (M + 5) 15

4
Constraints XI X2 X3 X4 X5 X6 r.h.s.

(1) -s2 0 s
I
0 1 -1 t
(2) 2~ 0 -s2 0 0 ~
(3) t t 0 0 0 4~
Xo -1 0 0 0 M 24

Constraints XI X2 X3 X4 X5 X6 r.h.s.

(1) 0 0 t I
7 -1 2~
7
(2) 0 I
-7 154 0 0 2,
2
(3) 0 7 -/4 0 0 3~
Xo 0 0 ~ 154 0 M 2~
Solutions to Selected Exercises 405

The optimal solution is

xi = 2~, xi = 3t, x~ = 2i,


x~ = x! = 0, X6 = 26~.

Thus the housewife should make 2~ lbs chutney and 3t lbs sauce to obtain
a maximum profit of $2.63.
4(a). This problem can be expressed in mathematical terms. The variables
are defined as follows. Let

Xl = the number of classrooms constructed


X2 = the number of houses constructed.
The problem can now be stated:

Maximize: 4XI + 5X2


subject to: 4XI + 5x 2 ::;; 32 (1)
4Xl + 3x 2 ::;; 24 (2)
3Xl + 2X2 ::;; 20 (3)
2XI + X2::;; 16 (4)

which converted to standard form is

Maximize: 4Xl + 5X2


subject to: 4Xl + 5X2 + X3 = 32
4Xl + 3X2 + X4 = 24
+ X5 = 20
+ X6 = 16.

This problem can now be solved using the simplex method.

Constraints Xl X2 X3 X4 X5 X6 r.h.s. Ratio

(1) 4 G) 1 0 0 0 32 35
2
24
(2) 4 3 0 0 0 24 "3
20
(3) 3 2 0 0 0 20 2
(4) 2 0 0 0 1 16 \6
Xo -4 -5 0 0 0 0 0
406 Solutions to Selected Exercises

Constraints XI X2 X3 X4 Xs X6 r.h.s. Ratio

4 I 32
(1) 5 S 0 0 0 ""5 8
(2) CD 0 3
S 1 0 0 24
""5 3
(3) S
7
0 -s2 0 1 0 36
s st
(4) §.
s 0 -sI 0 0 1 4S8
8
Xo 0 0 0 0 0 32

The last tableau yields the optimal solution:


x! = 3f, xl = 254 , X~ = 356,

X~ = 458 , Xl = X~ = 0, X6 = 32.
However, the nonbasic variable Xl' has a zero xo-row coefficient, indicating
that the objective function value would remain unchanged if Xl was brought
into the basis:

Constraints XI X2 X3 X4 Xs X6 r.h.s.

I I
(1) 0 2 -2 0 0 4
(2) 1 0 3
-8 i 0 0 3
I 7
(3) 0 0 8 -8 1 0 3
I 6
(4) 0 0 "4 -8 0 6
Xo 0 0 1 0 0 0 32

This tableau yields the optimal solution:


xt = 3, x! =4, x~ = 3, x~ = 6, x~ = xl = 0, X6 = 32.
Thus the builder should build 3 classrooms and 4 houses and maximize his
profit at $32,000.
The Xo row value of X 4 is zero indicating that X4 could replace Xl in the
basis at no change in objective function value. This would produce tableau
2. Thus this problem has two basic optimal solutions.
The problem is solved graphically in Figure S.3. When the objective func-
tion is drawn at the optimal level, it coincides with constraint line (1). This
means that all points on the line from point (0, 6~) to (3,4) represent optimal
solutions. This situation can be stated as follows:
4xt + 5x! = 32, o ~ xt ~ 3, X6 = 32.
Solutions to Selected Exercises 407

Xo = 32
'-
'-
II
s

Figure S.3

5(a). In mathematical form the problem is

Maximize: 3x 1 + 2X2 + 4X3 + X4


subject to: 8x 1 + 2X2 + 5X3 + 4X4 ::;; 16
6x 1 + 4X2 + 3X3 + 2X4 ::;; 10
3x 1 + 3X2 + 2X3 + X4::;; 6t·
In standard form the problem becomes

Maximize:' 3Xl + 2X2 + 4X3 + X4


subject to: 8x 1 + 2X2 + 5X3 + 4X4 + Xs = 16
6Xl + 4X2 + 3X3 + 2X4 + X6 = 10
3Xl + 3x 2 + 2X3 + X4 + X7 = 6t
Xi ~ 0, i = 1, 2, ... , 7.

The final two tableaux required to solve the problem are displayed:

Xl X2 X3 X4 X5 X6 X7 r.h.s. Ratio

1!
5 5
2
! t 0 0 16
5 8
6
5 ® 0 -5
2
-5
3
1 0 ~ t
-'5
1
V 0 -5
3 2
-'5 0 U 1
"f
17 2 11 4 654
Xo 5 -'5 0 5 '5 0 0
408 Solutions to Selected Exercises

Xl X2 X3 X4 X5 X6 X7 r.h.s.

--.,-
10
0 ~
2
--; ---;
1
0 --.,-
22

3 1 3 i.. 1
--; 1 0 ---; -14 14 0 --;
---;2
8 1 11
---; 0 0 14 -14 1 0
Xo 275 0 0 --.,-
15
~
I
--; 0 --.,-
90

It can be seen that X 2 should enter the basis in tableau 2 but a tie occurs
on forming the ratios to decide which variable leaves the basis. In the next
iteration one of the basic variables is X 7 = O. This basic feasible solution is
called a degenerate solution. The optimum is reached at the first stage of
degeneracy.
The solution to the problem is that the farmer should cultivate t acre
of barley and 3t
acres of wheat. His profit would be $1285.71.
6(a). The problem can be expressed in mathematical terms as follows. Let
Xl = the units of cheese produced

X 2 = the units of butter produced

X3 = the units of milk powder produced

X 4 = the units of yoghurt produced.

The problem can now be restated:


Maximize: 4XI + 3x 2 + 2X3 + X4
subject to: 2XI + 3X2 + 4X3 + 2X4 $; 8
3x I + X2 + 4X3 + 2X4 $; 9
3x I + 2X2 + 5X3 + X4 $; 9,
which converted to standard form is
Maximize: 4XI + 3x 2 + 2X3 + X4
subject to: 2XI + 3x 2 + 4X3 + 2X4 + Xs =8
3x I + X2 + 4X3 + 2X4 + X6 = 9
3XI + 2X2 + 5X3 + X4 + X7 = 9.
The problem can now be solved using the simplex method.

XI X2 X3 X4 X5 X6 X7 r.h.s. Ratio

2 3 4 2 1 0 0 8 .!l.
2
3 1 4 2 0 1 0 9 9
3'
G) 2 5 1 0 0 1 9 9
3'
Xo -4 -3 -2 -1 0 0 0 0
Solutions to Selected Exercises 409

XI X2 X3 X4 Xs X6 X7 r.h.s. Ratio

0 CD 2
3
4
3 0 -3
2
2 .Q.
s
0 -1 -1 0 1 -1 0
1 i t 0 0 I
3 3 i
Xo 0 14
""3
I
3 0 0 t 12

XI X2 X3 X4 Xs X6 X7 r.h.s.

0 2
S ! 3
S 0 -s2 6
s
0 0 -s3 t s
3
1 -s7 !
1 0 7
S -s I
-s2 0 3
s V
24 3 I .Q.
Xo 0 0 5 s S 0 s 12~

The optimal solution can be found from tableau 3:


xi = V, xi =~, X6 =~, xi = 0, otherwise x~ = $1,240.
Thus the dairy factory should produce ~ ton of cheese and V ton of butter
daily to maximize profit at $1,240.
Note that one of the basic feasible solutions produced by the simplex
method was degenerate, as the variable X6 had zero value. However, there
is no degeneracy in the tableau of the next iteration. This is because the
entering variable X2 coefficient is negative in the X6 row. Thus no ratio is
formed.
7(a)
Maximize: 4Xl + 3x 2
subject to: 3x l + 4X2 ::;; 12 (1)
5x l + 2X2 ::;; 8 (2)
Xl + X2 ~ 5 (3)
Xl,X 2 ~ O.
When this problem is expressed graphically (Figure S.4) it can be seen that
there does not exist a point which will satisfy all constraints simultaneously.
Hence the problem does not have a feasible solution.
Two-Phase Method
Maximize: 4Xl + 3x2
subject to: 3x l + 4X2 + X3 = 12
5x l + 2X2 +x4 = 8
Xl + X2 + X5 - X6 = 5.
410 Solutions to Selected Exercises

X2

XI

Figure S.4

Phase I

XI X2 X3 X4 Xs X6 r.h.s.

3 4 1 0 0 0 12
5 2 0 1 0 0 8
1 0 0 1 -1 5
0 0 0 0 1 0 0

3 @) 1 0 0 0 12
5 2 0 1 0 0 8
1 1 0 0 1 -1 5
-1 -1 0 0 0 -1 -5
3 I
4 4 0 0 0 3
7
2 0 -2
I
0 0 2
I
4 0 -4
I
0 -1 2
-4
I
0 I
4 0 0 -1 -2

0 2.. 3
0 0 Il
14 -14

1 0 -7
I 2
7 0 0 4
7
0 0 -14
3
-14
I
1 -1 'l
3 I 13
0 0 14 14 0 -7

x~ > 0 => no feasible solution.


Solutions to Selected Exercises 411

8(a). In mathematical form the problem is:


Primal
Maximize: 5x 1 + 4xz = Xo
subject to: 3x 1 + 4x z ~ 14
4Xl + 2x z ~ 8

2Xl + X z ~ 6
x 1,X z :2: 0.

Since the number of constraints is greater than the number of variables, the
problem is more easily solved when its dual is created. The problem can
be written as follows.
Dual
Minimize: 14Yl + 8Yz + 6Y3 = y~
subject to: 3Yl + 4yz + 2Y3 :2: 5
4Yl + 2yz + Y3:2: 4
Yl' Yz, Y3 :2: 0.

In standard form the problem is

Maximize: = Yo
subject to: 3Yl + 4yz + 2Y3 - Ys =5
4Yl + 2Yz + Y3 - Y6 + Y7 = 4
Yi :2: 0, i= 1,2, ... , 7.

The tableaux required to solve the problem are displayed next.

Ys r.h.s.

3 4 2 -1 o o 5
4 2 o o -1 1 4
Yo 14 8 6 o M o M o

YI Y2 Y3 Y4 Ys Y6 Y7 r.h.s. Ratio

3 4 2 -1 0 0 5
@) 2 1 0 0 -1 4
Yo -(7M - 14) -(6M - 8) -(3M - 6) M 0 M 0 -9M
412 Solutions to Selected Exercises

Yl Y2 Y3 Y4 Y5 Y6 Y7 r.h.s. Ratio

0 CD i -1 i -4
3
2 4
"5
! 1
0 0 1
2
*
4 -4

7M -14
Yo 0 -(~M -1) -tiM -~) M 0 -tiM - i) --- -(2M + 14)
4

Yl Y2 Y3 Y4 Ys Y6 Y7 r.h.s.

0 1
2"
2
-5 ~ 10 -TO
3 4
5
0 0 t -5
1
-5
2
~ !
2
lOM-4 IS6
5M-8 74
Yo 0 0 2 5 -5
10 5

Hence the solution to the original minimization problem is

yt = t y! =!,
yt = 0, otherwise
Y6 = 754 = l4!.
The solution to the primal problem can be found by observing the slack
variables Y4 and Y6' in the objective function row. Thus xt has value %and x!
value 156 • The vintner should produce %gallon of the medium white wine
and 3! gallons of the dry white wine. He would then maximize his profit
at $14.80.
9(a).
Maximize: 5X1 + 4X2
subject to: 3x 1 + 4X2 + X3 = 14
4X1 + 2X2 + X4 =8
2X1 + X2 + X5 =6
Xi~ 0, i = 1,2, ... ,5.

The problem is now solved using the simplex method.

Xl X2 X3 X4 Xs r.h.s. Ratio

3 4 1 0 0 14 14
""3

® 2 0 0 8 8
3"
2 1 0 0 6
Xo -5 -4 0 0 0 0
Solutions to Selected Exercises 413

XI X2 X3 X4 Xs r.h.s. Ratio

0 CD 3
-4 0 8 s
"3
1 ! 0 ±
I
0 2 4
0 0 0 -z 2
Xo 0 -z3 0 S
4 0 10

XI X2 X3 X4 Xs r.h.s.

3
0 1 ~ -TO 0 IS6

2
0 -s-
I
s- 0 ~
O 0 0 -zI 1 2
Xo 0 0 t 4
s- 0 14!

Suppose C 2 is changed from 4 to 4 + p. Then the initial simplex tableau


for the problem becomes

XI X2 X3 X4 Xs r.h.s.

3 4 0 0 14
4 2 0 0 8
2 1 0 0 1 6
Xo -5 -(4 + p) 0 0 0 0

The corresponding tableau from this table would be

XI X2 X3 X4 Xs r.h.s.

0 1 ~ -TO
3
0 \6
I
1 0 -s- ~ 0 ~
O 0 0 -zI 1 2
3 4
Xo 0 -p s- s- 0 14!

In order for the present basis to remain optimal, X2 must still be basic.
Therefore the X 2 value in the Xo row must have zero value. This results in the
following tableau.

XI X2 X3 X4 X5 r.h.s.

2 3 16
0 s- -TO 0 s-
O -s-I ~ 0 ~
O 0 0 -zI 1 2
Xo 0 0 t+~p !- 130p 0 754 + 156p
414 Solutions to Selected Exercises

For the present basis to remain optimal all xo-row values must be non-
negative. Thus

and

This implies

Hence the range for C2 is


(4 - !,4 + i) = (~, 23°)·
lO(a). Consider the problem:
Maximize: 4X1 + 5x 2 = X o
subject to: 3x 1 + 4X2 ::;; 14
4X1 + 2X2 ::;; 8

2X1 + x 2 ::;; 6.

The final tableau is

Xl X2 X3 X4 Xs Lh.s.

0 1 t -TO
3
0 16
""5
1 2 2
0 -5 5 0 5
1
0 0 0 - 2 1 2
Xo 0 0 t 4
5 0 14!

Suppose we change the r.h.s. constant of the first constraint from 14 to


14 + y. Since X3 is the slack variable for this constraint, all the r.h.s. values
in the final tableau will change to
156 +h
i-h·
However, in order that the solution be feasible these values must be non-
negative. Thus
156 +h ;?: 0 => y;?: - 8
i- ty ;?: 0 => Y ::;; 2.

y;?: - 8 implies that the r.h.s. constant must be greater than 6 and y ::;; 2
implies that the r.h.s. constant must be smaller than 16 in order for the
solution to be feasible. Thus the range is - 8 ::;; y ::;; 2, with a r.h.s. constant
range of 6 to 16. This means that for the problem to have an optimal and
feasible solution the number of boxes of graphs can be no less than 6 or no
greater than 16.
Solutions to Selected Exercises 415

11 (a). From 9(a), the tableau:

r.h.s.

3 4 1 o 0 14
4 2 0 1 0 8
2 1 0 o 1 6
-5 -4 0 o 0 o
becomes at optimality:

r.h.s.

o t o
1 o -! o
o o o
o o o 5"
74

If a31 becomes 7! instead of 2, the same iterations produce:

r.h.s.

o t o
o -s1
o
¥ o o
o o o
which in canonical form is

r.h.s.

o -?o o 156

1 o t o 2
S
o
o
o
o
-n 1 -
1
S
t o 74
5"

This is infeasible, as Xs < O. Using the dual simplex method, X4 replaces Xs


in the basis (the only negative ratio):

r.h.s.

o 5
TIl o - ! 29
'"9
o 7
-27 o TI
4
~
o o -# -~
2
TI
o o # o -fr W
416 Solutions to Selected Exercises

Thus the new optimal solution is


xf = ~~,
xi = 2J1,
x6 = 3N·
12(a). Solving 5(b), let
Xl = the number of truckloads of A
X2 = the number of truckloads of B
X3 = the number of truckloads of C
X4 = the number of truckloads of D.
Then the problem is to
maximize: 2Xl + 3x 2 + 4X3 + 7X4 = Xo ($100)
subject to: 16x l + 15x 2 + 20x 3 + 30x4 ::; 150 (Area)
Xl + 9x 2 + X3 + 2x4 ::; 10 (Manpower)
Xi 2:: 0, i = 1,2,3,4.
(I) In standard form, this is
Maximize: Xo = 2Xl + 3x 2 + 4X3 + 7X4
subject to: 16x l + 15x2 + 20x 3 + 30X4 + Xs = 150
Xl + 9X2 + X3 + 2X4 + X6 = 10
Xi 2:: 0, i = 1,2,3,4.

Table a

XI X2 X3 X4 Xs X6 r.h.s.

16 15 20 30 1 0 150
1 9 1 CIl 0 1 10
-2 -3 -4 -7 0 0 10

t 5 0 -15 0
"2
I

3
9
-20
3
ill
I
0 t
7
5
2 20 -2 0 0 "2 35

t ..1...
10 1 0 S
1. - 3 0
t 130 0 I
-TO 2 5
! 130 0 0 I
TO 2 35

Table a shows the iterations to the optimal solution.


Solutions to Selected Exercises 417

(II) Suppose a new variable X7 is introduced which represents the amount


of the new product to be stored. The problem then becomes:
Maximize: 2Xl + 3X2 + 4X3 + 7X4 + 5X7 = Xo
subject to: 16x l + 15x2 + 20X3 + 30X4 + Xs + 2X7 = 150
Xl + 9X2 + X3 + 2X4 + X6 + X7 = 10
Xi ~ 0, i = 1, 2, ... , 7.
The dual of this problem is
Minimize: 150Yl + IOY2 = Yo
subject to: 16Yl + Y2 ~ 2
15Yl + 9Y2 ~ 3
20Yl + Y2 ~ 4
30Yl + 2Y2 ~ 7
2Yl + Y2 ~ 5
Yl,Y2 ~ O.
The last constraint can be tested to see whether the present primal solution
is optimal or not. Now

Yl = Xs = /0' Y2 = X6 = 2
and
2(/0) +2< 5.

Hence the present primal solution is suboptimal. If the new primal (II) had
the same primal iterations applied to it as had (I) to produce Table a, the
final tableau would be

Xl X2 X3 X4 X5 X6 X7 r.h.s.
1 + (-
5 130 0 t -3 [(t)(2) 3)(1)] 0
2 1 [( -/0)(2) + (2)(1)]
5 130 0 -TO 2 5
8 3 0 0 1
-TO 2 [ - 5 + (/0)(2) + (2)(1)] 35
5 TO
1 3 1 0 1 -3 -ll 0
5 TO 5
3 1
! TO 0 -TO 2 t 5
1 14
! 130 0 0 TO 2 -5 35
1 -g1 65
~
11
5 V -18 0 9
2 1 0 5 1 10 1 25
9 "6 9 18 9 9
1,il50 23 0 14 1 496 0 ~
30 9 -18 9
14 656 18 26 1 -2 0 130
9 2 0 10
TO
3 3 3 0 5 0 50
"2
418 Solutions to Selected Exercises

So the new profit is $5,000 and


x~ = 10,
xt = 0, i = 1,2,3,4.
13(a). Production constraints for breweries 1,2,3,4 are
XII + X12 + X13 + X 14 ::;; 20
X21 + X22 + X 23 + X 24 ::;; 10
X 31 + X32 + X33 + X34 ::;; 10
X41 + X 42 + X 43 + X44 ::;; 15.
Demand constraints for hotels 1, 2, 3, 4 are
X 11 + X21 + X31 + X 41 :2: 15
X 12 + X22 + X32 + X42 ::;; 20

X13 + X 23 + X33 + X 4 3 :2: 10

X14 + X 24 + X 34 + X 44 :2: 10.

All quantities transported must be nonnegative. Thus


i = 1,2,3,4
j = 1,2,3,4.
The objective was to find a supply schedule with minimum cost. The total
cost is the sum of all costs from all breweries to all hotels. This cost Xo can
be expressed as
Xo = 8Xll + 14x12 + 12x13 + 17x14 + llX21 + 9X22 + 15x23 + 13x24
+ 12x31 + 19x32 + 10x33 + 6X34 + 12x41 + 5X42 + 13x43 + 18x44·
The problem can now be summarized in linear programming form:
Minimize: Xo = + 14x12 + 12x13 + 17x14 + llX21 + 9X 22
8x 11

+ 15x23 + 13x24 + 12x31 + 19x 32 + IOx33 + 6X34


+ 12x41 + 5x42 + 13x43 + 18x44
subject to: X 11 + X12 + X13 + X14 ::;; 20

X 21 + X22 + X23 + X24 ::;; 10

X31 + X32 + X33 + X 34 ::;; 10

X41 + X42 + X43 + X44 ::;; 15

X 11 + X21 + X31 + X41 :2: 15

X12 + X 22 + X32 + X 42 :2: 20

X13 + X23 + X33 + X 4 3 :2: 10

X 14 + X24 + X34 + X44 :2: 10

i = 1,2,3,4
j = 1,2,3,4.
Solutions to Selected Exercises 419

The tableau for the example problem is given below.

2 3 4 Supply

20

2 10

Breweries Hotels

3 10

4 15

Demand 15 20 10 10

Identification of initial feasible solution by the four required methods


follows.

(1) The Northwest Corner Method. The method starts by allocating as


much as possible to the cell in the northwest corner of the tableau of the
problem, cell 1, 1 or row 1, column 1. The maximum that can be allocated
is 15 units, as the demand of hotel is 15 units. Column 1 is removed and cell
1, 2 becomes the new northwest corner. A maximum of 5 units is allocated
to this cell, all that remains in brewery 1. Row 1 is removed and cell 2, 3

2 3 4

~ ~ ~ ~
15 5 20

~ ~ ~ ~
2 10 10

Brewery Hotel
~ ~ ~ ~
3 5 5 10

~ ~ ~ ~
4 5 10 15

15 20 10 10
420 Solutions to Selected Exercises

becomes the new northwest corner. This procedure continues until all de-
mand is met. The tableau shows the feasible solution obtained.

Conclusion
Brewery 1 supplies 15 units to hotel 1 and 5 units to hotel 2.
Brewery 2 supplies 10 units to hotel 2.
Brewery 3 supplies 5 'units to hotel 2 and 5 units to hotel 3.
Brewery 4 supplies 5 units to hotel 3 and 10 units to hotel 4.
Total cost: 670 units.

(2) The Least Cost Method. This method starts by allocating the largest
possible amount to the cell in the tableau with the least unit cost. This means
allocating 15 units to cell 4, 2, and row 4 is removed. The demand of hotel
2 is reduced to 5 units. The cell with the next smallest cost is identified, i.e.,
cell 3,4 and 10 units are allocated to it removing row 3 and column 4. This
procedure continues until all the demand is met. The following tableau
illustrates the feasible solution which is obtained.

2 3 4

15 5 20

2 10

3 10

4 15 15

15 20 10 10

Conclusion
Brewery 1 supplies 15 units to hotel 1 and 5 units to hotel 3.
Brewery 2 supplies 5 units to hotel 2 and 5 units to hotel 3.
Brewery 3 supplies 10 units to hotel 4.
Brewery 4 supplies 15 units to hotel 2.
Total cost: 435 units.
Solutions to Selected Exercises 421

(3) The Vogel Approximation Method. This method begins by first re-
ducing the matrix of unit costs. This reduction is achieved by subtracting
the minimum quantity in each row from all elements in that row. This results
in the following tableau:

2 3 4

0 6 4 9 (-8) 20

2 2 0 6 4 (-9) 10

3 6 13 4 0 (- 6) 10

4 7 0 8 13 (- 5) 15

15 20 10 10

The costs are further reduced by carrying out this procedure on the columns
of the new cost matrix:

2 3 4

0 6 0 9

2 2 0 2 4

3 6 13 0 0

4 7 0 4 13

(0) (0) (-4) (0)

A penalty is then calculated for each cell which currently has zero unit
cost. Each cell penalty is found by adding together the second smallest costs
422 Solutions to Selected Exercises

of the row and column of the cell:

2 3 4

0 2 6 0 0 9 (0)

2 2 0 2 2 4 (2)

3 6 13 0 0 0 4 (0)

4 7 0 4 4 13 (4)

(2) (0) (0) (4)

The penalties are shown in the top right-hand corner of each appropriate
cell and the cell with the largest penalty is identified. The maximum amount
possible is then allocated to this cell. Cell 3, 4 will be arbitrarily chosen and
10 units are allocated to it. Row 3 and column 4 are removed from considera-
tion. A further reduction in the cost matrix and a recalculation of some
penalties is necessary. This results in the following tableau:

0 2 6 0 2 20 (0)

2 0 2 2 10 (2)

10

7 0 0 4 4 15 (4)

15 20 10
(2) (0) (2)

Cell 4, 2 is chosen and 15 units are allocated to it. Row 4 is then removed
from consideration. This process is repeated until all demand is met.
Solutions to Selected Exercises 423

0 2 6 0 0 2 20 (0)

2 0 8 2 10 (2)

10

15

15 20 10
(2) (6) (2)

Cell 2, 2 is chosen and 5 units are allocated to it, removing column 2 from
consideration.

0 2 0 2 20 (0)

2 5 2 10 (2)

10

15

15 10
(2) (2)

Cell 1, 1 is arbitrarily chosen and 15 units are allocated to it, removing


column 1. Cell 1, 3 must be allocated 5 units and cell 2, 3 5 units in order
that all demand shall be met.
424 Solutions to Selected Exercises

The final allocation is shown in the following tableau:

2 3 4

15 20

2 10

3 10

4 15 15

15 20 10 10

Conclusion
Brewery 1 supplies 15 units to hotel 1 and 5 units to hotel 3.
Brewery 2 supplies 5 units to hotel 2 and 5 units to hotel 3.
Brewery 3 supplies 10 units to hotel 4.
Brewery 4 supplies 15 units to hotel 2.
Total cost: 435 units.

(4) Stepping Stone Algorithm. Consider the initial feasible solution found
by the northwest corner method.

2 3 4

15 5 20

2 10

3 10

4 5 10 15

15 20 10 10
Solutions to Selected Exercises 425

To determine whether this solution is optimal or not it is necessary to ask


for each cell individually if the allocation of one unit to that cell would
reduce the total cost. This is done for the cells which at present have no
units assigned to them.
Cell 4, 2 has the greatest decrease (17 units) and as much as possible, (5 units)
is allocated to this cell. This means a decrease in cost of $(17 x 5) = $85.
The new solution is displayed in the following tableau.

~ ~ ~ ~
15 5

~ ~ ~ ~
10 0

~ ~ ~ ~
10

~ ~ ~ ~
5 10

The same procedure occurs-all empty cells in the new tableau are examined
as before and the process is repeated. Since a basic feasible solution should
contain (m + n - 1) basic variables, one ofthe empty cells is assigned a zero.
Cell 2, 4 has the greatest decrease (19 units) and as much as possible (10 units)
is allocated to this cell. This means a decrease in cost of $(19 x 5) = $95.
The new solution is displayed.

15 5

15 o
426 Solutions to Selected Exercises

The process is repeated and cell 2, 3 with a decrease of 8 units is allocated


5 units. This means a decrease in cost of $(8 x 5) = $40. The tableau is dis-
played next.

2 3 4

15 20

2 10

3 10

4 15 o 15

15 20 10 10

The process is repeated, but there is no allocation which will cause a cost
reduction. Thus the optimal solution has been found.

Conclusion
Brewery 1 supplies 15 units to hotel 1 and 5 units to hotel 3
Brewery 2 supplies 5 units to hotel 2 and 5 units to hotel 3
Brewery 3 supplies 10 units to hotel 4.
Brewery 4 supplies 15 units to hote12.
Total cost: 435 units.
14(a)

1 2 3 4 5 6 Tasks

1 7 5 3 9 2 4 (-1)
2 8 6 1 4 5 2 ( -0)
3 2 3 5 6 8 9 (-0)
4 6 8 1 3 7 2 (-0)
Students
5 4 5 6 9 4 7 (-2)
6 9 2 3 5 1 8 (-0)
( -2) (-2) ( -1) (-3) ( -1) (-2)

(1) Subtract minimum quantity from each column and row of cij matrix
to obtain:
Solutions to Selected Exercises 427

2 4 3 7 7

1 3 4 1 3

(2) As the minimum number oflines is less than n, the minimum uncrossed
number is subtracted from all the uncrossed numbers and added to all
numbers with two lines passing through them to obtain:

1 5 1
"v ~ "v
( 3 2 6
"v "v "
v

2 3 ~ 2
2 2 6

(3) The process of (1) and (2) is repeated to obtain

5 Z e 4 e e
8 5 e I 5 e
e e z I 6 5
6 9- e e 9- e
e e I z e I
8 e I I e 5

The solution for this problem is

Student Task

X13 = 1 i.e. 1 3
X26 = 1 i.e. 2 6
X31 = 1 i.e. 3 1
X44 = 1 I.e. 4 4
X 52 = 1 i.e. 5 2
X65 = 1 i.e. 6 5.

The value ofthis solution is equal to the total of the numbers subtracted, i.e.,

~=2+2+1+3+1+2+1+0+0+0+2+0+1+1=1~
428 Solutions to Selected Exercises

By inspecting the original cij matrix we also obtain


x~ = 3 +2+2+3+5+1= 16.

Chapter 3

Section 3.7
l(a)

BI CB Bi l 1t1 b CI Ratio Entering Leaving

X5 0 0 0 0 0 9 3 3
X6 0 0 1 0 0 0 12 1 12
X7 0 0 0 0 0 8 2 4
Xs 0 0 0 0 1 0 10 3 3
-CI -3 -2 -1 -2 0000 XI Xs

B2 CB Bi l 1t2 b C2 Ratio

XI -3 I
3 0 0 0 -1 3 t 9
X6 0 -3
I 1 0 0 0 9 t 27
""5
x7 0 -3
2
0 1 0 0 2 t 6
I
Xs 0 -1 0 0 1 0 1 2 Z
-C2 0 -1 0 0 000

B3 CB B3 1 1t3 b C4 Ratio

XI -3 I
Z 0 0 -6
I
-zI 17
"6 i- 17
""5
I 5 .4.2. .ll 49
X6 0 Z 0 -6 0 6 6 TI
I 1 I I
X7 0 -2 0 -6 0 Ii -6
I I
X2 -2 -2 0 0 Z -zI zI -zI
-C3 0 0 I
2
I
-2 too I
Z

B4 Cb B4 1 1t4 b

XI -3 i -sI 0 0 i !
X4 -2 3
TI
6
TI 0 -sI 3
-TI '1
5S4
X7 0 12
-TI is 1 -sI 0
X2 -2 II
-TI ls 0 i -s2 37
TI
-C4 00 n 0 ~~ !s 2
S
Solutions to Selected Exercises 429

xt =~, x*4 -- 49
5, X*7 -- 54 x!
5 , =n, X~, x!, xt, x: = 0,
X6 = eBb = 226l·

2(a)

Xl X2 X3 X4 Xs X6 X7 Xs r.h.s. Ratio

G) 1 2 1 0 0 0 9 3
1 2 1 4 0 1 0 0 12 12
2 3 1 0 0 1 0 8 4
10
3 2 0 0 0 10 "3
3
-3 -2 -1 -2 0 0 0 0 0
1
3 t 3
2
3
1
0 0 0 3 9
0 i 2
3 130 -3
1
0 0 9 27
s-
O 1
3 t -3
1
-3
2
0 0 2 6
0 0 -1 -1 0 0 1 1 1
2
0 -1 0 0 0 0 0 9

1 0 1
"6 i t 0 0 1
-"6 167 17
s-
O 0 -"6
1
® t 1 0 5
-"6
49
""6 4/

0 0 13
""6
1
-"6 -2
1
0 1 1
-"6 II
1 1 1 1 1
0 2 -2 -2 0 0 "2 "2
0 0 1
2 -2
1 1
"2 0 0 t 1i
1-
0 5 0 ~ -s1 0 0 5
§.

1
1 ...J... 1 49
0 0 -15 25
~
25 0 -s 15
0 0 M
25 0 12
-25 i5 1 -s1 54
15
0 1 n 0 11
-15 -Is 0 £
5
37
15
0 0 n 0 14
15 -Is 0 2
s l
226

The solution and its value are as in l(a).


3(a). Forming the dual:

Minimize: 6Yl + 8Y2 + 9Y3 + 12Y4 = Yo


subject to: Yl + 3Y2 + 2Y3 + 2Y4;;::: 3
2Yl + 4Y2 + 3Y3 + Y4;;::: 2
Yl + 2Y2 + 3Y3 + 2Y4;;::: 1
3Yl + Y2 + Y3 + 2Y4 > 2
Yl' Y2' Y3' Y4 ;;::: 0.
430 Solutions to Selected Exercises

Introducing Ys, Y6' Y7' Ys as slack variables:

YI Y2 Y3 Y4 Ys Y6 Y7 Ys r.h.s.

1 3 2 2 -1 0 0 0 3
2 4 3 1 0 -1 0 0 2
1 2 3 2 0 0 -1 0 1
3 1 1 2 0 0 0 -1 2
Yo 6 8 9 12 0 0 0 0 0

Multiplying each constraint by (-1):

YI Y2 Y3 Y4 Ys Y6 Y7 Ys r.h.s.

-1 Q) - 2 -2 1 0 0 0 - 3
-2 -4 - 3 -1 0 1 0 0 - 2
-1 -2 - 3 -2 0 0 1 0 - 1
-3 -1 - 1 -2 0 0 0 1 - 2
Yo 6 8 9 12 0 0 0 0 0
ratios -6 8
-3 -
9
"2' -6
I
3 1 t 2
3 -3
I
0 0 0 1
-3
2
0 -
I
3 1 -3
4
0 0 2
-3
I
0 - i -3
2
-3
2
0 0
ED 0 -
11
)- -3
4

20
-3
I

S
0 0 - 1
- 8
Yo 130 0 ""3 ""3 3 0 0 0
Ratios s -11 -5 -8 0 0 0
-4

0 1 i t -8
3
0 0 i 7
8

*
0 0 -
I
4 2 s 1 0 I
-4 -4
13 I S I 9
0 0 -8 -"2' -8 0 -8 8
1 0 I
8
I
"2 8
I
0 0 -i i
Yo 0 0 Ii 5 9
4 0 0 ~
4 -4'
37

Hence
yT = i, yi =i, yt =£, y~ =t, y*0_- 37
4'
x*1 _.2
- 4, x*4 -- ~
4, xi, x! = 0, x*0-- 37
4'

4(a) (The Two-Phase Method). Introducing artificial variables Y9' YIO'


Yll' Y12:
Solutions to Selected Exercises 431

Phase I

Y1 Y2 Y3 Y4 Y5 Y9 Y6 Yio Y7 Y11 Ys Y12 r.h.s.

1 3 2 2 -1 1 0 o 0 o 0 o 3
2 4 3 1 0 o -1 1 0 o 0 o 2
1 2 3 2 0 o 0 o -1 1 0 o 1
3 1 1 2 0 o 0 o 0 o -1 1 2
Yo o o o o 0 1 0 1 0 1 0 1 o

In canonical form:

Y1 Y2 Y3 Y4 Y5 Y9 Y6 YIO Y7 Y11 Ys Y12 r.h.s. Ratio

3 2 2 -1 1 0 o 0 o 0 o 3
2 4 3 1 0 0-1 1 0 o 0 o 2 t
Q)3 2000 o -1 o o t
3 1 2 0 0 0 o 0 o -1 1 2 2
Yo -7 -10 -9 -7 0 o o 1 o -8

1
2" o -~ -1 -1 0 0 1 -1 0 0 1 1
o o -3 -3 0 0 -1 Q) -2 0 0 0 o
t 1 o 0 o o -t 2"1 o o t
~ o -t 1 o 0 o o t -t -1 1 1 3
-2 o 6 3 1 0 o -4 5 1 o -3

-t o -t i - I 1 1-1 000 o 1
o o -1 -1 0 o -t t 1 -1 0 o 0
t 1 1 t 0 o -t t 000 o t 2
~
2 o t CD 0 o t-t o 0-1 1
Yo -2 o 0 -3 1 o -1 2 o 1 o -3

_176 o -~ o -1 o 0 CD-~
o -t o 0 1 -1 -~ ~
o 0 o 0 t-t 2
o t 1 0 o t-t o 0 -4 4
Yo 176 o ~ o o -4 V o 1 -t V -~

_156 o -i o -t t !-! o 0 1 -1
-i o -~ o -~ ~ i-i 1 -1 o 0
J.
5 1 ! o t -t -i i o 0 o 0
-5
2
o -t 1 -! o 0 o 0
Yo 0 o 0 o 0 0 1 o o
432 Solutions to Selected Exercises

Phase II
Minimize: 6Yl + 8Y2 + 9Y3 + 12Y4 = Yo·

YI Y2 Y3 Y4 Y5 Y6 Y7 Ys r.h.s.

_\6 0 -t 0 -5
7 ±
s 0 t
3 9 2
-5 0 -5 0 -5
6
5 0 ~
5
J. 4
0 1. 2
0 0 I
S 5 s -5 5
2 I 4 3
-5 0 -5 -5 5 0 0 §.
5

Yo 6 8 9 12 0 0 0 0 0

In canonical form:

YI Y2 Y3 Y4 Ys Y6 Y7 Ys r.h.s. Ratio

-T
16
0 -5
3
0 -5
7
CD 0 1 t i
-5
3
0 -5
9
0 -5
6 1.
s 1 0 ~
5 !
t 1 5
4
0 5
I 2
-5 0 0 I
5
2
0 I
1 4 J. 0 0 6
2
-5 -5 -5 s 5
Yo 6 0 5 0 8 -4 0 0 -16

-4 0 -"4
3
0 7
-"4 0 i i
3 I I 3 3
0 -2 0 -2 0 1 -2 2 2
-1 1 i 0 -2
I
0 0 i I
2
(1) 0 t 1 1.
4 0 0 -"4
3 3
"4
3
8
Yo -10 0 2 0 0 0 5 -13
s
0 0 -"4
I
2 -"4 0 -"4
I
t
13 I 5 1
0 0 -8 -2 -8 0 -8 ~
0 1 i i -8
3
0 0 t 7
8
0 t i t 0 0 -8
3 J.
s
Yo 0 0 Ii 5 t 0 0 ~
4 -4
37

5(a)

XI X2 X3 X4 Xs X6 r.h.s. Ratio

2 1 4 1 0 0 10 5
1 2 1 0 1 0 4 4
Q) -2 1 0 0 1 6 2
-3 -1 -2 0 0 0 0
Solutions to Selected Exercises 433

Xl X2 X3 X4 Xs X6 r.h.s. Ratio

0 ! \0 0 -t 6 178

0 CD t 0 1 -~
1
2 i
1 -3
2
t 0 0 t 2
0 -3 -1 0 0 6

0 0 W -8"
7
-8"
3 12 17
IT
0 1
4 0 i -8"
1
! 3
1 t 0 1
1 5
*t
0 4
0 0 -4
1
0 i 31

0 0 1 IT
4 7
-n -22
3
n
0 1 0 1 .i. 1 ..£
-IT 11 -IT 11
2
0 0 -IT 12 272 ~~
0 0 0 It n n n

Hence
xt = ~~, x~ = 141' x!= n, xt, x;, x~ = 0, x~ =n.
Let
fJ = amount of elapsed time.
Xo = (3 + fJ)Xl + (1 + 2fJ)X2 + (2 + 3fJ)X3·
Then:

Xl X2 X3 X4 Xs X6 r.h.s.

0 0 4
IT
1
7
-n
.i.
-22
3

1
n4
0 0 -IT 11 -IT IT
1 0 0 2
-IT 272 II
12 11
-0 -20 -30 ...L
11 n II
22 n
In canonical form:

Xl X2 X3 X4 Xs X6 r.h.s.

0 0 141
7
-22
3
-22 n
0 1 0 1
-IT 5 1 ..£
IT -IT 11
0 0 9 7 II
-11 n n 11

1 + 80 23 + 0 13 - 60 95 + 70
0 0 0
11 22 22 11
434 Solutions to Selected Exercises

The first critical point is e= ll.

XI X2 X3 X4 Xs X6 r.h.s.

4 7 3 17
0 0 1 IT -TI -22 IT
I 1 4
0 0 -IT ~
II -IT IT
2 9 7 19
1 0 0 IT 22 TI IT
S 11
0 0 0 :1 6"" 0 24

Replacing Xl by X6:

XI X2 X3 X4 Xs X6 r.h.s.

2 1 16
~ 0 7 7 0 7
2 1 4 6
7 1 0 -7 7 0 7
22 4 2- 378
7 0 0 7 7
s II
0 0 0 :1 6 0 24

Searching for further critical points:

Xl X2 X3 X4 Xs X6 r.h.s.

2 I 16
~ 0 7 -7 0 7
2 1 4
7 0 7 7 0 ~
22 4 2-
7 0 0 -7 7
;!Ji
7
-8 -28 - 38 s II
:1 6"" 0 24

In canonical form:

XI X2 X3 X4 Xs X6 r.h.s.

3 I 16
7 0 1 2
7 -7 0 7
2 1 4 6
7 0 7 7 0 7
22 4 9 38
7 0 0 -7 7 1 7
~8 0 0 i + 48 Ii + ~8 0 24 + 67°8

For e > 0 this basis is optimal. Thus

xi
* 95 + 7e
= ~i, X~ = 141' X~ = ~i. Xo = 11
Solutions to Selected Exercises 435

0> ll: x*2 -Q


- 7,
x*3 _li
- 7'
x*6 -- 38
7, x~ = 24 + 67°0.

6(a). Dual:

Minimize: lOYl + 4Y2 + 6Y3


subject to: 2Yl + Y2 + 3Y3 ~ 3
Yl + 2Y2 - 2Y2 ~ 1
4Yl + Y2 + Y3 ~ 2
Yt, Y2' Y3 ~ O.

Phase I
Minimize: Ys + Y7 + Y9

Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 t.h.s. Ratios

2 3 -1 0 0 0 0 3 0
1 2 -2 0 0 -1 0 0 1 20
4 1 0 0 0 0 -1 1 2 30
0 0 0 0 1 0 0 1 0 0

2 1 3 -1 1 0 0 0 0 3 3
2 0
1 2 -2 0 0 -1 0 0 1 20
@) 1 0 0 0 0 -1 1 2 1
2 30
-7 -4 0 1 0 1 0 -6 -60

0
0
t
CD -4
t
9
-1
0
1
0 -1
0 0
4
1
2
1
-2
-4
1

!
2
t ,4 -to
iO
iO
! ! 0 0 0 0 1
-4
1
2 2

, i -iO
3 5
0 9
-4 -4
1
0 0 -4 -2

0
0
1
0
1
0
®
-7
9

4
7
-1
0
0
0
0
-7
4

1
7
-7

-7
2

4
7
1
-7
~
1
7
2
-7
-7
, -V
3

1, V
~
13
22

J.
4
-~O
~O
40
0 0 -7
22
1 0 -7
2
t -7
3 17° ~O

7 1 3 il
0 0 -22 272 /1 -IT £2 -22 22 - 131 0
0 1 0 -i2
...2...
22
5
-IT i
11 272
7
-22 H 141 0
0 0 ...L
11 --fi /1 -IT
1
-22
4
141 h 181 0
0 0 0 0 0 0 0 0
436 Solutions to Selected Exercises

Phase II
Minimize: lOYI + 4Y2 + 6Y3

Yl Y2 Y3 Y4 Y5 Y6 Y7 Ys Y9 r.h.s. Ratios

0 0 1 SiJ 9
1
IT
5
-b
7
13
21
23
-fi8
0 0 -21 -IT TI 21 141 8

0 0 2
IT /1 -n4 fi- IS1 8

10 4 6 0 0 0 0

In canonical form:

Yl Y2 Y3 Y4 Y5 Y6 Y7 Ys Y9 r.h.s. Ratios

7 1 13
0 0 -21 IT l2 21 -fi8
0 1 0 -21
9
-IT
5 7
21
4
23
21
1
n 8
1 0 0 ?1 /1 -IT IT I 18
S

0 0 0 li 4
IT
17
IT -IT
95
-n8

Y! = xt = 1\ x! = n
Y! = x! = ~~ x! = 141

y~ = x~ = g x~ = Ii by complementary slackness.
yt = Y! = y~ = 0
Y6 = n
Yl Y2 Y3 Y4 Y6 Ys r.h.s.

7 1 3 13-68
0 0 -21 IT 21 ---
22

9 5 7 23 + 88
0 0 -21 -IT 21 ---
22
1 + 88
0 0 fi 1
IT
4
-IT
11
95 - 788
0 0 0 19
IT
4
IT Ii 11

For r.h.s. entries to be nonnegative, 0 ~ e ~ Ii. Y4 enters the basis with


II _ .il
u - 6'
Solutions to Selected Exercises 437

Y1 Y2 Y3 Y4 Y6 Y8 r.h.s.

7 1
0 0 1 -21 IT l2 0
0 1 0 9
-22 -IT
5 7
21 1i
1 0 0 l1 -h -IT
4 5
3"
0 0 0 19
IT
-±-
11
17
IT -24
22 2 3
0 0 -7 -7 -7 0 ~8
9 4 1 11
0 1 -7 0 -7 7 6 -A8
1 0 4
7 0 1.
7 -7
2
t 48
0 0 38
7 0 ~ 176 -24 -¥8

yT = xt = i x*6 -- 38
7

yi = x~ = Ii xi = ~
yt = xT = ° x*3 -- 1.§. by complementary slackness.
°
7

y! = yt = y: =
Y6 = 24

Y1 Y2 Y3 Y4 Y6 Y8
0 0 -7
22
-7
2
-7
3
o+ ~8
9 4
0 -7 0 -7
1
7 1i + 141 8

0 4
7 0 + -7
2
t + 48
0 0 \8 0 ~ ¥ -24 - 6,'8

As all r.h.s. entries are nonnegative for e ~ 0, no further critical points


can be found. Thus the solution is as in 5(a).
7(a). A = (2,1,3)

Y1 Y2 Y3 Y4 Y6 Y8 r.h.s.

7 3 13
0 0 -21 /1 21 TI
9 5 7 23
0 0 -21 -IT 22 22
4
1 0 0 /1 /1 -IT ...L
11
2,1, ,1, 3,1, }~ 141 IT
17
-if
7 1
0 0 1 -21 IT TI
3 13
TI
0 1 0 -TI
9

2
-IT
5
TI
7

4
n 1
0 0 IT /1 -IT IT
0 0 0 tt +,1, 141
17
IT -H - 3,1,
438 Solutions to Selected Exercises

Basis remains optimal for Ie ~ 0. Therefore

xi = ~i + Ie, x! = n, x!, x!, x~ = 0, x~ = ii + 31e.

Chapter 4

Section 4.6
l(a). The decision tree for this problem is shown in Figure S.5. The
optimal solution is

xi, x! = 0, xi = 1, x~ = 5.
To obtain the first node:

Xl X2 X3 X4 X5 X6 r.h.s. Ratio

2 ® 3 0 0 8 4
3
7
5 4 4 0 0 7 4
6 0 0 1 12 12
-3 -5 -4 0 0 0 0
1 1 1 4 8
3 2 6 0 0 3 3
3"
11
0 Q) 2
3 0 3
5
6
5

17 1 1 32 64
3" 0 2 -6 0 1 3" 3"
-3
4
0 -2
3
i 0 0 20
3"

7 1 1 11
-12 1 0 3 -4 0 12
11 1 1 5
0 0
"
-3 2 6
19 1 41
4 0 0 0 -4 4
17 1 3 95
4 0 0 3 4 0 12

The problem has an optimal (noninteger) solution:

xi = g, x! =i, x*6 -- .±.l


4, x~ = ii.
To create nodes (I) and (II), examine xi = g and introduce
X2 sO, (I)
X2 ~ 1, (II).
Solutions to Selected Exercises 439

-00 (IV)

(XI) 1Q
3

4 5 -00

(IX) (X)
5 -00

Figure S.5

(I) Introducing a new constraint with slack variable X7:

x2 + x 7 = O.

Xl X2 X3 X4 Xs X6 X7 r.h.s.
7 1 1 11
-12 0 3 -4 0 0 IT
II 0 -3
1
"2
1
0 0 i
19 1 41
4" 0 0 0 -4 1 0 4"
0 1 0 0 0 0 0
17
IT 0 0 1
"3 i 0 0 9S
IT

In canonical form:

Xl" X2 X3 X4 Xs X6 X7 r.h.s.

7 1 1 11
-12 0 3 -4 0 0 IT
II 0 -3
1
t 0 0 .s.6
19 1 41
4" 0 0 0 -4 1 0 4"
172 0 0 ED t 0 1 -IT
11

17
12 0 0 1
3 i 0 0 H
440 Solutions to Selected Exercises

Applying the dual simplex method:

Xl X2 X3 X4 X5 X6 X7 r.h.s.

0 0 0 0 0 0 0
5 I 7
4 0 0 4 0 0 4
11 0 0 0 I
-4 0 41
"4
7 3 11
-4 0 0 -4 0 "4
2 0 0 0 0 0 7

This has solution


xi =0
(as expected as x 2 ::; 0 and X2 Z 0),

xj = i, x*4 -- 1l.
4, X6 = 7.

(II) Introducing a new constraint with slack variable xs:

x 2 - Xs = l.

Xl X2 X3 X4 X5 X6 Xs r.h.s.

7 1 1 11
-12 0 3 4 0 0 12
11 1 1 5
"6 0 1 -3 "2 0 0 "6
1 41
11 0 0 0 -4 1 0 "4
0 1 0 0 0 0 -1
17 1 3 95
12 0 0 3 4 0 0 12

In canonical form:

Xl X2 X3 X4 X5 X6 Xs r.h.s.

7 1 1 11
-12 0 3 -4 0 0 12
"6
11
0 -3
1 1
"2 0 0 i
19 1 41
"4 0 0 0 -4 0 "4
7 1 1
-12 0 0 -4 0 1 -12
12
17
0 0 3
4 0 0 n
Solutions to Selected Exercises 441

Applying the dual simplex method:

Xl X2 X3 X4 X5 X6 Xs r.h.s.

0 0 0 0 0 -1

,
0 0 7
~
-7
2
0 V 4
7
0 0 0 19
7 -7
16
1 V 677

0 0 -7
4
0 -7
12
t
1 17 54
0 0 0 ~ 7 0 7 7

This has the solution:

xt =t, x! = 1, x~ =4,
(III) Introducing a new constraint with slack variable X9 :

Xl + X9 = o.

Xl X2 X3 X4 X5 X6 Xs X9 r.h.s.

0 0 0 0 0 - 1 0 1
0 0 1 ~ -
2
7 0 22
7 0 4
0 0 0 li -If 57
7 0 67
T
4 3 12 1
0 0 -7 7 0 -7 0 7
0 0 0 0 0 0 0
0 0 0 ~
7 t 0 17
7 0 574

In canonical form:

Xl X2 X3 X4 X5 X6 Xs X9 r.h.s.

0 1 0 0 0 0 - 1 0 1

,
0 0 5
7 -
2
7 0 22
7 0 4
0 0 0 li -7
16 .n
7 0 67
7
0 0 -7
4
0 12
-7 0 t
0 0 0 4 8) 0 12
7 -7
1

0 0 0 ~ t 0 17
7 0 54
7
442 Solutions to Selected Exercises

Applying the dual simplex method:

XI X2 X3 X4 X5 X6 Xs X9 r.h.s.

0 1 0 0 0 0 -1 0
I
0 0 1 3 0 0 2 -
2
3
2
3
0 0 0 -3
I
0 1 -1 -3
16
V
1 0 0 0 0 0 0 0
4 7 I
0 0 0 -3 0 -4 - 3 3
4 I 23
0 0 0 3 0 0 3 3 3

which has the solution:


x! = 0
(as expected, as Xl ~ 0 and Xl :::;:; 0),
X~ = 1, x~ =1, X! = t, X*
6 -- 1.1
3,
X* _n
0- 3 .

(IV) Introducing a new constraint with slack variable X 10 :

Xl - XIO = 1.

XI X2 X3 X4 X5 X6 Xs XIO r.h.s.

0 0 0 0 0 -1 0
0 0 5
7 -
2
7 0 2l 0 4
7
19 16 57 67
0 0 0 7 -7 1 7 0 7
4 3 12 I
0 0 -7 7 0 -7 0 7
1 0 0 0 0 0 0 -1
S I 17 54
0 0 0 7 7 0 7 0 7

In canonical form:

XI X2 X3 X4 X5 X6 Xs XIO r.h.s.

0 0 0 0 0 -1 0
0 0 1 5
7 - 7
2
0 22
7 0 4
19 16 57 67
0 0 0 7 -7 7 0 7
0 0 4
-7
3
7 0 -7
12
0 +
0 0 0 4
-7
;l
7 0 0})
- 7 -7
6

8 I 17 574
0 0 0 7 7 0 7 0

Applying the dual simplex method:


Solutions to Selected Exercises 443

Xl X2 X3 X4 X5 X6 Xs XlO r.h.s.

1 1 7 3
0 1 0 "3" -4 0 0 -TI -,:
1 1 11
0 0 -"3" -,: 0 0 b -1
0 0 0 0 -4
1
0 it H
1 0 0 0 0 0 0 -1 1
0 0 0 1 1
0 7 I
"3" -4 -TI -,:
1 3 17 13
0 0 0 "3" 4 0 0 TI 2

0 1 1 0 t 0 0 5
"4
1
2"
3 11
0 0 -3 1 --,: 0 0 -2 3
1 57 11
0 0 0 0 -4 1 0 TI 2
0 0 0 0 0 0 -1 1
0 0 0 "4
1
0 1 i -t
0 0 1 0 i 0 0 13
"4
11
"2

Note that this subproblem required two iterations. The final tableau
indicates that it does not have a feasible solution.
(V) Introducing a new constraint with slack variable XII:
X3 + X 11 = o.

Xl X2 X3 X4 X5 X6 Xs Xg X11 r.h.s.

0 1 0 0 0 0 -1 0 0
0 0 1
"3" 0 0 2 - 1 0 t
0 0 0 1
-"3" 0 1 -1 -¥ 0 31
3"
0 0 0 0 0 0 1 0 0
0 0 0 4
-"3" 1 0 -4 -
7
3 0 1
"3"
0 0 1 0 0 0 0 0 0
0 0 0 4
"3" 0 0 3 t 0 ¥
In canonical form:

Xl X2 X3 X4 X5 X6 Xs Xg X11 r.h.s.

0 1 0 0 0 0 -1 0 0 1
0 0 1 t 0 0 2 -
2
"3" 0 t
0 0 0 -3
1
0 -1 -¥ 0 ¥
1 0 0 0 0 0 0 1 0 0
0 0 0 4
-"3" 0 -4 -
7
"3" 0 1
"3"
0 0 0 -"3"
1
0 0 8> 2
"3" t
0 0 0 4
"3" 0 0 3 t 0 23
3"
444 Solutions to Selected Exercises

Applying the dual simplex method:

Xl X2 X3 X4 Xs X6 Xs X9 X11 r.h.s.

I I I 4
0 1 0 "6 0 0 0 - 1" -2 1"
0 0 1 0 0 0 0 0 1 0
I 17 1 32
0 0 0 -"6 0 0 -3 -2 3
0 0 0 0 0 0 1 0 0
2 11 -2 5
0 0 0 1" 1 0 0 -3 1"
1 I 1 1
0 0 0 6 0 0 - 3 -2 1"
5 4 3 20
0 0 0 6 0 0 0 1" 2 3

which has solution:


x!, xj = 0, (as expected), xi =4, x*5 --~
3,
x*6 -- 32
3, x*8 -- 1.3, x*0-- 20
3'

(VI) Introducing a new constraint with slack variable X12:

X3 - X12 = l.

XI X2 X3 X4 Xs X6 Xs X9 X12 r.h.s.

0 1 0 0 0 0 -1 0 0
0 0 t 0 0 2 -3
2
0 2
3
0 0 0 1
-1" 0 -1 -¥ 0 3
31

0 0 0 0 0 0 1 0 0
0 0 0 -3
4
0 -4 -3
7
0 t
0 0 0 0 0 0 0 -1
4 1 23
0 0 0 1" 0 0 3 1" 0 3

In canonical form:

XI X2 X3 X4 Xs X6 Xs X9 X12 r.h.s.

0 1 0 0 0 0 -1 0 0
0 0 3
I
0 0 2 -3
2
0 t
1 -1 16 31
0 0 0 -1" 0 -3 0 3
0 0 0 0 0 0 1 0 0
4 7
0 0 0 -3 1 0 -4 -3 0 I
3
0 0 0 3
1
0 0 2 8) 1 -"3
1

0 0 0 4
3 0 0 3 t 0 ¥-
Solutions to Selected Exercises 445

Applying the dual simplex method:

XI X2 X3 X4 X5 X6 Xs X9 X12 r.h.s.

0 0 0 0 0 -1 0 0
0 0 1 0 0 0 0 0 -1
0 0 0 -3 0 -17 0 -8 13
1 0 0 ! 0 0 3 0 3
2" -t
0 0 0 -1 0 11 0 7
-2"
3
2"
I 3 I
0 0 0 -2" 0 0 -3 -2" 2"
0 0 0 J. 0 0 4 0 I
15
2 2"

This does not have a feasible solution.


(VII) Introducing a new constraint with slack variable X13:

X3 + X13 = 1.

XI X2 X3 X4 X5 X6 X7 XI3 r.h.s.

0 0 0 0 0 0 0
4
5
0 1 0 t 0 -1 0 i
19
"4 0 0 0 -4
I
0 0 4J
7 3 11
-4 0 0 -4 0 3 0 "4
0 0 1 0 0 0 0 1
2 0 0 0 0 0 7

In canonical form:

XI X2 X3 X4 X5 X6 X7 XI3 r.h.s.

0 0 0 0 0 0 0
CD 0 1 0 4
I
0 -1 0 4
7

lj- 0 0 0 -t 0 0 41
"4
7 3
-4 0 0 1 -4 0 3 0 If

-i 0 0 0 -4
I
0 1 1 -1
2 0 0 0 1 0 0 7
446 Solutions to Selected Exercises

Applying the dual simplex method:

XI X2 X3 X4 X5 X6

0 0 0 0 0
0 0 1 0 0 0
6
0 0 0 0 -5 1
2
0 0 0 -5 0
I
0 0 0 5 0
3
0 0 0 0 5 0

This has solution:

x! =!, xi = 0 (as expected), x! = 1,


x*4 -_.!2
5,
x*6 -_ l5,
l x*0-- 29
5 .

(VIII) Introducing a new constraint with slack variable X14:

X3 - X 14 = 2.

XI X2 X3 X4 X5 X6 X7 XI4 r.h.s.

0 0 0 0 0 0 0
4
~ 0 1 0 I
4 0 -1 0 i
19 I 41
4 0 0 0 -4 0 0 4
7 3 .li
-4 0 0 -4 0 -3 0 4
0 0 1 0 0 0 0 -1 20
2 0 0 0 0 0 7

In canonical form:

XI X2 X3 X4 X5 X6 X7 X I4 r.h.s.

0 1 0 0 0 0 1 0 0
i 0 0 4
I
0 -1 0 7
4
19 I 41
4 0 0 0 -4 1 0 0 4
7 3
-4 0 0 -4 0 -3 0 Ii

i 0 0 0 i 0 ED 1 -4
I

2 0 0 0 0 0 7
Solutions to Selected Exercises 447

Applying the dual simplex method:

Xl X2 X3 X4 Xs X6 X7 X 14 r.h.s.

i 1 0 0 1
4 0 0 1 -4
1

0 0 0 0 0 0 -1 2
11 0 0 0 -4
1
0 0 4f
-2
11
0 0 -2
3
0 0 -3 !
s 0 0 0 1
0 1 -1 1
-4 -4 4
If 0 0 0 i 0 0 1 2;;

This does not have a feasible solution.


(IX) Introducing a new constraint with slack variable x 15 :
X2 + X 15 = l.

Xl X2 X3 X4 Xs X6 Xs X9 X I1 X 1S r.h.s.

0 1 0 i 0 0 0 -3
1
-2
1
0 t
0 0 0 0 0 0 0 0 0
1 17 1
0 0 0 -6 0 0 -3 -2 0 3l
0 0 0 0 0 0 0 0 0
0 0 0 i 1 0 0 11
-3 -2 0 s
3
0 0 0 i 0 0 1 -3
1
-2
1
0 1
3
0 0 0 0 0 0 0 0 1
0 0 0 ~
6 0 0 0 4
3 ! 0 20
3

In canonical form:

Xl X2 X3 X4 Xs X6 Xs X9 XI1 X1S r.h.s.

0 1 0 i 0 0 0 -3
1
-2
1
0 4
3
0 0 1 0 0 0 0 0 0 0
1 17 1 32
0 0 0 -6 0 1 0 -3 -2 0 3
0 0 0 0 0 0 0 0 0
0 0 0 i 0 0 11
-3 -2 0 s
3
0 0 0 i 0 0 1 -3
1
-2
1
0 t
0 0 0 ED 0 0 0 t 1
2 I -3
4

0 0 0 ~
6 0 0 0 t J.
2 0 230
448 Solutions to Selected Exercises

Applying the dual simplex method:

XI X2 X3 X4 Xs X6 Xs Xg Xli XIS r.h.s.

0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 1 0 0
0 0 0 0 0 1 0 -6 -1 -1 11
0 0 0 0 0 0 0 0 0
0 0 0 0 7 I
0 0 -3 0 4 3
0 0 0 0 0 0 1 0 0 1 0
0 0 0 0 0 0 -2 -3 -6 2
0 0 0 0 0 0 0 3 4 5 5

This has a feasible integral solution:


x!, xj = 0, xi = 1, X6 = 5.
This is the first incumbent.
(X) Introducing a new constraint with slack variable X 16 :

x 2 - X 16 = 2.

XI X2 X3 X4 Xs X6 Xs Xg Xli X I6 r.h.s.

I I
0 1 0 6 0 0 0 -3 -zI 0 4
3
0 0 0 0 0 0 0 0 0
I 17 I II
0 0 0 -6 0 0 -3 Z 0 3
0 0 0 0 0 0 0 0 0
0 0 0 2
3 1 0 0 3
11
-2 0 i
I I
0 0 0 6 0 0 -3 -zI 0 1.
3
0 1 0 0 0 0 0 0 0 -1 2
s 4 3
0 0 0 6 0 0 0 3 Z 0 23°

In canonical form:

XI X2 X3 X4 Xs X6 Xs Xg Xli X I6 r.h.s.

0 0 I
6 0 0 0 -3
I
-zI 0 4
0 0 0 0 0 0 0 0 0
I 17 I 32
0 0 0 -6 0 0 -3 Z 0 -3
0 0 0 0 0 0 0 0 0
0 0 0 2
3 0 0 ¥ -2 0 ~
3
I I I I
0 0 0 6 0 0 1 -3 -z 0 3
0 0 0 I
6 0 0 0 -3
I
ED -3
2

0 0 0 ~
6 0 0 0 4
3
3
Z 0 Zf
Solutions to Selected Exercises 449

Applying the dual simplex method:

XI X2 X3 X4 X5 X6 Xs X9 XII X I6 r.h.s.

0 0 0 0 0 0 0 0 -1 2
0 0 1 I
3
I
0 0 0 ED
16
0 2
-1
-3
4

0 0 0 -3" 0 0 "3 0 10
0 0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 -3"
7
0 -4 12
"3
0 0 0 0 0 0 0 0 -1 1
0 0 0 -3"
1
0 0 0 2
3 1 -2 1
0 0 0 t 0 0 0 t 0 3 1/

Xl X2 X3 X4 X5 X6 Xs X9 XII X 16 r.h.s.

0 0 0 0 0 0 0 0 -1 2
3 1
0 0 -2 -2 0 0 0 0 -3 2
7 2
0 0 8 3 0 0 0 0 15 -3
1 0 3
2
1
2 0 0 0 0 0 3 -2
7 7
0 0 -2 -6" 0 0 0 0 -11 9
0 0 0 0 0 0 1 0 0 -1 1
0 0 1 0 0 0 0 0 0 0
1
0 0 -2 3 0 0 0 0 0 4 4

This does not have a feasible solution.


(XI) Introducing a new constraint with slack variable X17:
Xl + X l7 = o.

Xl X2 X3 X4 X5 X6 X7 X13 X17 r.h.s.

0 1 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 0
0 0 0 0 -! 1 1/ 1/ 0 V
0 0 0 1 -s 2
0 -s- 22 27
-s- 0 1/
O 0 0 t 0 -s4 -s4 0 t
0 0 0 0 0 0 0 0
0 0 0 0 t 0 Il t 0 ¥
450 Solutions to Selected Exercises

In canonical form:

XI X2 X3 X4 Xs X6 X7 X13 X 17 r.h.s.

0 1 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0
ti 37
0 0 0 0 -s6 s It 0 5
ti
0 0 0 -s2 0 -5
22
-5
27
0 s
1 0 0 0 t 0 -s4 -s4 0 J.
s
0 0 0 0 8) 0 ! 4
S 1 -s3
0 0 0 0 ! 0 Il ! 0 2t

XI X2 X3 X4 Xs X6 X7 XI 3 X17 r.h.s.

0 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 1 0 1
0 0 0 0 0 1 -1 -1 -6 11
0 0 0 1 0 0 -6 -7 -2 5
1 0 0 0 0 0 0 0 1 0
0 0 0 0 1 0 -4 -4 -5 3
0 0 0 0 0 0 5 4 3 4

This has a feasible integral solution:

xT, xi = 0, x! = 1, x~ =4.

This is less than the value of the incumbent, so it can be discarded as sub-
optimal.
(XII) Introducing a new constraint with slack variable x1S :

XI X2 X3 X4 Xs X6 X7 XI3 XIS r.h.s.

0 0 0 0 0 1 0 0 0
0 0 0 0 0 0 0
0 0 0 0 -! It It 0 37
5
2 22
0 0 0 1 -5 0 -5 -s7 0 It

1 0 0 0 t 0 - s 4
-s4 0 !
0 0 0 0 0 0 0 -1
0 0 0 0 J.
s 0 IS3
! 0 l.2.
s
Solutions to Selected Exercises 451

In canonical form:

Xl X2 X3 X4 X5 X6 X7 X13 XIS r.h.s.

0 0 0 0 0 1 0 0 0
0 0 1 0 0 0 0 0
19 19 37
0 0 0 0 -s-6 1 5 5 0 5
0 0 0 1 -s-2 0 -5
22
-s-7 0 II
5
1
0 0 0 s- 0 - s- 4
-s-4 0 ~
O 0 0 0 t 0 - s-4 8) -s-2
O 0 0 0 3
s- 0 V ! 0 2/

Applying the dual simplex method:

Xl X2 X3 X4 X5 X6 X7 X13 XIS r.h.s.

0 0 0 0 0 1 0 0 0
0 0 1 0 1
4 0 -1 0 i 1.
2
0 0 0 0 -4
1
1 0 0 11 II
0 0 0 -4
3
0 3 0 -4
7
!
1 0 0 0 0 0 0 0 -1
1 5 1
0 0 0 0 -4 0 1 -4 "2
0 0 0 0 0 0 2 5

This has the solution:


xi = 1, xi =0, x~ = t, x~ = 5,
which is no better than the incumbent and can be ignored.
Thus there is no node in the decision tree with bound greater than that of
node (IV). As this represents a feasible solution it is optimal. Hence
xi = x~ = 0, xi = 1, x~ = 5.
2(a). The decision tree is shown in Figure S.6. The optimal solution is
xi, x~ = 0, xi = 1, x~ = 5.
3(a). This has the same solution as 2(a).
3(b) Xl = y~ + 2yi :::; 2

x 2 = y~ + 2YI :::; 3
X3 = yg + 2y~ :::; 2.
452 Solutions to Selected Exercises

8 5

-00 3 4 5
Figure S.6

Hence the problem becomes

Maximize: 4Y6 + 8Yi + 3y~ + 6yi + 3Y6 + 6YI


subject to: 3Y6 + 6Yi + 4y~ + 8yi + 2Y6 + 4YI ~ 14
4Y6 + 8yi + 2y~ + 4yi + Y6 + 2YI ~ 10
2Y6 + 4yi + y~ + 2Yi + 3Y6 + 6YI ~ 7
y{ = °or 1, i = 0, 1
j = 1,2,3.
The optimal solution is
Y6 = yi = Y6 = 1
yi = y~ = YI = 0,
with value 13.
4(a). From 1(a) the final tableau is

Xl X2 X3 X4 X5 X6 r.h.s.
7 1 1 11
-12 1 0 3 -4 0 12
V
19
0 -3
1
t 0 5
"6
1
4"" 0 0 0 -4
41
4""
12
17
0 0 1
3 i 0 95
12

Now
x2- ?2 + tX4 - txs = g
x2 = g- (-1 + f-2)Xl - (0 + tx 4 ) - (-1 + t)x s'
Adding in slack variable x 7 , the constraint is
Solutions to Selected Exercises 453

Introducing this into the above tableau:

XI X2 X3 X4 Xs X6 X7 r.h.s.

-12
7
1 0 I
3 -i 0 0 tt
11
"6 0 1 -3
I
t 0 0 i
11 0 0 0 -4
I
1 0 4;f
-12
g
5
0 0 ED -4
3
0 -H
0 0 t 3
4 0 0 H

Applying the dual simplex method:

XI X2 X3 X4 Xs X6 X7 r.h.s.

-1 0 0 -1 0 0
4
9
0 1 0 .s.4 0 -1 i
19
"4 0 0 0 -i 0 4;f
4
5
0 0 .2.
4 0 -3 Ii
1 0 0 0 0 0 7

Therefore

Adding in slack variable xs, the constraint is:

Introducing this into the preceding tableau:

XI X2 X3 X4 Xs X6 X7 Xs r.h.s.

-1 1 0 0 -1 0 0 0
1 0 1 0 i 0 -1 0 i
19
"4 0 0 0 -4
I
1 0 0 4;f
4
5
0 0 1 9
4 0 -3 0 11
"4
-4
I
0 0 0 8) 0 0 1 -4
3

0 0 0 0 0 0 7
454 Solutions to Selected Exercises

XI X2 X3 X4 X5 X6 X7 Xs r.h.s.

0 1 0 0 0 0 -4 3
0 1 0 0 0 -1 5 -2
5 0 0 0 0 0 -1 11
-1 0 0 1 0 0 8) 9 -4
1 0 0 0 0 0 -4 3
0 0 0 0 0 0 7

I 5
-3
I
1 0 3 0 0 0 -1 3
4
3 0 8) 0 0 0 2 -3
2

5 0 0 0 0 0 -1 11
I I 4
3 0 0 -3 0 0 1 -3 3
0 0 0 0 0 -4 3
2 I 17
3 0 0 3 0 0 0 3 3

1 0 0 0 0 1 1
-4 0 -3 0 0 0 -6 2
5 0 0 0 0 1 0 -1 11
-1 0 -1 0 0 0 1 -5 2
1 0 0 0 1 0 0 -4 3
2 0 1 0 0 0 0 5 5

The last tableau represents the optimal solution:


xi, xj = 0, x~ = 1, x6 = 5.
5(a). From l(a) the final tableau is

XI X2 X3 X4 X5 X6 r.h.s.
7
-12 1 0 3
I
-4
I
0 II
12
II I I 5
0 0
"
""6 3 2
19 I 41
4" 0 0 0 -4 4"
17 I 3 95
12 0 0 3 4 0 12

Now substituting in:

Df.J = D'.(D'.
J J
- 1) - I Ls_
kE
ajkYk +
k
Ls+
E
ajkYk - x,

0+ g- X2 = - ?2XI + tX4 - txs


II
12 = {llel
12 12 - 1)-I[ -12x
7
I - 4IX]S} + I
J"X4 - x7 •
Therefore
g = iix i + 141X S + tX4 - x7 •
Solutions to Selected Exercises 455

Introducing this into the tableau:

Xl X2 X3 X4 X5 X6 X7 r.h.s.

7 1 1 11
-12 0 3 4 0 0 12
11
6"" 0 1 1
3
.1
2 0 0 i
19
4 0 0 0 -4
1
1 0 4j
8J)
-12 0 0 -3
1
-4
11
0 1 -12
11

g 0 0 .1
3 i 0 0 95
12
4 1
0 0 IT 0 0 -IT
3 .£
0 0 -7 7 0 ~ ~
19 16 57 67
0 0 0 -77 ""7 1 77 ""7
0 0 4
-77 .J.
7 0 12
-77 i
2 17 54
0 0 0 ~~ 7 0 77 ""7

Hence the optimal solution is


xT =~,
xi = 1 (which was constrained to be integral)

Chapter 5

Section 5.6
l(a). The graph is shown in Figure S.7. The shortest path is <Pl,P2,PS,
Ps, P11) with a length of 40.
[0]

12 12
[\5]

6 3

9 6
[23]

8 9
[33]

7 10

[40]
Figure S.7
456 Solutions to Selected Exercises

12

I------------f 3
6 3

)--------{ 7
9 6

)---8=-----f 9 )--------:9:------f 10

Figure S.8

2(a) and 3(a). The graph is shown in Figure S.8. The order of inclusion
of the lines for Kruskal's method is
{P3,P4}, {P4,P6}' {P6,P7}' {P4,P2}, {PS,Pll}, {PS,P9}' {P6,PS}'
{P9,P10} [reject: {P3,P7}' {Ps,Ps}] {Ps,Ps} [reject: {P10,Pll},
{P2'PS}' {P7,PlO}, {P6,P9}] {P1,P2}'
The weight of the minimal spanning tree is 75.
4(a). The minimum cut is {(P1,P2),(P6,P2),(P6,PS),(P7'PS)}, with a ca-
pacity of 18. The arc capacities are shown in Figure S.9.
5(a). The optimal flow assignment is
112 = 7, 115 = 11, 124 = 4, 123 = 5, 162 = 2,
156 = 6, 157 = 5,
16S = 4, 167 = 0,

5
8

Figure S.9
Solutions to Selected Exercises 457

6(a). The optimal solution is shown in Figure S.lO. The cost of this flow
is 325. Note that it is different from the solution to 5(a).

18
Figure S.lO

7(b) ti eS i lSi el; II; tl; ffi


(J( 0 0 0 0 0 0 0 critical
1 5 0 0 5 5 0 0 critical
2 10 0 1 10 11 1 1
3 8 0 6 8 14 6 0
4 5 5 5 11 11 0 0 critical
5 12 5 6 17 18 1 1
6 7 11 11 18 18 0 0 critical
7 4 8 14 12 18 6 6
8 6 18 18 24 24 0 0 critical
9 10 8 14 18 24 6 6
w 0 24 24 24 24 0 0 critical

7(c) ti eSi lSi el; II; tl; ffi


0 0 0 0 0 0 0 0 critical
1 6 0 30 6 36 30 30
2 4 0 0 4 4 0 0 critical
3 5 4 24 9 29 20 0
4 6 4 4 10 10 0 0 critical
5 4 4 6 8 10 2 2
6 3 9 29 12 32 20 20
7 10 10 10 20 20 0 0 critical
8 12 20 20 32 32 0 0 critical
9 4 32 32 36 36 0 0 critical
w 0 36 36 36 36 0 0 critical
458 Solutions to Selected Exercises

8. Activity 4 ceases to be critical. The new critical path is


(IX, 2, 5, 7, 8,9, w).
The earliest completion time is now 34.

Chapter 6

Section 6.9
l(a). The solution is shown in Figure S.ll. The shortest path is
(Ph P2, Ps, Ps, Pll)
with a length of 40.
2(a). Let f,,(s) be the return when s has been allocated to Xl' X2' Xm n=
1, 2, 3, 4. Let

Then
n = 2, 3, 4 ... ,

Sl = Xl
11(Sl) = [7Xl - xDx,=s,
= 7s 1 - sf,

[0]

[12]

2 [23]

3 [33]

Figure S.11
Solutions to Selected Exercises 459

12(S2) = max {Jl(S2 - x 2) + 7X2 - xD


X2
O:s: X2 :S;S2

= max {7(S2 - x 2) - (S2 - x2 f + 7X2 - 2xD


x2
O:S:X2:S: S 2

13(S3) = 7s 3 + ts~ + 383S~ - 343S~


= 7s 3 - 161S~,
S4 = x 4 + S3 = 8
14(8) = max {J3(8 - x 4) + 7X4 - 4xn
X4
0:;; X4:;; 8

= max {7(8 - x 4 ) - 161 (8 - x4 f + 7x4 - 4xi}


X4
0:;;x4:;;8

= max {2N + i~X4 - i~xn.


X4
0:;; X4:;; 8

Let

Then

dF(xt) _
--- -
dX 4
96
11 -
1QQ
11 x 4 -
*- ° .

Therefore
xt = ~~ E [0,8].
Also,

d 2 F(xt) __ 100
dx 42 - 11 < ,
°
hence xt is a maximum point. Therefore

{" (8) --
)4
232
11
+ (96)(24)2
11 25
_
-
632
25 •

Recapitulating:

S4 =8 ~ xt = ~~
S3 = 8 - ~~ = g~ ~ X3 = (ll)(g~) = ~;
S2 = 12756 - n
= 12454 ~ xi = me2454 ) = i~
SI = 12454 - i~ = ~~ ~ xT = ~~.
The optimum is 6l52.
460 Solutions to Selected Exercises

2(b). Let fn(s) be the return when s has been allocated to X4, X3, ... , Xn·
Let
n = 1,2,3,4.
Then

S4 = X4
f4(S4) = [7X4 - 4xiJx4=.4
= 7s 4 - 4si,

f3(S3) = max {f4(S3 - X3) + 7X3 - 3xn


X3
o ';;X3';;'3
= max {7(S3 - x 3) - 4(S3 - X3)2 + 7X3 - 3xn
X3
O:S;X3S S 3

= max {7s 3 - 4s~ + 8S 3X3 - 7xn.


X3
O';;X3';;'3

Let

Then

Therefore

and

hence x~ is a maximum point. Hence

f3(S3) = 7s 3 - 4s~ + 3ls~ - 176S~


= 7s 3 - llsL

f2(S2) = max {f3(S2 - x 2) + 7X2 - 2xD


X2
O';;X2';;'2

= max {7S2 - Vs~ + 274S2X2 - 276XD·


X2
O';;X2';;'2
Solutions to Selected Exercises 461

Let

Hence

JF2 (xi) -_
JX 2
24
7 S2 -
52 *-
7 X2 -
°
and

xi = 163S2 E [0, S2].


Hence

and xi is a maximum point. Therefore


12(s2) = 7s 2 - lls~ + 19414S~ - ~is~
= 7s 2 - gis~.

Now

11(sl) = max {f2(8 - Xl) + 7X1 - xi}


XI
O";xI,,;8

= max {7(8 - Xl) - gi(8 - xy + 7X1 - xi}.


XI
O,,;xI,,;8

Let

Hence

and

hence xi is a maximum point. Therefore


11(sl) = 7(8 - ~~) - gi(8 - ~~)2 + 7(~~) _ (~~)2
_ 632
- 25 .
462 Solutions to Selected Exercises

Recapitulating:
Sl = 8 = xi = ~~
S2 = 8 - ~~ = IN = x! = (163W2054 ) = i~

S3 = 12°54 - i~ = ~~ = xj = (~)(~~) = ~;

S4 = ~~ - n
= ~~ = x! = ~~.
The optimum is 6ll.
There is slightly less effort involved in forward recursion.
3. Let f,,(s) be the return when s has been allocated to Xl' X2, ... , Xno
n = 1, 2, 3, 4. Let

Then
f,,(Sn) = max {fn-l(Sn - Xn) + txn - nx;}.
Xn
o ::::;xn::::;sn

Now
Sl = Xl

11(Sl) = max {7Xl - xi}.


XI
Xl=Sl

Sl 0 1 2 3 4 5 6 7 8

11(Sl) 0 6 10 12 12 10 6 0 -8

Now

12(S2) = max {fl(Sl) + 7X2 - 2xD·


X2
O.s X2.s 82

S2 0 2 3 4 5 6 7 8 x~ H S 2)
0 0 0 0
1 6 5 1 6
2 10 11 6 1 11
3 12 15 12 3 15
4 12 17 16 9 -4 1 17
5 10 17 18 13 2 -15 2 18
6 6 15 18 15 6 -9 -30 2 18
7 0 11 16 15 8 -5 -24 -49 2 16
8 -8 5 12 13 8 -3 -20 -43 -72 3 13
Solutions to Selected Exercises 463

Now

S3 = X3 + S2
f3(S3) = max {f2(S2) + 7X3 - 3xn·
X2
O:S;X2;:5;S2

X3

S3 0 2 3 4 5 6 7 8 x~ f2(S3)

0 0 0 0
1 1 4 0 6
2 11 10 2 0 11
3 15 15 8 -6 0,1 15
4 17 19 13 0 -20 1 19
5 18 21 17 5 -14 -40 1 21
6 18 22 19 9 -10 -34 -66 1 22
7 16 22 20 11 -8 -30 -60 -98 1 22
8 13 20 20 12 -8 -28 -56 -92 -136 1,2 20

Now

S4 = S3 + X4 = 8
f4(S4) = max {f3(S3) + 7X4 - 4xn·
X4
0:s;x4:s;8

x.

S4 0 1 2 3 4 5 6 7 8 xl f4(S4)

8 20 25 20 0 -17 -44 -80 -125 -180 1 25

S4 =8 =>X4 =1
S3 = 8 - 1 = 7 => x! =1
S2 = 7 - 1 = 6 => xi =2
SI = 6 - 2 = 4 => xt =4
with an optimum of 25.
464 Solutions to Selected Exercises

4. Let
Xn = the number of units produced at stage.
J..(s) = return when a total of s units have been produced by the end of
stage n.
Sn = Xl + X 2 + ... + X n •
cx,j = production cost of Xi units in period j.

J..(sn) = min {J.. -1 (Sn - Xn) + CXnn + (4 - n)Xn}, n = 1,2,3,4.


X4
O,:=;;;xn~sn

Now

Therefore
f1(S!) = min {c x .! + 3X1}.
x.
O:SX1:S S1

o 1 2 3 4 5 6

2 7 14 18 23 27 32

Now
S2 = Sl + X2
f2(S2) = min {J1(S2 - x 2) + cx2 2 + 2x 2}.
x2,,6
1 :::;X2:::;S2
5"'2,,12

The restrictions 1 :s; X 2 and 5 :s; S2 :s; 12 arise from the facts that 19 units
must be produced but that no more than 6 can be produced per period.

X2

S2 1 2 3 4 5 6 x! f2(S2)

7 41 40 40 41 43 39 2,3 39
8 45 44 46 47 46 3 44
9 49 50 52 50 3 49
10 55 56 55 4,6 55
11 61 59 6 59
12 64 6 64
Solutions to Selected Exercises 465

S3 = S2 + X3
f3(S3) = min {f2(S3 - X3) + cx3 3 + X3}·
l5x356
135'3518

x2

S2 2 3 4 5 6 x! f3(s3)

13 73 72 73 69 66 66 5,6 65
14 77 77 75 71 70 6 70
15 82 79 77 75 6 75
16 84 81 81 5,6 81
17 86 85 6 85
18 90 6 90

S4 = S3 + X3 = 19

f4(S4) = min {f3(19 - x 4) + CX44 }·


l5x456

1 2 3 4 5 6

19 95 94 94 90 87 87 5

The minimum cost is $87. Backtracking:

xt = 5 or xt = 6
xj = 6 xj = 6
xi = 3 xi = 6
xi = 5 xi = 1.

5. Let
f..(S) = return when S has been allocated to Xl' X2' ... , xn
Sn = Xl + x 2 + ... + x n •

Then

Xn
O<xn<sn
466 Solutions to Selected Exercises

11(Sl) = max {xd = Sl


XI
Xl =SI

12(S2) = max {f(S2 - X2)X 2}


X2
o <X2 <52
max {(S2 - X 2)X 2}.
X2
o <X2 <S2

Let

Hence
aF 2 (xf) _
- S2 -
*_0
sX 2 -
aX 2

and
xi = S2/ 2 E [0, S2J.
Also

a2 Fz{xi) = -2
< 0,
aX2
2

hence xi is a maximum point.

13(S3) = max {fz{S3 - X3)X3}


x3
o <X3 <S3

Let

Then

and
Solutions to Selected Exercises 467

hence S3/3 a maximum point, and


a2F3(S3) =
::l 2
2 0
S3> ,
uX3

hence S3 is a minimum point. Therefore


x~ = S3/3
f3(S3) = i(s 3/3)(S3 - (S3/3)f = (S3/3)3,

Hence
f4(S4) = max {J3(9 - X4)X4 }
X4
o <X4 < 9

Let

Then

and
xt = 0, £, or 9.
But as 0 < xt < 9,

Backtracking:
S4 = 9 = xt = £
S3 = 9 - xt = 247 =x~ = S3/3 = £
S2 = 247 - x~ = ~8 =x! = S2/2 = £
SI = I I - xi = £ = xi = SI
_2-
- 4·

The value of the optimal solution is (£)4.


6. Let
/,,(S) = return when S has been allocated to Xl' x 2, ... , Xn
Sn = Xl + X2 + ... + X n •
468 Solutions to Selected Exercises

Then

Xn
o <X n

Hence
11(sd = min {xi} = sf
Xl
Xl =51

12(S2) = min
X2
Ul(S2/X2) xn +
o
{(S2/X2)2 + xn.
<Xl

= min
X2
o <Xl
Let

Then

Therefore

Also
a2 F2 (xi) = 6s~ + 2 > 0
ax~ (.js;f '
hence xi is a minimum point. Therefore
12(S2) = 2s 2,

Hence
13(S3) = min U2(S3/X3) + xD
X3
O<X3

= min {2(S3/X3) + xD.


X3
o <X3
Let

Then

Therefore
Solutions to Selected Exercises 469

Also

hence xj is a minimum point. Therefore


13(S3) = 3s~/3,
S3 = S4/X4'
Hence
14(S4) = min {J3(S4/X4) + xn
X4
0<X4

= min {3(S4/X4)2/3 + xl}.


X4
0<X4
Let

Then

Therefore

Also
iJ2F4(x:) lOsl/3
()X4* = -S43
2/3 + 2 > 0,

hence x: is a minimum point. Therefore


14(S4) = 3sl /3 si 1/6 + si /2 = 4si /2 ,
S4 = ss/xs = 11/x 5·
Hence
Is(ss) = 15(11) = min {J4(11/xs) + xn
x,
o<x,
= min {4(11/x S )1/2 + xn.
x,
o<x,
Let

Then

and

Hence
'(11)
J.
= 4(11)1/2
11 1/10
+ 112/5 = 5(11)2/5
.
470 Solutions to Selected Exercises

Backtracking:
S5 = 11 = x~ = 11 1 / 5

S =
4
~
11 1 /5
= 11 4/5 = x* = (114/5)1/4 = 11 1/5
4

- = 11 3/5 = x*3 = (113/5)1/3 = 11 1/5


11 4 / 5
S
3
=-
11 1 / 5

- = 11 2/5 = x*2 = (112/5)1/2 = 11 1/5


11 3 / 5
S
2
=-
11 1 / 5
11 2 / 5
Sl -
- 11 1/ 5 -- 11 1 / 5 = x! = 111/5.

7. Let In(s) be the return when s has been allocated to V1 X1 + ... + VnX n,
n = 1, 2, where

Then
n

Sn = L ViX i
i= 1

n = 2,

hence

Hence
11(Sl) = min {xd = sd3.
XI
o :0; XI :0;51/3

Therefore
x! = sd3,

Hence
12(S2) = min {II (S2 - 4x 2) + X2}
X2
o :O;X2:O; 52/4

Hence
Solutions to Selected Exercises 471

and
f2(S2) = s2/4,
S2 ~ 12.
Therefore

Recapitulating:
xi = 3, xt = o.
8(a). Let f..(s, t) be the return when
n

L ViXi = Sn' n = 1,2


i= 1
and
n
I WiX i =
i= 1
tn' n = 1,2
where
(V 1 , V 2 ) = (3,4)
(W 1 , W 2 ) = (4, - 5)
S2 ::; 24
t2 ::; 20.
From the constraints,

Therefore
f..(sn, tn) = max {f..-1(Sn-1' tn- 1) + dnx;; + enxn}, n= 2

where
(d 1 ,d2 ) = (8,4)
(e 1 ,e 2 ) = (-3, -4).

SI tI XI fl(sj,tJ!

0 0 0 0
3 4 1 5
6 8 2 26
9 12 3 63
12 16 4 116
15 20 5 185

f2(S2' t 2) = max {I1 (S1' t 1) + 4x~ - 4x 2 }


X2
0,; X2'; 6

S2 = S1 + 4X2
t2 = t1 + 5x2·
472 Solutions to Selected Exercises

Therefore
fz(sz, t z ) = max {fl(SZ - 4x 2 ,tZ - 5x 2 ) + 4x~ - 4X 2}'
X2
O:$x2$6

S2 t2 X2 fz(sz, t z )

0 0 0 0
3 4 0 5
4 5 1 0
6 8 0 26
7 9 1 5
8 10 2 8
9 12 0 63
10 13 1 26
11 14 2 13
12 16 0 116
13 17 1 63
14 18 2 34
15 20 0 185
16 20 4 116

Hence the maximum value occurs when Sz = 15, t2 = 20, Xz = 0. Back-


tracking:
xi = 5, xi = 0, X6 = 185.
Let J..(s, t) be the return when
z
I
i=n
DiX i = Sn' n = 1,2

and
2
I
i-n
WiX i = tn' n = 1,2.

Then
fl(Sl' t 1 ) = max {fz(sz, t z ) + 8xi - 3xd·
Xl

Sz tz Xz fz(sz, t 2)

0 0 0 0
4 5 0
8 10 2 8
12 15 3 24
16 20 4 48
Solutions to Selected Exercises 473

SI t1 Xl fl(sl, t l )

0 0 0 0
3 4 1 5
4 5 0 0
6 8 0 26
7 9 1 5
8 10 0 8
9 12 3 63
10 13 2 26
11 14 13
12 15 0 24
12 16 4 116
13 17 3 63
14 18 2 34
15 20 5 185
16 20 4 116

Hence, as before,
xi = 5, xi = 0, X6 = 185.
9. Number the objects 1,2, ... ,7 in nonincreasing order of weight. Let
X. = {I, if object i is taken
! 0, otherwise
Wi = weight of object i
Vi = value of object i, i = 1,2, ... , 7
n

Sn = L WiX i,
i= 1
n = 1,2, ... , 7

f,.(S) = return when the weight of the objects selected after the first n
objects have been considered is s.
Then
f,.(sn) = max {f,. -1 (Sn - 1) + VnXn}
Xn

r=1 WiXi ~Sn


n

i
s7:>100

SI 0 50
Xl 0 1
11(SI) 0 60

S2 0 40 50 90
X2 0 1 0 1
12(S2) 0 40 60 100
474 Solutions to Selected Exercises

o 40 50 80 90
o o o 1 o
o 40 60 60 100

o 30 40 50 70 80 90
o 1 o o 1 1 0
o 10 40 60 50 70 100

o 30 40 50 70 80 90
o 1 o o 1 1 0
o 60 40 60 100 120 100

o 10 30 40 50 60 70 80 90 100
o 1 o 1 o 0,1 o 1 1 1
o 10 60 70 60 70 100 110 130 110

o 10 30 40 50 60 70 80 90 100
o o o 0 o 0 o o o 1
o 10 60 70 60 70 100 110 130 133

Backtracking:
x~ = 1, x~ = 1, x~ = 1, xl = 0,
xj = 0, x! = 0, xi = 1, x~ = 133.
10. Let
Ui = volume of object i
n

tn = L UiX i , n = 1,2, ... ,7


i= 1

t7 ::; 100
gn(s, t) = return when the weight and volume of the objects selected after
the first n objects have been considered in sand t, respectively.

Sl 0 50
tl 0 50
Xl 0 1
gl(Sl' t 1) 0 60

S2 0 40 50 90
t2 0 25 50 75
X2 0 1 0 1
g2(S2' t 2) 0 40 60 100
Solutions to Selected Exercises 475

S3 0 40 40 50 80 90 90
t3 0 0 25 50 25 75 50
X3 0 1 0 0 1 0 1
g3(S3, t 3 ) 0 20 40 60 60 100 80

S4 0 30 40 40 50 70 70 80 80 90 90
t4 0 25 0 25 50 25 50 75 25 75 50
X4 0 1 0 0 0 1 1 1 0 0 0
g4(S4, t 4 ) 0 10 20 40 60 30 50 70 60 100 80

S5 0 30 30 40 40 50 60 70 70 70 70 80 80 90 90
t5 0 25 75 0 25 50 100 25 50 100 75 75 25 75 90
X5 0 0 1 0 0 0 1 0 0 1 1 0 0 0 0
g5(S5, t 5 ) 0 10 60 20 40 60 70 30 50 100 80 70 60 100 80

S6 0 10 30 30 40 40 50 50 60 60 70 70 70 70 80 80 80 80
t6 0 25 25 75 o 25 50 25 75 100 25 100 75 50 75 25 50 100
X6 0 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1
g6 0 10 10 60 20 40 50 30 70 70 30 100 80 50 70 60 40 90

S6 90 90 90 100 100
t6 75 50 100 100 75
X6 0 0 1 1 1
g6 100 80 80 110 90

S7 0 10 10 20 30 30 40 40 40 50 50 50 50 60 60
t7 0 24 50 75 25 75 0 25 75 50 25 75 100 75 100
X7 0 0 1 1 0 0 0 0 1 0 0 1 1 0 0
g7 0 10 3 13 10 60 20 40 13 60 30 43 63 70 70

S7 60 70 70 70 70 80 80 80 80 90 90 90 100 100
t7 50 25 100 75 50 75 25 50 100 75 50 100 100 75
X7 1 0 0 0 0 0 0 0 0 0 0 0 0 0
g7 63 30 100 80 70 70 60 40 90 100 80 80 110 90
Backtracking:
x~ = 0, x: = 1, x~ =0, xl = 0,
x! =0, x! = 1, xt = 1, x~ = 110.
11. First divide all constants by 10.

Sl 0 10 20 30 40 50
tl 0 100 200 300 400 500
Xl 0 1 2 3 4 5
11 (Sl' t 1) 0 9 18 27 36 45
476 Solutions to Selected Exercises

82 0 10 20 20 30 30 40 40 40 50
t2 0 100 50 200 150 300 100 250 400 200
X2 0 0 1 0 1 0 2 1 0 2
f2(82' t 2) 0 9 10 18 19 27 20 28 36 29

82 50 50 60 60 60 70 70 70 80
t2 350 500 150 300 450 250 400 550 200
X2 1 0 3 2 1 3 2 1 4
f2(82 , t 2) 37 45 30 38 46 39 47 55 40

82 80 80 90 90 100 100 100


t2 350 500 300 450 250 400 550
X2 3 2 4 3 5 4 3
f2(8 2, t 2) 48 56 49 57 50 58 66

83 0 5 10 10 15 15 20 20 25 25 25 30
t3 0 150 100 300 250 450 50 400 200 350 550 150
X3 0 1 0 2 1 3 0 2 1 1 3 0
f3(83, t 3) 0 15 9 30 24 45 10 39 25 33 54 19

83 30 30 35 35 35 40 40 40 40 45 45 45
t3 300 500 300 450 500 100 250 400 450 250 400 550
X3 0 2 1 1 3 0 0 0 2 1 1 1
f3(83' t 3) 27 48 34 42 55 20 28 36 49 35 43 51

83 50 50 50 50 50 55 55 55 60 60 60 60
t3 200 350 400 500 550 350 500 550 150 300 450 500
X3 0 0 2 0 2 1 1 3 0 0 0 2
f3(8 3, t 3) 29 37 50 45 58 44 52 65 30 38 46 59

83 65 65 70 70 70 70 75 75 80 80 80 80 85
t3 300 450 250 400 450 550 400 550 200 350 500 550 350
X3 1 1 0 0 2 0 1 1 0 0 0 2 1
f3(83 , t 3) 45 53 39 47 60 55 54 62 40 48 56 69 55

83 85 90 90 90 95 100 100 100


t3 500 300 450 500 450 250 400 550
X3 1 0 0 2 1 0 0 0
f3(83 , t 3) 63 49 57 70 64 50 58 66

Backtracking:
x~ = 2, x! =4, xI =0
x~ = 70, or 700 in terms of the original data.
Solutions to Selected Exercises 477

12. Let
Si = number of tons available at the end of year i
Xi = number of tons sold at the end of year i, i = 1, 2, ... , 5
n = 2, 3,4,5.
Using backward recursion: Let
/,,(s) = the return that can be accrued from years n, n + 1, ... , 5
given S tons are available at the end of year n.

Then
n = 1,2,3,4
Xn
O:S;XnSSn

where
(dl>d 2 , d3 , d4 , d s ) = (400,330,44,15,5),
Sl = 20.

Hence
Is(ss) = max {5xs} = 5s s , and x~ = Ss,
x.=ss

14(S4) = max {fS(3(S4 - x 4» + 15x4}


X4
o :S;X4 :S;'4

= 15s4, and xl = 0,
13(S3) = max {f4(3(S3 - X3» + 44X3}
X3
0:S;X3:S;'3

= max {45s 3 - X3}


X3
O:s X3:S;'3

= 45s 3, and x! = 0,
12(S2) = max {f3(3(S2 - X2» + 330X2}
X2
0:SX2:S'2

= max {135s 2 + 195x2}


X2
O:s X2:S;'2

= 330s2, and xi = S2'


11(Sl) = max {fi3(20 - Xl» + 400xd
XI
o :S;XI:S; 20

= max = {990(20 - Xl) + 400Xd


XI
o :S;XI:S; 20
= 19,800, and xT = o.
478 Solutions to Selected Exercises

Backtracking:
x! = 0, Sl = 20
S2 = 3(Sl - 0) = 60
x~ = 60
S3 = 3(60 - 60) ° =

Ss =°
Hence all potatoes should be sold at the end of the second year for a total
profit of 19,800 units.

Chapter 7

Section 7.6
l(a)
f(x) = x 3 + ~X2 - 18x + 19
f'(x) = 3x 2 + 3x - 18
f"(x) = 6x + 3.
If
f'(x*) = 0,
then
x* = -3 or 2,
1"( -3) < 0,
hence x* = - 3 is the global maximum; also,
1"(2) > 0,
hence x* = 2 is the global minimum.
2(a). Referring to Exercise l(a), as -3 ¢ [ -1 ..i] the global maximum of
f occurs at one of the endpoints of [ - 1, n
As f( - 1) > fm,
x* =-1
is the global maximum, and 2 ¢ [ -1,n Therefore x* = 2 is still the global
minimum.
8 (a)
f(x) = sin x
f'(x) = cos x.
If
f'(x*) = 0,
then
cos x* = 0.
Therefore
x* = n12, x* E [0, nJ.
Solutions to Selected Exercises 479

Hence as f is concave, by Theorem 7.7,


x* = nl2
is a global maximum. f has global minima at x = 0 and x = n.
10(a)

(Pf
--=0
OXIX2

o2f
----'-----=0
OX2 0X l

Hence
2xT - 1 = O.
Therefore
*
X 1 -_l.
2'

Hence
6x! + 18 = O.
Therefore
x! = -3.
Therefore
Xo = (t, -3)
is the only candidate for an extreme point. As

H(X) = (~ ~)
is positive definite, X 0 is a global minimum.
f(Xo) = 52·
480 Solutions to Selected Exercises

l1(a). From Exercise 10(a),


*
X 1 --1.
2, x! = -3.
But
x! ¢ [ -2,6]'
Hence the global minimum lies on the boundary of
S= {(X l ,X2 ): -1::; Xl::; 5, -2::; x 2 ::; 6}.
The boundaries of S are
Bl = {tX( -1, -2) + (1 - tX)(5, -2): 0::; tX::; 1}
B2 = {tX(5, - 2) + (1 - tX)(5, 6): 0 ::; tX ::; 1}
B3 = {tX( -1,6) + (1 - tX)(5,6): 0::; tX::; 1}
B4 = {tX( -1, -2) + (1 - tX)( -1,6): 0::; tX::; 1}.
For B l . Let
g(tX) = f(tX( -1, - 2) + (1 - tX)(5, - 2)),
= f(5 - 6tX, -2)
= (5 - 6tX)2 - (5 - 6tX) + 3( _2)2 + 18( -2) + 14.
Then
g'(tX) = 2(5 - 6tX)( -6) + 6,
which is zero at tX*. Therefore
tX* = i E [0,1] and g"(tX*) > 0,
indicating a minimum. Therefore
(x!,x!) = i( -1, -2) + *(5, -2)
= (-t, -2)
f(x!,x!) = _~l.
For B 2 • Let
g(tX) = f(tX(5, -2) + (1 - tX)(5,6)),
= f(5, 6 - 8tX)
= 52 - 5 + 3(6 - 8tX)2 + 18(6 - 8tX) + 14.
Then

which is zero at tX*. Therefore


tX* = ! ¢ [0,1]'
Hence the global minimum for f cannot lie on B 2 •
For B 3 • Let
g(tX) = f(tX( -1, 6) + (1 - tX)(5,6))
= f(5 - 6tX,6)
= (5 - 6tX)2 - (5 - 6tX) + (3)6 2 + 18(6) + 14.
Solutions to Selected Exercises 481

Then
g'(a) = 2(5 - 6a)( -6) + 6,
which is zero at a*. Therefore
a* = 1E [0,1] and g"(a*) > 0,
indicating a minimum. Therefore
(xt, x!) = 1( -1,6) + !(5, 6).
= (t,6)
and
f(xt, x!) = 2291.
For B4 . Let
g(a) = f(a( -1, -2) + (1 - a)( -1,6)
= f( -1, 6 - 8a)
= ( _1)2 - (-1) + 3(6 - 8a)2 + 18(6 - 8a) + 14.
Then
g'(a) = 6(6 - 8a)( - 8) - 8(18),
which is zero at a*. Therefore
a* = i ¢ [0, 1J.
Hence the global minimum for f cannot lie on B4 .
As the lowest value for f occurs on the boundary of Bh we have
X* = (t, -2)
f(X*) = -¥-.
14(a)
hl(X) = Xl + X2 + X3 + X4 - 2 = 0
h 2(X) = 3Xl + 2X2 + 4X3 + X4 - 3 = 0

h3(X) = Xl + 4X2 + 3X3 + X4 - 1 = o.


Let
x = (Xl,X2,X3,X4) = (Wl, W2, W3, Yt)
W = (Wl, W2, w3 )
Y=Yl

J =(! ~ !)
1 4 3

J- l =( ; =: -;)
-2 ~ 5

K=(:)
482 Solutions to Selected Exercises

Then
af(W,Y)
ay
-1
= Vf(x 4) - Vf(x l , X2, X3)J K

= - 2X4 + (8x l , 6x 2, 12x3) r 1 K


1

~
-S
= -2X4 + (8x l ,6x 2 , 12x3) ( -S
2

-2 3
S
= SS6 Xl + 2S4X2 - 7lx3 - 2X4
= 0, for a stationary point.
Combining this equation with the previous three, we get
SS6 Xl + ~4X2 - 7lx3 - 2X4 = 0
Xl + X 2 + X3 + X 4 = 2

3Xl + 2X2 + 4X3 + X4 = 3

Xl + 4X2 + 3X3 + X 4 = 1,

which can be solved to yield


X* = (0.58, -0.39,0.08,1.73).
This point is a maximum and
f(X*) = -4.86.
15(a)
Maximize: 4Xl + 3X2 = Xo

subject to: 3Xl + 4X2 + X3 = 12


3x l + 3X2 + X4 = 10
4Xl + 2X2 + Xs = 8
i = 1,2, ... ,5.
In order to guarantee that the nonnegativity conditions are satisfied we
introduce squared slack variables:
Maximize: 4Xl + 3X2 = Xo
subject to: 3x l + 4X2 + x~ = 12
3x l + 3x2 + xi = 10
4Xl + 2X2 + x; = 8

X l ,X 2 ~ o.
We must still guarantee that Xl and X2 are nonnegative. Hence we introduce
the following constraints:
Solutions to Selected Exercises 483

with squared slack variables:


Xl - x~ =0
X2 - x~ = o.
Substituting for Xl and X2, we get
Maximize: 4x~ + 3x~ = Xo
subject to: 3x~ + 4x~ + x~ = 12 (A)
3x~ + 3x~ + xi = 10 (B)
4x~ + 2x~ + x; = 8. (C)
There is now no need for nonnegativity conditions and the problem is in a
form which is amenable to solution by the Jacobian method. Hence
m = 3, n = 5.
We must choose which variables are assigned to Wand which to Y. In
order to do this we calIon Theorem 7.12. If the Hessian matrix is negative
definite at X*, then X* is a candidate for a local maximum for xo.
Let
W = (WI' W2, W3) = (X3, X4 , xs)
Y = (Yl, Y2) = (X6, X7)·
Then
Xo = 4x~ + 3x~
and
8x o = 8X6
8X6
82xo = 8
8x~

8x o = 6X7
8X7
82xo =6
82X7 .

Therefore

Hy=G ~).
which is positive definite, not negative definite. This choice will not lead to
a local maximum.
Now let
W = (WI' W2, W3) = (x4 , X6, X7)
Y = (Yl, Y2) = (X3'XS)·
484 Solutions to Selected Exercises

By (A) and (C), we have


5x~ - x~ + 2x; = 4
and
lOx~ + 4x~ - 3x; = 24.
Therefore
x~ = !(x~ - 2x; + 4)
x~ = lo(3x; - 4x~ + 24)
Xo = ~(x~ - 2x; + 4) + /0 (3x; - 4x~ + 24).
Therefore
oXo 4
-= --X3
OX3 5
02XO 4
ox~ = 5
oXo 7
-=--x
oXs 5 S

02x o 7
ox; = 5
Therefore

Hy = (-t _~).
which is negative definite. Proceeding with the Jacobian method:
Vwxo = (O, 8X 6, 6x 7)
VcyXO -_( -SX3,
4 7)
-sXs

Therefore

3 1 3
lOx4 2X4 20X4
1
}-1 = 1
0
lOx6 5X6
1 3
0
5x7 20X7

c= ex,0 0)
o .
0 2xs
Solutions to Selected Exercises 485

Now

3 1

V~Xo = (0,0) - (0, 8X6' 6x 7 )


10x6
1
o
5X7

_ (-16X3 12x3 16xs _ 18X S )


- 10 + 5 ' 5 10

= (-~X3' -~xs),
which is what we expect from D. Therefore
V~Xo = 0 ~ x! = 0, xt =0.
Hence the original constraints become
3x~ + 4x~ = 12
3x~ + 3x~ + xi = 10
4x~ + 2x~ = 8.
This means that
X~ = ~(=xt)
x~ = II (=x!)
x~ =~
Xo = sl·
17(a). Let
m
F(X,A) = f(X) - L Ajhj(X),
j= 1
where
f(X) = -4xi - 3x~ - 6x~ - xi
hl(X) = Xl + X2 + X3 + X4 - 2
h2(X) = 3Xl + 2X2 + 4X3 + X4 - 3
hiX) = Xl + 4X2 + 3X3 + X4 - 1.
Then

-
aF = - 8Xl - Ai - 3A 2 - A3 = 0
aXl

aF
-aX = - 6X2 - Ai - 2A2 - 4A3 =0
2
486 Solutions to Selected Exercises

-
of = - 2X4 - A1 - A2 - A3 = 0
oX 4

of
OA 1 = -(Xl + X2 + X3 + x4 - 2) = 0

of
OA 2 = -(3X1 + 2X2 + 4X3 + x 4 - 3) = 0

of
OA 3 = -(Xl + 4X2 + 3X3 + X4 - 1) = 0,

which can be solved to yield:


xt = 0.58, x! = -0.39, X~ = 0.08, xl = 1.74
At = -5.02, A! = -0.57, A~ = 2.12.
Therefore
f(X*) = -4.86.

Chapter 8
Section 8.4
The number of evaluations n must be such that
bm -a,.. e 1
r '2:: = + -::-:----=-:--
b 1 - a1 b 1 - a1 3An - 2 + 2An - 3
Therefore
1 1. 1
- > ~ + --::--:----=---
10 - 10 3An - 2 + 2A n - 3•

Therefore
n=6.
Thus 6 evaluations will be necessary. The first two points are placed at
8 1 = -5 + (5 - (-5»(A5/A7)
81 = -5 + (5 - (-5»(A6/A7).
Therefore
81 = -5 + 10(153) = -~~
81 = -5 + 10(183) = g,
Solutions to Selected Exercises 487

Hence the new interval becomes [a1 b1 J = [ - ~ ~, 5J:

S2 = 81 = B
82 = - B + (5 - (- ~~))(A5/A6)
= -g + (~~)(i) = iL
I(S2) = -2m)2 + m) + 4 = -2.66 + 1.15 + 4 = 2.49
1(82 )= -2m)2 + (B) + 4 = 14.4 + 2.69 + 4 = -7.8,

Hence the new interval becomes [a2 b 2J = [ - g, UJ:


S3 = - g + (i~ - (- g))(A3/A 5)
= - g + i~)(i) = -fJ
83 = S2 = g,

I(S3) = - 2(153f + (153) + 4 = 0.29 + 0.38 + 4 = 4.09


1(83 ) = -2mf + 4 = -2.66 + 1.15 + 4 = 2.49,

Hence the new interval becomes [a3,b 3J = [ - g, ~~J:


S4 = -B + m- (-W)(A 2 /A 4 )
= -H + (mH) = - 153
84 = S3 = 153'

Hence the new interval becomes [a4, b4 J = [ - 153' ~n It can be seen that
the point remaining in the interval [a4, b4 J, namely, 84 = 153' is exactly at
the centre of [a4,b 4J. Also S5 = 84 , Thus in order to place 8 5 symmetrically
it must coincide with S5' which is of no advantage. Hence 8 5 is placed e to
the right of S5' Thus

I(S5) = -2(153f + (153) + 4 = -0.29 + 0.38 + 4 = 4.09


1(8 5 ) = -2(N7)2 + (N7) + 4 = -0.49 + 0.49 + 4 = 4,
488 Solutions to Selected Exercises

Hence the final interval is [ - 153' if.r]. The length ofthe interval is t~~, which
is only 8.8% as long as the original interval.
The first two points are placed at

So = -5 + (5 - (-5))e ~ y's)
= 10 - 5y'S = 5(J2 - y's)

80 = _5+(5_(_5))(y'S-1)
2
= 5J'S - 10 = 5(..)5 - 2),

Hence the new interval becomes [a2,b 2J = [10 - 5y'S,25 - lOy's]:

S2 = 5(2 - y's) + [5(5 - 2J'S) - 5(2 - y's)] (3 -2y's)

= 45 - 20y'S = 5(9 - 4y'S)


82 = S1 = 5y'S - 10,

Hence the new interval becomes [a3,b 3J = [10 - 5y'S,5y'S - 1OJ:

S3 = 5(2 - y's) + [5(y'S - 2) - 5(2 - y'S)] e -2y's)

= -45 + 20y'S = 5(4y'S - 9)


83 = S2 = 5(9 - 4y'S) = 45 - 20y'S,

f(S3) < f(8 3)·

Hence the new interval becomes [a4' b4 J = [5(4y'S - 9), 5(y'S - 2)].

S5 = '84 = 45 - 20J'S

'85 = 5(4y'S - 9) + [5(y'S - 2) - 5(4y'S - 9)J [ y's2- 1]

= 45y'S - 100 = 5(9y'S - 20),


Solutions to Selected Exercises 489

Hence the final interval is [as,bsJ = [5(4.J5 - 9), 5(9.j5 - 20)]. This in-
terval has length
5(915 - 20) - 5(4.J5 - 9) = 25.j5 - 55
= 5(5.j5 - 11).
The original interval is 10; this interval is a little over 9% of the original
interval in length.
3(a)
f(S) = -25 2 + 5 + 4 and [ao,boJ = [-5,5]'
Therefore
I'(S) = -4S + 1.
Let
S. = aj +b j
I 2·
Then

and
I'(S 1) so that [a2' b2J = [0,2.5]'
= 1'(2.5) < 0,
I'(S2) = 1'(1.25) < 0 so that [a3' b3 J = [0,1.25]'
I'(S3) =1'(0.625) < 0 so that [a4,b 4J = [0,0.625].
I'(S4) = 1'(0.3125) < 0 so that [as,bsJ = [0,0.3125]'
I'(Ss) = 1'(0.15625) > 0 so that [a6' b6 J = [0.15625,0.3125]'
Hence, after only 6 iterations the interval has been reduced to one of length
0.3125 - 0.15626. This is equal to 0.15625/10 or 1.5625% of the original
length
4(a). Refer to the solution to Exercise 3.
5(a). X* = (0,0).
(b). X* = (2,4).
(c). X* = (!, 1).
(d). X* = (- 3, -i).
(e). X* = (0,0).
(t). X* = (0,0).
(g). X* = (1, -2).
(h). X* = (2, 3).
6. Refer to Exercise 5.
7. Refer to Exercise 5.
8. Refer to Exercise 5.
9(a). Let
Xo = (0,0,0)
then
f(Xo) = -98
490 Solutions to Selected Exercises

and

Therefore
Vf = (- 6(X1 - 2), - 8(X2 - 3), -4(X3 + 5))
Vf(Xo) = (12,24, -20)
Xl = Xo + sD
= (0,0,0) + s(12,24, - 20).
Hence
f(X 1) = - 3(12s - 2)2 - 4(24s - 3)2 - 2( - 20s + 5)2
and

df~~ 1) = _ 6(12s _ 2)(12) - 8(24s - 3)(24) - 4( - 20s + 5)( - 20)

which is zero at s*. Therefore

s* = 0.158 and d 2f(X 1) < 0


ds 2
indicating a maximum. Therefore
Xl = (1.9,3.8, - 3.2)
f(X 1 ) = -9.07> f(Xo)
Vf(X 1 ) = (0.6, -6.4, -7.2),
X 2 = (1.9,3.8, - 3.2) + 5(0.6, - 6.4, - 7.2).
Therefore
f(X 2) = -9.07 - 2.68.6s 2 + 96.16s
and

which is zero at s*. Therefore

s* = 0.173
Solutions to Selected Exercises 491

indicating a maximum. Therefore


X2 = (2,2.7, -4.4).
f(X 2) = -1.08 > f(X 1)
Vf(X 2) = (0,2.4, - 2.4),
X3 = (2,2.7 + 2.4s, -4.4 - 2.4s).
Therefore
f(X 3) = -1.08 + 11.52s - 34.56s 2
and
df(X 3 )
~ = 11.52 - 69.12s,
which is zero at s*. Therefore
d 2f(X3) 0
s* = 0.1666 and dT < ,
indicating a maximum. Therefore
X3 = (2.0,3.0, -4.8), f(X 3 ) > f(X 2 )
Vf(X 3 ) = (0,0, -0.8),
X 4 = (2,3, - 4.8) + s(O,O, - 0.8)
= (2, 3, - 4.8 - 0.8s).
Therefore
f(X 4) = -1.28s 2 + O.64s - 0.08
and
df(X 4) = _ 2.56s + 0.64,
ds
which is zero at s* Therefore

s* = 0.25,

indicating a maximum. Therefore


X 4 = (2,3, - 4.8) + (0,0, - 0.2)
= (2,3, -5)
f(X 4 ) = 0 > f(X 3 )
Vf(X 4 ) = o.
Therefore
X* = (2,3, -5)
f(X*) = O.
492 Solutions to Selected Exercises

lO(a). From 9(a),


Xo = (0,0,0)
X 2 = (2,2.7, -4.4).
The search direction is
X 2 - Xo = (-2, -2.7,4.4).
Therefore
X3= Xo + S(X2 - Xo)
= (0,0,0) + s( -2, -2.7,4.4)
so that
f(X 3)= - 3(2s - 2)2 - 4(2.7s - 3)2 - 2( -4.4s + 5)2
and
df(X 3 )
~ = - 24s + 24 - 58.32s + 64.8 - 77.44s + 88,
which is zero at s*. Therefore

indicating a maximum. Therefore


X 3 = (2.2,3.0, - 4.9)

df
-d = -6(2.2 - 2) = -1.2
Xl

-
dX2
df
= -8(3 - 3) = °
df
-d = -4( -4.9 + 5) = -0.4.
X3

Therefore
X 4 = (2.2,3.0, - 4.9) + s( - 1.2,0, - 0.4),
so that
f(X 4) = - 3(2.2 - 1.2s - 2)2 - 4(3 - 3)2 - 2( - 4.9 - O.4s + S)2
and
df(X4)
~ = - 6( -1.2s + .2)( -1.2) - 4(0.1 - 0.4s)( - 0.4),
Solutions to Selected Exercises 493

which is zero at s*. Therefore

indicating a maximum. Therefore


X 4 = (2.0, 3.0, - 5.0),

n
which is the maximum point of f from 9(a).
l1(a)

f(X) ~ (12,24, -20) ( : } t(x"x" x,) -~ -4~)(~:) - 98


X3

Xo=O
Vf(X 0) = (12,24, - 20)
df
- = - 6(Xl - 2) = 12 at Xl =0
OXl

of
- = - 8(X2 - 3) = 24 at X2 =0
OX2

of
- = -- 4(X3 + 5) = - 20 at X3 = O.
OX3
From 9(a),
Xl = (1.9,3.8, - 3.2)

(-0.6,6.4,7.2) (-~ -~ ~) ( ~~)


o 0 -4 -20

ao= (-06 0 00) ( 2142)


(12,24, -20) -8
o 0 -4 -20
= -0.2843
Dl = Vf(X 1) + aoDo
=(0.6, -6.4,7.2) - 0.2843(12,24, -20)
= (- 2.8166, -13.2232, -1.514)
X 2 =X 1 + Sl Dl
= (1.9,3.8, - 3.2) + s( - 2.8, -13.2, -1.5)
f(X 2) = - 3(1.9 - 2.8s - 2)2 - 4(3.8 -13.2s - 3)2 - 2( - 3.2 -1.5s + 5)2
df(X 2 )
~= -6( -2.8s-0.1)( -2.8)-8(0.8-13.2s)( -13.2)-4(1.8-1.5s)( -1.5),
494 Solutions to Selected Exercises

which is zero at s*. Also

indicating a maximum point. Therefore


s* = 0.0645
X 2 = (1.9,3.8, - 3.2) + 0.0645 ( - 2.8, -13.2, -1.5)
= (1.7,2.9, -3.3)
I7f(X 2) = ( - 6(1. 7 - 2), - 8(2.9 - 3), - 4(3.3 + 5))
= (1.8,0.8, - 6.06)
-6 0 0) ( -2.81 )
( - 1.8, - 0.8,6.06) ( 0 - 8 0 - 13.22
o 0 -4 -1.514
l
a = (-6 0 0) ( -2.81)
(-2.81, -13.22, -1.51) 0 -8 0 -13.22
o 0 -4 -1.51
= 0.538257
D2 = I7f(X 2) + alD l
= (1.8,0.8, - 6.06) + O.0538257( - 2.8166, -13.2232, -1.514)
= (1.648,0.088, - 6.141)
X 3 =X 2 +sD 2
= (1.7,2.9, -3.3) + s(1.648,0.088, -6.141)
f(X 3) = - 3( -0.3 + 1.648s)2 - 4( -0.1 + 0.088s)2 - 2(1.7 - 6.141s)2

df <:3) = -6( -0.3 + 1.648s)(1.648) - 8( -0.1 + 0.088s)(0.088)


- 4(1.7 - 6.141s)( -6.141),
which is zero at s*. Therefore
d2f(X 3 ) 0
s* = 0.4275 and ds 2 <,

indicating a maximum point. Therefore


X3 = (1.7,2.9, -3.3) + 0.4275(1.648,0.088, -6.141)
= (2.4,3.0, - 5.9).
At the next iteration
X 4 = (2.0, 3.0, - 5.0),

which is the maximum point of f·


Solutions to Selected Exercises 495

12(a)

Therefore

Assume
Xo= (0,0).
Then
Xl = Xo - H- l (X o)l7f(X o)

= (0,0) _ (l~ _~)( -2~)


= a~,4)

H(X l ) = C~7 _~)


Vf(X 1) = ( - ;7).
Therefore

=(H,4)-(m -t)(-~)
= (1.73,4)
Vf(X 2 ) = (-~.83)

H(X 2 )= (7.6o2 -20).


496 Solutions to Selected Exercises

Therefore
X3= X 2 - H- 1(X 2 W!(X 2 )

= (1.73,4) - (0.~3 -~.5)( -~.83)


= (2.36,4)

V!(X 3) = ( - ~.23)

H(X3) = e~4 _~)


X 4 = X3 - H- 1(X 3W!(X 3)

= (2.34,4) - (0.~6 -~.5)( - ~.23)


= (2.75,4)

V!(X 4 ) = ( - ~8)
1.5
H(X 4 )= ( 0 -2·
0)
Therefore
Xs= X 4 - H- 1(X 4 W!(X 4 )

= (2.75,4) - (0.~7 -~.5)( -~.l8)


= (2.87,4)

V!(X s) = (-~.06)

H(Xs) = (0.~8 _~}


Therefore
X6= Xs - H- 1 (X S W!(X s)

= (2.87,4) - C·~8 -~.5)( -~.06)


= (2.95,4)
Eventually
X* = (3,4), !(X*) = 1.
13(a)
X* = (3,4), !(X*) = 1.
Solutions to Selected Exercises 497

13(b)
X* = (-2,1).
13(c)
X* = (0,2,0).
14(b). X* = (t, 0.4, 0.0182).
(c). X* = (1.094, 1.077, 1.041).
(d). X* = (0.9073,1.0561,0.8211).
(e). X* = (2(i)1/4, (!)1/i4, (i)-1 /28).
(f). X* = (1.82,0.234,j2/2).
(g). X* = (1.5004,0.4992,3.9936).
(h). X* = (0.2878,2.3438, 1.8420).
(i). X* = (1.03,0.103,0.872, -1.05).
Index

In the case of duplicated page numbers the major reference is given in boldface. Authors
not referred to here can be found in the references.
Archimedes 4 Conjugate gradients 333
Ascent methods 312 Conservation of flow 198
Assignment problem 83 Constraint
binding 26
global 124
Bernoulli, J. 5 local 124
Big M method 26 new 68
Bolzano's method 324 redundant 12
Bound 7 set 6
greatest lower 8 slack 26
lesat upper 7 Convex
lower 8 function 18
upper 7 programming 311
Brachistochrone 4,297 Critical path scheduling 187, 220
Brent's method 343 method 221
Cycling 38
Cutting plane methods 162
Calculus of variations 5, 290
Canonical form 21
Capital budgeting 180 Dantzig's method 80
Chaplygin 291 Decomposition 122
Characteristic equation 381 Definiteness 380
Circuit of cells 77 Degeneracy 35
Classical optimization 3, 2S7 Degree of difficulty 364
Cofactor 375 de l'Hopital 5
Combinatorial optimization 151 Derivative 385
Complementary slackness 55, S6 constrained 280
Conjugate directions 332 Determinant 374

499
500 Index

Diet problem 98 (8e) Golden section ratio 323


Digraph 190 Gradient
Dijkstra's algorithm 191 methods 313, 329
Direct methods 313, 334 partan 330
Distinguishability 311 projection method 346
Domain 385 Gradient direction theorem 390
Dual simplex method 66, 69, 115 Gram - Schmidt orthogonalization 342,
Duality 10, SO 382
Dynamic programming 3, 235, 298 Graph theory 188
Griffith and Stewart's method 351

Eigen values 381


Eigen vectors 381 Hamilton, W.R. 5
Enumeration Heron of Alexandria 4
branch and bound 154 Hessian methods 313,326
exhaustive 154 Hungarian method 85
implicit 154
Euler, L. 5
Euler- Lagrange lemma 295 Infimum 8
Extreme point 18 Integer programming 2, 10, 150
Extremum all integer 163
global 258 mixed 152
local 258 zero-one 152

Farmer's problem 256 Jacobian methods 4, 277


Feasible region 7 kinetic principle 5
Feasible directions method 345
Fermat 4
First mean value theorem 387 Knapsack problem 182
Fixed charge problem 179 Konig 85
Flow Kruskal's algorithm 194
backward 209 Kuhn - Tucker conditions 284, 358
forward 209
Fly-away kit problem 182, 253, 256
Functions 384 Labelling
concave 266 method 201
continuous 385 process 211
convex 266 Lagrange multipliers 4, 282
differentiable 385 Least cost method 74
unimodal 314 Least surface are problem 296
Functional 291 Leibniz 5
Limiting value 385
Linear approximation 350
Gauss 5 . Linear dependence 372
Gauss - Jordan elimination 22, 378 Linear programming 2, 10
Generalized reduced gradient
method 348
Geometric programming 311, 361 Matrix 370
Gibbs, J. W. 5 adjoint 377
Index 501

basis 108 Policy 239


block angular 124 Political redistricting 177
cofactor 377 Point
control 279 critical 258
Hessian 390 inflection 260
identity 370 stationary 258
inverse 377 Postoptimal analysis 10, 50, 59
nonsingular 377 Posynomial 362
square 370 Powell's method 342
transpose 371 Praxis method 343
zero 371 Primal 50
Maupertuis 5 Primal- Dual algorithm 122
Maximal flow problem 187, 199 Prim's algorithm 196
Minimal spanning tree problem 187, 194 Principle of optimality 240
Maximum
absolute 258
global 258 Quadratic form 379
local 258 Quadratic programming 310, 358
principle 5, 303
relative 258
Minimum 8 Range 385
absolute 258 Recursion
cost flow problem 187, 206 backward 246
cut 200 forward 246
global 258 Resolution 311
local 258 Return 237
relative 258 Revised simplex method 106
Rolle's theorem 386
Rosenbrock's method 341
Network 190
Network analysis 2, 187
Newton, I. 4, 5 Scalar multiplication 372
method 264, 273 Search
Raphson 327 adaptive 314
Nonlinear programming 10, 310 Bolzano's method 324
Normality condition 362 even block 326
Northwest corner method 74 Fibonnacci 317
golden section 322
one-at-a-time 341
Objective function 6 pattern 334
One-at-a-time search 341 univariate 313
Optimum 7 Separable programming 311, 352
Orthogonality conditions 363 Serial systems 238
Out-of-kilter method 209 Shortest length problem 295
Shortest path problem 187, 191
Simplex method 10, 19
Parametric programming 136 multipliers 109
Pattern search 334 Solution 7
Penalty function method 347 basic 17
PERT 221 basic feasible 17
502 Index

Solution (cont.) u - V method 80


degenerate 17
feasible 7
maximal 7 Variable
minimal 7 artificial 26
multiple 32 basic 17
nonexistant feasible 42 decision 278
optimal 7 independent 6
unbounded 45 metric method 327
value 7 new 68
Stage 237 slack 16
Standard form 15 state 278
State 237 structural 16
Step size 312 Vector 371
Stepping stone algorithm 77 direction 312
Stewart's method 344 Vehicle scheduling problem 174
SUMT 348 Vogel approximation method 75
Supremum 7

Weierstrass' theorem 386


Taylor's theorem 388 Weights 362
Toll adjustment 212
Transportation problem 69
Travelling salesman problem 172 Zermelo 291
Two phase method 30 Zoutendijk's method 345
Undergraduate Texts in Mathematics

Apostol: Introduction to Analytic Malitz: Introduction to Mathematical


Number Theory. Logic.
1976. xii, 338 pages. 24 illus. Set Theory - Computable Functions -
Model Theory.
Childs: A Concrete Introduction to 1979. Approx. 250 pages.
Higher Algebra. Approx. 2 illus.
1979. Approx. 336 pages. Approx. 8 illus.
Prenowitz/Jantosciak: The Theory of
Chung: Elementary Probability Theory Join Spaces
with Stochastic Processes. A Contemporary Approach to Convex
1975. xvi, 325 pages. 36 illus. Sets and Linear Geometry.
1979. xxii, 534 pages. Approx 404 illus.
Croom: Basic Concepts of Algebraic
Topology.
1978. x, 177 pages. 46 illus. Priestley: Calculus: An Historical
Approach.
Fleming: Functions of Several Variables. 1979. xvii, 448 pages. Approx. 300
Second edition. illus.
1977. xi, 411 pages. 96 illus.
Protter/Morrey: A First Course in Real
Franklin: Methods of Mathematical Analysis.
Economics. Linear and Nonlinear 1977. xii, 507 pages. 135 illus.
Programming, Fixed-Point Theorems.
1980. x, 297 pages. 38 illus. Ross: Elementary Analysis: The Theory
of Calculus.
Halmos: Finite-Dimensional Vector 1980. viii, 264 pages. 34 illus.
Spaces. Second edition.
1974. viii, 200 pages. Sigler: Algebra.
1976. xii, 419 pages. 27 illus.
Halmos: Naive Set Theory.
1974. vii, 104 pages.
Singer/Thorpe: Lecture Notes on
Elementary Topology and Geometry.
Hewitt: Numbers, Series, and Integrals.
1976. viii, 232 pages. 109 illus.
1981. Approx. 450 pages. In preparation

Iooss/Joseph: Elementary Stability and Smith: Linear Algebra


Bifurcation Theory. 1978. vii, 280 pages. 21 illus.
1980. Approx. 275 pages.
Approx. 47 illus. Thorpe: Elementary Topics in
Differential Geometry.
Kemeny/Snell: Finite Markov Chains. 1979. xvii. 253 pages. 126 illus.
1976. ix, 224 pages. 11 illus.
Whyburn/Duda: Dynamic Topology.
Lax/Burstein/Lax: Calculus with 1979. Approx. 175 pages. Approx. 20
Applications and Computing, illus.
Volume 1.
1976. xi, 513 pages. 170 illus. Wilson: Much Ado About Calculus.
A Modern Treatment with Applications
LeCuyer: College Mathematics with Prepared for Use with the Computer.
A Programming Language. 1979. Approx. 500 pages. Approx. 145
1978. xii, 420 pages. 144 illus. illus.

You might also like