Kuttler LinearAlgebra AFirstCourse
Kuttler LinearAlgebra AFirstCourse
A First Course in
LINEAR ALGEBRA
An Open Text
by Ken Kuttler
UNIVERSITY OF CALGARY
MATH 211 LINEAR METHODS I
ALL SECTIONS FALL 2016
OPEN TEXT
This text can be downloaded in electronic format, printed, and can be distributed to students
at no cost. Lyryx will also adapt the content and provide custom editions for specific courses
adopting Lyryx Assessment. The original TeX files are also available if instructors wish to adapt
certain sections themselves.
ONLINE ASSESSMENT
Lyryx has developed corresponding formative online assessment for homework and quizzes.
These are genuine questions for the subject and adapted to the content. Student answers are
carefully analyzed by the system and personalized feedback is immediately provided to help
students improve on their work.
Lyryx provides all the tools required to manage online assessment including student grade re-
ports and student performance statistics.
INSTRUCTOR SUPPLEMENTS
A number of resources are available, including a full set of beamer slides for instructors and
students, as well as a partial solution manual.
SUPPORT
Lyryx provides all of the support instructors and students need! Starting from the course prepa-
ration time to beyond the end of the course, Lyryx staff is available 7 days/week to provide
assistance. This may include adapting the text, managing multiple sections of the course, pro-
viding course supplements, as well as timely assistance to students with registration, navigation,
and daily organization.
Contact Lyryx!
[email protected]
A First Course in Linear Algebra
Ken Kuttler
Edits 2012-2014: The content of text has been modified and adapted with the addition of new material
and several images.
Edits 2015: The content of the text continues to be modified and adapted with the addition of new material
including additional examples and proofs to existing material.
Edits 2016: The layout and appearance of the text has been updated, including the title page and newly
designed back cover.
All new content (text and images) is released under the same license as noted below.
Lyryx Learning
Copyright
Creative Commons License (CC BY): This text, including the art and illustrations, are available under
the Creative Commons license (CC BY), allowing anyone to reuse, revise, remix and redistribute the text.
1 Systems of Equations 3
1.1 Systems of Equations, Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Systems Of Equations, Algebraic Procedures . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Elementary Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.3 Uniqueness of the Reduced Row-Echelon Form . . . . . . . . . . . . . . . . . . 25
1.2.4 Rank and Homogeneous Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2 Matrices 41
2.1 Matrix Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.1.1 Addition of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.1.2 Scalar Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.1.3 Multiplication of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.1.4 The i jth Entry of a Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.1.5 Properties of Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . 55
2.1.6 The Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.1.7 The Identity and Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
2.1.8 Finding the Inverse of a Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.1.9 Elementary Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
2.1.10 More on Matrix Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3 Determinants 87
3.1 Basic Techniques and Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.1.1 Cofactors and 2 2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.1.2 The Determinant of a Triangular Matrix . . . . . . . . . . . . . . . . . . . . . . . 92
3.1.3 Properties of Determinants I: Examples . . . . . . . . . . . . . . . . . . . . . . . 94
3.1.4 Properties of Determinants II: Some Important Proofs . . . . . . . . . . . . . . . 98
3.1.5 Finding Determinants using Row Operations . . . . . . . . . . . . . . . . . . . . 103
3.2 Applications of the Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.2.1 A Formula for the Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.2.2 Cramers Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
3.2.3 Polynomial Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
iii
iv CONTENTS
4 Rn 125
4.1 Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.2 Algebra in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.2.1 Addition of Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.2.2 Scalar Multiplication of Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . 129
4.3 Geometric Meaning of Vector Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.4 Length of a Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.5 Geometric Meaning of Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.6 Parametric Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.7 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.7.1 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
4.7.2 The Geometric Significance of the Dot Product . . . . . . . . . . . . . . . . . . . 149
4.7.3 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
4.8 Planes in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
4.9 The Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
4.9.1 The Box Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
4.10 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
4.10.1 Vectors and Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
4.10.2 Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Index 305
Preface
A First Course in Linear Algebra presents an introduction to the fascinating subject of linear algebra.
As the title suggests, this text is designed as a first course in linear algebra for students who have a
reasonable understanding of basic algebra. Major topics of linear algebra are presented in detail, with
proofs of important theorems provided. Connections to additional topics covered in advanced courses are
introduced, in an effort to assist those students who are interested in continuing on in linear algebra.
Each chapter begins with a list of desired outcomes which a student should be able to achieve upon
completing the chapter. Throughout the text, examples and diagrams are given to reinforce ideas and
provide guidance on how to approach various problems. Suggested exercises are given at the end of each
section, and students are encouraged to work through a selection of these exercises.
A brief review of complex numbers is given, which can serve as an introduction to anyone unfamiliar
with the topic.
Linear algebra is a wonderful and interesting subject, which should not be limited to a challenge
of correct arithmetic. The use of a computer algebra system can be a great help in long and difficult
computations. Some of the standard computations of linear algebra are easily done by the computer,
including finding the reduced row-echelon form. While the use of a computer system is encouraged, it is
not meant to be done without the student having an understanding of the computations.
1
1. Systems of Equations
Outcomes
A. Relate the types of solution sets of a system of two (three) variables to the intersections of
lines in a plane (the intersection of planes in three space)
As you may remember, linear equations like 2x + 3y = 6 can be graphed as straight lines in the coordi-
nate plane. We say that this equation is in two variables, in this case x and y. Suppose you have two such
equations, each of which can be graphed as a straight line, and consider the resulting graph of two lines.
What would it mean if there exists a point of intersection between the two lines? This point, which lies on
both graphs, gives x and y values for which both equations are true. In other words, this point gives the
ordered pair (x, y) that satisfy both equations. If the point (x, y) is a point of intersection, we say that (x, y)
is a solution to the two equations. In linear algebra, we often are concerned with finding the solution(s)
to a system of equations, if such solutions exist. First, we consider graphical representations of solutions
and later we will consider the algebraic methods for finding solutions.
When looking for the intersection of two lines in a graph, several situations may arise. The follow-
ing picture demonstrates the possible situations when considering two equations (two lines in the graph)
involving two variables.
y y y
x x x
One Solution No Solutions Infinitely Many Solutions
In the first diagram, there is a unique point of intersection, which means that there is only one (unique)
solution to the two equations. In the second, there are no points of intersection and no solution. When no
solution exists, this means that the two lines are parallel and they never intersect. The third situation which
can occur, as demonstrated in diagram three, is that the two lines are really the same line. For example,
x + y = 1 and 2x + 2y = 2 are equations which when graphed yield the same line. In this case there are
infinitely many points which are solutions of these two equations, as every ordered pair which is on the
graph of the line satisfies both equations. When considering linear systems of equations, there are always
three types of solutions possible; exactly one (unique) solution, infinitely many solutions, or no solution.
3
4 Systems of Equations
x+y = 3
yx = 5
Solution. Through graphing the above equations and identifying the point of intersection, we can find the
solution(s). Remember that we must have either one solution, infinitely many, or no solutions at all. The
following graph shows the two equations, as well as the intersection. Remember, the point of intersection
represents the solution of the two equations, or the (x, y) which satisfy both equations. In this case, there
is one point of intersection at (1, 4) which means we have one unique solution, x = 1, y = 4.
y
6
4
(x, y) = (1, 4)
x
4 3 2 1 1
In the above example, we investigated the intersection point of two equations in two variables, x and
y. Now we will consider the graphical solutions of three equations in two variables.
Consider a system of three equations in two variables. Again, these equations can be graphed as
straight lines in the plane, so that the resulting graph contains three straight lines. Recall the three possible
types of solutions; no solution, one solution, and infinitely many solutions. There are now more complex
ways of achieving these situations, due to the presence of the third line. For example, you can imagine
the case of three intersecting lines having no common point of intersection. Perhaps you can also imagine
three intersecting lines which do intersect at a single point. These two situations are illustrated below.
y y
x x
No Solution One Solution
1.1. Systems of Equations, Geometry 5
Consider the first picture above. While all three lines intersect with one another, there is no common
point of intersection where all three lines meet at one point. Hence, there is no solution to the three
equations. Remember, a solution is a point (x, y) which satisfies all three equations. In the case of the
second picture, the lines intersect at a common point. This means that there is one solution to the three
equations whose graphs are the given lines. You should take a moment now to draw the graph of a system
which results in three parallel lines. Next, try the graph of three identical lines. Which type of solution is
represented in each of these graphs?
We have now considered the graphical solutions of systems of two equations in two variables, as well
as three equations in two variables. However, there is no reason to limit our investigation to equations in
two variables. We will now consider equations in three variables.
You may recall that equations in three variables, such as 2x + 4y 5z = 8, form a plane. Above, we
were looking for intersections of lines in order to identify any possible solutions. When graphically solving
systems of equations in three variables, we look for intersections of planes. These points of intersection
give the (x, y, z) that satisfy all the equations in the system. What types of solutions are possible when
working with three variables? Consider the following picture involving two planes, which are given by
two equations in three variables.
Notice how these two planes intersect in a line. This means that the points (x, y, z) on this line satisfy
both equations in the system. Since the line contains infinitely many points, this system has infinitely
many solutions.
It could also happen that the two planes fail to intersect. However, is it possible to have two planes
intersect at a single point? Take a moment to attempt drawing this situation, and convince yourself that it
is not possible! This means that when we have only two equations in three variables, there is no way to
have a unique solution! Hence, the types of solutions possible for two equations in three variables are no
solution or infinitely many solutions.
Now imagine adding a third plane. In other words, consider three equations in three variables. What
types of solutions are now possible? Consider the following diagram.
New Plane
In this diagram, there is no point which lies in all three planes. There is no intersection between all
6 Systems of Equations
planes so there is no solution. The picture illustrates the situation in which the line of intersection of the
new plane with one of the original planes forms a line parallel to the line of intersection of the first two
planes. However, in three dimensions, it is possible for two lines to fail to intersect even though they are
not parallel. Such lines are called skew lines.
Recall that when working with two equations in three variables, it was not possible to have a unique
solution. Is it possible when considering three equations in three variables? In fact, it is possible, and we
demonstrate this situation in the following picture.
New Plane
In this case, the three planes have a single point of intersection. Can you think of other types of
solutions possible? Another is that the three planes could intersect in a line, resulting in infinitely many
solutions, as in the following diagram.
We have now seen how three equations in three variables can have no solution, a unique solution, or
intersect in a line resulting in infinitely many solutions. It is also possible that the three equations graph
the same plane, which also leads to infinitely many solutions.
You can see that when working with equations in three variables, there are many more ways to achieve
the different types of solutions than when working with two variables. It may prove enlightening to spend
time imagining (and drawing) many possible scenarios, and you should take some time to try a few.
You should also take some time to imagine (and draw) graphs of systems in more than three variables.
Equations like x + y 2z + 4w = 8 with more than three variables are often called hyper-planes. You may
soon realize that it is tricky to draw the graphs of hyper-planes! Through the tools of linear algebra, we
can algebraically examine these types of systems which are difficult to graph. In the following section, we
will consider these algebraic tools.
1.2. Systems Of Equations, Algebraic Procedures 7
Exercises
Exercise 1.1.1 Graphically, find the point (x1 , y1 ) which lies on both lines, x + 3y = 1 and 4x y = 3.
That is, graph each line and see where they intersect.
Exercise 1.1.2 Graphically, find the point of intersection of the two lines 3x + y = 3 and x + 2y = 1. That
is, graph each line and see where they intersect.
Exercise 1.1.3 You have a system of k equations in two variables, k 2. Explain the geometric signifi-
cance of
(a) No solution.
Outcomes
A. Use elementary operations to find the solution to a linear system of equations.
We have taken an in depth look at graphical representations of systems of equations, as well as how to
find possible solutions graphically. Our attention now turns to working with systems algebraically.
8 Systems of Equations
where ai j and b j are real numbers. The above is a system of m equations in the n variables,
x1 , x2 , xn . Written more simply in terms of summation notation, the above can be written in
the form
n
ai j x j = bi, i = 1, 2, 3, , m
j=1
The relative size of m and n is not important here. Notice that we have allowed ai j and b j to be any
real number. We can also call these numbers scalars . We will use this term throughout the text, so keep
in mind that the term scalar just means that we are working with real numbers.
Now, suppose we have a system where bi = 0 for all i. In other words every equation equals 0. This is
a special type of system.
Recall from the previous section that our goal when working with systems of linear equations was to
find the point of intersection of the equations when graphed. In other words, we looked for the solutions to
the system. We now wish to find these solutions algebraically. We want to find values for x1 , , xn which
solve all of the equations. If such a set of values exists, we call (x1 , , xn ) the solution set.
Recall the above discussions about the types of solutions possible. We will see that systems of linear
equations will have one unique solution, infinitely many solutions, or no solution. Consider the following
definition.
If you think of each equation as a condition which must be satisfied by the variables, consistent would
mean there is some choice of variables which can satisfy all the conditions. Inconsistent would mean there
is no choice of the variables which can satisfy all of the conditions.
The following sections provide methods for determining if a system is consistent or inconsistent, and
finding solutions if they exist.
We begin this section with an example. Recall from Example 1.1 that the solution to the given system was
(x, y) = (1, 4).
x+y = 3
yx = 5
Solution. By graphing these two equations and identifying the point of intersection, we previously found
that (x, y) = (1, 4) is the unique solution.
We can verify algebraically by substituting these values into the original equations, and ensuring that
the equations hold. First, we substitute the values into the first equation and check that it equals 3.
x + y = (1) + (4) = 3
This equals 3 as needed, so we see that (1, 4) is a solution to the first equation. Substituting the values
into the second equation yields
y x = (4) (1) = 4 + 1 = 5
which is true. For (x, y) = (1, 4) each equation is true and therefore, this is a solution to the system.
Now, the interesting question is this: If you were not given these numbers to verify, how could you
algebraically determine the solution? Linear algebra gives us the tools needed to answer this question.
The following basic operations are important tools that we will utilize.
It is important to note that none of these operations will change the set of solutions of the system of
equations. In fact, elementary operations are the key tool we use in linear algebra to find solutions to
systems of equations.
10 Systems of Equations
Solution. Notice that the second system has been obtained by taking the second equation of the first system
and adding -2 times the first equation, as follows:
2x y + (2)(x + y) = 8 + (2)(7)
By simplifying, we obtain
3y = 6
which is the second equation in the second system. Now, from here we can solve for y and see that y = 2.
Next, we substitute this value into the first equation as follows
x+y = x+2 = 7
Hence x = 5 and so (x, y) = (5, 2) is a solution to the second system. We want to check if (5, 2) is also a
solution to the first system. We check this by substituting (x, y) = (5, 2) into the system and ensuring the
equations are true.
x + y = (5) + (2) = 7
2x y = 2 (5) (2) = 8
Hence, (5, 2) is also a solution to the first system.
This example illustrates how an elementary operation applied to a system of two equations in two
variables does not affect the solution set. However, a linear system may involve many equations and many
variables and there is no reason to limit our study to small systems. For any size of system in any number
of variables, the solution set is still the collection of solutions to the equations. In every case, the above
operations of Definition 1.6 do not change the set of solutions to the system of linear equations.
In the following theorem, we use the notation Ei to represent an equation, while bi denotes a constant.
1.2. Systems Of Equations, Algebraic Procedures 11
E1 = b 1
(1.1)
E2 = b 2
Then the following systems have the same solution set as 1.1:
1.
E2 = b 2
(1.2)
E1 = b 1
2.
E1 = b 1
(1.3)
kE2 = kb2
for any scalar k, provided k 6= 0.
3.
E1 = b 1
(1.4)
E2 + kE1 = b2 + kb1
for any scalar k (including k = 0).
Before we proceed with the proof of Theorem 1.8, let us consider this theorem in context of Example
1.7. Then,
E1 = x + y, b1 = 7
E2 = 2x y, b2 = 8
Recall the elementary operations that we used to modify the system in the solution to the example. First,
we added (2) times the first equation to the second equation. In terms of Theorem 1.8, this action is
given by
E2 + (2) E1 = b2 + (2) b1
or
2x y + (2) (x + y) = 8 + (2) 7
This gave us the second system in Example 1.7, given by
E1 = b 1
E2 + (2) E1 = b2 + (2) b1
From this point, we were able to find the solution to the system. Theorem 1.8 tells us that the solution
we found is in fact a solution to the original system.
We will now prove Theorem 1.8.
Proof.
1. The proof that the systems 1.1 and 1.2 have the same solution set is as follows. Suppose that
(x1 , , xn ) is a solution to E1 = b1 , E2 = b2 . We want to show that this is a solution to the system
in 1.2 above. This is clear, because the system in 1.2 is the original system, but listed in a different
order. Changing the order does not effect the solution set, so (x1 , , xn ) is a solution to 1.2.
12 Systems of Equations
2. Next we want to prove that the systems 1.1 and 1.3 have the same solution set. That is E1 = b1 , E2 =
b2 has the same solution set as the system E1 = b1 , kE2 = kb2 provided k 6= 0. Let (x1 , , xn ) be a
solution of E1 = b1 , E2 = b2 ,. We want to show that it is a solution to E1 = b1 , kE2 = kb2 . Notice that
the only difference between these two systems is that the second involves multiplying the equation,
E2 = b2 by the scalar k. Recall that when you multiply both sides of an equation by the same number,
the sides are still equal to each other. Hence if (x1 , , xn ) is a solution to E2 = b2 , then it will also
be a solution to kE2 = kb2 . Hence, (x1 , , xn ) is also a solution to 1.3.
Similarly, let (x1 , , xn ) be a solution of E1 = b1 , kE2 = kb2 . Then we can multiply the equation
kE2 = kb2 by the scalar 1/k, which is possible only because we have required that k 6= 0. Just as
above, this action preserves equality and we obtain the equation E2 = b2 . Hence (x1 , , xn ) is also
a solution to E1 = b1 , E2 = b2 .
3. Finally, we will prove that the systems 1.1 and 1.4 have the same solution set. We will show that
any solution of E1 = b1 , E2 = b2 is also a solution of 1.4. Then, we will show that any solution of
1.4 is also a solution of E1 = b1 , E2 = b2 . Let (x1 , , xn ) be a solution to E1 = b1 , E2 = b2 . Then
in particular it solves E1 = b1 . Hence, it solves the first equation in 1.4. Similarly, it also solves
E2 = b2 . By our proof of 1.3, it also solves kE1 = kb1 . Notice that if we add E2 and kE1 , this is equal
to b2 + kb1 . Therefore, if (x1 , , xn ) solves E1 = b1 , E2 = b2 it must also solve E2 + kE1 = b2 + kb1 .
Now suppose (x1 , , xn ) solves the system E1 = b1 , E2 + kE1 = b2 + kb1 . Then in particular it is a
solution of E1 = b1 . Again by our proof of 1.3, it is also a solution to kE1 = kb1 . Now if we subtract
these equal quantities from both sides of E2 + kE1 = b2 + kb1 we obtain E2 = b2 , which shows that
the solution also satisfies E1 = b1 , E2 = b2 .
Stated simply, the above theorem shows that the elementary operations do not change the solution set
of a system of equations.
We will now look at an example of a system of three equations and three variables. Similarly to the
previous examples, the goal is to find values for x, y, z such that each of the given equations are satisfied
when these values are substituted in.
x + 3y + 6z = 25
2x + 7y + 14z = 58 (1.5)
2y + 5z = 19
Solution. We can relate this system to Theorem 1.8 above. In this case, we have
E1 = x + 3y + 6z, b1 = 25
E2 = 2x + 7y + 14z, b2 = 58
E3 = 2y + 5z, b3 = 19
Theorem 1.8 claims that if we do elementary operations on this system, we will not change the solution
set. Therefore, we can solve this system using the elementary operations given in Definition 1.6. First,
1.2. Systems Of Equations, Algebraic Procedures 13
replace the second equation by (2) times the first equation added to the second. This yields the system
x + 3y + 6z = 25
y + 2z = 8 (1.6)
2y + 5z = 19
Now, replace the third equation with (2) times the second added to the third. This yields the system
x + 3y + 6z = 25
y + 2z = 8 (1.7)
z=3
At this point, we can easily find the solution. Simply take z = 3 and substitute this back into the previous
equation to solve for y, and similarly to solve for x.
x + 3y + 6 (3) = x + 3y + 18 = 25
y + 2 (3) = y + 6 = 8
z=3
x + 3y = 7
y=2
z=3
Now add (3) times the second to the first. This yields
x=1
y=2
z=3
a system which has the same solution set as the original system. This avoided back substitution and led
to the same solution set. It is your decision which you prefer to use, as both methods lead to the correct
solution, (x, y, z) = (1, 2, 3).
The work we did in the previous section will always find the solution to the system. In this section, we
will explore a less cumbersome way to find the solutions. First, we will represent a linear system with
14 Systems of Equations
an augmented matrix. A matrix is simply a rectangular array of numbers. The size or dimension of a
matrix is defined as m n where m is the number of rows and n is the number of columns. In order to
construct an augmented matrix from a linear system, we create a coefficient matrix from the coefficients
of the variables in the system, as well as a constant matrix from the constants. The coefficients from one
equation of the system create one row of the augmented matrix.
For example, consider the linear system in Example 1.9
x + 3y + 6z = 25
2x + 7y + 14z = 58
2y + 5z = 19
Notice that it has exactly the same information as the original system.Here it is understood that the
1
first column contains the coefficients from x in each equation, in order, 2 . Similarly, we create a
0
3
column from the coefficients on y in each equation, 7 and a column from the coefficients on z in each
2
6
equation, 14 . For a system of more than three variables, we would continue in this way constructing
5
a column for each variable. Similarly, for a system of less than three variables, we simply construct a
column for each variable.
25
Finally, we construct a column from the constants of the equations, 58 .
19
The rows of the augmented matrix correspond
to the equations in the system. For example, the top
row in the augmented matrix, 1 3 6 | 25 corresponds to the equation
x + 3y + 6z = 25.
a11 x1 + + a1n xn = b1
.
..
am1 x1 + + amn xn = bm
where the xi are variables and the ai j and bi are constants, the augmented matrix of this system is
given by
a11 a1n b1
. . .
.. .. ..
am1 amn bm
Now, consider elementary operations in the context of the augmented matrix. The elementary opera-
tions in Definition 1.6 can be used on the rows just as we used them on equations previously. Changes to
a system of equations in as a result of an elementary operation are equivalent to changes in the augmented
matrix resulting from the corresponding row operation. Note that Theorem 1.8 implies that any elementary
row operations used on an augmented matrix will not change the solution to the corresponding system of
equations. We now formally define elementary row operations. These are the key tool we will use to find
solutions to systems of equations.
Recall how we solved Example 1.9. We can do the exact same steps as above, except now in the
context of an augmented matrix and using row operations. The augmented matrix of this system is
1 3 6 25
2 7 14 58
0 2 5 19
Thus the first step in solving the system given by 1.5 would be to take (2) times the first row of the
augmented matrix and add it to the second row,
1 3 6 25
0 1 2 8
0 2 5 19
16 Systems of Equations
Note how this corresponds to 1.6. Next take (2) times the second row and add to the third,
1 3 6 25
0 1 2 8
0 0 1 3
This augmented matrix corresponds to the system
x + 3y + 6z = 25
y + 2z = 8
z=3
which is the same as 1.7. By back substitution you obtain the solution x = 1, y = 2, and z = 3.
Through a systematic procedure of row operations, we can simplify an augmented matrix and carry it
to row-echelon form or reduced row-echelon form, which we define next. These forms are used to find
the solutions of the system of equations corresponding to the augmented matrix.
In the following definitions, the term leading entry refers to the first nonzero entry of a row when
scanning the row from left to right.
2. Each leading entry of a row is in a column to the right of the leading entries of any row above
it.
We also consider another reduced form of the augmented matrix which has one further condition.
2. Each leading entry of a row is in a column to the right of the leading entries of any rows above
it.
4. All entries in a column above and below a leading entry are zero.
Notice that the first three conditions on a reduced row-echelon form matrix are the same as those for
row-echelon form.
Hence, every reduced row-echelon form matrix is also in row-echelon form. The converse is not
necessarily true; we cannot assume that every matrix in row-echelon form is also in reduced row-echelon
1.2. Systems Of Equations, Algebraic Procedures 17
form. However, it often happens that the row-echelon form is sufficient to provide information about the
solution of a system.
The following examples describe matrices in these various forms. As an exercise, take the time to
carefully verify that they are in the specified form.
Notice that we could apply further row operations to these matrices to carry them to reduced row-
echelon form. Take the time to try that on your own. Consider the following matrices, which are in
reduced row-echelon form.
One way in which the row-echelon form of a matrix is useful is in identifying the pivot positions and
pivot columns of the matrix.
18 Systems of Equations
This is all we need in this example, but note that this matrix is not in reduced row-echelon form.
In order to identify the pivot positions in the original matrix, we look for the leading entries in the
row-echelon form of the matrix. Here, the entry in the first row and first column, as well as the entry in
the second row and second column are the leading entries. Hence, these locations are the pivot positions.
We identify the pivot positions in the original matrix, as in the following:
1 2 3 4
3 2 1 6
4 4 4 10
Thus the pivot columns in the matrix are the first two columns.
The following is an algorithm for carrying a matrix to row-echelon form and reduced row-echelon
form. You may wish to use this algorithm to carry the above matrix to row-echelon form or reduced
row-echelon form yourself for practice.
1.2. Systems Of Equations, Algebraic Procedures 19
1. Starting from the left, find the first nonzero column. This is the first pivot column, and the
position at the top of this column is the first pivot position. Switch rows if necessary to place
a nonzero number in the first pivot position.
2. Use row operations to make the entries below the first pivot position (in the first pivot column)
equal to zero.
3. Ignoring the row containing the first pivot position, repeat steps 1 and 2 with the remaining
rows. Repeat the process until there are no more rows to modify.
4. Divide each nonzero row by the value of the leading entry, so that the leading entry becomes
1. The matrix will then be in row-echelon form.
The following step will carry the matrix from row-echelon form to reduced row-echelon form.
5. Moving from right to left, use row operations to create zeros in the entries of the pivot columns
which are above the pivot positions. The result will be a matrix in reduced row-echelon form.
Most often we will apply this algorithm to an augmented matrix in order to find the solution to a system
of linear equations. However, we can use this algorithm to compute the reduced row-echelon form of any
matrix which could be useful in other applications.
Consider the following example of Algorithm 1.19.
Solution. In working through this example, we will use the steps outlined in Algorithm 1.19.
1. The first pivot column is the first column of the matrix, as this is the first nonzero column from the
left. Hence the first pivot position is the one in the first row and first column. Switch the first two
rows to obtain a nonzero entry in the first pivot position, outlined in a box below.
1 4 3
0 5 4
5 10 7
20 Systems of Equations
2. Step two involves creating zeros in the entries below the first pivot position. The first entry of the
second row is already a zero. All we need to do is subtract 5 times the first row from the third row.
The resulting matrix is
1 4 3
0 5 4
0 10 8
3. Now ignore the top row. Apply steps 1 and 2 to the smaller matrix
5 4
10 8
In this matrix, the first column is a pivot column, and 5 is in the first pivot position. Therefore, we
need to create a zero below it. To do this, add 2 times the first row (of this matrix) to the second.
The resulting matrix is
5 4
0 0
Our original matrix now looks like
1 4 3
0 5 4
0 0 0
We can see that there are no more rows to modify.
4. Now, we need to create leading 1s in each row. The first row already has a leading 1 so no work is
needed here. Divide the second row by 5 to create a leading 1. The resulting matrix is
1 4 3
0 1 45
0 0 0
This matrix is now in row-echelon form.
5. Now create zeros in the entries above pivot positions in each column, in order to carry this matrix
all the way to reduced row-echelon form. Notice that there is no pivot position in the third column
so we do not need to create any zeros in this column! The column in which we need to create zeros
is the second. To do so, subtract 4 times the second row from the first row. The resulting matrix is
1 0 51
4
0 1 5
0 0 0
In the next example, we look at how to solve a system of equations using the corresponding augmented
matrix.
2x + 4y 3z = 1
5x + 10y 7z = 2
3x + 6y + 5z = 9
In order to find the solution to this system, we wish to carry the augmented matrix to reduced row-
echelon form. We will do so using Algorithm 1.19. Notice that the first column is nonzero, so this is our
first pivot column. The first entry in the first row, 2, is the first leading entry and it is in the first pivot
position. We will use row operations to create zeros in the entries below the 2. First, replace the second
row with 5 times the first row plus 2 times the second row. This yields
2 4 3 1
0 0 1 1
3 6 5 9
Now, replace the third row with 3 times the first row plus to 2 times the third row. This yields
2 4 3 1
0 0 1 1
0 0 1 21
Now the entries in the first column below the pivot position are zeros. We now look for the second pivot
column, which in this case is column three. Here, the 1 in the second row and third column is in the pivot
position. We need to do just one row operation to create a zero below the 1.
Taking 1 times the second row and adding it to the third row yields
2 4 3 1
0 0 1 1
0 0 0 20
We could proceed with the algorithm to carry this matrix to row-echelon form or reduced row-echelon
form. However, remember that we are looking for the solutions to the system of equations. Take another
look at the third row of the matrix. Notice that it corresponds to the equation
0x + 0y + 0z = 20
22 Systems of Equations
There is no solution to this equation because for all x, y, z, the left side will equal 0 and 0 6= 20. This shows
there is no solution to the given system of equations. In other words, this system is inconsistent.
The following is another example of how to find the solution to a system of equations by carrying the
corresponding augmented matrix to reduced row-echelon form.
3x y 5z = 9
y 10z = 0 (1.8)
2x + y = 6
In order to find the solution to this system, we will carry the augmented matrix to reduced row-echelon
form, using Algorithm 1.19. The first column is the first pivot column. We want to use row operations to
create zeros beneath the first entry in this column, which is in the first pivot position. Replace the third
row with 2 times the first row added to 3 times the third row. This gives
3 1 5 9
0 1 10 0
0 1 10 0
Now, we have created zeros beneath the 3 in the first column, so we move on to the second pivot column
(which is the second column) and repeat the procedure. Take 1 times the second row and add to the third
row.
3 1 5 9
0 1 10 0
0 0 0 0
The entry below the pivot position in the second column is now a zero. Notice that we have no more pivot
columns because we have only two leading entries.
At this stage, we also want the leading entries to be equal to one. To do so, divide the first row by 3.
1 13 53 3
0 1 10 0
0 0 0 0
1
3 times the second row to the first row.
1 0 5 3
0 1 10 0
0 0 0 0
This is in reduced row-echelon form, which you should verify using Definition 1.13. The equations
corresponding to this reduced row-echelon form are
x 5z = 3
y 10z = 0
or
x = 3 + 5z
y = 10z
Observe that z is not restrained by any equation. In fact, z can equal any number. For example, we can
let z = t, where we can choose t to be any number. In this context t is called a parameter . Therefore, the
solution set of this system is
x = 3 + 5t
y = 10t
z=t
where t is arbitrary. The system has an infinite set of solutions which are given by these equations. For
any value of t we select, x, y, and z will be given by the above equations. For example, if we choose t = 4
then the corresponding solution would be
x = 3 + 5(4) = 23
y = 10(4) = 40
z=4
In Example 1.22 the solution involved one parameter. It may happen that the solution to a system
involves more than one parameter, as shown in the following example.
Take 1 times the first row and add to the second. Then take 1 times the first row and add to the
third. This yields
1 2 1 1 3
0 1 0 0 2
0 1 0 0 2
Now add the second row to the third row and divide the second row by 1.
1 2 1 1 3
0 1 0 0 2 (1.9)
0 0 0 0 0
This matrix is in row-echelon form and we can see that x and y correspond to pivot columns, while
z and w do not. Therefore, we will assign parameters to the variables z and w. Assign the parameter s
to z and the parameter t to w. Then the first row yields the equation x + 2y s + t = 3, while the second
row yields the equation y = 2. Since y = 2, the first equation becomes x + 4 s + t = 3 showing that the
solution is given by
x = 1 + s t
y=2
z=s
w=t
It is customary to write this solution in the form
x 1 + s t
y 2
z = s (1.10)
w t
This example shows a system of equations with an infinite solution set which depends on two param-
eters. It can be less confusing in the case of an infinite solution set to first place the augmented matrix in
reduced row-echelon form rather than just row-echelon form before seeking to write down the description
of the solution.
In the above steps, this means we dont stop with the row-echelon form in equation 1.9. Instead we
first place it in reduced row-echelon form as follows.
1 0 1 1 1
0 1 0 0 2
0 0 0 0 0
Then the solution is y = 2 from the second row and x = 1 + z w from the first. Thus letting z = s and
w = t, the solution is given by 1.10.
You can see here that there are two paths to the correct answer, which both yield the same answer.
Hence, either approach may be used. The process which we first used in the above solution is called
Gaussian Elimination This process involves carrying the matrix to row-echelon form, converting back to
equations, and using back substitution to find the solution. When you do row operations until you obtain
reduced row-echelon form, the process is called Gauss-Jordan Elimination.
1.2. Systems Of Equations, Algebraic Procedures 25
We have now found solutions for systems of equations with no solution and infinitely many solutions,
with one parameter as well as two parameters. Recall the three types of solution sets which we discussed
in the previous section; no solution, one solution, and infinitely many solutions. Each of these types of
solutions could be identified from the graph of the system. It turns out that we can also identify the type
of solution from the reduced row-echelon form of the augmented matrix.
No Solution: In the case where the system of equations has no solution, the row-echelon form of
the augmented matrix will have a row of the form
0 0 0 | 1
This row indicates that the system is inconsistent and has no solution.
One Solution: In the case where the system of equations has one solution, every column of the
coefficient matrix is a pivot column. The following is an example of an augmented matrix in reduced
row-echelon form for a system of equations with one solution.
1 0 0 5
0 1 0 0
0 0 1 2
Infinitely Many Solutions: In the case where the system of equations has infinitely many solutions,
the solution contains parameters. There will be columns of the coefficient matrix which are not
pivot columns. The following are examples of augmented matrices in reduced row-echelon form for
systems of equations with infinitely many solutions.
1 0 0 5
0 1 2 3
0 0 0 0
or
1 0 0 5
0 1 0 3
As we have seen in earlier sections, we know that every matrix can be brought into reduced row-echelon
form by a sequence of elementary row operations. Here we will prove that the resulting matrix is unique;
in other words, the resulting matrix in reduced row-echelon form does not depend upon the particular
sequence of elementary row operations or the order in which they were performed.
Let A be the augmented matrix of a homogeneous system of linear equations in the variables x1 , x2 , , xn
which is also in reduced row-echelon form. The matrix A divides the set of variables in two different types.
We say that xi is a basic variable whenever A has a leading 1 in column number i, in other words, when
column i is a pivot column. Otherwise we say that xi is a free variable.
Recall Example 1.23.
26 Systems of Equations
x + 2y z + w = 3
x+yz+w = 1
x + 3y z + w = 5
Solution. Recall from the solution of Example 1.23 that the row-echelon form of the augmented matrix of
this system is given by
1 2 1 1 3
0 1 0 0 2
0 0 0 0 0
You can see that columns 1 and 2 are pivot columns. These columns correspond to variables x and y,
making these the basic variables. Columns 3 and 4 are not pivot columns, which means that z and w are
free variables.
We can write the solution to this system as
x = 1 + s t
y=2
z=s
w=t
Here the free variables are written as parameters, and the basic variables are given by linear functions
of these parameters.
In general, all solutions can be written in terms of the free variables. In such a description, the free
variables can take any values (they become parameters), while the basic variables become simple linear
functions of these parameters. Indeed, a basic variable xi is a linear function of only those free variables
x j with j > i. This leads to the following observation.
Using this proposition, we prove a lemma which will be used in the proof of the main result of this
section below.
Proof. With respect to the linear systems associated with the matrices A and B, there are two cases to
consider:
1.2. Systems Of Equations, Algebraic Procedures 27
Case 2: the two systems do not have the same basic variables
In case 1, the two matrices will have exactly the same pivot positions. However, since A and B are not
identical, there is some row of A which is different from the corresponding row of B and yet the rows each
have a pivot in the same column position. Let i be the index of this column position. Since the matrices are
in reduced row-echelon form, the two rows must differ at some entry in a column j > i. Let these entries
be a in A and b in B, where a 6= b. Since A is in reduced row-echelon form, if x j were a basic variable
for its linear system, we would have a = 0. Similarly, if x j were a basic variable for the linear system of
the matrix B, we would have b = 0. Since a and b are unequal, they cannot both be equal to 0, and hence
x j cannot be a basic variable for both linear systems. However, since the systems have the same basic
variables, x j must then be a free variable for each system. We now look at the solutions of the systems in
which x j is set equal to 1 and all other free variables are set equal to 0. For this choice of parameters, the
solution of the system for matrix A has x j = a, while the solution of the system for matrix B has x j = b,
so that the two systems have different solutions.
In case 2, there is a variable xi which is a basic variable for one matrix, lets say A, and a free variable
for the other matrix B. The system for matrix B has a solution in which xi = 1 and x j = 0 for all other free
variables x j . However, by Proposition 1.25 this cannot be a solution of the system for the matrix A. This
completes the proof of case 2.
Now, we say that the matrix B is equivalent to the matrix A provided that B can be obtained from A
by performing a sequence of elementary row operations beginning with A. The importance of this concept
lies in the following result.
Proof. Let A be an m n matrix and let B and C be matrices in reduced row-echelon form, each equivalent
to A. It suffices to show that B = C.
Let A+ be the matrix A augmented with a new rightmost column consisting entirely of zeros. Similarly,
augment matrices B and C each with a rightmost column of zeros to obtain B+ and C+. Note that B+ and
C+ are matrices in reduced row-echelon form which are obtained from A+ by respectively applying the
same sequence of elementary row operations which were used to obtain B and C from A.
Now, A+ , B+ , and C+ can all be considered as augmented matrices of homogeneous linear systems
in the variables x1 , x2 , , xn . Because B+ and C+ are each equivalent to A+ , Theorem 1.27 ensures that
28 Systems of Equations
all three homogeneous linear systems have exactly the same solutions. By Lemma 1.26 we conclude that
B+ = C+ . By construction, we must also have B = C.
According to this theorem we can say that each matrix A has a unique reduced row-echelon form.
There is a special type of system which requires additional study. This type of system is called a homo-
geneous system of equations, which we defined above in Definition 1.3. Our focus in this section is to
consider what types of solutions are possible for a homogeneous system of equations.
Consider the following definition.
Then, x1 = 0, x2 = 0, , xn = 0 is always a solution to this system. We call this the trivial solution
.
If the system has a solution in which not all of the x1 , , xn are equal to zero, then we call this solution
nontrivial . The trivial solution does not tell us much about the system, as it says that 0 = 0! Therefore,
when working with homogeneous systems of equations, we want to know when the system has a nontrivial
solution.
Suppose we have a homogeneous system of m equations, using n variables, and suppose that n > m.
In other words, there are more variables than equations. Then, it turns out that this system always has
a nontrivial solution. Not only will the system have a nontrivial solution, but it also will have infinitely
many solutions. It is also possible, but not required, to have a nontrivial solution if n = m and n < m.
Consider the following example.
2x + y z = 0
x + 2y 2z = 0
Solution. Notice that this system has m = 2 equations and n = 3 variables, so n > m. Therefore by our
previous discussion, we expect this system to have infinitely many solutions.
The process we use to find the solutions for a homogeneous system of equations is the same process
1.2. Systems Of Equations, Algebraic Procedures 29
we used in the previous section. First, we construct the augmented matrix, given by
2 1 1 0
1 2 2 0
Then, we carry this matrix to its reduced row-echelon form, given below.
1 0 0 0
0 1 1 0
x=0
y=z=t
z=t
Hence this system has infinitely many solutions, with one parameter t.
Suppose we were to write the solution to the previous example in another form. Specifically,
x=0
y = 0+t
z = 0+t
can be written as
x 0 0
y = 0 +t 1
z 0 1
Notice that we have constructed a column from the constants in the solution (all equal to 0), as well as a
column corresponding to the coefficients on t in each equation. While we will discuss this form of solution
more in further chapters,
for now consider the column of coefficients of the parameter t. In this case, this
0
is the column 1 .
1
There is a special name for this column, which is basic solution. The basic solutions of a system are
columns constructed from the coefficients on parameters in the solution. We often denote basic solutions
by X1 , X2 etc.,
depending on how many solutions occur. Therefore, Example 1.30 has the basic solution
0
X1 = 1 .
1
We explore this further in the following example.
30 Systems of Equations
x + 4y + 3z = 0
3x + 12y + 9z = 0
Solution. The augmented matrix of this system and the resulting reduced row-echelon form are
1 4 3 0 1 4 3 0
3 12 9 0 0 0 0 0
x + 4y + 3z = 0
Notice that only x corresponds to a pivot column. In this case, we will have two parameters, one for y and
one for z. Let y = s and z = t for any numbers s and t. Then, our solution becomes
x = 4s 3t
y=s
z=t
We now present a new definition.
V = a1 X1 + + an Xn
A remarkable result of this section is that a linear combination of the basic solutions is again a solution
to the system. Even more remarkable is that every solution can be written as a linear combination of these
1.2. Systems Of Equations, Algebraic Procedures 31
solutions. Therefore, if we take a linear combination of the two solutions to Example 1.31, this would also
be a solution. For example, we could take the following linear combination
4 3 18
3 1 +2 0 = 3
0 1 2
Similarly, we could count the number of pivot positions (or pivot columns) to determine the rank of A.
Solution. First, we need to find the reduced row-echelon form of A. Through the usual algorithm, we find
that this is
1 0 1
0 1 2
0 0 0
Here we have two leading entries, or two pivot positions, shown above in boxes.The rank of A is r = 2.
Notice that we would have achieved the same answer if we had found the row-echelon form of A
instead of the reduced row-echelon form.
Suppose we have a homogeneous system of m equations in n variables, and suppose that n > m. From
our above discussion, we know that this system will have infinitely many solutions. If we consider the
32 Systems of Equations
rank of the coefficient matrix of this system, we can find out even more about the solution. Note that we
are looking at just the coefficient matrix, not the entire augmented matrix.
Consider our above Example 1.31 in the context of this theorem. The system in this example has m = 2
equations in n = 3 variables. First, because n > m, we know that the system has a nontrivial solution, and
therefore infinitely many solutions. This tells us that the solution will contain at least one parameter. The
rank of the coefficient matrix can tell us even more about the solution! The rank of the coefficient matrix
of the system is 1, as it has one leading entry in row-echelon form. Theorem 1.35 tells us that the solution
will have n r = 3 1 = 2 parameters. You can check that this is true in the solution to Example 1.31.
Notice that if n = m or n < m, it is possible to have either a unique solution (which will be the trivial
solution) or infinitely many solutions.
We are not limited to homogeneous systems of equations here. The rank of a matrix can be used to
learn about the solutions of any system of linear equations. In the previous section, we discussed that a
system of equations can have no solution, a unique solution, or infinitely many solutions. Suppose the
system is consistent, whether it is homogeneous or not. The following theorem tells us how we can use
the rank to learn about the type of solution we have.
We will not present a formal proof of this, but consider the following discussions.
1. No Solution The above theorem assumes that the system is consistent, that is, that it has a solution.
It turns out that it is possible for the augmented matrix of a system with no solution to have any
rank r as long as r > 1. Therefore, we must know that the system is consistent in order to use this
theorem!
2. Unique Solution Suppose r = n. Then, there is a pivot position in every column of the coefficient
matrix of A. Hence, there is a unique solution.
3. Infinitely Many Solutions Suppose r < n. Then there are infinitely many solutions. There are less
pivot positions (and hence less leading entries) than columns, meaning that not every column is a
pivot column. The columns which are not pivot columns correspond to parameters. In fact, in this
case we have n r parameters.
1.2. Systems Of Equations, Algebraic Procedures 33
Exercises
Exercise 1.2.4 Find the point (x1 , y1 ) which lies on both lines, x + 3y = 1 and 4x y = 3.
Exercise 1.2.5 Find the point of intersection of the two lines 3x + y = 3 and x + 2y = 1.
Exercise 1.2.8 Four times the weight of Gaston is 150 pounds more than the weight of Ichabod. Four
times the weight of Ichabod is 660 pounds less than seventeen times the weight of Gaston. Four times the
weight of Gaston plus the weight of Siegfried equals 290 pounds. Brunhilde would balance all three of the
others. Find the weights of the four people.
Exercise 1.2.9 Consider the following augmented matrix in which denotes an arbitrary number and
denotes a nonzero number. Determine whether the given augmented matrix is consistent. If consistent, is
the solution unique?
0 0
0 0
0 0 0 0
Exercise 1.2.10 Consider the following augmented matrix in which denotes an arbitrary number and
denotes a nonzero number. Determine whether the given augmented matrix is consistent. If consistent, is
the solution unique?
0
0 0
Exercise 1.2.11 Consider the following augmented matrix in which denotes an arbitrary number and
denotes a nonzero number. Determine whether the given augmented matrix is consistent. If consistent, is
the solution unique?
0 0 0
0 0 0
0 0 0 0
Exercise 1.2.12 Consider the following augmented matrix in which denotes an arbitrary number and
denotes a nonzero number. Determine whether the given augmented matrix is consistent. If consistent, is
34 Systems of Equations
Exercise 1.2.13 Suppose a system of equations has fewer equations than variables. Will such a system
necessarily be consistent? If so, explain why and if not, give an example which is not consistent.
Exercise 1.2.14 If a system of equations has more equations than variables, can it have a solution? If so,
give an example and if not, tell why not.
Exercise 1.2.18 Choose h and k such that the augmented matrix shown has each of the following:
(b) no solution
Exercise 1.2.19 Choose h and k such that the augmented matrix shown has each of the following:
(b) no solution
1 2 2
2 h k
Exercise 1.2.20 Determine if the system is consistent. If so, is the solution unique?
x + 2y + z w = 2
xy+z+w = 1
2x + y z = 1
4x + 2y + z = 5
Exercise 1.2.21 Determine if the system is consistent. If so, is the solution unique?
x + 2y + z w = 2
xy+z+w = 0
2x + y z = 1
4x + 2y + z = 3
Exercise 1.2.23 Row reduce the following matrix to obtain the row-echelon form. Then continue to obtain
the reduced row-echelon form.
2 1 3 1
1 0 2 1
1 1 1 2
Exercise 1.2.24 Row reduce the following matrix to obtain the row-echelon form. Then continue to obtain
the reduced row-echelon form.
0 0 1 1
1 1 1 0
1 1 0 1
36 Systems of Equations
Exercise 1.2.25 Row reduce the following matrix to obtain the row-echelon form. Then continue to obtain
the reduced row-echelon form.
3 6 7 8
1 2 2 2
1 2 3 4
Exercise 1.2.26 Row reduce the following matrix to obtain the row-echelon form. Then continue to obtain
the reduced row-echelon form.
2 4 5 15
1 2 3 9
1 2 2 6
Exercise 1.2.27 Row reduce the following matrix to obtain the row-echelon form. Then continue to obtain
the reduced row-echelon form.
4 1 7 10
1 0 3 3
1 1 2 1
Exercise 1.2.28 Row reduce the following matrix to obtain the row-echelon form. Then continue to obtain
the reduced row-echelon form.
3 5 4 2
1 2 1 1
1 1 2 0
Exercise 1.2.29 Row reduce the following matrix to obtain the row-echelon form. Then continue to obtain
the reduced row-echelon form.
2 3 8 7
1 2 5 5
1 3 7 8
Exercise 1.2.30 Find the solution of the system whose augmented matrix is
1 2 0 2
1 3 4 2
1 0 2 1
Exercise 1.2.31 Find the solution of the system whose augmented matrix is
1 2 0 2
2 0 1 1
3 2 1 3
1.2. Systems Of Equations, Algebraic Procedures 37
Exercise 1.2.32 Find the solution of the system whose augmented matrix is
1 1 0 1
1 0 4 2
Exercise 1.2.33 Find the solution of the system whose augmented matrix is
1 0 2 1 1 2
0 1 0 1 2 1
1 2 0 0 1 3
1 0 1 0 2 2
Exercise 1.2.34 Find the solution of the system whose augmented matrix is
1 0 2 1 1 2
0 1 0 1 2 1
0 2 0 0 1 3
1 1 2 2 2 0
Exercise 1.2.35 Find the solution to the system of equations, 7x + 14y + 15z = 22, 2x + 4y + 3z = 5, and
3x + 6y + 10z = 13.
Exercise 1.2.37 Find the solution to the system of equations, 9x 2y + 4z = 17, 13x 3y + 6z = 25,
and 2x z = 3.
Exercise 1.2.38 Find the solution to the system of equations, 65x + 84y + 16z = 546, 81x + 105y + 20z =
682, and 84x + 110y + 21z = 713.
Exercise 1.2.40 Find the solution to the system of equations, 8x + 2y + 5z = 18, 8x + 3y + 5z = 13,
and 4x + y + 5z = 19.
Exercise 1.2.42 Find the solution to the system of equations, 9x + 15y = 66, 11x + 18y = 79, x + y =
4, and z = 3.
38 Systems of Equations
Exercise 1.2.43 Find the solution to the system of equations, 19x + 8y = 108, 71x + 30y = 404,
2x + y = 12, 4x + z = 14.
Exercise 1.2.44 Suppose a system of equations has fewer equations than variables and you have found a
solution to this system of equations. Is it possible that your solution is the only one? Explain.
Exercise 1.2.45 Suppose a system of linear equations has a 2 4 augmented matrix and the last column
is a pivot column. Could the system of linear equations be consistent? Explain.
Exercise 1.2.46 Suppose the coefficient matrix of a system of n equations with n variables has the property
that every column is a pivot column. Does it follow that the system of equations must have a solution? If
so, must the solution be unique? Explain.
Exercise 1.2.47 Suppose there is a unique solution to a system of linear equations. What must be true of
the pivot columns in the augmented matrix?
Exercise 1.2.48 The steady state temperature, u, of a plate solves Laplaces equation, u = 0. One way
to approximate the solution is to divide the plate into a square mesh and require the temperature at each
node to equal the average of the temperature at the four adjacent nodes. In the following picture, the
numbers represent the observed temperature at the indicated nodes. Find the temperature at the interior
nodes, indicated by x, y, z, and w. One of the equations is z = 41 (10 + 0 + w + x).
30 30
20 y w 0
20 x z 0
10 10
Exercise 1.2.60 Suppose A is an m n matrix. Explain why the rank of A is always no larger than
min (m, n) .
Exercise 1.2.61 State whether each of the following sets of data are possible for the matrix equation
AX = B. If possible, describe the solution set. That is, tell whether there exists a unique solution, no
solution or infinitely many solutions. Here, [A|B] denotes the augmented matrix.
Exercise 1.2.62 Consider the system 5x + 2y z = 0 and 5x 2y z = 0. Both equations equal zero
and so 5x + 2y z = 5x 2y z which is equivalent to y = 0. Does it follow that x and z can equal
anything? Notice that when x = 1, z = 4, and y = 0 are plugged in to the equations, the equations do
not equal 0. Why?
2. Matrices
Outcomes
A. Perform the matrix operations of matrix addition, scalar multiplication, transposition and ma-
trix multiplication. Identify when these operations are not defined. Represent these operations
in terms of the entries of a matrix.
B. Prove algebraic properties for matrix addition, scalar multiplication, transposition, and ma-
trix multiplication. Apply these properties to manipulate an algebraic expression involving
matrices.
C. Compute the inverse of a matrix using row operations, and prove identities involving matrix
inverses.
You have now solved systems of equations by writing them in terms of an augmented matrix and
then doing row operations on this augmented matrix. It turns out that matrices are important not only for
systems of equations but also in many applications.
Recall that a matrix is a rectangular array of numbers. Several of them are referred to as matrices.
For example, here is a matrix.
1 2 3 4
5 2 8 7 (2.1)
6 9 1 2
Recall that the size or dimension of a matrix is defined as m n where m is the number of rows and n is
the number of columns. The above matrix is a 3 4 matrix because there are three rows and four columns.
You can remember the columns are like columns in a Greek temple. They stand upright while the rows
lay flat like rows made by a tractor in a plowed field.
When specifying the size of a matrix, you always list the number of rows before the number of
columns.You might remember that you always list the rows before the columns by using the phrase
Rowman Catholic.
41
42 Matrices
There is some notation specific to matrices which we now introduce. We denote the columns of a
matrix A by A j as follows
A = A1 A2 An
Therefore, A j is the jth column of A, when counted from left to right.
The individual elements of the matrix are called entries or components of A. Elements of the matrix
are identified according to their position. The (i, j)-entry of a matrix is the entry in the ith row and jth
column. For example, in the matrix 2.1 above, 8 is in position (2, 3) (and is called the (2, 3)-entry) because
it is in the second row and the third column.
In order to remember which matrix we are speaking of, we will denote the entry in the ith row and
the jth column of matrix A by ai j . Then, we can write A in terms of its entries, as A = ai j . Using this
notation on the matrix in 2.1, a23 = 8, a32 = 9, a12 = 2, etc.
There are various operations which are done on matrices of appropriate sizes. Matrices can be added
to and subtracted from other matrices, multiplied by a scalar, and multiplied by other matrices. We will
never divide a matrix by another matrix, but we will see later how matrix inverses play a similar role.
In doing arithmetic with matrices, we often define the action by what happens in terms of the entries
(or components) of the matrices. Before looking at these operations in depth, consider a few general
definitions.
Note there is a 2 3 zero matrix, a 3 4 zero matrix, etc. In fact there is a zero matrix for every size!
In other words, two matrices are equal exactly when they are the same size and the corresponding
entries are identical. Thus
0 0
0 0 6= 0 0
0 0
0 0
because they are different sizes. Also,
0 1 1 0
6=
3 2 2 3
because, although they are the same size, their corresponding entries are not identical.
In the following section, we explore addition of matrices.
When adding matrices, all matrices in the sum need have the same size. For example,
1 2
3 4
5 2
and
1 4 8
2 8 5
cannot be added, as one has size 3 2 while the other has size 2 3.
However, the addition
4 6 3 0 5 0
5 0 4 + 4 4 14
11 2 3 1 2 6
is possible.
The formal definition is as follows.
This definition tells us that when adding matrices, we simply add corresponding entries of the matrices.
This is demonstrated in the next example.
Solution. Notice that both A and B are of size 2 3. Since A and B are of the same size, the addition is
possible. Using Definition 2.5, the addition is done as follows.
1 2 3 5 2 3 1+5 2+2 3+3 6 4 6
A+B = + = =
1 0 4 6 2 1 1 + 6 0 + 2 4 + 1 5 2 5
Addition of matrices obeys very much the same properties as normal addition with numbers. Note that
when we write for example A + B then we assume that both matrices are of equal size so that the operation
is indeed possible.
(A + B) +C = A + (B +C) (2.3)
Proof. Consider the Commutative Law of Addition given in 2.2. Let A, B,C, and D be matrices such that
A + B = C and B + A = D. We want to show that D = C. To do so, we will use the definition of matrix
addition given in Definition 2.5. Now,
ci j = ai j + bi j = bi j + ai j = di j
Therefore, C = D because the i jth entries are the same for all i and j. Note that the conclusion follows
from the commutative law of addition of numbers, which says that if a and b are two numbers, then
a + b = b + a. The proof of the other results are similar, and are left as an exercise.
We call the zero matrix in 2.4 the additive identity. Similarly, we call the matrix A in 2.5 the
additive inverse. A is defined to equal (1) A = [ai j ]. In other words, every entry of A is multiplied
by 1. In the next section we will study scalar multiplication in more depth to understand what is meant
by (1) A.
2.1. Matrix Arithmetic 45
Recall that we use the word scalar when referring to numbers. Therefore, scalar multiplication of a matrix
is the multiplication of a matrix by a number. To illustrate this concept, consider the following example in
which a matrix is multiplied by the scalar 3.
1 2 3 4 3 6 9 12
3 5 2 8 7 = 15 6 24 21
6 9 1 2 18 27 3 6
The new matrix is obtained by multiplying every entry of the original matrix by the given scalar.
The formal definition of scalar multiplication is as follows.
Similarly to addition of matrices, there are several properties of scalar multiplication which hold.
46 Matrices
k (A + B) = kA + kB
(k + p) A = kA + pA
k (pA) = (kp) A
The proof of this proposition is similar to the proof of Proposition 2.7 and is left an exercise to the
reader.
The next important matrix operation we will explore is multiplication of matrices. The operation of matrix
multiplication is one of the most important and useful of the matrix operations. Throughout this section,
we will also demonstrate how matrix multiplication relates to linear systems of equations.
First, we provide a formal definition of row and column vectors.
We may simply use the term vector throughout this text to refer to either a column or row vector. If
we do so, the context will make it clear which we are referring to.
2.1. Matrix Arithmetic 47
In this chapter, we will again use the notion of linear combination of vectors as in Definition 4.7. In
this context, a linear combination is a sum consisting of vectors multiplied by scalars. For example,
50 1 2 3
=7 +8 +9
122 4 5 6
is a linear combination of three vectors.
It turns out that we can express any system of linear equations as a linear combination of vectors. In
fact, the vectors that we will use are just the columns of the corresponding augmented matrix!
a11 x1 + + a1n xn = b1
.
..
am1 x1 + + amn xn = bm
Notice that each vector used here is one column from the corresponding augmented matrix. There is
one vector for each variable in the system, along with the constant vector.
The first important form of matrix multiplication is multiplying a matrix by a vector. Consider the
product given by
7
1 2 3
8
4 5 6
9
We will soon see that this equals
1 2 3 50
7 +8 +9 =
4 5 6 122
In general terms,
x1
a11 a12 a13 x2 = x1 a11 + x2 a12 + x3 a13
a21 a22 a23 a21 a22 a23
x3
a11 x1 + a12 x2 + a13 x3
=
a21 x1 + a22 x2 + a23 x3
Thus you take x1 times the first column, add to x2 times the second column, and finally x3 times the third
column. The above sum is a linear combination of the columns of the matrix. When you multiply a matrix
48 Matrices
on the left by a vector on the right, the numbers making up the vector are just the scalars to be used in the
linear combination of the columns as illustrated above.
Here is the formal definition of how to multiply an m n matrix by an n 1 column vector.
Then the product AX is the m 1 column vector which equals the following linear combination of
the columns of A:
n
x1 A1 + x2 A2 + + xn An = x jA j
j=1
If we write the columns of A in terms of their entries, they are of the form
a1 j
a2 j
A j = ..
.
am j
Solution. We will use Definition 2.13 to compute the product. Therefore, we compute the product AX as
2.1. Matrix Arithmetic 49
follows.
1 2 1 3
1 0 +2 2 +0 1 + 1 2
2 1 4 1
1 4 0 3
= 0 + 4 + 0 + 2
2 2 0 1
8
= 2
5
Using the above operation, we can also write a system of linear equations in matrix form. In this
form, we express the system as a matrix multiplied by a vector. Consider the following definition.
a11 x1 + + a1n xn = b1
a21 x1 + + a2n xn = b2
.
..
am1 x1 + + amn xn = bm
The expression AX = B is also known as the Matrix Form of the corresponding system of linear
equations. The matrix A is simply the coefficient matrix of the system, the vector X is the column vector
constructed from the variables of the system, and finally the vector B is the column vector constructed
from the constants of the system. It is important to note that any system of linear equations can be written
in this form.
Notice that if we write a homogeneous system of equations in matrix form, it would have the form
AX = 0, for the zero vector 0.
You can see from this definition that a vector
x1
x2
X = ..
.
xn
50 Matrices
will satisfy the equation AX = B only when the entries x1 , x2 , , xn of the vector X are solutions to the
original system.
Now that we have examined how to multiply a matrix by a vector, we wish to consider the case where
we multiply two matrices of more general sizes, although these sizes still need to be appropriate as we will
see. For example, in Example 2.14, we multiplied a 3 4 matrix by a 4 1 vector. We want to investigate
how to multiply other sizes of matrices.
We have not yet given any conditions on when matrix multiplication is possible! For matrices A and
B, in order to form the product AB, the number of columns of A must equal the number of rows of B.
Consider a product AB where A has size m n and B has size n p. Then, the product in terms of size of
matrices is given by
these must match!
(m d
n) (n p ) = m p
Note the two outside numbers give the size of the product. One of the most important rules regarding
matrix multiplication is the following. If the two middle numbers dont match, you cant multiply the
matrices!
When the number of columns of A equals the number of rows of B the two matrices are said to be
conformable and the product AB is obtained as follows.
B = [B1 B p ]
where B1 , ..., B p are the n 1 columns of B. Then the m p matrix AB is defined as follows:
where (AB)k is an m 1 matrix or column vector which gives the kth column of AB.
Solution. The first thing you need to verify when calculating a product is whether the multiplication is
possible. The first matrix has size 2 3 and the second matrix has size 3 3. The inside numbers are
equal, so A and B are conformable matrices. According to the above discussion AB will be a 2 3 matrix.
2.1. Matrix Arithmetic 51
You know how to multiply a matrix times a vector, using Definition 2.13 for each of the three columns.
Thus
1 2 0
1 2 1 1 9 3
0 3 1 =
0 2 1 2 7 3
2 1 1
Since vectors are simply n 1 or 1 m matrices, we can also multiply a vector by another vector.
Solution. In this case we are multiplying a matrix of size 3 1 by a matrix of size 1 4. The inside
numbers match so the product is defined. Note that the product will be a matrix of size 3 4. Using
Definition 2.16, we can compute this product as follows
First column Second column Third column Fourth column
z }| {
z }| { z }| { z }| {
1 1 1 1 1
2 1 2 1 0 = 2 1 , 2 2 , 2 1 , 2 0
1 1 1 1 1
Solution. First check if it is possible. This product is of the form (3 3) (2 3) . The inside numbers do
not match and so you cant do this multiplication.
In this case, we say that the multiplication is not defined. Notice that these are the same matrices which
we used in Example 2.17. In this example, we tried to calculate BA instead of AB. This demonstrates
another property of matrix multiplication. While the product AB maybe be defined, we cannot assume
that the product BA will be possible. Therefore, it is important to always check that the product is defined
before carrying out any calculations.
Earlier, we defined the zero matrix 0 to be the matrix (of appropriate size) containing zeros in all
entries. Consider the following example for multiplication by the zero matrix.
Hence, A0 = 0.
0
Notice that we could also multiply A by the 2 1 zero vector given by . The result would be the
0
2 1 zero vector. Therefore, it is always the case that A0 = 0, for an appropriately sized zero matrix or
vector.
In previous sections, we used the entries of a matrix to describe the action of matrix addition and scalar
multiplication. We can also study matrix multiplication using the entries of matrices.
What is the i jth entry of AB? It is the entry in the ith row and the jth column of the product AB.
Now if A is m n and B is n p, then we know that the product AB has the form
a11 a12 a1n b11 b12 b1 j b1p
a21 a22 a2n b21 b22 b2 j b2p
.. .. . . .. .. .. .. ..
. . . . . . . .
am1 am2 amn bn1 bn2 bn j bnp
2.1. Matrix Arithmetic 53
Therefore, the i jth entry is the entry in row i of this vector. This is computed by
n
ai1 b1 j + ai2 b2 j + + ain bn j = aik bk j
k=1
The following is the formal definition for the i jth entry of a product of matrices.
In other words, to find the (i, j)-entry of the product AB, or (AB)i j , you multiply the ithrow of A, on
the left by the jth column of B. To express AB in terms of its entries, we write AB = (AB)i j .
Consider the following example.
Solution. First check if the product is possible. It is of the form (3 2) (2 3) and since the inside
numbers match, it is possible to do the multiplication. The result should be a 3 3 matrix. We can first
compute AB:
1 2 1 2 1 2
3 1 2 , 3 1 3 , 3 1 1
7 6 2
2 6 2 6 2 6
where the commas separate the columns in the resulting product. Thus the above product equals
16 15 5
13 15 5
46 42 14
Solution. This product is of the form (3 3) (3 2). The middle numbers match so the matrices are
conformable and it is possible to compute the product.
We want to find the (2, 1)-entry of AB, that is, the entry in the second row and first column of the
product. We will use Definition 2.21, which states
n
(AB)i j = aik bk j
k=1
As pointed out above, it is sometimes possible to multiply matrices in one order but not in the other order.
However, even if both AB and BA are defined, they may not be equal.
Solution. First, notice that A and B are both of size 2 2. Therefore, both products AB and BA are defined.
The first product, AB is
1 2 0 1 2 1
AB = =
3 4 1 0 4 3
The second product, BA is
0 1 1 2 3 4
=
1 0 3 4 1 2
Therefore, AB 6= BA.
This example illustrates that you cannot assume AB = BA even when multiplication is defined in both
orders. If for some matrices A and B it is true that AB = BA, then we say that A and B commute. This is
one important property of matrix multiplication.
The following are other important properties of matrix multiplication. Notice that these properties hold
only when the size of matrices are such that the products are defined.
56 Matrices
Proof. First we will prove 2.6. We will use Definition 2.21 and prove this statement using the i jth entries
of a matrix. Therefore,
(A (rB + sC))i j = aik (rB + sC)k j = aik rbk j + sck j
k k
= (AB)il cl j = ((AB)C)i j .
l
This proves 2.8.
Another important operation on matrices is that of taking the transpose. For a matrix A, we denote the
transpose of A by AT . Before formally defining the transpose, we explore this operation on the following
matrix. T
1 4
3 1 = 1 3 2
4 1 6
2 6
What happened? The first column became the first row and the second column became the second row.
Thus the 3 2 matrix became a 2 3 matrix. The number 4 was in the first row and the second column
and it ended up in the second row and first column.
The definition of the transpose is as follows.
2.1. Matrix Arithmetic 57
Solution. By Definition 2.26, we know that for A = ai j , AT = a ji . In other words, we switch the row
and column location of each entry. The (1, 2)-entry becomes the (2, 1)-entry.
Thus,
1 3
AT = 2 5
6 4
2. (AB)T = BT AT
Solution. By Definition 2.29, we need to show that A = AT . Now, using Definition 2.26,
2 1 3
AT = 1 5 3
3 3 7
Hence, A = AT , so A is symmetric.
You can see that each entry of AT is equal to 1 times the same entry of A. Hence, AT = A and so
by Definition 2.29, A is skew symmetric.
There is a special matrix, denoted I, which is called to as the identity matrix. The identity matrix is
always a square matrix, and it has the property that there are ones down the main diagonal and zeroes
2.1. Matrix Arithmetic 59
The first is the 1 1 identity matrix, the second is the 2 2 identity matrix, and so on. By extension, you
can likely see what the n n identity matrix would be. When it is necessary to distinguish which size of
identity matrix is being discussed, we will use the notation In for the n n identity matrix.
The identity matrix is so important that there is a special symbol to denote the i jth entry of the identity
matrix. This symbol is given by Ii j = i j where i j is the Kronecker symbol defined by
1 if i = j
i j =
0 if i 6= j
In is called the identity matrix because it is a multiplicative identity in the following sense.
aik k j = ai j
k
AA1 = A1 A = In
Such a matrix A1 will have the same size as the matrix A. It is very important to observe that the
inverse of a matrix, if it exists, is unique. Another way to think of this is that if it acts like the inverse, then
it is the inverse.
Proof. In this proof, it is assumed that I is the n n identity matrix. Let A, B be n n matrices such that
A1 exists and AB = BA = I. We want to show that A1 = B. Now using properties we have seen, we get:
A1 = A1 I = A1 (AB) = A1 A B = IB = B
and
2 1 1 1 1 0
= =I
1 1 1 2 0 1
showing that this matrix is indeed the inverse of A.
Unlike ordinary multiplication of numbers, it can happen that A 6= 0 but A may fail to have an inverse.
This is illustrated in the following example.
Solution. One might think A would have an inverse because it does not equal zero. However, note that
1 1 1 0
=
1 1 1 0
1
=
1
This says that
0 1
=
0 1
which is impossible! Therefore, A does not have an inverse.
In the next section, we will explore how to find the inverse of a matrix, if it exists.
In Example 2.35, we were given A1 and asked to verify that this matrix was in fact the inverse of A. In
this section, we explore how to find A1 .
Let
1 1
A=
1 2
1 x z
as in Example 2.35. In order to find A , we need to find a matrix such that
y w
1 1 x z 1 0
=
1 2 y w 0 1
We can multiply these two matrices, and see that in order for this equation to be true, we must find the
solution to the systems of equations,
x+y = 1
x + 2y = 0
and
z+w = 0
z + 2w = 1
Writing the augmented matrix for these two systems gives
1 1 1
1 2 0
for the first system and
1 1 0
(2.9)
1 2 1
for the second.
Lets solve the first system. Take 1 times the first row and add to the second to get
1 1 1
0 1 1
Now take 1 times the second row and add to the first to get
1 0 2
0 1 1
62 Matrices
the first column of the inverse was obtained by solving the first system and then the second column
z
w
To simplify this procedure, we could have solved both systems at once! To do so, we could have
written
1 1 1 0
1 2 0 1
and row reduced until we obtained
1 0 2 1
0 1 1 1
and read off the inverse as the 2 2 matrix on the right side.
This exploration motivates the following important algorithm.
[A|I]
[I|B]
When this has been done, B = A1 . In this case, we say that A is invertible. If it is impossible to
row reduce to a matrix of the form [I|B], then A has no inverse.
This algorithm shows how to find the inverse if it exists. It will also tell you if A does not have an
inverse.
Consider the following example.
2.1. Matrix Arithmetic 63
Notice that the left hand side of this matrix is now the 3 3 identity matrix I3 . Therefore, the inverse is
the 3 3 matrix on the right hand side, given by
1 2 2
7 7 7
1
2 12 0
1 5 1
14 14 7
64 Matrices
It may happen that through this algorithm, you discover that the left hand side cannot be row reduced
to the identity matrix. Consider the following example of this situation.
At this point, you can see there will be no way to obtain I on the left side of this augmented matrix. Hence,
there is no way to complete this algorithm, and therefore the inverse of A does not exist. In this case, we
say that A is not invertible.
If the algorithm provides an inverse for the original matrix, it is always possible to check your answer.
To do so, use the method demonstrated in Example 2.35. Check that the products AA1 and A1 A both
equal the identity matrix. Through this method, you can always be sure that you have calculated A1
properly!
One way in which the inverse of a matrix is useful is to find the solution of a system of linear equations.
Recall from Definition 2.15 that we can write a system of equations in matrix form, which is of the form
AX = B. Suppose you find the inverse of the matrix A1 . Then you could multiply both sides of this
equation on the left by A1 and simplify to obtain
A1 AX 1
= A1 B
1
A A X =A B
IX = A1 B
X = A1 B
2.1. Matrix Arithmetic 65
Therefore we can find X , the solution to the system, by computing X = A1 B. Note that once you have
found A1 , you can easily get the solution for different right hand sides (different B). It is always just
A1 B.
We will explore this method of finding the solution to a system in the following example.
0
What if the right side, B, of 2.10 had been 1 ? In other words, what would be the solution to
3
1 0 1 x 0
1 1 1 y = 1 ?
1 1 1 z 3
66 Matrices
This illustrates that for a system AX = B where A1 exists, it is easy to find the solution when the vector
B is changed.
We conclude this section with some important properties of the inverse.
1. I is invertible and I 1 = I
We now turn our attention to a special type of matrix called an elementary matrix. An elementary matrix
is always a square matrix. Recall the row operations given in Definition 1.11. Any elementary matrix,
which we often denote by E, is obtained from applying one row operation to the identity matrix of the
same size.
For example, the matrix
0 1
E=
1 0
2.1. Matrix Arithmetic 67
is the elementary matrix obtained from switching the two rows. The matrix
1 0 0
E = 0 3 0
0 0 1
is the elementary matrix obtained from multiplying the second row of the 3 3 identity matrix by 3. The
matrix
1 0
E=
3 1
is the elementary matrix obtained from adding 3 times the first row to the third row.
You may construct an elementary matrix from any row operation, but remember that you can only
apply one operation.
Consider the following definition.
Therefore, E constructed above by switching the two rows of I2 is called a permutation matrix.
Elementary matrices can be used in place of row operations and therefore are very useful. It turns out
that multiplying (on the left hand side) by an elementary matrix E will have the same effect as doing the
row operation used to obtain E.
The following theorem is an important result which we will use throughout this text.
Therefore, instead of performing row operations on a matrix A, we can row reduce through matrix
multiplication with the appropriate elementary matrix. We will examine this theorem in detail for each of
the three row operations given in Definition 1.11.
First, consider the following lemma.
Solution. You can see that the matrix P12 is obtained by switching the first and second rows of the 3 3
identity matrix I.
Using our usual procedure, compute the product P12 A = B. The result is given by
g d
B= a b
e f
Notice that B is the matrix obtained by switching rows 1 and 2 of A. Therefore by multiplying A by P12 ,
the row operation which was applied to I to obtain P12 is applied to A to obtain B.
Theorem 2.44 applies to all three row operations, and we now look at the row operation of multiplying
a row by a scalar. Consider the following lemma.
E (k, i) A = B
Solution. You can see that E (5, 2) is obtained by multiplying the second row of the identity matrix by 5.
Using our usual procedure for multiplication of matrices, we can compute the product E (5, 2) A. The
resulting matrix is given by
a b
B = 5c 5d
e f
2.1. Matrix Arithmetic 69
Example 2.50: Adding Two Times the First Row to the Last
Let
1 0 0 a b
E (2 1 + 3) = 0 1 0 , A = c d
2 0 1 e f
Find B where B = E (2 1 + 3) A.
Solution. You can see that the matrix E (2 1 + 3) was obtained by adding 2 times the first row of I to the
third row of I.
Using our usual procedure, we can compute the product E (2 1 + 3) A. The resulting matrix B is
given by
a b
B= c d
2a + e 2b + f
You can see that B is the matrix obtained by adding 2 times the first row of A to the third row.
Suppose we have applied a row operation to a matrix A. Consider the row operation required to return
A to its original form, to undo the row operation. It turns out that this action is how we find the inverse of
an elementary matrix E.
Consider the following theorem.
In fact, the inverse of an elementary matrix is constructed by doing the reverse row operation on I.
E 1 will be obtained by performing the row operation which would carry E back to I.
If E is obtained by switching rows i and j, then E 1 is also obtained by switching rows i and j.
70 Matrices
If E is obtained by adding k times row i to row j, then E 1 is obtained by subtracting k times row i
from row j.
Here, E is obtained from the 2 2 identity matrix by multiplying the second row by 2. In order to carry E
back to the identity, we need to multiply the second row of E by 21 . Hence, E 1 is given by
" #
1 0
E 1 = 0 1
2
Solution. To find B, row reduce A. For each step, we will record the appropriate elementary matrix. First,
switch rows 1 and 2.
0 1 1 0
1 0 0 1
2 0 2 0
0 1 0
The resulting matrix is equivalent to finding the product of P12 = 1 0 0 and A.
0 0 1
Next, add (2) times row 1 to row 3.
1 0 1 0
0 1 0 1
2 0 0 0
1 0 0
This is equivalent to multiplying by the matrix E(2 1 + 3) = 0 1 0 . Notice that the
2 0 1
resulting matrix is B, the required reduced row-echelon form of A.
We can then write
B = E(2 1 + 2) P12 A
= E(2 1 + 2)P12 A
= UA
U = E(2 1 + 2)P12
1 0 0 0 1 0
= 0 1 0 1 0 0
2 0 1 0 0 1
0 1 0
= 1 0 0
0 2 1
1 0
= 0 1
0 0
= B
While the process used in the above example is reliable and simple when only a few row operations
are used, it becomes cumbersome in a case where many row operations are needed to carry A to B. The
following theorem provides an alternate way to find the matrix U .
Lets revisit the above example using the process outlined in Theorem 2.55.
Now, row reduce this matrix until the left side equals the reduced row-echelon form of A.
0 1 1 0 0 1 0 0 1 0
1 0 0 1 0 0 1 1 0 0
2 0 0 0 1 2 0 0 0 1
1 0 0 1 0
0 1 1 0 0
0 0 0 2 1
The left side of this matrix is B, and the right side is U . Comparing this to the matrix U found above
in Example 2.54, you can see that the same matrix is obtained regardless of which process is used.
Recall from Algorithm 2.37 that an n n matrix A is invertible if and only if A can be carried to the
n n identity matrix using the usual row operations. This leads to an important consequence related to the
above discussion.
2.1. Matrix Arithmetic 73
Suppose A is an n n invertible matrix. Then, set up the matrix [A|In] as done above, and row reduce
until it is of the form [B|U ]. In this case, B = In because A is invertible.
B = UA
In = UA
1
U = A
Now suppose that U = E1 E2 Ek where each Ei is an elementary matrix representing a row operation
used to carry A to I. Then,
A = U 1
= Ek1 E21 E1 1
Solution. We will use the process outlined in Theorem 2.55 to write A as a product of elementary matrices.
We will set up the matrix [A|I] and row reduce, recording each row operation as an elementary matrix.
First:
0 1 0 1 0 0 1 1 0 0 1 0
1 1 0 0 1 0 0 1 0 1 0 0
0 2 1 0 0 1 0 2 1 0 0 1
0 1 0
represented by the elementary matrix E1 = 1 0 0 .
0 0 1
Secondly:
1 1 0 0 1 0 1 0 0 1 1 0
0 1 0 1 0 0 0 1 0 1 0 0
0 2 1 0 0 1 0 2 1 0 0 1
74 Matrices
1 1 0
represented by the elementary matrix E2 = 0 1 0 .
0 0 1
Finally:
1 0 0 1 1 0 1 0 0 1 1 0
0 1 0 1 0 0 0 1 0 1 0 0
0 2 1 0 0 1 0 0 1 2 0 1
1 0 0
represented by the elementary matrix E3 = 0 1 0 .
0 2 1
Notice that the reduced row-echelon form of A is I. Hence I = UA where U is the product of the
above elementary matrices. It follows that A = U 1 . Since we want to write A as a product of elementary
matrices, we wish to express U 1 as a product of elementary matrices.
U 1 = (E3 E2 E1 )1
= E11 E21 E31
0 1 0 1 1 0 1 0 0
= 1 0 0 0 1 0 0 1 0
0 0 1 0 0 1 0 2 1
= A
This gives A written as a product of elementary matrices. By Theorem 2.57 it follows that A is invert-
ible.
In this section, we will prove three theorems which will clarify the concept of matrix inverses. In order to
do this, first recall some important properties of elementary matrices.
Recall that an elementary matrix is a square matrix obtained by performing an elementary operation
on an identity matrix. Each elementary matrix is invertible, and its inverse is also an elementary matrix. If
E is an m m elementary matrix and A is an m n matrix, then the product EA is the result of applying to
A the same elementary row operation that was applied to the m m identity matrix in order to obtain E.
Let R be the reduced row-echelon form of an m n matrix A. R is obtained by iteratively applying
a sequence of elementary row operations to A. Denote by E1 , E2 , , Ek the elementary matrices asso-
ciated with the elementary row operations which were applied, in order, to the matrix A to obtain the
resulting R. We then have that R = (Ek (E2 (E1 A))) = Ek E2 E1 A. Let E denote the product matrix
Ek E2 E1 so that we can write R = EA where E is an invertible matrix whose inverse is the product
(E1 )1 (E2 )1 (Ek )1 .
Now, we will consider some preliminary lemmas.
Proof. Let R be the reduced row-echelon form of A. Then R = EA for some invertible square matrix E as
described above. By hypothesis AB = I where I is an identity matrix, so we have a chain of equalities
If R would have a row of zeros, then so would the product R(BE 1 ). But since the identity matrix I does
not have a row of zeros, neither can R have one.
We now consider a second important lemma.
Proof. Let R be the reduced row-echelon form of A. By Lemma 2.59, we know that R does not have a row
of zeros, and therefore each row of R has a leading 1. Since each column of R contains at most one of
these leading 1s, R must have at least as many columns as it has rows.
An important theorem follows from this lemma.
Proof. Suppose that A and B are matrices such that both products AB and BA are identity matrices. We will
show that A and B must be square matrices of the same size. Let the matrix A have m rows and n columns,
so that A is an m n matrix. Since the product AB exists, B must have n rows, and since the product BA
exists, B must have m columns so that B is an n m matrix. To finish the proof, we need only verify that
m = n.
We first apply Lemma 2.60 with A and B, to obtain the inequality m n. We then apply Lemma 2.60
again (switching the order of the matrices), to obtain the inequality n m. It follows that m = n, as we
wanted.
Of course, not all square matrices are invertible. In particular, zero matrices are not invertible, along
with many other square matrices.
The following proposition will be useful in proving the next theorem.
The proof of this proposition is left as an exercise to the reader. We now consider the second important
theorem of this section.
76 Matrices
Proof. Let R be the reduced row-echelon form of a square matrix A. Then, R = EA where E is an invertible
matrix. Since AB = I, Lemma 2.59 gives us that R does not have a row of zeros. By noting that R is a
square matrix and applying Proposition 2.62, we see that R = I. Hence, EA = I.
Using both that EA = I and AB = I, we can finish the proof with a chain of equalities as given by
Hence AAT is not the 3 3 identity matrix. This shows that for Theorem 2.63, it is essential that both
matrices be square and of the same size.
Is it possible to have matrices A and B such that AB = I, while BA = 0? This question is left to the
reader to answer, and you should take a moment to consider the answer.
2.1. Matrix Arithmetic 77
A is invertible
Proof. In order to prove this, we show that for any given matrix A, each condition implies the other. We
first show that if A is invertible, then its reduced row-echelon form is an identity matrix, then we show that
if the reduced row-echelon form of A is an identity matrix, then A is invertible.
If A is invertible, there is some matrix B such that AB = I. By Lemma 2.59, we get that the reduced row-
echelon form of A does not have a row of zeros. Then by Theorem 2.61, it follows that A and the reduced
row-echelon form of A are square matrices. Finally, by Proposition 2.62, this reduced row-echelon form of
A must be an identity matrix. This proves the first implication.
Now suppose the reduced row-echelon form of A is an identity matrix I. Then I = EA for some product
E of elementary matrices. By Theorem 2.63, we can conclude that A is invertible.
Theorem 2.65 corresponds to Algorithm 1
1 2.37, which claims that A is found by row reducing the
augmented matrix [A|I] to the form I|A . This will be a matrix product E [A|I] where E is a product of
elementary matrices. By the rules of matrix multiplication, we have that E [A|I] = [EA|EI] = [EA|E].
It follows that the reduced row-echelon form of [A|I] is [EA|E], where EA gives the reduced row-
echelon form of A. By Theorem 2.65, if EA 6= I, then A is not invertible, and if EA = I, A is invertible. If
EA = I, then by Theorem 2.63, E = A1 . This proves that Algorithm 2.37 does in fact find A1 .
Exercises
Exercise 2.1.1 For the following pairs of matrices, determine if the sum A + B is defined. If so, find the
sum.
1 0 0 1
(a) A = ,B =
0 1 1 0
2 1 2 1 0 3
(b) A = ,B =
1 1 0 0 1 4
1 0
2 7 1
(c) A = 2 3 ,B =
0 3 4
4 2
Exercise 2.1.2 For each matrix A, find the matrix A such that A + (A) = 0.
78 Matrices
1 2
(a) A =
2 1
2 3
(b) A =
0 2
0 1 2
(c) A = 1 1 3
4 2 0
Exercise 2.1.4 For each matrix A, find the product (2)A, 0A, and 3A.
1 2
(a) A =
2 1
2 3
(b) A =
0 2
0 1 2
(c) A = 1 1 3
4 2 0
Exercise 2.1.5 Using only the properties given in Proposition 2.7 and Proposition 2.10, show A is
unique.
Exercise 2.1.6 Using only the properties given in Proposition 2.7 and Proposition 2.10 show 0 is unique.
Exercise 2.1.7 Using only the properties given in Proposition 2.7 and Proposition 2.10 show 0A = 0.
Here the 0 on the left is the scalar 0 and the 0 on the right is the zero matrix of appropriate size.
Exercise 2.1.8 Using only the properties given in Proposition 2.7 and Proposition 2.10, as well as previ-
ous problems show (1) A = A.
1 2 3 3 1 2 1 2
Exercise 2.1.9 Consider the matrices A = ,B = ,C = ,
2 1 7 3 2 1 3 1
1 2 2
D= ,E = .
2 3 3
Find the following if possible. If it is not possible explain why.
(a) 3A
(b) 3B A
(c) AC
2.1. Matrix Arithmetic 79
(d) CB
(e) AE
(f) EA
1 2
2 5 2 1 2
Exercise 2.1.10 Consider the matrices A = 3 2 ,B = ,C = ,
3 2 1 5 0
1 1
1 1 1
D= ,E =
4 3 3
Find the following if possible. If it is not possible explain why.
(a) 3A
(b) 3B A
(c) AC
(d) CA
(e) AE
(f) EA
(g) BE
(h) DE
1 1 1 1 3
1 1 2
Exercise 2.1.11 Let A = 2 1 , B = , and C = 1 2 0 . Find the
2 1 2
1 2 3 1 0
following if possible.
(a) AB
(b) BA
(c) AC
(d) CA
(e) CB
(f) BC
80 Matrices
1 1
Exercise 2.1.12 Let A = . Find all 2 2 matrices, B such that AB = 0.
3 3
Exercise 2.1.13 Let X = 1 1 1 and Y = 0 1 2 . Find X T Y and XY T if possible.
1 2 1 2
Exercise 2.1.14 Let A = ,B = . Is it possible to choose k such that AB = BA? If so,
3 4 3 k
what should k equal?
1 2 1 2
Exercise 2.1.15 Let A = ,B = . Is it possible to choose k such that AB = BA? If so,
3 4 1 k
what should k equal?
Exercise 2.1.16 Find 2 2 matrices, A, B, and C such that A 6= 0,C 6= B, but AC = AB.
Exercise 2.1.17 Give an example of matrices (of any size), A, B,C such that B 6= C, A 6= 0, and yet AB =
AC.
Exercise 2.1.19 Give an example of matrices (of any size), A, B such that A 6= 0 and B 6= 0 but AB = 0.
Exercise 2.1.20 Find 2 2 matrices A and B such that A 6= 0 and B 6= 0 with AB 6= BA.
Exercise 2.1.25 For each pair of matrices, find the (1, 2)-entry and (2, 3)-entry of the product AB.
1 2 1 4 6 2
(a) A = 3 4 0 ,B = 7 2 1
2 5 1 1 0 0
1 3 1 2 3 0
(b) A = 0 2 4 , B = 4 16 1
1 0 5 0 2 2
Exercise 2.1.26 Suppose A and B are square matrices of the same size. Which of the following are
necessarily true?
(b) (AB)2 = A2 B2
(d) (A + B)2 = A2 + AB + BA + B2
(e) A2 B2 = A (AB) B
(g) (A + B) (A B) = A2 B2
82 Matrices
1 2
2 5 2 1 2
Exercise 2.1.27 Consider the matrices A = 3 2 ,B = ,C = ,
3 2 1 5 0
1 1
1 1 1
D= ,E =
4 3 3
Find the following if possible. If it is not possible explain why.
(a) 3AT
(b) 3B AT
(c) E T B
(d) EE T
(e) BT B
(f) CAT
(g) DT BE
Exercise 2.1.28 Let A be an n nmatrix. Show A equals the sum of a symmetric and a skew symmetric
matrix. Hint: Show that 12 AT + A is symmetric and then consider using this as one of the matrices.
Exercise 2.1.29 Show that the main diagonal of every skew symmetric matrix consists of only zeros.
Recall that the main diagonal consists of every entry of the matrix which is of the form aii .
Exercise 2.1.30 Prove 3. That is, show that for an m n matrix A, an n p matrix B, and scalars r, s, the
following holds:
(rA + sB)T = rAT + sBT
Exercise 2.1.32 Suppose AB = AC and A is an invertible n n matrix. Does it follow that B = C? Explain
why or why not.
Exercise 2.1.33 Suppose AB = AC and A is a non invertible n n matrix. Does it follow that B = C?
Explain why or why not.
Exercise 2.1.34 Give an example of a matrix A such that A2 = I and yet A 6= I and A 6= I.
Exercise 2.1.44 Using the inverse of the matrix, find the solution to the systems:
84 Matrices
(a)
2 4 x 1
=
1 1 y 2
(b)
2 4 x 2
=
1 1 y 0
Exercise 2.1.45 Using the inverse of the matrix, find the solution to the systems:
(a)
1 0 3 x 1
2 3 4 y = 0
1 0 2 z 1
(b)
1 0 3 x 3
2 3 4 y = 1
1 0 2 z 2
Exercise 2.1.46 Show that if A is an n n invertible matrix and X is a n 1 matrix such that AX = B for
B an n 1 matrix, then X = A1 B.
Exercise 2.1.48 Show that if A1 exists for an n n matrix, then it is unique. That is, if BA = I and AB = I,
then B = A1 .
1 T
Exercise 2.1.49 Show that if A is an invertible n n matrix, then so is AT and AT = A1 .
and
B1 A1 (AB) = I
2.1. Matrix Arithmetic 85
1 2 1
Exercise 2.1.58 Let A = 0 5 1 . Suppose a row operation is applied to A and the result is
2 1 4
1 2 1
B= 0 10 2 .
2 1 4
1 2 1
Exercise 2.1.59 Let A = 0 5 1 . Suppose a row operation is applied to A and the result is
2 1 4
1 2 1
B= 0 5 1 .
1 21 2
1 2 1
Exercise 2.1.60 Let A = 0 5 1 . Suppose a row operation is applied to A and the result is
2 1 4
1 2 1
B= 2 4 5 .
2 1 4
Outcomes
A. Evaluate the determinant of a square matrix using either Laplace Expansion or row operations.
Let A be an n n matrix. That is, let A be a square matrix. The determinant of A, denoted by det (A) is a
very important number which we will explore throughout this section.
If A is a 22 matrix, the determinant is given by the following formula.
The determinant is also often denoted by enclosing the matrix with two vertical lines. Thus
a b a b
det = = ad bc
c d c d
87
88 Determinants
The 2 2 determinant can be used to find the determinant of larger matrices. We will now explore how
to find the determinant of a 3 3 matrix, using several tools including the 2 2 determinant.
We begin with the following definition.
Hence, there is a minor associated with each entry of A. Consider the following example which
demonstrates this definition.
Solution. First we will find minor (A)12 . By Definition 3.3, this is the determinant of the 2 2 matrix
which results when you delete the first row and the second column. This minor is given by
4 2
minor (A)12 = det
3 1
Similarly, minor (A)23 is the determinant of the 2 2 matrix which results when you delete the second
row and the third column. This minor is therefore
1 2
minor (A)23 = det = 4
3 2
It is also convenient to refer to the cofactor of an entry of a matrix as follows. If ai j is the i jth entry of
the matrix, then its cofactor is just cof (A)i j .
Hence,
cof (A)23 = (1)2+3 minor (A)23 = (1)2+3 (4) = 4
90 Determinants
You may wish to find the remaining cofactors for the above matrix. Remember that there is a cofactor
for every entry in the matrix.
We have now established the tools we need to find the determinant of a 3 3 matrix.
When calculating the determinant, you can choose to expand any row or any column. Regardless of
your choice, you will always get the same number which is the determinant of the matrix A. This method of
evaluating a determinant by expanding along a row or a column is called Laplace Expansion or Cofactor
Expansion.
Consider the following example.
Solution. First, we will calculate det (A) by expanding along the first column. Using Definition 3.7, we
take the 1 in the first column and multiply it by its cofactor,
1+1 3 2
1 (1) = (1)(1)(1) = 1
2 1
Similarly, we take the 4 in the first column and multiply it by its cofactor, as well as with the 3 in the first
column. Finally, we add these numbers together, as given in the following equation.
As mentioned in Definition 3.7, we can choose to expand along any row or column. Lets try now by
expanding along the second row. Here, we take the 4 in the second row and multiply it to its cofactor, then
add this to the 3 in the second row multiplied by its cofactor, and the 2 in the second row multiplied by its
cofactor. The calculation is as follows.
cof(A)21 cof(A)22 cof(A)23
z }| { z }| { z }| {
2 3 1 3 1 2
det (A) = 4(1)2+1 + 3(1)2+2
+ 2(1)2+3
2 1 3 1 3 2
You can see that for both methods, we obtained det (A) = 0.
As mentioned above, we will always come up with the same value for det (A) regardless of the row or
column we choose to expand along. You should try to compute the above determinant by expanding along
other rows and columns. This is a good way to check your work, because you should come up with the
same number each time!
We present this idea formally in the following theorem.
We have now looked at the determinant of 2 2 and 3 3 matrices. It turns out that the method used
to calculate the determinant of a 3 3 matrix can be used to calculate the determinant of any sized matrix.
Notice that Definition 3.3, Definition 3.5 and Definition 3.7 can all be applied to a matrix of any size.
For example, the i jth minor of a 4 4 matrix is the determinant of the 3 3 matrix you obtain when you
delete the ith row and the jth column. Just as with the 3 3 determinant, we can compute the determinant
of a 4 4 matrix by Laplace Expansion, along any row or column
Consider the following example.
Solution. As in the case of a 3 3 matrix, you can expand this along any row or column. Lets pick the
third column. Then, using Laplace Expansion,
5 4 3 1 2 4
det (A) = 3 (1)1+3 1 3 5 + 2 (1)2+3 1 3 5 +
3 4 2 3 4 2
92 Determinants
1 2 4 1 2 4
3+3
4 (1) 5 4 3 + 3 (1) 5 4 3
4+3
3 4 2 1 3 5
Now, you can calculate each 3 3 determinant using Laplace Expansion, as we did above. You should
complete these as an exercise and verify that det (A) = 12.
The following provides a formal definition for the determinant of an n n matrix. You may wish
to take a moment and consider the above definitions for 2 2 and 3 3 determinants in context of this
definition.
The first formula consists of expanding the determinant along the ith row and the second expands
the determinant along the jth column.
In the following sections, we will explore some important properties and characteristics of the deter-
minant.
There is a certain type of matrix for which finding the determinant is a very simple procedure. Consider
the following definition.
A lower triangular matrix is defined similarly as a matrix for which all entries above the main
diagonal are equal to zero.
The following theorem provides a useful way to calculate the determinant of a triangular matrix.
3.1. Basic Techniques and Properties 93
The verification of this Theorem can be done by computing the determinant using Laplace Expansion
along the first row or column.
Consider the following example.
Solution. From Theorem 3.13, it suffices to take the product of the elements on the main diagonal. Thus
det (A) = 1 2 3 (1) = 6.
Without using Theorem 3.13, you could use Laplace Expansion. We will expand along the first column.
This gives
2 6 7 2 3 77
det (A) = 1 0 3 33.7 + 0 (1) 0 3 33.7 +
2+1
0 0 1 0 0 1
2 3 77 2 3 77
3+1
0 (1) 2 6 4+1
7 + 0 (1) 2 6 7
0 0 1 0 3 33.7
Now find the determinant of this 3 3 matrix, by expanding along the first column to obtain
3 33.7 6 7 6 7
det (A) = 1 2 + 0 (1)
2+1 + 0 (1) 3+1
0 1 0 1 3 33.7
3 33.7
= 1 2
0 1
Next use Definition 3.1 to find the determinant of this 2 2 matrix, which is just 3 1 0 33.7 = 3.
Putting all these steps together, we have
which is just the product of the entries down the main diagonal of the original matrix!
You can see that while both methods result in the same answer, Theorem 3.13 provides a much quicker
method.
In the next section, we explore some important properties of determinants.
There are many important properties of determinants. Since many of these properties involve the row
operations discussed in Chapter 1, we recall that definition now.
We will now consider the effect of row operations on the determinant of a matrix. In future sections,
we will see that using the following properties can greatly assist in finding determinants. This section will
use the theorems as motivation to provide various examples of the usefulness of the properties.
The first theorem explains the affect on the determinant of a matrix when two rows are switched.
When we switch two rows of a matrix, the determinant is multiplied by 1. Consider the following
example.
Solution. By Definition 3.1, det (A) = 1 4 3 2 = 2. Notice that the rows of B are the rows of A but
switched. By Theorem 3.16 since two rows of A have been switched, det (B) = det (A) = (2) = 2.
You can verify this using Definition 3.1.
The next theorem demonstrates the effect on the determinant of a matrix when we multiply a row by a
scalar.
3.1. Basic Techniques and Properties 95
Notice that this theorem is true when we multiply one row of the matrix by k. If we were to multiply
two rows of A by k to obtain B, we would have det (B) = k2 det (A). Suppose we were to multiply all n
rows of A by k to obtain the matrix B, so that B = kA. Then, det (B) = kn det (A). This gives the next
theorem.
Solution. By Definition 3.1, det (A) = 2. We can also compute det (B) using Definition 3.1, and we see
that det (B) = 10.
Now, lets compute det (B) using Theorem 3.18 and see if we obtain the same answer. Notice that the
first row of B is 5 times the first row of A, while the second row of B is equal to the second row of A. By
Theorem 3.18, det (B) = 5 det (A) = 5 2 = 10.
You can see that this matches our answer above.
Finally, consider the next theorem for the last row operation, that of adding a multiple of a row to
another row.
Therefore, when we add a multiple of a row to another row, the determinant of the matrix is unchanged.
Note that if a matrix A contains a row which is a multiple of another row, det (A) will equal 0. To see this,
suppose the first row of A is equal to 1 times the second row. By Theorem 3.21, we can add the first row
to the second row, and the determinant will be unchanged. However, this row operation will result in a
row of zeros. Using Laplace Expansion along the row of zeros, we find that the determinant is 0.
Consider the following example.
96 Determinants
Solution. By Definition 3.1, det (A) = 2. Notice that the second row of B is two times the first row of A
added to the second row. By Theorem 3.16, det (B) = det (A) = 2. As usual, you can verify this answer
using Definition 3.1.
det (A) = 1 4 2 2 = 0
However notice that the second row is equal to 2 times the first row. Then by the discussion above
following Theorem 3.21 the determinant will equal 0.
Until now, our focus has primarily been on row operations. However, we can carry out the same
operations with columns, rather than rows. The three operations outlined in Definition 3.15 can be done
with columns instead of rows. In this case, in Theorems 3.16, 3.18, and 3.21 you can replace the word,
"row" with the word "column".
There are several other major properties of determinants which do not involve row (or column) opera-
tions. The first is the determinant of a product of matrices.
In order to find the determinant of a product of matrices, we can simply take the product of the deter-
minants.
Consider the following example.
Solution. Consider the matrix A first. Using Definition 3.1 we can find the determinant as follows:
det (A) = 3 4 2 6 = 12 12 = 0
det (B) = 2 1 5 3 = 2 15 = 13
Example 3.30:
(1) Let Ei j be the elementary matrix obtained by interchanging ith and jth rows of I . Then det Ei j =
1.
(2) Let Eik be the elementary matrix obtained by multiplying the ith row of I by k. Then det Eik = k.
(3) Let Ei jk be the elementary matrix obtained by multiplying ith row of I by k and adding it to its
jth row. Then det Ei jk = 1.
(4) If C and B are such that CB is defined and the ith row of C consists of zeros, then the ith row of
CB consists of zeros.
(5) If E is an elementary matrix, then det E = det E T .
Many of the proofs in section use the Principle of Mathematical Induction. This concept is discussed
in Appendix A.2 and is reviewed here for convenience. First we check that the assertion is true for n = 2
(the case n = 1 is either completely trivial or meaningless).
Next, we assume that the assertion is true for n 1 (where n 3) and prove it for n. Once this is
accomplished, by the Principle of Mathematical Induction we can conclude that the statement is true for
all n n matrices for every n 2.
If A is an n n matrix and 1 j n, then the matrix obtained by removing 1st column and jth row
from A is an n 1 n 1 matrix (we shall denote this matrix by A( j) below). Since these matrices are used
in computation of cofactors cof(A)1,i , for 1 i 6= n, the inductive assumption applies to these matrices.
Consider the following lemma.
Lemma 3.31:
If A is an n n matrix such that one of its rows consists of zeros, then det A = 0.
Lemma 3.32:
Assume A, B and C are n n matrices that for some 1 i n satisfy the following.
2. Each entry in the jth row of A is the sum of the corresponding entries in jth rows of B and C.
This proves that the assertion is true for all n and completes the proof.
Theorem 3.33:
Let A and B be n n matrices.
1. If A is obtained by interchanging ith and jth rows of B (with i 6= j), then det A = det B.
Proof. We prove all statements by induction. The case n = 2 is easily checked directly (and it is strongly
suggested that you do check it).
We assume n 3 and (1)(4) are true for all matrices of size n 1 n 1.
(1) We prove the case when j = i + 1, i.e., we are interchanging two consecutive rows.
Let l {1, . . ., n} \ {i, j}. Then A(l) is obtained from B(l) by interchanging two of its rows (draw a
picture) and by our assumption
cof(A)1,l = cof(B)1,l . (3.2)
3.1. Basic Techniques and Properties 101
Now consider a1,i cof(A)1,l . We have that a1,i = b1, j and also that A(i) = B( j). Since j = i + 1, we have
and therefore a1i cof(A)1i = b1 j cof(B)1 j and a1 j cof(A)1 j = b1i cof(B)1i . Putting this together with (3.2)
into (3.1) we see that if in the formula for det A we change the sign of each of the summands we obtain the
formula for det B.
n n
det A = a1l cof(A)1l = b1l B1l = det B.
l=1 l=1
We have therefore proved the case of (1) when j = i + 1. In order to prove the general case, one needs
the following fact. If i < j, then in order to interchange ith and jth row one can proceed by interchanging
two adjacent rows 2( j i) + 1 times: First swap ith and i + 1st, then i + 1st and i + 2nd, and so on. After
one interchanges j 1st and jth row, we have ith row in position of jth and lth row in position of l 1st
for i + 1 l j. Then proceed backwards swapping adjacent rows until everything is in place.
Since 2( j i) + 1 is an odd number (1)2( ji)+1 = 1 and we have that det A = det B.
(2) This is like (1). . . but much easier. Assume that (2) is true for all n 1 n 1 matrices. We
have that a ji = kb ji for 1 j n. In particular a1i = kb1i , and for l 6= i matrix A(l) is obtained from
B(l) by multiplying one of its rows by k. Therefore cof(A)1l = kcof(B)1l for l 6= i, and for all l we have
a1l cof(A)1l = kb1l cof(B)1l . By (3.1), we have det A = k det B.
(3) This is a consequence of (1). If two rows of A are identical, then A is equal to the matrix obtained
by interchanging those two rows and therefore by (1) det A = det A. This implies det A = 0.
(4) Assume (4) is true for all n 1 n 1 matrices and fix A and B such that A is obtained by multi-
plying ith row of B by k and adding it to jth row of B (i 6= j) then det A = det B. If k = 0 then A = B and
there is nothing to prove, so we may assume k 6= 0.
Let C be the matrix obtained by replacing the jth row of B by the ith row of B multiplied by k. By
Lemma 3.32, we have that
det A = det B + detC
and we only need to show that detC = 0. But ith and jth rows of C are proportional. If D is obtained by
multiplying the jth row of C by 1k then by (2) we have detC = 1k det D (recall that k 6= 0!). But ith and jth
rows of D are identical, hence by (3) we have det D = 0 and therefore detC = 0.
Theorem 3.34:
Let A and B be two n n matrices. Then
Proof. If A is an elementary matrix of either type, then multiplying by A on the left has the same effect as
performing the corresponding elementary row operation. Therefore the equality det(AB) = det A det B in
this case follows by Example 3.30 and Theorem 3.33.
If C is the reduced row-echelon form of A then we can write A = E1 E2 Em C for some elementary
matrices E1 , . . . , Em .
Now we consider two cases.
102 Determinants
Now assume C 6= I. Since it is in reduced row-echelon form, its last row consists of zeros and by (4)
of Example 3.30 the last row of CB consists of zeros. By Lemma 3.31 we have detC = det(CB) = 0 and
therefore
det A = det(E1 E2 Em ) det(C) = det(E1 E2 Em ) 0 = 0
and also
det AB = det(E1 E2 Em ) det(CB) = det(E1 E2 Em )0 = 0
hence det AB = 0 = det A det B.
The same machine used in the previous proof will be used again.
Theorem 3.35:
Let A be a matrix where AT is the transpose of A. Then,
det AT = det (A)
Proof. Note first that the conclusion is true if A is elementary by (5) of Example 3.30.
Let C be the reduced row-echelon form of A. Then we can write A = E1 E2 EmC. Then AT =
CT EmT E2T E1 . By Theorem 3.34 we have
By (5) of Example 3.30 we have that det E j = det E Tj for all j. Also, detC is either 0 or 1 (depending on
whether C = I or not) and in either case detC = detCT . Therefore det A = det AT .
The above discussions allow us to now prove Theorem 3.9. It is restated below.
Theorem 3.36:
Expanding an n n matrix along any row or column always gives the same result, which is the
determinant.
Proof. We first show that the determinant can be computed along any row. The case n = 1 does not apply
and thus let n 2.
Let Abe an n n matrix and fix j > 1. We need to prove that
n
det A = a j,i cof(A) j,i .
i=1
3.1. Basic Techniques and Properties 103
Theorems 3.16, 3.18 and 3.21 illustrate how row operations affect the determinant of a matrix. In this
section, we look at two examples where row operations are used to find the determinant of a large matrix.
Recall that when working with large matrices, Laplace Expansion is effective but timely, as there are
many steps involved. This section provides useful tools for an alternative method. By first applying row
operations, we can obtain a simpler matrix to which we apply Laplace Expansion.
While working through questions such as these, it is useful to record your row operations as you go
along. Keep this in mind as you read through the next example.
Solution. We will use the properties of determinants outlined above to find det (A). First, add 5 times
the first row to the second row. Then add 4 times the first row to the third row, and 2 times the first
104 Determinants
0 2 10 3
Notice that the only row operation we have done so far is adding a multiple of a row to another row.
Therefore, by Theorem 3.21, det (B) = det (A) .
At this stage, you could use Laplace Expansion to find det (B). However, we will continue with row
operations to find an even simpler matrix to work with.
Add 3 times the third row to the second row. By Theorem 3.21 this does not change the value of the
determinant. Then, multiply the fourth row by 3. This results in the matrix
1 2 3 4
0 0 11 22
C= 0 3 8 13
0 6 30 9
Here, det (C) = 3 det (B), which means that det (B) = 13 det (C)
Since det (A) = det (B), we now have that det (A) = 13 det (C). Again, you could use Laplace
Expansion here to find det (C). However, we will continue with row operations.
Now replace the add 2 times the third row to the fourth row. This does not change the value of the
determinant by Theorem 3.21. Finally switch the third and second rows. This causes the determinant to
be multiplied by 1. Thus det (C) = det (D) where
1 2 3 4
0 3 8 13
D= 0
0 11 22
0 0 14 17
Hence, det (A) = 13 det (C) = 13 det (D)
You could do more row operations or you could note that this can be easily expanded along the first
column. Then, expand the resulting 3 3 matrix also along the first column. This results in
11 22
det (D) = 1 (3) = 1485
14 17
1
and so det (A) = 3 (1485) = 495.
You can see that by using row operations, we can simplify a matrix to the point where Laplace Ex-
pansion involves only a few steps. In Example 3.37, we also could have continued until the matrix was in
upper triangular form, and taken the product of the entries on the main diagonal. Whenever computing the
determinant, it is useful to consider all the possible methods and tools.
Consider the next example.
3.1. Basic Techniques and Properties 105
Solution. Once again, we will simplify the matrix through row operations. Add 1 times the first row to
the second row. Next add 2 times the first row to the third and finally take 3 times the first row and add
to the fourth row. This yields
1 2 3 2
0 5 1 1
B= 0 3 4
1
0 10 8 4
By Theorem 3.21, det (A) = det (B).
Remember you can work with the columns also. Take 5 times the fourth column and add to the
second column. This yields
1 8 3 2
0 0 1 1
C=
0
8 4 1
0 10 8 4
By Theorem 3.21 det (A) = det (C).
Now take 1 times the third row and add to the top row. This gives.
1 0 7 1
0 0 1 1
D= 0 8 4
1
0 10 8 4
which by Theorem 3.21 has the same determinant as A.
Now, we can find det (D) by expanding along the first column as follows. You can see that there will
be only one non zero term.
0 1 1
det (D) = 1 det 8 4 1 +0+0+0
10 8 4
Expanding again along the first column, we have
1 1 1 1
det (D) = 1 0 + 8 det + 10 det = 82
8 4 4 1
Now since det (A) = det (D), it follows that det (A) = 82.
Remember that you can verify these answers by using Laplace Expansion on A. Similarly, if you first
compute the determinant using Laplace Expansion, you can use the row operation method to verify.
106 Determinants
Exercises
1 2 4
Exercise 3.1.2 Let A = 0 1 3 . Find the following.
2 5 1
(a) minor(A)11
(b) minor(A)21
(c) minor(A)32
(d) cof(A)11
(e) cof(A)21
(f) cof(A)32
1 2 1 2
3.1. Basic Techniques and Properties 107
Exercise 3.1.4 Find the following determinant by expanding along the first row and second column.
1 2 1
2 1 3
2 1 1
Exercise 3.1.5 Find the following determinant by expanding along the first column and third row.
1 2 1
1 0 1
2 1 1
Exercise 3.1.6 Find the following determinant by expanding along the second row and first column.
1 2 1
2 1 3
2 1 1
Exercise 3.1.7 Compute the determinant by cofactor expansion. Pick the easiest row or column to use.
1 0 0 1
2 1 1 0
0 0 0 2
2 1 3 1
Exercise 3.1.9 An operation is done to get from the first matrix to the second. Identify what was done and
tell how it will affect the value of the determinant.
a b a c
c d b d
108 Determinants
Exercise 3.1.10 An operation is done to get from the first matrix to the second. Identify what was done
and tell how it will affect the value of the determinant.
a b c d
c d a b
Exercise 3.1.11 An operation is done to get from the first matrix to the second. Identify what was done
and tell how it will affect the value of the determinant.
a b a b
c d a+c b+d
Exercise 3.1.12 An operation is done to get from the first matrix to the second. Identify what was done
and tell how it will affect the value of the determinant.
a b a b
c d 2c 2d
Exercise 3.1.13 An operation is done to get from the first matrix to the second. Identify what was done
and tell how it will affect the value of the determinant.
a b b a
c d d c
Exercise 3.1.14 Let A be an r r matrix and suppose there are r 1 rows (columns) such that all rows
(columns) are linear combinations of these r 1 rows (columns). Show det (A) = 0.
Exercise 3.1.15 Show det (aA) = an det (A) for an n n matrix A and scalar a.
Exercise 3.1.16 Construct 2 2 matrices A and B to show that the det A det B = det(AB).
Exercise 3.1.17 Is it true that det (A + B) = det (A) + det (B)? If this is so, explain why. If it is not so, give
a counter example.
Exercise 3.1.18 An n n matrix is called nilpotent if for some positive integer, k it follows Ak = 0. If A is
a nilpotent matrix and k is the smallest possible integer such that Ak = 0, what are the possible values of
det (A)?
Exercise 3.1.19 A matrix is said to be orthogonal if AT A = I. Thus the inverse of an orthogonal matrix is
just its transpose. What are the possible values of det (A) if A is an orthogonal matrix?
3.1. Basic Techniques and Properties 109
Exercise 3.1.20 Let A and B be two n n matrices. A B (A is similar to B) means there exists an
invertible matrix P such that A = P1 BP. Show that if A B, then det (A) = det (B) .
Exercise 3.1.21 Tell whether each statement is true or false. If true, provide a proof. If false, provide a
counter example.
(a) If A is a 3 3 matrix with a zero determinant, then one column must be a multiple of some other
column.
(b) If any two columns of a square matrix are equal, then the determinant of the matrix equals zero.
(c) For two n n matrices A and B, det (A + B) = det (A) + det (B) .
(f) If B is obtained by multiplying a single row of A by 4 then det (B) = 4 det (A) .
Exercise 3.1.22 Find the determinant using row operations to first simplify.
1 2 1
2 3 2
4 1 2
Exercise 3.1.23 Find the determinant using row operations to first simplify.
2 1 3
2 4 2
1 4 5
Exercise 3.1.24 Find the determinant using row operations to first simplify.
1 2 1 2
3 1 2 3
1 0 3 1
2 3 2 2
110 Determinants
Exercise 3.1.25 Find the determinant using row operations to first simplify.
1 4 1 2
3 2 2 3
1 0 3 3
2 1 2 2
Outcomes
A. Use determinants to determine whether a matrix has an inverse, and evaluate the inverse using
cofactors.
C. Given data points, find an appropriate interpolating polynomial and use it to estimate points.
The determinant of a matrix also provides a way to find the inverse of a matrix. Recall the definition of
the inverse of a matrix in Definition 2.33. We say that A1 , an n n matrix, is the inverse of A, also n n,
if AA1 = I and A1 A = I.
We now define a new matrix called the cofactor matrix of A. The cofactor matrix of A is the matrix
whose i jth entry is the i jth cofactor of A. The formal definition is as follows.
Note that cof (A)i j denotes the i jth entry of the cofactor matrix.
We will use the cofactor matrix to create a formula for the inverse of A. First, we define the adjugate
of A to be the transpose of the cofactor matrix. We can also call this matrix the classical adjoint of A, and
we denote it by adj (A).
In the specific case where A is a 2 2 matrix given by
a b
A=
c d
3.2. Applications of the Determinant 111
Notice that the first formula holds for any n n matrix A, and in the case A is invertible we actually
have a formula for A1 .
Consider the following example.
First we will find the determinant of this matrix. Using Theorems 3.16, 3.18, and 3.21, we can first
simplify the matrix through row operations. First, add 3 times the first row to the second row. Then add
1 times the first row to the third row to obtain
1 2 3
B = 0 6 8
0 0 2
By Theorem 3.21, det (A) = det (B). By Theorem 3.13, det (B) = 1 6 2 = 12. Hence, det (A) = 12.
Now, we need to find adj (A). To do so, first we will find the cofactor matrix of A. This is given by
2 2 6
cof (A) = 4 2 0
2 8 6
112 Determinants
Here, the i jth entry is the i jth cofactor of the original matrix A which you can verify. Therefore, from
Theorem 3.40, the inverse of A is given by
1 1 1
T 6 3 6
2 2 6
1 1 1 2
1
A = 4 2 0
= 6 6 3
12
2 8 6 1 1
2 0 2
Remember that we can always verify our answer for A1 . Compute the product AA1 and A1 A and
make sure each product is equal to I.
Compute A1 A as follows
1 1 1
6 3 6
1 2 3 1 0 0
A1 A = 1 16 2
3
6 3 0 1 = 0 1 0 =I
1 1 2 1 0 0 1
2 0 21
You can verify that AA1 = I and hence our answer is correct.
We will look at another example of how to use this formula to find A1 .
Solution. First we need to find det (A). This step is left as an exercise and you should verify that det (A) =
1
6 . The inverse is therefore equal to
1
A1 = adj (A) = 6 adj (A)
(1/6)
3.2. Applications of the Determinant 113
We continue to calculate as follows. Here we show the 2 2 determinants needed to find the cofactors.
1
1
1
1
1 1 T
3 2 6 2 6 3
2
1 1 5 2
6 3
5
3 2
6 2
1
0 1 1 1 0
2 2 2 2
A1 = 6
23 12
5 1 5 2
6 2 6 3
1
1 1 1
0 2 2 2 2 0
1 1 1
3 2
1
1 1 6 3
6 2
Again, you can always check your work by multiplying A1 A and AA1 and ensuring these products
equal I. 1 1
2 0 2
1 2 1 1 1 1
1 0 0
6 3 2
A1 A = 2 1 1 = 0 1 0
1 2 1 5 2 1 0 0 1
6 3 2
This tells us that our calculation for A1 is correct. It is left to the reader to verify that AA1 = I.
The verification step is very important, as it is a simple way to check your work! If you multiply A1 A
and AA1 and these products are not both equal to I, be sure to go back and double check each step. One
common error is to forget to take the transpose of the cofactor matrix, so be sure to complete this step.
We will now prove Theorem 3.40.
Proof. (of Theorem 3.40) Recall that the (i, j)-entry of adj(A) is equal to cof(A) ji . Thus the (i, j)-entry of
B = A adj(A) is :
n n
Bi j = aik adj(A)k j = aik cof(A) jk
k=1 k=1
By the cofactor expansion theorem, we see that this expression for Bi j is equal to the determinant of the
matrix obtained from A by replacing its jth row by ai1 , ai2 , . . . ain i.e., its ith row.
If i = j then this matrix is A itself and therefore Bii = det A. If on the other hand i 6= j, then this matrix
has its ith row equal to its jth row, and therefore Bi j = 0 in his case. Thus we obtain:
A adj (A) = det (A)I
Similarly we can verify that:
adj (A) A = det (A)I
114 Determinants
and thus det (A) 6= 0. Equivalently, if det (A) = 0, then A is not invertible.
Finally if det (A) 6= 0, then the above formula shows that A is invertible and that:
1
A1 = adj (A)
det (A)
Another context in which the formula given in Theorem 3.40 is important is Cramers Rule. Recall that
we can represent a system of linear equations in the form AX = B, where the solutions to this system
are given by X . Cramers Rule gives a formula for the solutions X in the special case that A is a square
invertible matrix. Note this rule does not apply if you have a system of equations in which there is a
3.2. Applications of the Determinant 115
different number of equations than variables (in other words, when A is not square), or when A is not
invertible.
Suppose we have a system of equations given by AX = B, and we want to find solutions X which
satisfy this system. Then recall that if A1 exists,
AX = B
A (AX ) = A1 B
1
A1 A X = A1 B
IX = A1 B
X = A1 B
Hence, the solutions X to the system are given by X = A1 B. Since we assume that A1 exists, we can use
the formula for A1 given above. Substituting this formula into the equation for X , we have
1
X = A1 B = adj (A) B
det (A)
Let xi be the ith entry of X and b j be the jth entry of B. Then this equation becomes
n
1 n
1
xi = ai j bj = det (A) adj (A)i j b j
j=1 j=1
where here the ith column of A is replaced with the column vector [b1 , bn ]T . The determinant of this
modified matrix is taken and divided by det (A). This formula is known as Cramers rule.
We formally define this method now.
where Ai is the matrix obtained by replacing the ith column of A with the column matrix
b1
B = ...
bn
116 Determinants
Solution. We will use method outlined in Procedure 3.44 to find the values for x, y, z which give the solution
to this system. Let
1
B= 2
3
In order to find x, we calculate
det (A1 )
x=
det (A)
where A1 is the matrix obtained from replacing the first column of A with B.
Hence, A1 is given by
1 2 1
A1 = 2 2 1
3 3 2
Therefore,
1 2 1
2 2 1
det (A1 ) 3 3 2 1
x= = =
det (A) 1 2 1 2
3 2 1
2 3 2
Similarly, to find y we construct A2 by replacing the second column of A with B. Hence, A2 is given by
1 1 1
A2 = 3 2 1
2 3 2
Therefore,
1 1 1
3 2 1
det (A2 ) 2 3 2 1
y= = =
det (A) 7
1 2 1
3 2 1
2 3 2
3.2. Applications of the Determinant 117
Solution. We are asked to find the value of z in the solution. We will solve using Cramers rule. Thus
1 0 1
0 et cost t
0 et sint t 2
z= = t ((cost)t + sint) et
1 0 0
0 e cost e sint
t t
0 et sint et cost
In studying a set of data that relates variables x and y, it may be the case that we can use a polynomial to
fit to the data. If such a polynomial can be established, it can be used to estimate values of x and y which
have not been provided.
Consider the following example.
118 Determinants
p(x) = r0 + r1 x1 + r2 x22
such that p(1) = 4, p(2) = 9 and p(3) = 12. To find this polynomial, substitute the known values in for x
and solve for r0 , r1 , and r2 .
p(1) = r0 + r1 + r2 = 4
p(2) = r0 + 2r1 + 4r2 = 9
p(3) = r0 + 3r1 + 9r2 = 12
Therefore the solution to the system is r0 = 3, r1 = 8, r2 = 1 and the required interpolating polyno-
mial is
p(x) = 3 + 8x x2
To estimate the value for x = 12 , we calculate p( 21 ):
1 1 1
p( ) = 3 + 8( ) ( )2
2 2 2
1
= 3 + 4
4
3
=
4
This procedure can be used for any number of data points, and any degree of polynomial. The steps
are outlined below.
3.2. Applications of the Determinant 119
4. Solving this system will result in a unique solution r0 , r1 , , rn1 . Use these values to con-
struct p(x), and estimate the value of p(a) for any x = a.
p(x) = r0 + r1 x + r2 x2 + r3 x3
120 Determinants
p(0) = r0 = 1
p(1) = r0 + r1 + r2 + r3 = 2
p(3) = r0 + 3r1 + 9r2 + 27r3 = 22
p(5) = r0 + 5r1 + 25r2 + 125r3 = 66
Exercises
Determine whether the matrix A has an inverse by finding whether the determinant is non zero. If the
determinant is nonzero, find the inverse using the formula for the inverse.
Exercise 3.2.31 For the following matrices, determine if they are invertible. If so, use the formula for the
inverse in terms of the cofactor matrix to find each inverse. If the inverse does not exist, explain why.
1 1
(a)
1 2
1 2 3
(b) 0 2 1
4 1 1
1 2 1
(c) 2 3 0
0 1 2
Does there exist a value of t for which this matrix fails to have an inverse? Explain.
Exercise 3.2.36 Show that if det (A) 6= 0 for A an n n matrix, it follows that if AX = 0, then X = 0.
Exercise 3.2.37 Suppose A, B are n n matrices and that AB = I. Show that then BA = I. Hint: First
explain why det (A) , det (B) are both nonzero. Then (AB) A = A and then show BA (BA I) = 0. From this
use what is given to conclude A (BA I) = 0. Then use Problem 3.2.36.
Exercise 3.2.38 Use the formula for the inverse in terms of the cofactor matrix to find the inverse of the
matrix t
e 0 0
A= 0 et cost et sint
t t t
0 e cost e sint e cost + e sint t
Exercise 3.2.40 Suppose A is an upper triangular matrix. Show that A1 exists if and only if all elements
of the main diagonal are non zero. Is it true that A1 will also be upper triangular? Explain. Could the
same be concluded for lower triangular matrices?
Exercise 3.2.41 If A, B, and C are each n n matrices and ABC is invertible, show why each of A, B, and
C are invertible.
Exercise 3.2.42 Decide if this statement is true or false: Cramers rule is useful for finding solutions to
systems of linear equations in which there is an infinite set of solutions.
x + 2y + z = 1
2x y z = 2
x+z = 1
4. Rn
4.1 Vectors in Rn
Outcomes
A. Find the position vector of a point in Rn .
The notation Rn refers to the collection of ordered lists of n real numbers, that is
Rn = (x1 xn ) : x j R for j = 1, , n
In this chapter, we take a closer look at vectors in Rn . First, we will consider what Rn looks like in more
detail. Recall that the point given by 0 = (0, , 0) is called the origin.
Now, consider the case of Rn for n = 1. Then from the definition we can identify R with points in R1
as follows:
R = R1 = {(x1 ) : x1 R}
Hence, R is defined as the set of all real numbers and geometrically, we can describe this as all the points
on a line.
Now suppose n = 2. Then, from the definition,
R2 = (x1 , x2 ) : x j R for j = 1, 2
Consider the familiar coordinate plane, with an x axis and a y axis. Any point within this coordinate plane
is identified by where it is located along the x axis, and also where it is located along the y axis. Consider
as an example the following diagram.
y
Q = (3, 4)
4
P = (2, 1)
1
x
3 2
125
126 Rn
Hence, every element in R2 is identified by two components, x and y, in the usual manner. The
coordinates x, y (or x1 ,x2 ) uniquely determine a point in the plan. Note that while the definition uses x1 and
x2 to label the coordinates and you may be used to x and y, these notations are equivalent.
Now suppose n = 3. You may have previously encountered the 3-dimensional coordinate system, given
by
R3 = (x1 , x2 , x3 ) : x j R for j = 1, 2, 3
Points in R3 will be determined by three coordinates, often written (x, y, z) which correspond to the x,
y, and z axes. We can think as above that the first two coordinates determine a point in a plane. The third
component determines the height above or below the plane, depending on whether this number is positive
or negative, and all together this determines a point in space. You see that the ordered triples correspond to
points in space just as the ordered pairs correspond to points in a plane and single real numbers correspond
to points on a line.
The idea behind the more general Rn is that we can extend these ideas beyond n = 3. This discussion
regarding points in Rn leads into a study of vectors in Rn . While we consider Rn for all n, we will largely
focus on n = 2, 3 in this section.
Consider the following definition.
For this reason we may write both P = (p1 , , pn ) Rn and 0P = [p1 pn ]T Rn .
This definition is illustrated in the following picture for the special case of R3 .
P = (p1 , p2 , p3 )
T
0P = p1 p2 p3
Thus every point P in Rn determines its position vector 0P. Conversely, every such position vector 0P
which has its tail at 0 and point at P determines the point P of Rn .
Now suppose we are given two points, P, Q whose coordinates are (p1 , , pn ) and (q1 , , qn ) re-
spectively. We can also determine the position vector from P to Q (also called the vector from P to Q)
defined as follows.
q1 p1
..
PQ = . = 0Q 0P
qn pn
4.1. Vectors in Rn 127
Now, imagine taking a vector in Rn and moving it around, always keeping it pointing in the same
direction as shown in the following picture.
B
T
0P = p1 p2 p3
A
After moving it around, it is regarded as the same vector. Each vector, 0P and AB has the same length
(or magnitude) and direction. Therefore, they are equal.
Consider now the general definition for a vector in Rn .
is called a vector. Vectors have both size (magnitude) and direction. The numbers x j are called the
components of ~x.
Using this notation, we may use ~p to denote the position vector of point P. Notice that in this context,
~p = 0P. These notations may be used interchangeably.
You can think of the components of a vector as directions for obtaining the vector. Consider n = 3.
Draw a vector with its tail at the point (0, 0, 0) and its tip at the point (a, b, c). This vector it is obtained
by starting at (0, 0, 0), moving parallel to the x axis to (a, 0, 0) and then from here, moving parallel to the
y axis to (a, b, 0) and finally parallel to the z axis to (a, b, c) . Observe that the same vector would result if
you began at the point (d, e, f ), moved parallel to the x axis to (d + a, e, f ) , then parallel to the y axis to
(d + a, e + b, f ) , and finally parallel to the z axis to (d + a, e + b, f + c). Here, the vector would have its
tail sitting at the point determined by A = (d, e, f ) and its point at B = (d + a, e + b, f + c) . It is the same
vector because it will point in the same direction and have the same length. It is like you took an actual
arrow, and moved it from one location to another keeping it pointing the same direction.
We conclude this section with a brief discussion regarding notation. In previous sections, we have
written vectors as columns, or n 1 matrices. For convenience in this chapter we may write vectors as the
transpose of row vectors, or 1 n matrices. These are of course equivalent and we may move between
both notations. Therefore, recognize that
2 T
= 2 3
3
Notice that two vectors ~u = [u1 un ]T and ~v = [v1 vn ]T are equal if and only if all corresponding
components are equal. Precisely,
~u =~v if and only if
u j = v j for all j = 1, , n
128 Rn
T T T T
Thus 1 2 4 R3 and 2 1 4 R3 but 1 2 4 6= 2 1 4 because, even though
the same numbers are involved, the order of the numbers is different.
For the specific case of R3 , there are three special vectors which we often use. They are given by
~i = 1 0 0 T
T
~j = 0 1 0
T
~k = 0 0 1
T
We can write any vector ~u = u1 u2 u3 as a linear combination of these vectors, written as ~u =
u1~i + u2~j + u3~k. This notation will be used throughout this chapter.
4.2 Algebra in Rn
Outcomes
A. Understand vector addition and scalar multiplication, algebraically.
Addition and scalar multiplication are two important algebraic operations done with vectors. Notice
that these operations apply to vectors in Rn , for any value of n. We will explore these operations in more
detail in the following sections.
To add vectors, we simply add corresponding components exactly as we did for matrices. Therefore,
in order to add vectors, they must be the same size.
Similarly to matrices, addition of vectors satisfies some important properties. These are outlined in the
following theorem.
~u +~0 = ~u (4.1)
~u + (~u) = ~0
The proof of this theorem follows from the similar properties for matrix operations. Thus the additive
identity shown in equation 4.1 is also called the zero vector, the n 1 vector in which all components
are equal to 0. Further, ~u is simply the vector with all components having same value as those of ~u but
opposite sign; this is just (1)~u. This will be made more explicit in the next section when we explore
scalar multiplication of vectors. Note that subtraction is defined as ~u ~v = ~u + (~v) .
Scalar multiplication of vectors in Rn is defined as follows. Notice that, just like addition, this definition
is the same as the corresponding definition for matrices.
Just as with addition, scalar multiplication of vectors satisfies several important properties. These are
130 Rn
k (p~u) = (kp)~u
Proof. Again the verification of these properties follows from the corresponding properties for scalar
multiplication of matrices.
As a refresher we can show that
k (~u +~v) = k~u + k~v
Note that:
k (~u +~v) = k [u1 + v1 un + vn ]T
= [k (u1 + v1 ) k (un + vn )]T
= [ku1 + kv1 kun + kvn ]T
= [ku1 kun ]T + [kv1 kvn ]T
= k~u + k~v
We now present a useful notion you may have seen earlier combining vector addition and scalar mul-
tiplication
For example,
4 3 18
3 1 +2 0 = 3 .
0 1 2
4.2. Algebra in Rn 131
Exercises
5 8
1 2
Exercise 4.2.1 Find 3
2 + 5 3
.
3 6
6 13
0 1
Exercise 4.2.2 Find 7
4 +6
.
1
1 6
Outcomes
A. Understand vector addition, geometrically.
Recall that an element of Rn is an ordered list of numbers. For the specific case of n = 2, 3 this can
be used to determine a point in two or three dimensional space. This point is specified relative to some
coordinate axes.
Consider the case n = 3. Recall that taking a vector and moving it around without changing its length or
direction does not change the vector. This is important in the geometric representation of vector addition.
Suppose we have two vectors, ~u and ~v in R3 . Each of these can be drawn geometrically by placing the
tail of each vector at 0 and its point at (u1 , u2 , u3 ) and (v1 , v2 , v3 ) respectively. Suppose we slide the vector
~v so that its tail sits at the point of ~u. We know that this does not change the vector ~v. Now, draw a new
vector from the tail of ~u to the point of ~v. This vector is ~u +~v.
The geometric significance of vector addition in Rn for any n is given in the following definition.
~u +~v
~v
~u
This definition is illustrated in the following picture in which ~u +~v is shown for the special case n = 3.
~v
z ~u ~u +~v
~v
y
x
4.3. Geometric Meaning of Vector Addition 133
Notice the parallelogram created by ~u and ~v in the above diagram. Then ~u +~v is the directed diagonal
of the parallelogram determined by the two vectors ~u and ~v.
When you have a vector ~v, its additive inverse ~v will be the vector which has the same magnitude as
~v but the opposite direction. When one writes ~u ~v, the meaning is ~u + (~v) as with real numbers. The
following example illustrates these definitions and conventions.
~u
~v
Solution. We will first sketch ~u +~v. Begin by drawing ~u and then at the point of ~u, place the tail of ~v as
shown. Then ~u +~v is the vector which results from drawing a vector from the tail of ~u to the tip of ~v.
~v
~u
~u +~v
Next consider ~u ~v. This means ~u + (~v) . From the above geometric description of vector addition,
~v is the vector which has the same length but which points in the opposite direction to ~v. Here is a
picture.
~v
~u ~v
~u
134 Rn
Outcomes
A. Find the length of a vector and the distance between two points in Rn .
In this section, we explore what is meant by the length of a vector in Rn . We develop this concept by
first looking at the distance between two points in Rn .
First, we will consider the concept of distance for R, that is, for points in R1 . Here, the distance
between two points P and Q is given by the absolute value of their difference. We denote the distance
between P and Q by d(P, Q) which is defined as
q
d(P, Q) = (P Q)2 (4.2)
Q = (q1 , q2 ) (p1 , q2 )
There are two points P = (p1 , p2 ) and Q = (q1 , q2 ) in the plane. The distance between these points
is shown in the picture as a solid line. Notice that this line is the hypotenuse of a right triangle which
is half of the rectangle shown in dotted lines. We want to find the length of this hypotenuse which will
give the distance between the two points. Note the lengths of the sides of this triangle are |p1 q1 | and
|p2 q2 |, the absolute value of the difference in these values. Therefore, the Pythagorean Theorem implies
the length of the hypotenuse (and thus the distance between P and Q) equals
1/2 1/2
|p1 q1 |2 + |p2 q2 |2 = (p1 q1 )2 + (p2 q2 )2 (4.3)
Now suppose n = 3 and let P = (p1 , p2 , p3 ) and Q = (q1 , q2 , q3 ) be two points in R3 . Consider the
following picture in which the solid line joins the two points and a dotted line joins the points (q1 , q2 , q3 )
and (p1 , p2 , q3 ) .
4.4. Length of a Vector 135
P = (p1 , p2 , p3 )
(p1 , p2 , q3 )
Q = (q1 , q2 , q3 ) (p1 , q2 , q3 )
Here, we need to use Pythagorean Theorem twice in order to find the length of the solid line. First, by
the Pythagorean Theorem, the length of the dotted line joining (q1 , q2 , q3 ) and (p1 , p2 , q3 ) equals
1/2
(p1 q1 )2 + (p2 q2 )2
while the length of the line joining (p1 , p2 , q3 ) to (p1 , p2 , p3 ) is just |p3 q3 | . Therefore, by the Pythagorean
Theorem again, the length of the line joining the points P = (p1 , p2 , p3 ) and Q = (q1 , q2 , q3 ) equals
!1/2
1/2 2
(p1 q1 )2 + (p2 q2 )2 + (p3 q3 )2
1/2
= (p1 q1 )2 + (p2 q2 )2 + (p3 q3 )2 (4.4)
This discussion motivates the following definition for the distance between points in Rn .
This is called the distance formula. We may also write |P Q| as the distance between P and Q.
From the above discussion, you can see that Definition 4.10 holds for the special cases n = 1, 2, 3, as
in Equations 4.2, 4.3, 4.4. In the following example, we use Definition 4.10 to find the distance between
two points in R4 .
P = (1, 2, 4, 6)
and
Q = (2, 3, 1, 0)
136 Rn
Solution. We will use the formula given in Definition 4.10 to find the distance between P and Q. Use the
distance formula and write
1
2 2 2 2 2
d(P, Q) = (1 2) + (2 3) + (4 (1)) + (6 0) = 47
Therefore, d(P, Q) = 47.
There are certain properties of the distance between points which are important in our study. These are
outlined in the following theorem.
d(P, Q) = d(Q, P)
There are many applications of the concept of distance. For instance, given two points, we can ask what
collection of points are all the same distance between the given points. This is explored in the following
example.
Solution. Let P = (p1 , p2 , p3 ) be such a point. Therefore, P is the same distance from (1, 2, 3) and (0, 1, 2) .
Then by Definition 4.10,
q q
(p1 1) + (p2 2) + (p3 3) = (p1 0)2 + (p2 1)2 + (p3 2)2
2 2 2
(p1 1)2 + (p2 2)2 + (p3 3)2 = p21 + (p2 1)2 + (p3 2)2
and so
p21 2p1 + 14 + p22 4p2 + p23 6p3 = p21 + p22 2p2 + 5 + p23 4p3
Simplifying, this becomes
2p1 + 14 4p2 6p3 = 2p2 + 5 4p3
which can be written as
2p1 + 2p2 + 2p3 = 9 (4.5)
Therefore, the points P = (p1 , p2 , p3 ) which are the same distance from each of the given points form a
plane whose equation is given by 4.5.
4.4. Length of a Vector 137
We can now use our understanding of the distance between two points to define what is meant by the
length of a vector. Consider the following definition.
This definition corresponds to Definition 4.10, if you consider the vector ~u to have its tail at the point
0 = (0, , 0) and its tip at the point U = (u1 , , un ). Then the length of ~u is equal to the distance between
0 and U , d(0,U ). In general, d(P, Q) = kPQk.
Consider Example 4.11. By Definition 4.14, we could also find the distance between P and Q as the
length of the vector connecting them. Hence,if we were to draw a vector PQ with its tail at P and its point
at Q, this vector would have length equal to 47.
We conclude this section with a new definition for the special case of vectors of length 1.
k~uk = 1
Let ~v be a vector in Rn . Then, the vector ~u which has the same direction as ~v but length equal to 1 is
the corresponding unit vector of ~v. This vector is given by
1
~u = ~v
k~vk
We often use the term normalize to refer to this process. When we normalize a vector, we find the
corresponding unit vector of length 1. Consider the following example.
Solution. We will use Definition 4.15 to solve this. Therefore, we need to find the length of ~v which, by
Definition 4.14 is given by q
k~vk = v21 + v22 + v23
Using the corresponding values we find that
q
k~vk = 12 + (3)2 + 42
138 Rn
= 1 + 9 + 16
= 26
In order to find ~u, we divide ~v by 26. The result is
1
~u = ~v
k~vk
1 T
= 1 3 4
26
h iT
= 1 3 4
26 26 26
Outcomes
A. Understand scalar multiplication, geometrically.
q the point P = (p1 , p2 , p3 ) determines a vector ~p from 0 to P. The length of ~p, denoted k~pk,
Recall that
is equal to p21 + p22 + p23 by Definition 4.10.
T
Now suppose we have a vector ~u = u1 u2 u3 and we multiply ~u by a scalar k. By Definition
T
4.5, k~u = ku1 ku2 ku3 . Then, by using Definition 4.10, the length of this vector is given by
r q
2 2 2
(ku1 ) + (ku2 ) + (ku3 ) = |k| u21 + u22 + u23
~u
~v
Solution.
In order to find ~u, we preserve the length of ~u and simply reverse the direction. For 2~v, we double
the length of ~v, while preserving the direction. Finally 12~v is found by taking half the length of ~v and
reversing the direction. These vectors are shown in the following diagram.
~u
21~v
~v
~u 2~v
Now that we have studied both vector addition and scalar multiplication, we can combine the two
actions. Recall Definition 4.7 of linear combinations of column matrices. We can apply this definition to
vectors in Rn . A linear combination of vectors in Rn is a sum of vectors multiplied by scalars.
In the following example, we examine the geometric meaning of this concept.
~u
~v
~u 2~v
~u + 2~v
12~v
~u 12~v
~u
Outcomes
A. Find the vector and parametric equations of a line.
We can use the concept of vectors and points to find equations for arbitrary lines in Rn , although in
this section the focus will be on lines in R3 .
To begin, consider the case n = 1 so we have R1 = R. There is only one line here which is the familiar
number line, that is R itself. Therefore it is not necessary to explore the case of n = 1 further.
Now consider the case where n = 2, in other words R2 . Let P and P0 be two different points in R2
which are contained in a line L. Let ~p and ~p0 be the position vectors for the points P and P0 respectively.
Suppose that Q is an arbitrary point on L. Consider the following diagram.
P0
Our goal is to be able to define Q in terms of P and P0 . Consider the vector P0 P = ~p ~p0 which has its
tail at P0 and point at P. If we add ~p ~p0 to the position vector ~p0 for P0, the sum would be a vector with
its point at P. In other words,
~p = ~p0 + (~p ~p0 )
4.6. Parametric Lines 141
Now suppose we were to add t(~p ~p0 ) to ~p where t is some scalar. You can see that by doing so, we
could find a vector with its point at Q. In other words, we can find t such that
This equation determines the line L in R2 . In fact, it determines a line L in Rn . Consider the following
definition.
Note that this definition agrees with the usual notion of a line in two dimensions and so this is consistent
with earlier concepts. Consider now points in R3 . If a point P R3 is given by P = (x, y, z), P0 R3 by
P0 = (x0 , y0 , z0 ), then we can write
x x0 a
y = y0 + t b
z z0 c
a
~
where d = b . This is the vector equation of L written in component form .
c
The following theorem claims that such an equation is in fact a line.
x2 Rn . Define x~1 = ~a and let ~x2 ~x1 = ~b. Since ~b 6= ~0, it follows that ~x2 6= ~x1 . Then
Proof. Let x~1 , ~
~a + t~b = ~
x1 + t (~x2 ~x1 ). It follows that ~x = ~a + t~b is a line containing the two different points X1 and X2
whose position vectors are given by ~x1 and ~x2 respectively.
We can use the above discussion to find the equation of a line when given two distinct points. Consider
the following example.
Solution. We will use the definition of a line given above in Definition 4.19 to write this line in the form
equation for the line which contains the point P0 = (1, 2, 0) and has direction vector
Finda vector
1
d~ = 2
1
4.6. Parametric Lines 143
We sometimes elect to write a line such as the one given in 4.6 in the form
x = 1+t
y = 2 + 2t where t R (4.7)
z=t
This set of equations give the same information as 4.6, and is called the parametric equation of the line.
Consider the following definition.
You can verify that the form discussed following Example 4.22 in equation 4.7 is of the form given in
Definition 4.23.
There is one other form for a line which is useful, which is the symmetric form. Consider the line
given by 4.7. You can solve for the parameter t to write
t = x1
t = y2
2
t =z
Therefore,
y2
x1 = =z
2
This is the symmetric formof the line.
In the following example, we look at how to take the equation of a line from symmetric form to
parametric form.
Solution. We want to write this line in the form given by Definition 4.23. This is of the form
x = x0 + ta
y = y0 + tb where t R
z = z0 + tc
x2 y1
Let t = 3 ,t = 2 and t = z + 3, as given in the symmetric form of the line. Then solving for x, y, z,
yields
x = 2 + 3t
y = 1 + 2t with t R
z = 3 + t
This is the parametric equation for this line.
Now, we want to write this line in the form given by Definition 4.19. This is the form
~p = ~p0 + t d~
Exercises
Exercise 4.6.5 Find the vector equation for the line through (7, 6, 0) and (1, 1, 4) . Then, find the
parametric equations for this line.
Exercise
4.6.6 Find parametric equations for the line through the point (7, 7, 1) with a direction vector
1
d~ = 6 .
2
x = t +2
y = 6 3t
z = t 6
Find a direction vector for the line and a point on the line.
Exercise 4.6.8 Find the vector equation for the line through the two points (5, 5, 1), (2, 2, 4) . Then, find
the parametric equations.
4.7. The Dot Product 145
Exercise 4.6.9 The equation of a line in two dimensions is written as y = x 5. Find parametric equations
for this line.
Exercise 4.6.10 Find parametric equations for the line through (6, 5, 2) and (5, 1, 2) .
x = 2t + 2
y = 5 4t
z = t 3
Find a direction vector for the line and a point on the line, and write the vector equation of the line.
Exercise 4.6.13 Find the vector equation and parametric equations for the line through the two points
(4, 10, 0), (1, 5, 6) .
1
Exercise 4.6.14 Find the point on the line segment from P = (4, 7, 5) to Q = (2, 2, 3) which is 7 of
the way from P to Q.
Exercise 4.6.15 Suppose a triangle in Rn has vertices at P1 , P2 , and P3 . Consider the lines which are
drawn from a vertex to the mid point of the opposite side. Show these three lines intersect in a point and
find the coordinates of this point.
Outcomes
A. Compute the dot product of vectors, and use this to compute vector projections.
There are two ways of multiplying vectors which are of great importance in applications. The first of these
is called the dot product. When we take the dot product of vectors, the result is a scalar. For this reason,
the dot product is also called the scalar product and sometimes the inner product. The definition is as
follows.
146 Rn
The dot product ~u ~v is sometimes denoted as (~u,~v) where a comma replaces . It can also be written
as h~u,~vi. If we write the vectors as column or row matrices, it is equal to the matrix product ~v~wT .
Consider the following example.
This is given by
With this definition, there are several important properties satisfied by the dot product.
~u ~v =~v ~u
k~uk2 = ~u ~u
4.7. The Dot Product 147
The proof is left as an exercise. This proposition tells us that we can also use the dot product to find
the length of a vector.
Solution. By Proposition 4.27, k~uk2 = ~u ~u. Therefore, k~uk = ~u ~u. First, compute ~u ~u.
This is given by
Then,
k~uk = ~u ~u
= 25
= 5
You may wish to compare this to our previous definition of length, given in Definition 4.14.
The Cauchy Schwarz inequality is a fundamental inequality satisfied by the dot product. It is given
in the following theorem.
Furthermore equality is obtained if and only if one of ~u or ~v is a scalar multiple of the other.
Proof. First note that if~v =~0 both sides of 4.8 equal zero and so the inequality holds in this case. Therefore,
it will be assumed in what follows that ~v 6= ~0.
Define a function of t R by
f (t) = (~u + t~v) (~u + t~v)
Then by Proposition 4.27, f (t) 0 for all t R. Also from Proposition 4.27
Now this means the graph of y = f (t) is a parabola which opens up and either its vertex touches the t
axis or else the entire graph is above the t axis. In the first case, there exists some t where f (t) = 0 and
this requires ~u + t~v = ~0 so one vector is a multiple of the other. Then clearly equality holds in 4.8. In the
case where ~v is not a multiple of ~u, it follows f (t) > 0 for all t which says f (t) has no real zeros and so
from the quadratic formula,
(2 (~u ~v))2 4k~uk2 k~vk2 < 0
which is equivalent to |~u ~v| < k~ukk~vk.
Notice that this proof was based only on the properties of the dot product listed in Proposition 4.27.
This means that whenever an operation satisfies these properties, the Cauchy Schwarz inequality holds.
There are many other instances of these properties besides vectors in Rn .
The Cauchy Schwarz inequality provides another proof of the triangle inequality for distances in Rn .
Proof. By properties of the dot product and the Cauchy Schwarz inequality,
Hence,
k~u +~vk2 (k~uk + k~vk)2
Taking square roots of both sides you obtain 4.9.
It remains to consider when equality occurs. Suppose ~u = ~0. Then, ~u = 0~v and the claim about when
equality occurs is verified. The same argument holds if ~v = ~0. Therefore, it can be assumed both vectors
are nonzero. To get equality in 4.9 above, Theorem 4.29 implies one of the vectors must be a multiple of
the other. Say ~v = k~u. If k < 0 then equality cannot occur in 4.9 because in this case
Therefore, k 0.
4.7. The Dot Product 149
Given two vectors, ~u and ~v, the included angle is the angle between these two vectors which is given by
such that 0 . The dot product can be used to determine the included angle between two vectors.
Consider the following picture where gives the included angle.
~v
~u
In words, the dot product of two vectors equals the product of the magnitude (or length) of the two
vectors multiplied by the cosine of the included angle. Note this gives a geometric description of the dot
product which does not depend explicitly on the coordinates of the vectors.
Consider the following example.
Then, p
k~uk =p (2)(2) + (1)(1) + (1)(1) = 6
k~vk = (3)(3) + (4)(4) + (1)(1) = 26
Therefore, the cosine of the included angle equals
9
cos = = 0.7205766...
26 6
With the cosine known, the angle can be determined by computing the inverse cosine of that angle,
giving approximately = 0.76616 radians.
Another application of the geometric description of the dot product is in finding the angle between two
lines. Typically one would assume that the lines intersect. In some situations, however, it may make sense
to ask this question when the lines do not intersect, such as the angle between two object trajectories. In
any case we understand it to mean the smallest angle between (any of) their direction vectors. The only
subtlety here is that if ~u is a direction vector for a line, then so is any multiple k~u, and thus we will find
complementary angles among all angles between direction vectors for two lines, and we simply take the
smaller of the two.
and
x 0 2
L2 : y = 4 + s 1
z 3 1
Solution. You can verify that these lines do not intersect, but as discussed above this does not matter and
we simply find the smallest angle between any directions vectors for these lines.
To do so we first find the angle between the direction vectors given above:
1 2
~u = 1 , ~v = 1
2 1
4.7. The Dot Product 151
~u ~v = k~ukk~vk cos
2
to obtain cos = 12 and since we choose included angles between 0 and we obtain = 3 .
Now the angles between any two direction vectors for these lines will either be 23 or its complement
= 23 = 3 . We choose the smaller angle, and therefore conclude that the angle between the two lines
is 3 .
We can also use Proposition 4.31 to compute the dot product of two vectors.
Solution. From the geometric description of the dot product in Proposition 4.31
Two nonzero vectors are said to be perpendicular, sometimes also called orthogonal, if the included
angle is /2 radians (90 ).
Consider the following proposition.
~u ~v = 0
Proof. This follows directly from Proposition 4.31. First if the dot product of two nonzero vectors is equal
to 0, this tells us that cos = 0 (this is where we need nonzero vectors). Thus = /2 and the vectors are
perpendicular.
If on the other hand ~v is perpendicular to ~u, then the included angle is /2 radians. Hence cos = 0
and ~u ~v = 0.
Consider the following example.
are perpendicular.
152 Rn
Solution. In order to determine if these two vectors are perpendicular, we compute the dot product. This
is given by
~u ~v = (2)(1) + (1)(3) + (1)(5) = 0
Therefore, by Proposition 4.35 these two vectors are perpendicular.
4.7.3. Projections
In some applications, we wish to write a vector as a sum of two related vectors. Through the concept
of projections, we can find these two vectors. First, we explore an important theorem. The result of this
theorem will provide our definition of a vector projection.
Proof. Suppose 4.13 holds and ~v|| = k~u. Taking the dot product of both sides of 4.13 with ~u and using
~v ~u = 0, this yields
~v ~u = (~v|| +~v ) ~u
= k~u ~u +~v ~u
= kk~uk2
which requires k = ~v ~u/k~uk2 . Thus there can be no more than one vector ~v|| . It follows ~v must equal
~v ~v|| . This verifies there can be no more than one choice for both ~v|| and ~v and proves their uniqueness.
Now let
~v ~u
~v|| = ~u
k~uk2
and let
~v ~u
~v =~v ~v|| =~v ~u
k~uk2
~v~u
Then ~v|| = k~u where k = k~uk2
. It only remains to verify ~v ~u = 0. But
~v ~u
~v ~u = ~v ~u ~u ~u
k~uk2
= ~v ~u ~v ~u
= 0
The vector ~v|| in Theorem 4.37 is called the projection of ~v onto ~u and is denoted by
Solution. We can use the formula provided in Definition 4.38 to find proj~u (~v). First, compute ~v ~u. This
is given by
1 2
2 3 = (2)(1) + (3)(2) + (4)(1)
1 4
= 264
= 8
Similarly, ~u ~u is given by
2 2
3 3 = (2)(2) + (3)(3) + (4)(4)
4 4
= 4 + 9 + 16
= 29
We will conclude this section with an important application of projections. Suppose a line L and a
point P are given such that P is not contained in L. Through the use of projections, we can determine the
shortest distance from P to L.
Solution. In order to determine the shortest distance from P to L, we will first find the vector P0 P and then
find the projection of this vector onto L. The vector P0 P is given by
1 0 1
3 4 = 1
5 2 7
Therefore, Q = ( 10 17 4
3 , 3 , 3 ).
4.7. The Dot Product 155
Exercises
1 2
2 0
Exercise 4.7.16 Find
3 1
.
4 3
Exercise 4.7.17 Use the formula given in Proposition 4.31 to verify the Cauchy Schwarz inequality and
to show that equality occurs if and only if one of the vectors is a scalar multiple of the other.
Exercise 4.7.18 For ~u,~v vectors in R3 , define the product, ~u ~v = u1 v1 + 2u2 v2 + 3u3 v3 . Show the axioms
for a dot product all hold for this product. Prove
~ ~ 1 ~ 2 ~ 2
Exercise 4.7.19 Let ~a, b be vectors. Show that ~a b = 4 k~a + bk k~a bk .
Exercise 4.7.20 Using the axioms of the dot product, prove the parallelogram identity:
Exercise 4.7.21 Let A be a real m n matrix and let ~u Rn and ~v Rm . Show A~u ~v = ~u AT~v. Hint:
Use the definition of matrix multiplication to do this.
Exercise 4.7.22 Use the result of Problem 4.7.21 to verify directly that (AB)T = BT AT without making
any reference to subscripts.
1 1
Exercise 4.7.25 Find proj~v (~w) where ~w = 0 and ~v = 2 .
2 3
156 Rn
1 1
Exercise 4.7.26 Find proj~v (~w) where ~w = 2 and ~v = 0 .
2 3
1 1
2 2
Exercise 4.7.27 Find proj~v (~w) where ~w =
2 and ~v =
.
3
1 0
3
Exercise 4.7.28 Let P = (1,2, 3) be a point in R . Let L be the line through the point P0 = (1, 4, 5) with
1
direction vector d~ = 1 . Find the shortest distance from P to L, and find the point Q on L that is
1
closest to P.
(0, 2, 1) be a point in R3 . Let L be the line through the point P0 = (1, 1, 1) with
Exercise 4.7.29 LetP =
3
~
direction vector d = 0 . Find the shortest distance from P to L, and find the point Q on L that is closest
1
to P.
Exercise 4.7.31 Prove the Cauchy Schwarz inequality in Rn as follows. For ~u,~v vectors, consider
Simplify using the axioms of the dot product and then put in the formula for the projection. Notice that
this expression equals 0 and you get equality in the Cauchy Schwarz inequality if and only if ~w = proj~v~w.
What is the geometric meaning of ~w = proj~v~w?
Exercise 4.7.32 Let ~v,~w ~u be vectors. Show that (~w +~u) = ~w +~u where ~w = ~w proj~v (~w) .
4.8 Planes in Rn
Outcomes
A. Find the vector and scalar equations of a plane.
Much like the above discussion with lines, vectors can be used to determine planes in Rn . Given a
vector ~n in Rn and a point P0 , it is possible to find a unique plane which contains P0 and is perpendicular
to the given vector.
~n ~v = 0
In other words, we say that ~n is orthogonal (perpendicular) to every vector in the plane.
Consider now a plane with normal vector given by ~n, and containing a point P0 . Notice that this plane
is unique. If P is an arbitrary point on this plane, then by definition the normal vector is orthogonal to the
vector between P0 and P. Letting 0P and 0P0 be the position vectors of points P and P0 respectively, it
follows that
~n (0P 0P0 ) = 0
or
~n P0P = 0
The first of these equations gives the vector equation of the plane.
Notice that this equation can be used to determine if a point P is contained in a certain plane.
Notice that since P0 is given, ax0 + by0 + cz0 is a known scalar, which we can call d. This equation
becomes
ax + by + cz = d
ax + by + cz = d
Solution. The above vector ~n is the normal vector for this plane. Using Definition 4.42, we can determine
the vector equation for this plane.
~n (0P 0P0 ) = 0
2 x 3
4 y 2 = 0
1 z 5
2 x3
4 y+2 = 0
1 z5
Using Definition 4.44, we can determine the scalar equation of the plane.
2
From the above scalar equation, we have that ~n = 1 . Now, choose P0 = (1, 0, 0) so that ~n 0P =
2
3 1 2
2 = d. Then, P0 P = 2 0 = 2 .
3 0 3
Next, compute QP = proj~n P0 P.
QP = proj~n P0 P
!
P0 P ~n
= ~n
k~nk2
2
12
= 1
9
2
2
4
= 1
3
2
Then, kQPk = 4 so the shortest distance from P to the plane is 4.
Next, to find the point Q on the plane which is closest to P we have
0Q = 0P QP
3 2
4
= 2 1
3
3 2
1
1
= 2
3
1
Therefore, Q = ( 13 , 23 , 31 ).
Outcomes
Recall that the dot product is one of two important products for vectors. The second type of product
for vectors is called the cross product. It is important to note that the cross product is only defined in
4.9. The Cross Product 161
R3 . First we discuss the geometric meaning and then a description in terms of coordinates is given, both
of which are important. The geometric description is essential in order to understand the applications to
physics and geometry while the coordinate description is necessary to compute the cross product.
Consider the following definition.
For an example of a right handed system of vectors, see the following picture.
~w
~u
~v
In this picture the vector ~w points upwards from the plane determined by the other two vectors. Point
the fingers of your right hand along ~u, and close them in the direction of ~v. Notice that if you extend the
thumb on your right hand, it points in the direction of ~w.
You should consider how a right hand system would differ from a left hand system. Try using your left
hand and you will see that the vector ~w would need to point in the opposite direction.
Notice that the special vectors, ~i, ~j,~k will always form a right handed system. If you extend the fingers
of your right hand along ~i and close them in the direction ~j, the thumb points in the direction of ~k.
~k
~j
~i
The following is the geometric description of the cross product. Recall that the dot product of two
vectors results in a scalar. In contrast, the cross product results in a vector, as the product gives a direction
as well as magnitude.
162 Rn
~i ~j =~k ~j ~i = ~k
~k ~i = ~j ~i ~k = ~j
~j ~k =~i ~k ~j = ~i
With this information, the following gives the coordinate description of the cross product.
T
Recall that the vector ~u = u1 u2 u3 can be written in terms of ~i, ~j,~k as ~u = u1~i + u2~j + u3~k.
(4.15)
4.9. The Cross Product 163
There is another version of 4.14 which may be easier to remember in case you have already covered
the notion of matrix determinant; otherwise you can skip the next few lines. We can express the cross
product as the determinant of a matrix, as follows.
~i ~j ~k
~u ~v = u1 u2 u3 (4.16)
v1 v2 v3
Proof. Formula 1. follows immediately from the definition. The vectors ~u ~v and ~v ~u have the same
magnitude, |~u| |~v| sin , and an application of the right hand rule shows they have opposite direction.
Formula 2. is proven as follows. If k is a non-negative scalar, the direction of (k~u) ~v is the same as
the direction of ~u ~v, k (~u ~v) and ~u (k~v). The magnitude is k times the magnitude of ~u ~v which is the
same as the magnitude of k (~u ~v) and ~u (k~v) . Using this yields equality in 2. In the case where k < 0,
everything works the same way except the vectors are all pointing in the opposite direction and you must
multiply by |k| when comparing their magnitudes.
The distributive laws, 3. and 4., are much harder to establish. For now, it suffices to notice that if we
know that 3. is true, 4. follows. Thus, assuming 3., and using 1.,
We will now look at an example of how to compute a cross product.
Solution. Note that we can write ~u,~v in terms of the special vectors ~i, ~j,~k as
~u =~i ~j + 2~k
~v = 3~i 2~j +~k
We will use the equation given by 4.16 to compute the cross product.
~i ~j ~k
1 2 1 2
~u ~v = 1 1 2 = ~i ~j + 1 1 ~k = 3~i + 5~j +~k
3 2 1 2 1 3 1 3 2
~v k~vk sin( )
~u
Solution. Notice that these vectors are the same as the ones given in Example 4.51. Recall from the
geometric description of the cross product, that the area of the parallelogram is simply the magnitude of
~u ~v. From Example 4.51, ~u ~v = 3~i + 5~j +~k. We can also write this as
3
~u ~v = 5
1
Solution. This triangle is obtained by connecting the three points with lines. Picking (1, 2, 3) as a starting
T T
point, there are two displacement vectors, 1 0 2 and 4 1 1 . Notice that if we add
either of these vectors to the position vector of the starting point, the result is the position vectors of
the other two points. Now, the area of the triangle is half the area of the parallelogram determined by
T T
1 0 2 and 4 1 1 . The required cross product is given by
1 4
0 1 = 2 7 1
2 1
Taking the size of this vector gives the area of the parallelogram, given by
p
(2)(2) + (7)(7) + (1)(1) = 4 + 49 + 1 = 54
Hence the area of the triangle is 12 54 = 32 6.
In general, if you have three points in R3 , P, Q, R, the area of the triangle is given by
1
kPQ PRk
2
Recall that PQ is the vector running from point P to point Q.
P R
Recall that we can use the cross product to find the the area of a parallelogram. It follows that we can use
the cross product together with the dot product to find the volume of a parallelepiped.
We begin with a definition.
That is, if you pick three numbers, r, s, and t each in [0, 1] and form r~u + s~v + t~w then the collection
of all such points makes up the parallelepiped determined by these three vectors.
~u ~v
~w
~v
~u
Notice that the base of the parallelepiped is the parallelogram determined by the vectors ~u and ~v.
Therefore, its area is equal to k~u ~vk. The height of the parallelepiped is k~wk cos where is the angle
shown in the picture between ~w and ~u ~v. The volume of this parallelepiped is the area of the base times
the height which is just
k~u ~vkk~wk cos = ~u ~v ~w
This expression is known as the box product and is sometimes written as [~u,~v,~w] . You should consider
what happens if you interchange the ~v with the ~w or the ~u with the ~w. You can see geometrically from
drawing pictures that this merely introduces a minus sign. In any case the box product of three vectors
always equals either the volume of the parallelepiped determined by the three vectors or else 1 times this
volume.
|~u ~v ~w|
Solution. According to the above discussion, pick any two of these vectors, take the cross product and then
take the dot product of this with the third of these vectors. The result will be either the desired volume or
1 times the desired volume. Therefore by taking the absolute value of the result, we obtain the volume.
We will take the cross product of ~u and ~v. This is given by
1 1
~u ~v = 2 3
5 6
~i ~j ~k 3
= 1 2 5 = 3~i + ~
j +~k = 1
1 3 6 1
Now take the dot product of this vector with ~w which yields
3 3
(~u ~v) ~w = 1 2
1 3
= 3~i + ~j +~k 3~i + 2~j + 3~k
= 9+2+3
= 14
Proof. This follows from observing that either (~u ~v) ~w and ~u (~v ~w) both give the volume of the
parallelepiped or they both give 1 times the volume.
In case you have covered the notion of matrix determinant, you will remember that we can express the
cross product as the determinant of a particular matrix. It turns out that the same can be done for the box
T T T
product. Suppose you have three vectors, ~u = a b c ,~v = d e f , and ~w = g h i .
168 Rn
To take the box product, you can simply take the determinant of the matrix which results by letting the
rows be the components of the given vectors in the order in which they occur in the box product.
This follows directly from the definition of the cross product given above and the way we expand
determinants. Thus the volume of a parallelepiped determined by the vectors ~u,~v,~w is just the absolute
value of the above determinant.
Exercises
Exercise 4.9.34 Show that if ~a ~u = ~0 for any unit vector ~u, then ~a = ~0.
Exercise 4.9.35 Find the area of the triangle determined by the three points, (1, 2, 3) , (4, 2, 0) and (3, 2, 1) .
Exercise 4.9.36 Find the area of the triangle determined by the three points, (1, 0, 3) , (4, 1, 0) and (3, 1, 1) .
Exercise 4.9.37 Find the area of the triangle determined by the three points, (1, 2, 3) , (2, 3, 4) and (3, 4, 5) .
Did something interesting happen here? What does it mean geometrically?
1 3
Exercise 4.9.38 Find the area of the parallelogram determined by the vectors 2 , 2 .
3 1
1 4
Exercise 4.9.39 Find the area of the parallelogram determined by the vectors 0 , 2 .
3 1
Exercise
4.9.40 Is ~u (~v ~w) = (~u ~v) ~w? What is the meaning of ~u ~v ~w? Explain. Hint: Try
~i ~j ~k.
4.9. The Cross Product 169
Exercise 4.9.41 Verify directly that the coordinate description of the cross product, ~u ~v has the property
that it is perpendicular to both ~u and ~v. Then show by direct computation that this coordinate description
satisfies
where is the angle included between the two vectors. Explain why k~u ~vk has the correct magnitude.
Exercise 4.9.42 Suppose A is a 3 3 skew symmetric matrix such that AT = A. Show there exists a
vector ~ such that for all ~u R3
A~u = ~ ~u
Hint: Explain why, since A is skew symmetric it is of the form
0 3 2
A = 3 0 1
2 1 0
Exercise 4.9.44 Suppose ~u,~v, and ~w are three vectors whose components are all integers. Can you con-
clude the volume of the parallelepiped determined from these three vectors will always be an integer?
Exercise 4.9.45 What does it mean geometrically if the box product of three vectors gives zero?
Exercise 4.9.46 Using Problem 4.9.45, find an equation of a plane containing the two position vectors, ~p
and ~q and the point 0. Hint: If (x, y, z) is a point on this plane, the volume of the parallelepiped determined
by (x, y, z) and the vectors ~p,~q equals 0.
Exercise 4.9.47 Using the notion of the box product yielding either plus or minus the volume of the
parallelepiped determined by the given three vectors, show that
In other words, the dot and the cross can be switched as long as the order of the vectors remains the same.
Hint: There are two ways to do this, by the coordinate description of the dot and cross product and by
geometric reasoning.
Exercise 4.9.50 For ~u,~v,~w functions of t, prove the following product rules:
4.10 Applications
Outcomes
A. Apply the concepts of vectors in Rn to the applications of physics and work.
Suppose you push on something. Then, your push is made up of two components, how hard you push and
the direction you push. This illustrates the concept of force.
Vectors are used to model force and other physical vectors like velocity. As with all vectors, a vector
modeling force has two essential ingredients, its magnitude and its direction.
Recall the special vectors which point along the coordinate axes. These are given by
~ei = [0 0 1 0 0]T
where the 1 is in the ith slot and there are zeros in all the other spaces. The direction of ~ei is referred to as
the ith direction.
Consider the following picture which illustrates the case of R3 . Recall that in R3 , we may refer to these
vectors as ~i, ~j, and ~k.
~e3
~e1 ~e2 y
x
What does addition of vectors mean physically? Suppose two forces are applied to some object. Each
of these would be represented by a force vector and the two forces acting together would yield an overall
force acting on the object which would also be a force vector known as the resultant. Suppose the two
vectors are ~u = nk=1 ui~ei and ~v = nk=1 vi~ei . Then the vector ~u involves a component in the ith direction
given by ui~ei , while the component in the ith direction of ~v is vi~ei . Then the vector ~u +~v should have a
component in the ith direction equal to (ui + vi )~ei . This is exactly what is obtained when the vectors, ~u and
~v are added.
~u +~v = [u1 + v1 un + vn ]T
n
= (ui + vi )~ei
i=1
Thus the addition of vectors according to the rules of addition in Rn which were presented earlier,
yields the appropriate vector which duplicates the cumulative effect of all the vectors in the sum.
Consider now some examples of vector addition.
Solution. To find the total force, we add the vectors as described above. This is given by
Hence, the total force is 10~i + 7~j +~k Newtons. Therefore, the force in the ~i direction is 10 Newtons.
Consider another example.
Therefore, we need to find the vector ~u which has length 100 and direction as shown in this diagram.
We can consider the vector ~u as the hypotenuse of a right triangle having equal sides, since the direction
4.10. Applications 173
of ~u corresponds
with the 45 line. The sides, corresponding to the ~i and ~j directions, should be each of
length 100/ 2. Therefore, the vector is given by
T
100~ 100 ~ 100
100
~u = i + j = 2 2
2 2
This example also motivates the concept of velocity, defined below.
Solution. Here imagine a Cartesian coordinate system in which the third component is altitude and the
first and second components are measured on a line from West to East and a line from South to North.
T
Consider the vector 1 2 1 , which is the initial position vector of the airplane. As the plane
1
moves, the position vector changes according to the velocity vector. After one minute (considered as 60 of
an hour) the airplane has moved in the~i direction a distance of 100 60 = 3 kilometer. In the ~j direction it
1 5
has moved 60 1
kilometer during this same time, while it moves 60 1
kilometer in the ~k direction. Therefore,
the new displacement vector for the airplane is
T 5 1 1 T 8 121 121 T
1 2 1 + 3 60 60 = 3 60 60
Now consider an example which involves combining two velocities.
Solution. Consider the following picture which demonstrates the above scenario.
174 Rn
3
4
First we want to know the total time of the swim across the river. The velocity in the direction across
the river is 3 kilometers per hour, and the river is 21 kilometer wide. It follows the trip takes 1/6 hour or
10 minutes.
Now, we can compute how far downstream he will end up. Since the river runs at a rate of2 4 kilometers
1
per hour, and the trip takes 1/6 hour, the distance traveled downstream is given by 4 6 = 3 kilometers.
The distance traveled by the swimmer is given by the hypotenuse of a right triangle. The two arms of
the triangle are given by the distance across the river, 21 km, and the distance traveled downstream, 23 km.
Then, using the Pythagorean Theorem, we can calculate the total distance d traveled.
s
2 2 1 2 5
d= + = km
3 2 6
5
Therefore, the swimmer travels a total distance of 6 kilometers.
4.10.2. Work
The mathematical concept of work is an application of vectors in Rn . The physical concept of work differs
from the notion of work employed in ordinary conversation. For example, suppose you were to slide a
150 pound weight off a table which is three feet high and shuffle along the floor for 50 yards, keeping the
height always three feet and then deposit this weight on another three foot high table. The physical concept
of work would indicate that the force exerted by your arms did no work during this project. The reason
for this definition is that even though your arms exerted considerable force on the weight, the direction of
motion was at right angles to the force they exerted. The only part of a force which does work in the sense
of physics is the component of the force in the direction of motion.
Work is defined to be the magnitude of the component of this force times the distance over which it
acts, when the component of force points in the direction of motion. In the case where the force points
in exactly the opposite direction of motion work is given by (1) times the magnitude of this component
times the distance. Thus the work done by a force on an object as the object moves from one point to
another is a measure of the extent to which the force contributes to the motion. This is illustrated in the
following picture in the case where the given force contributes to the motion.
~F ~F
Q
~F||
P
Recall that for any vector ~u in Rn , we can write ~u as a sum of two vectors, as in
~u = ~u|| +~u
4.10. Applications 175
In the above picture the force, ~F is applied to an object which moves on the straight line from P to Q.
There are two vectors shown, ~F|| and ~F and the picture is intended to indicate that when you add these
two vectors you get ~F. In other words, ~F = ~F|| + ~F . Notice that ~F|| acts in the direction of motion and ~F
acts perpendicular to the direction of motion. Only ~F|| contributes to the work done by ~F on the object as it
moves from P to Q. ~F|| is called the component of the force in the direction of motion. From trigonometry,
you see the magnitude of ~F|| should equal k~Fk |cos | . Thus, since ~F|| points in the direction of the vector
from P to Q, the total work done should equal
k~FkkPQk cos = k~Fkk~q ~pk cos
Now, suppose the included angle had been obtuse. Then the work done by the force ~F on the object
would have been negative because ~F|| would point in 1 times the direction of the motion. In this case,
cos would also be negative and so it is still the case that the work done would be given by the above
formula. Thus from the geometric description of the dot product given above, the work equals
= 58 Newton meters
Note that if the force had been given in pounds and the distance had been given in feet, the units on
the work would have been foot pounds. In general, work has units equal to units of a force times units of
a length. Recall that 1 Newton meter is equal to 1 Joule. Also notice that the work done by the force can
be negative as in the above example.
Exercises
Exercise 4.10.51 The wind blows from the South at 20 kilometers per hour and an airplane which flies at
600 kilometers per hour in still air is heading East. Find the velocity of the airplane and its location after
two hours.
Exercise 4.10.52 The wind blows from the West at 30 kilometers per hour and an airplane which flies at
400 kilometers per hour in still air is heading North East. Find the velocity of the airplane and its position
after two hours.
Exercise 4.10.53 The wind blows from the North at 10 kilometers per hour. An airplane which flies at
300 kilometers per hour in still air is supposed to go to the point whose coordinates are at (100, 100) . In
what direction should the airplane fly?
3 1
Exercise 4.10.54 Three forces act on an object. Two are 1 and 3 Newtons. Find the third
1 4
force if the object is not to move.
6 2
Exercise 4.10.55 Three forces act on an object. Two are 3 and 1 Newtons. Find the third
3 3
7
force if the total force on the object is to be 1 .
3
Exercise 4.10.56 A river flows West at the rate of b miles per hour. A boat can move at the rate of 8 miles
per hour. Find the smallest value of b such that it is not possible for the boat to proceed directly across the
river.
Exercise 4.10.57 The wind blows from West to East at a speed of 50 miles per hour and an airplane which
travels at 400 miles per hour in still air is heading North West. What is the velocity of the airplane relative
to the ground? What is the component of this velocity in the direction North?
4.10. Applications 177
Exercise 4.10.58 The wind blows from West to East at a speed of 60 miles per hour and an airplane can
travel travels at 100 miles per hour in still air. How many degrees West of North should the airplane head
in order to travel exactly North?
Exercise 4.10.59 The wind blows from West to East at a speed of 50 miles per hour and an airplane which
travels at 400 miles per hour in still air heading somewhat West of North so that, with the wind, it is flying
due North. It uses 30.0 gallons of gas every hour. If it has to travel 600.0 miles due North, how much gas
will it use in flying to its destination?
Exercise 4.10.60 An airplane is flying due north at 150.0 miles per hour but it is not actually going due
North because there is a wind which is pushing the airplane due east at 40.0 miles per hour. After one
hour, the plane starts flying 30 East of North. Assuming the plane starts at (0, 0) , where is it after 2
hours? Let North be the direction of the positive y axis and let East be the direction of the positive x axis.
Exercise 4.10.61 City A is located at the origin (0, 0) while city B is located at (300, 500) where distances
are in miles. An airplane flies at 250 miles per hour in still air. This airplane wants to fly from city A to
city B but the wind is blowing in the direction of the positive y axis at a speed of 50 miles per hour. Find a
unit vector such that if the plane heads in this direction, it will end up at city B having flown the shortest
possible distance. How long will it take to get there?
Exercise 4.10.62 A certain river is one half mile wide with a current flowing at 2 miles per hour from
East to West. A man swims directly toward the opposite shore from the South bank of the river at a speed
of 3 miles per hour. How far down the river does he find himself when he has swam across? How far does
he end up traveling?
Exercise 4.10.63 A certain river is one half mile wide with a current flowing at 2 miles per hour from
East to West. A man can swim at 3 miles per hour in still water. In what direction should he swim in order
to travel directly across the river? What would the answer to this problem be if the river flowed at 3 miles
per hour and the man could swim only at the rate of 2 miles per hour?
Exercise 4.10.64 Three forces are applied to a point which does not move. Two of the forces are 2~i + 2~j
6~k Newtons and 8~i + 8~j + 3~k Newtons. Find the third force.
Exercise 4.10.65 The total force acting on an object is to be 4~i+2~j 3~k Newtons. A force of 3~i1~j +8~k
Newtons is being applied. What other force should be applied to achieve the desired total force?
Exercise 4.10.66 A bird flies from its nest 8 km in the direction 56 north of east where it stops to rest
on a tree. It then flies 1 km in the direction due southeast and lands atop a telephone pole. Place an xy
coordinate system so that the origin is the birds nest, and the positive x axis points east and the positive y
axis points north. Find the displacement vector from the nest to the telephone pole.
Exercise 4.10.67 If ~F is a force and ~D is a vector, show proj~D F ~ = kFk ~ cos ~u where ~u is the unit
vector in the direction of ~D, where ~u = ~D/k~Dk and is the included angle between the two vectors, ~F and
~D. k~
Fk cos is sometimes called the component of the force, ~F in the direction, ~D.
178 Rn
Exercise 4.10.68 A boy drags a sled for 100 feet along the ground by pulling on a rope which is 20 degrees
from the horizontal with a force of 40 pounds. How much work does this force do?
Exercise 4.10.69 A girl drags a sled for 200 feet along the ground by pulling on a rope which is 30 degrees
from the horizontal with a force of 20 pounds. How much work does this force do?
Exercise 4.10.70 A large dog drags a sled for 300 feet along the ground by pulling on a rope which is 45
degrees from the horizontal with a force of 20 pounds. How much work does this force do?
Exercise 4.10.71 How much work does it take to slide a crate 20 meters along a loading dock by pulling
on it with a 200 Newton force at an angle of 30 from the horizontal? Express your answer in Newton
meters.
Exercise 4.10.72 An object moves 10 meters in the direction of ~j. There are two forces acting on this
F1 =~i + ~j + 2~k, and ~F2 = 5~i + 2~j 6~k. Find the total work done on the object by the two forces.
object, ~
Hint: You can take the work done by the resultant of the two forces or you can add the work done by each
force. Why?
Exercise 4.10.73 An object moves 10 meters in the direction of ~j +~i. There are two forces acting on this
object, ~F1 =~i + 2~j + 2~k, and ~F2 = 5~i + 2~j 6~k. Find the total work done on the object by the two forces.
Hint: You can take the work done by the resultant of the two forces or you can add the work done by each
force. Why?
Exercise 4.10.74 An object moves 20 meters in the direction of ~k + ~j. There are two forces acting on this
F1 = ~i + ~j + 2~k, and ~F2 = ~i + 2~j 6~k. Find the total work done on the object by the two forces.
object, ~
Hint: You can take the work done by the resultant of the two forces or you can add the work done by each
force.
5. Linear Transformations
Outcomes
A. Understand the definition of a linear transformation, and that all linear transformations are
determined by matrix multiplication.
Recall that when we multiply an m n matrix by an n 1 column vector, the result is an m 1 column
vector. In this section we will discuss how, through matrix multiplication, an m n matrix transforms an
n 1 column vector into an m 1 column vector.
Recall that the n 1 vector given by
x1
x2
~x = ..
.
xn
is said to belong to Rn , which is the set of all n 1 vectors. In this section, we will discuss transformations
of vectors in Rn .
Consider the following example.
Solution. First, recall that vectors in R3 are vectors of size 3 1, while vectors in R2 are of size 2 1. If
we multiply A, which is a 2 3 matrix, by a 3 1 vector, the result will be a 2 1 vector. This what we
mean when we say that A transforms vectors.
x
Now, for y in R3 , multiply on the left by the given matrix to obtain the new vector. This product
z
looks like
x
1 2 0 x + 2y
y =
2 1 0 2x + y
z
179
180 Linear Transformations
The resulting product is a 2 1 vector which is determined by the choice of x and y. Here are some
numerical examples.
1
1 2 0 5
2 =
2 1 0 4
3
1
3 5
Here, the vector 2 in R was transformed by the matrix into the vector in R2 .
4
3
Here is another example:
10
1 2 0 20
5 =
2 1 0 25
3
The idea is to define a function which takes vectors in R3 and delivers new vectors in R2 . In this case,
that function is multiplication by the matrix A.
Let T denote such a function. The notation T : Rn 7 Rm means that the function T transforms vectors
in Rn into vectors in Rm . The notation T (~x) means the transformation T applied to the vector~x. The above
example demonstrated a transformation achieved by matrix multiplication. In this case, we often write
TA (~x) = A~x
Therefore, TA is the transformation determined by the matrix A. In this case we say that T is a matrix
transformation.
Recall the property of matrix multiplication that states that for k and p scalars,
Solution. By Definition 5.2 we need to show that T (k~x1 + p~x2 ) = kT (~x1 ) + pT (~x2 ) for all scalars k, p and
vectors ~x1 ,~x2 . Let
x1 x2
~x1 = y1 ,~x2 = y2
z1 z2
Then
x1 x2
T (k~x1 + p~x2 ) = T k y1 + p y2
z1 z2
kx1 px2
= T ky1 + py2
kz1 pz2
kx1 + px2
= T ky1 + py2
kz1 + pz2
(kx1 + px2 ) + (ky1 + py2 )
=
(kx1 + px2 ) (kz1 + pz2 )
(kx1 + ky1 ) + (px2 + py2 )
=
(kx1 kz1 ) + (px2 pz2 )
kx1 + ky1 px2 + py2
= +
kx1 kz1 px2 pz2
x1 + y1 x2 + y2
= k +p
x1 z1 x2 z2
= kT (~x1 ) + pT (~x2 )
It turns out that every linear transformation can be expressed as a matrix transformation, and thus linear
transformations are exactly the same as matrix transformations.
Exercises
Exercise 5.1.1 Show the map T : Rn 7 Rm defined by T (~x) = A~x where A is an m n matrix and ~x is an
m 1 column vector is a linear transformation.
Exercise 5.1.2 Show that the function T~u defined by T~u (~v) =~v proj~u (~v) is also a linear transformation.
Exercise 5.1.3 Let ~u be a fixed vector. The function T~u defined by T~u~v = ~u +~v has the effect of translating
all vectors by adding ~u 6= ~0. Show this is not a linear transformation. Explain why it is not possible to
represent T~u in R3 by multiplying by a 3 3 matrix.
Outcomes
A. Find the matrix of a linear transformation and determine the action on a vector in Rn .
In the above examples, the action of the linear transformations was to multiply by a matrix. It turns
out that this is always the case for linear transformations. If T is any linear transformation which maps
Rn to Rm , there is always an m n matrix A with the property that
for all ~x Rn .
Here is why. Suppose T : Rn 7 Rm is a linear transformation and you want to find the matrix defined
by this linear transformation as described in 5.1. Note that
x1 1 0 0
x2 0 1 0 n
~x = .. = x1 .. + x2 .. + + xn .. = xi~ei
. . . . i=1
xn 0 0 1
where ~ei is the ith column of In , that is the n 1 vector which has zeros in every slot but the ith and a 1 in
this slot.
Then since T is linear,
n
T (~x) = xiT (~ei)
i=1
| | x1
= T (~e1 ) T (~en ) ...
| | xn
x1
..
= A .
xn
Therefore, the desired matrix is obtained from constructing the ith column as T (~ei ) . We state this formally
as the following theorem.
where ~ei is the ith column of In , and then T (~ei ) is the ith column of A.
Find the matrix A of T such that T (~x) = A~x for all ~x.
In this case, A will be a 2 3 matrix, so we need to find T (~e1 ) , T (~e2 ) , and T (~e3 ). Luckily, we have
been given these values so we can fill in A as needed, using these vectors as the columns of A. Hence,
1 9 1
A=
2 3 1
In this example, we were given the resulting vectors of T (~e1 ) , T (~e2 ) , and T (~e3 ). Constructing the
matrix A was simple, as we could simply use these vectors as the columns of A. The next example shows
how to find A when we are not given the T (~ei ) so clearly.
Find the matrix A of T such that T (~x) = A~x for all ~x.
Solution. By Theorem 5.6 to find this matrix, we need to determine the action of T on ~e1 and ~e2 . In
Example 5.8, we were given these resulting vectors. However, in this example, we have been given T of
two different vectors. How can we find out the action of T on~e1 and~e2 ? In particular for~e1 , suppose there
exist x and y such that
1 1 0
=x +y (5.2)
0 1 1
Then, since T is linear,
1 1 0
T = xT + yT
0 1 1
5.2. The Matrix of a Linear
Transformation 185
Therefore, if we know the values of x and y which satisfy 5.2, we can substitute these into equation
5.3. By doing so, we find T (~e1 ) which is the first column of the matrix A.
We proceed to find x and y. We do so by solving 5.2, which can be done by solving the system
x=1
xy = 0
We see that x = 1 and y = 1 is the solution to this system. Substituting these values into equation 5.3,
we have
1 1 3 1 3 4
T =1 +1 = + =
0 2 2 2 2 4
4
Therefore is the first column of A.
4
Computing the second column is done in the same way, and is left as an exercise.
The resulting matrix A is given by
4 3
A=
4 2
This example illustrates a very long procedure for finding the matrix of A. While this method is reliable
and will always result in the correct matrix A, the following procedure provides an alternative method.
We will illustrate this procedure in the following example. You may also find it useful to work through
Example 5.9 using this procedure.
186 Linear Transformations
1
1 0 1 0 2 0
Solution. By Procedure 5.10, A = 3 1 1 and B = 1 1 0
1 1 0 1 3 1
Then, Procedure 5.10 claims that the matrix of T is
2 2 4
1
C = BA = 0 0 1
4 3 6
Indeed you can first verify that T (~x) = C~x for the 3 vectors above:
2 2 4 1 0 2 2 4 0 2
0 0 1 3 = 1 , 0 0 1 1 = 1
4 3 6 1 1 4 3 6 1 3
2 2 4 1 0
0 0 1 1 = 0
4 3 6 0 1
But more generally T (~x) = C~x for any ~x. To see this, let ~y = A1~x and then using linearity of T :
!
T (~x) = T (A~y) = T ~yi~ai = ~yi T (~ai ) ~yi~bi = B~y = BA1~x = C~x
i
Recall the dot product discussed earlier. Consider the map ~v7 proj~u (~v) which takes a vector a trans-
forms it to its projection onto a given vector ~u. It turns out that this map is linear, a result which follows
from the properties of the dot product. This is shown as follows.
(k~v + p~w) ~u
proj~u (k~v + p~w) = ~u
~u ~u
~v ~u ~w ~u
= k ~u + p ~u
~u ~u ~u ~u
= k proj~u (~v) + p proj~u (~w)
for any ~v R3 .
Solution.
1. First, we have just seen that T (~v) = proj~u (~v) is linear. Therefore by Theorem 5.5, we can find a
matrix A such that T (~x) = A~x.
2. The columns of the matrix for T are defined above as T (~ei ). It follows that T (~ei ) = proj~u (~ei ) gives
the ith column of the desired matrix. Therefore, we need to find
~ei ~u
proj~u (~ei ) = ~u
~u ~u
For the given vector ~u , this implies the columns of the desired matrix are
1 1 1
1 2 3
2 , 2 , 2
14 14 14
3 3 3
Exercises
(b) T replaces the ith component of ~x with b times the jth component added to the ith component.
(c) T switches the ith and jth components.
Show these functions are linear transformations and describe their matrices A such that T (~x) = A~x.
Exercise 5.2.5 You are given a linear transformation T : Rn Rm and you know that
T (Ai ) = Bi
1
where A1 An exists. Show that the matrix of T is of the form
1
B1 Bn A1 An
1 1
T 2 = 3
6 3
0 5
T 1 = 3
2 3
Find the matrix of T . That is find A such that T (~x) = A~x.
Exercise 5.2.11 Consider the following functions T : R3 R2 . Show that each is a linear transformation
and determine for each the matrix A such that T (~x) = A~x.
x
x + 2y + 3z
(a) T y =
2y 3x + z
z
x
7x + 2y + z
(b) T y =
3x 11y + 2z
z
190 Linear Transformations
x
3x + 2y + z
(c) T y =
x + 2y + 6z
z
x
2y 5x + z
(d) T y =
x+y+z
z
Exercise 5.2.12 Consider the following functions T : R3 R2 . Explain why each of these functions T is
not linear.
x
x + 2y + 3z + 1
(a) T y =
2y 3x + z
z
x 2 + 3z
x + 2y
(b) T y =
2y + 3x + z
z
x
sin x + 2y + 3z
(c) T y =
2y + 3x + z
z
x
x + 2y + 3z
(d) T y =
2y + 3x ln z
z
Outcomes
A. Use properties of linear transformations to solve problems.
Let T : Rn 7 Rm be a linear transformation. Then there are some important properties of T which will
be examined in this section. Consider the following theorem.
These properties are useful in determining the action of a transformation on a given vector. Consider
the following example.
192 Linear Transformations
7 7
Solution. Using the third property in Theorem 5.13, we can find T 3 by writing 3 as a linear
9 9
1 4
combination of 3 and 0 .
1 5
Therefore we want to find a, b R such that
7 1 4
3 = a 3 +b 0
9 1 5
The necessary augmented matrix and resulting reduced row-echelon form are given by:
1 4 7 1 0 1
3 0 3 0 1 2
1 5 9 0 0 0
Hence a = 1, b = 2 and
7 1 4
3 = 1 3 + (2) 0
9 1 5
Now, using the third property above, we have
7 1 4
T 3 = T 1 3 + (2) 0
9 1 5
1 4
= 1T 3 2T 0
1 5
4 4
4
= 2 5
0 1
2 5
5.3. Properties of Linear Transformations 193
4
6
=
2
12
4
7 6
Therefore, T 3 =
.
2
9
12
Suppose two linear transformations act in the same way on ~x for all vectors. Then we say that these
transformations are equal.
S (~x) = T (~x)
Suppose two linear transformations act on the same vector ~x, first the transformation T and then a
second transformation given by S. We can find the composite transformation that results from applying
both transformations.
S T : Rk 7 Rm
Notice that the resulting vector will be in Rm . Be careful to observe the order of transformations. We
write S T but apply the transformation T first, followed by S.
(S T )(~x) =~x
and
(T S)(~x) =~x
Then, S is called an inverse of T and T is called an inverse of S. Geometrically, they reverse the
action of each other.
5.3. Properties of Linear Transformations 195
The following theorem is crucial, as it claims that the above inverse transformations are unique.
Show that T 1 exists and find the matrix B which it is induced by.
Solution. Since the matrix A is invertible, it follows that the transformation T is invertible. Therefore, T 1
exists.
You can verify that A1 is given by:
1 4 3
A =
3 2
Exercises
Exercise 5.3.17 Show that if a function T : Rn Rm is linear, then it is always the case that T ~0 = ~0.
3 1
Exercise 5.3.18 Let T be a linear transformation induced by the matrix A = and S a linear
1 2
0 2 2
transformation induced by B = . Find matrix of S T and find (S T ) (~x) for ~x = .
4 2 1
1 2
Exercise 5.3.19 Let T be a linear transformation and suppose T = . Suppose S is a
4 3
1 2 1
linear transformation induced by the matrix B = . Find (S T ) (~x) for ~x = .
1 3 4
196 Linear Transformations
2 3
Exercise 5.3.20 Let T be a linear transformation induced by the matrix A = and S a linear
1 1
1 3 5
transformation induced by B = . Find matrix of S T and find (S T ) (~x) for ~x = .
1 2 6
2 1
Exercise 5.3.21 Let T be a linear transformation induced by the matrix A = . Find the matrix of
5 2
T 1 .
4 3
Exercise 5.3.22 Let T be a linear transformation induced by the matrix A = . Find the matrix
2 2
of T 1 .
1 9 0
Exercise 5.3.23 Let T be a linear transformation and suppose T = , T =
2 8 1
4
. Find the matrix of T 1 .
3
Outcomes
A. Find the matrix of rotations and reflections in R2 and determine the action of each on a vector
in R2 .
In this section, we will examine some special examples of linear transformations in R2 including rota-
tions and reflections. We will use the geometric descriptions of vector addition and scalar multiplication
discussed earlier to show that a rotation of vectors through an angle and reflection of a vector across a line
are examples of linear transformations.
More generally, denote a transformation given by a rotation by T . Why is such a transformation linear?
Consider the following picture which illustrates a rotation. Let ~u,~v denote vectors.
T (~u)
~v ~u +~v
T (~v)
~v
~u
5.4. Special Linear Transformations in R2 197
Lets consider how to obtain T (~u +~v). Simply, you add T (~u) and T (~v). Here is why. If you add
T (~u) to T (~v) you get the diagonal of the parallelogram determined by T (~u) and T (~v), as this action is our
usual vector addition. Now, suppose we first add ~u and ~v, and then apply the transformation T to ~u +~v.
Hence, we find T (~u +~v). As shown in the diagram, this will result in the same vector. In other words,
T (~u +~v) = T (~u) + T (~v).
This is because the rotation preserves all angles between the vectors as well as their lengths. In par-
ticular, it preserves the shape of this parallelogram. Thus both T (~u) + T (~v) and T (~u +~v) give the same
vector. It follows that T distributes across addition of the vectors of R2 .
Similarly, if k is a scalar, it follows that T (k~u) = kT (~u). Thus rotations are an example of a linear
transformation by Definition 5.2.
The following theorem gives the matrix of a linear transformation which rotates all vectors through an
angle of .
1 0
Proof. Let~e1 = and~e2 = . These identify the geometric vectors which point along the positive
0 1
x axis and positive y axis as shown.
~e2
(sin( ), cos( ))
R (~e1 ) (cos( ), sin( ))
R (~e2 )
~e1
From Theorem 5.6, we need to find R (~e1 ) and R (~e2 ), and use these as the columns of the matrix A
of T . We can use cos, sin of the angle to find the coordinates of R (~e1 ) as shown in the above picture.
The coordinates of R (~e2 ) also follow from trigonometry. Thus
cos sin
R (~e1 ) = , R (~e2 ) =
sin cos
Therefore, from Theorem 5.6,
cos sin
A=
sin cos
We can also prove this algebraically without the use of the above picture. The definition of (cos ( ) , sin ( ))
is as the coordinates of the point of R (~e1 ). Now the point of the vector~e2 is exactly /2 further along the
198 Linear Transformations
unit circle from the point of ~e1 , and therefore after rotation through an angle of the coordinates x and y
of the point of R (~e2 ) are given by
Consider the following example.
We now look at an example of a linear transformation involving two angles.
Solution. Let R + denote the linear transformation which rotates every vector through an angle of + .
Then to obtain R + , we first apply R and then R where R is the linear transformation which rotates
through an angle of and R is the linear transformation which rotates through an angle of . Denoting
the corresponding matrices by A + , A , and A , it follows that for every ~u
cos sin cos sin
= = A A
sin cos sin cos
Dont these look familiar? They are the usual trigonometric identities for the sum of two angles derived
here using linear algebra concepts.
Here we have focused on rotations in two dimensions. However, you can consider rotations and other
geometric concepts in any number of dimensions. This is one of the major advantages of linear algebra.
You can break down a difficult geometrical procedure into small steps, each corresponding to multiplica-
tion by an appropriate matrix. Then by multiplying the matrices, you can obtain a single matrix which can
give you numerical information on the results of applying the given sequence of simple procedures.
Linear transformations which reflect vectors across a line are a second important type of transforma-
tions in R2 . Consider the following theorem.
Consider the following example which incorporates a reflection as well as a rotation of vectors.
Solution. By Theorem 5.22, the matrix of the transformation which involves rotating through an angle of
/6 is 1
3 12
cos ( /6) sin ( /6) 2
= 1 1
sin ( /6) cos ( /6) 2 2 3
Reflecting across the x axis is the same action as reflecting vectors over the line ~y = m~x with m = 0.
By Theorem 5.25, the matrix for the transformation which reflects all vectors through the x axis is
1 1 m2 2m 1 1 (0)2 2(0) 1 0
= =
1 + m2 2m m2 1 1 + (0)2 2(0) (0)2 1 0 1
Therefore, the matrix of the linear transformation which first rotates through /6 and then reflects
through the x axis is given by
1
1 3 1 3 1
1 0 2 2
=
2 2
0 1 1 1 1 1
2 2 3 2 2 3
Exercises
Exercise 5.4.24 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of /3.
Exercise 5.4.25 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of /4.
Exercise 5.4.26 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of /3.
Exercise 5.4.27 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of 2 /3.
5.4. Special Linear Transformations in R2 201
Exercise 5.4.28 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of /12. Hint: Note that /12 = /3 /4.
Exercise 5.4.29 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of 2 /3 and then reflects across the x axis.
Exercise 5.4.30 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of /3 and then reflects across the x axis.
Exercise 5.4.31 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of /4 and then reflects across the x axis.
Exercise 5.4.32 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of /6 and then reflects across the x axis followed by a reflection across the y axis.
Exercise 5.4.33 Find the matrix for the linear transformation which reflects every vector in R2 across the
x axis and then rotates every vector through an angle of /4.
Exercise 5.4.34 Find the matrix for the linear transformation which reflects every vector in R2 across the
y axis and then rotates every vector through an angle of /4.
Exercise 5.4.35 Find the matrix for the linear transformation which reflects every vector in R2 across the
x axis and then rotates every vector through an angle of /6.
Exercise 5.4.36 Find the matrix for the linear transformation which reflects every vector in R2 across the
y axis and then rotates every vector through an angle of /6.
Exercise 5.4.37 Find the matrix for the linear transformation which rotates every vector in R2 through
an angle of 5 /12. Hint: Note that 5 /12 = 2 /3 /4.
Exercise 5.4.38 Find the matrix of the linear transformation which rotates every vector in R3 counter
clockwise about the z axis when viewed from the positive z axis through an angle of 30 and then reflects
through the xy plane.
a
Exercise 5.4.39 Let ~u = be a unit vector in R2 . Find the matrix which reflects all vectors across
b
this vector, as shown in the following picture.
~u
a cos
Hint: Notice that = for some . First rotate through . Next reflect through the x
b sin
axis. Finally rotate through .
6. Complex Numbers
Outcomes
A. Understand the geometric significance of a complex number as a point in the plane.
B. Prove algebraic properties of addition and multiplication of complex numbers, and apply
these properties. Understand the action of taking the conjugate of a complex number.
C. Understand the absolute value of a complex number and how to find it as well as its geometric
significance.
Although very powerful, the real numbers are inadequate to solve equations such as x2 +1 = 0, and this
is where complex numbers come in. We define the number i as the imaginary number such that i2 = 1,
and define complex numbers as those of the form z = a + bi where a and b are real numbers. We call this
the standard form, or Cartesian form, of the complex number z. Then, we refer to a as the real part of z,
and b as the imaginary part of z. It turns out that such numbers not only solve the above equation, but
in fact also solve any polynomial of degree at least 1 with complex coefficients. This property, called the
Fundamental Theorem of Algebra, is sometimes referred to by saying C is algebraically closed. Gauss is
usually credited with giving a proof of this theorem in 1797 but many others worked on it and the first
completely correct proof was due to Argand in 1806.
Just as a real number can be considered as a point on the line, a complex number z = a + bi can be
considered as a point (a, b) in the plane whose x coordinate is a and whose y coordinate is b. For example,
in the following picture, the point z = 3 + 2i can be represented as the point in the plane with coordinates
(3, 2) .
z = (3, 2) = 3 + 2i
(a + bi) + (c + di) = (a + c) + (b + d) i
203
204 Complex Numbers
This addition obeys all the usual properties as the following theorem indicates.
Additive Identity
z+0 = z
(z + w) + v = z + (w + v)
(3 + 6i)(5 i) = 9 + 33i
(zw) v = z (wv)
Multiplicative Identity
1z = z
Distributive Law
z (w + v) = zw + zv
You may wish to verify some of these statements. The real numbers also satisfy the above axioms, and
in general any mathematical structure which satisfies these axioms is called a field. There are many other
fields, in particular even finite ones particularly useful for cryptography, and the reason for specifying these
axioms is that linear algebra is all about fields and we can do just about anything in this subject using any
field. Although here, the fields of most interest will be the familiar field of real numbers, denoted as R,
and the field of complex numbers, denoted as C.
An important construction regarding complex numbers is the complex conjugate denoted by a hori-
zontal line above the number, z. It is defined as follows.
a + bi = a bi
Geometrically, the action of the conjugate is to reflect a given complex number across the x axis.
Algebraically, it changes the sign on the imaginary part of the complex number. Therefore, for a real
number a, a = a.
206 Complex Numbers
2 + 5i = 2 5i.
i = i.
7 = 7.
Notice that there is no imaginary part in the product, thus multiplying a complex number by its conjugate
results in a real number.
z w = z w.
(zw) = z w.
(z) = z.
wz = wz .
Interestingly every nonzero complex number a +bi has a unique multiplicative inverse. In other words,
for a nonzero complex number z, there exists a number z1 (or 1z ) so that zz1 = 1. Note that z = a + bi is
nonzero exactly when a2 + b2 6= 0, and its inverse can be written in standard form as defined now.
Note that we may write z1 as 1z . Both notations represent the multiplicative inverse of the complex
number z. Consider now an example.
Another important construction of complex numbers is that of the absolute value, also called the mod-
ulus. Consider the following definition.
208 Complex Numbers
|z| = (zz)1/2
Also from the definition, if z = a + bi and w = c + di are two complex numbers, then |zw| = |z| |w| .
Take a moment to verify this.
The triangle inequality is an important property of the absolute value of complex numbers. There are
two useful versions which we present here, although the first one is officially called the triangle inequality.
|z + w| |z| + |w|
||z| |w|| |z w|
|z + w| |z| + |w|
z = z w + w, w = w z + z
Hence, both |z| |w| and |w| |z| are no larger than |z w|. This proves the second version because
||z| |w|| is one of |z| |w| or |w| |z|.
With this definition, it is important to note the following. You may wish to take the time to verify this
remark. q
Let z = a + bi and w = c + di. Then |z w| = (a c)2 + (b d)2 . Thus the distance between the
point in the plane determined by the ordered pair (a, b) and the ordered pair (c, d) equals |z w| where z
and w are as just described.
For example, consider the distance between (2, 5) and (1,8) . Letting z = 2 + 5i and w = 1 + 8i, z w =
1 3i, (z w) (z w) = (1 3i) (1 + 3i) = 10 so |z w| = 10.
Recall that we refer to z = a + bi as the standard form of the complex number. In the next section, we
examine another form in which we can express the complex number.
Exercises
(a) z + w
(b) z 2w
(c) zw
w
(d) z
(a) z
(b) z1
(c) |z|
(a) zw
(b) |zw|
(c) z1 w
210 Complex Numbers
Exercise 6.1.4 If z is a complex number, show there exists a complex number w with |w| = 1 and wz = |z| .
Exercise 6.1.5 If z, w are complex numbers prove zw = z w and then show by induction that z1 zm =
z1 zm . Also verify that m m
k=1 zk = k=1 zk . In words this says the conjugate of a product equals the
product of the conjugates and the conjugate of a sum equals the sum of the conjugates.
Exercise 6.1.6 Suppose p (x) = an xn + an1 xn1 + + a1 x + a0 where all the ak are real numbers. Sup-
pose also that p (z) = 0 for some z C. Show it follows that p (z) = 0 also.
This is clearly a remarkable result but is there something wrong with it? If so, what is wrong?
Outcomes
A. Convert a complex number from standard form to polar form, and from polar form to standard
form.
In the previous section, we identified a complex number z = a + bi with a point (a, b) in the coordinate
plane. There is another form in which we can express the same number, called the polar form. The polar
form is the focus of this section. It will turn out to be very useful if not crucial for certain calculations as
we shall soon see.
Suppose z = a + bi is a complex number, and let r = a2 + b2 = |z|. Recall that r is the modulus of z
. Note first that
a 2 b 2 a2 + b2
+ = =1
r r r2
and so ar , br is a point on the unit circle. Therefore, there exists an angle (in radians) such that
a b
cos = , sin =
r r
In other words is an angle such that a = r cos and b = r sin , that is = cos1 (a/r) and =
sin1 (b/r). We call this angle the argument of z.
We often speak of the principal argument of z. This is the unique angle ( , ] such that
a b
cos = , sin =
r r
6.2. Polar Form 211
The polar form of the complex number z = a + bi = r (cos + i sin ) is for convenience written as:
z = rei
z = rei
where r = a2 + b2 and is the argument of z.
When given z = rei , the identity ei = cos + i sin will convert z back to standard form. Here we
think of ei as a short cut for cos + i sin . This is all we will need in this course, but in reality ei can be
considered as the complex equivalent of the exponential function where this turns out to be a true equality.
z = a + bi = rei
r= a2 + b2 r
Thus we can convert any complex number in the standard (Cartesian) form z = a + bi into its polar
form. Consider the following example.
z = rei
Solution. First, find r. By the above discussion, r = a2 + b2 = |z|. Therefore,
p
r= 22 + 22 = 8=2 2
Now, to find , we plot the point (2, 2) and find the angle from the positive x axis to the line between
this point and the origin. In this case, = 45 = 4 . That is we found the unique angle such that
= cos1 (1/ 2) and = sin1 (1/ 2).
Note that in polar form, we always express angles in radians, not degrees.
Hence, we can write z as
212 Complex Numbers
z = 2 2ei 4
Notice that the standard and polar forms are completely equivalent. That is not only can we transform
a complex number from standard form to its polar form, we can also take a complex number in polar form
and convert it back to standard form.
z = a + bi
Solution. Let z = 2e2 i/3 be the polar form of a complex number. Recall that ei = cos + i sin . There-
fore using standard values of sin and cos we get:
Exercises
Exercise 6.2.8 Let z = 3 + 3i be a complex number written in standard form. Convert z to polar form, and
write it in the form z = rei .
Exercise 6.2.9 Let z = 2i be a complex number written in standard form. Convert z to polar form, and
write it in the form z = rei .
2
Exercise 6.2.10 Let z = 4e 3 i be a complex number written in polar form. Convert z to standard form,
and write it in the form z = a + bi.
Exercise 6.2.11 Let z = 1e 6 i be a complex number written in polar form. Convert z to standard form,
and write it in the form z = a + bi.
Exercise 6.2.12 If z and w are two complex numbers and the polar form of z involves the angle while
the polar form of w involves the angle , show that in the polar form for zw the angle involved is + .
6.3. Roots of Complex Numbers 213
Outcomes
A. Understand De Moivres theorem and be able to use it to find the roots of a complex number.
A fundamental identity is the formula of De Moivre with which we begin this section.
Proof. The proof is by induction on n. It is clear the formula holds if n = 1. Suppose it is true for n. Then,
consider n + 1.
(r (cos + i sin ))n+1 = (r (cos + i sin ))n (r (cos + i sin ))
which by induction equals
by the formulas for the cosine and sine of the sum of two angles.
The process used in the previous proof, called mathematical induction is very powerful in Mathematics
and Computer Science and explored in more detail in the Appendix.
Now, consider a corollary of Theorem 6.15.
Proof. Let z = a + bi and let z = |z| (cos + i sin ) be the polar form of the complex number. By De
Moivres theorem, a complex number
This requires rk = |z| and so r = |z|1/k . Also, both cos (k ) = cos and sin (k ) = sin . This can only
happen if
k = + 2
for an integer. Thus
+ 2
= , = 0, 1, 2, , k 1
k
and so the kth roots of z are of the form
1/k + 2 + 2
|z| cos + i sin , = 0, 1, 2, , k 1
k k
Since the cosine and sine are periodic of period 2 , there are exactly k distinct numbers which result from
this formula.
The procedure for finding the k kth roots of z C is as follows.
rn = s
ein = ei (6.1)
3. The solutions to rn = s are given by r = n
s.
n = + 2 , for = 0, 1, 2, , n 1
or
2
= + , for = 0, 1, 2, , n 1
n n
5. Using the solutions r, to the equations given in (6.1) construct the nth roots of the form
z = rei .
Notice that once the roots are obtained in the final step, they can then be converted to standard form
if necessary. Lets consider an example of this concept. Note that according to Corollary 6.16, there are
exactly 3 cube roots of a complex number.
6.3. Roots of Complex Numbers 215
Solution. First, convert each number to polar form: z = rei and i = 1ei /2 . The equation now becomes
Therefore, the two equations that we need to solve are r3 = 1 and 3i = i /2. Given that r R and r3 = 1
it follows that r = 1.
Solving the second equation is as follows. First divide by i. Then, since the argument of i is not unique
we write 3 = /2 + 2 for = 0, 1, 2.
3 = /2 + 2 for = 0, 1, 2
2
= /6 + for = 0, 1, 2
3
For = 0:
2
= /6 + (0) = /6
3
For = 1:
2 5
= /6 + (1) =
3 6
For = 2:
2 3
= /6 + (2) =
3 2
Therefore, the three roots are given by
5 3
1ei /6 , 1ei 6 , 1ei 2
!
1 3
Solution. First find the cube roots of 27. By the above procedure , these cube roots are 3, 3 +i ,
2 2
!
1 3
and 3 i . You may wish to verify this using the above steps.
2 2
216 Complex Numbers
Therefore, x3 27 =
!! !!
1 3 1 3
(x 3) x 3 +i x3 i
2 2 2 2
Note also x 3 1
2 +i 2
3
x 3 12 i 2
3
= x2 + 3x + 9 and so
x3 27 = (x 3) x2 + 3x + 9
where the quadratic polynomial x2 + 3x + 9 cannot be factored without using complex numbers.
Note that even though the polynomial x3 27 has all real coefficients, it has some complex zeros,
! !
1 3 1 3
3 +i , and 3 i . These zeros are complex conjugates of each other. It is always
2 2 2 2
the case that if a polynomial has real coefficients and a complex root, it will also have a root equal to the
complex conjugate.
Exercises
Exercise 6.3.16 De Moivres theorem says [r (cost + i sint)]n = rn (cos nt + i sin nt) for n a positive integer.
Does this formula continue to hold for all integers n, even negative integers? Explain.
Exercise 6.3.17 Factor x3 + 8 as a product of linear factors. Hint: Use the result of 6.3.14.
Exercise 6.3.18 Write x3 + 27 in the form (x + 3) x2 + ax + b where x2 + ax + b cannot be factored any
more using only real numbers.
Exercise 6.3.19 Completely factor x4 + 16 as a product of linear factors. Hint: Use the result of 6.3.15.
Exercise 6.3.20 Factor x4 + 16 as the product of two quadratic polynomials each of which cannot be
factored further without using complex numbers.
Exercise 6.3.21 If n is an integer, is it always true that (cos i sin )n = cos (n ) i sin (n )? Explain.
Exercise 6.3.22 Suppose p (x) = an xn + an1 xn1 + + a1 x + a0 is a polynomial and it has n zeros,
z1 , z2 , , zn
listed according to multiplicity. (z is a root of multiplicity m if the polynomial f (x) = (x z)m divides p (x)
but (x z) f (x) does not.) Show that
p (x) = an (x z1 ) (x z2 ) (x zn )
6.4. The Quadratic Formula 217
Outcomes
A. Use the Quadratic Formula to find the complex roots of a quadratic equation.
The roots (or solutions) of a quadratic equation ax2 + bx + c = 0 where a, b, c are real numbers are
obtained by solving the familiar quadratic formula given by
b b2 4ac
x=
2a
When working with real numbers, we cannot solve this formula if b2 4ac < 0. However, complex
numbers allow us to find square roots of negative numbers, and the quadratic formula remains valid for
finding roots of the corresponding quadratic
equation. In
this case there are exactly two distinct (complex)
2
square roots of b 4ac, which are i 4ac b and i 4ac b2 .
2
Here is an example.
Solution. In terms of the quadratic equation above, a = 1, b = 2, and c = 5. Therefore, we can use the
quadratic formula with these values, which becomes
q
2
b b 4ac 2 (2) 4(1)(5)
2
x= =
2a 2(1)
We can verify that these are solutions of the original equation. We will show x = 1 + 2i and leave
x = 1 2i as an exercise.
Hence x = 1 + 2i is a solution.
218 Complex Numbers
What if the coefficients of the quadratic equation are actually complex numbers? Does the formula
hold even in this case? The answer is yes. This is a hint on how to do Problem 6.4.26 below, a special case
of the fundamental theorem of algebra, and an ingredient in the proof of some versions of this theorem.
Consider the following example.
Solution. In terms of the quadratic equation above, a = 1, b = 2i, and c = 5. Therefore, we can use
the quadratic formula with these values, which becomes
q
2
b b2 4ac 2i (2i) 4(1)(5)
x= =
2a 2(1)
Solving this equation, we see that the solutions are given by
2i 4 + 20 2i 4
x= = = i2
2 2
We can verify that these are solutions of the original equation. We will show x = i + 2 and leave
x = i 2 as an exercise.
Hence x = i + 2 is a solution.
We conclude this section by stating an essential theorem.
Exercises
Exercise 6.4.24 Give the solutions to the following quadratic equations having real coefficients.
6.4. The Quadratic Formula 219
(a) x2 2x + 2 = 0
(b) 3x2 + x + 3 = 0
(c) x2 6x + 13 = 0
(d) x2 + 4x + 9 = 0
(e) 4x2 + 4x + 5 = 0
Exercise 6.4.25 Give the solutions to the following quadratic equations having complex coefficients.
(a) x2 + 2x + 1 + i = 0
(d) x2 4ix 5 = 0
(e) 3x2 + (1 i) x + 3i = 0
Exercise 6.4.26 Prove the fundamental theorem of algebra for quadratic polynomials having coefficients
in C. That is, show that an equation of the form
ax2 + bx + c = 0 where a, b, c are complex numbers, a 6= 0 has a complex solution. Hint: Consider the
fact, noted earlier that the expressions given from the quadratic formula do in fact serve as solutions.
7. Spectral Theory
Outcomes
A. Describe eigenvalues geometrically and algebraically.
Spectral Theory refers to the study of eigenvalues and eigenvectors of a matrix. It is of fundamental
importance in many areas and is the subject of our study for this chapter.
In this section, we will work with the entire set of complex numbers, denoted by C. Recall that the real
numbers, R are contained in the complex numbers, so the discussions in this section apply to both real and
complex numbers.
To illustrate the idea behind what will be discussed, consider the following example.
221
222 Spectral Theory
In this case, the product AX resulted in a vector which is equal to 10 times the vector X . In other
words, AX = 10X .
Lets see what happens in the next product. Compute AX for the vector
1
X = 0
0
In this case, the product AX resulted in a vector equal to 0 times the vector X , AX = 0X .
Perhaps this matrix is such that AX results in kX , for every vector X . However, consider
0 5 10 1 5
0 22 16 1 = 38
0 9 2 1 11
In this case, AX did not result in a vector of the form kX for some scalar k.
There is something special about the first two products calculated in Example 7.1. Notice that for
each, AX = kX where k is some scalar. When this equation holds for some X and k, we call the scalar
k an eigenvalue of A. We often use the special symbol instead of k when referring to eigenvalues. In
Example 7.1, the values 10 and 0 are eigenvalues for the matrix A and we can label these as 1 = 10 and
2 = 0.
When AX = X for some X 6= 0, we call such an X an eigenvector of the matrix A. The eigenvectors
of A are associated to an eigenvalue. Hence, if 1 is an eigenvalue of A and AX = 1 X , we can label this
eigenvector as X1 . Note again that in order to be an eigenvector, X must be nonzero.
There is also a geometric significance to eigenvectors. When you have a nonzero vector which, when
multiplied by a matrix results in another vector which is parallel to the first or equal to 0, this vector is
called an eigenvector of the matrix. This is the meaning when the vectors are in Rn .
The formal definition of eigenvalues and eigenvectors is as follows.
7.1. Eigenvalues and Eigenvectors of a Matrix 223
AX = X (7.1)
for some scalar . Then is called an eigenvalue of the matrix A and X is called an eigenvector of
A associated with , or a -eigenvector of A.
The set of all eigenvalues of an n n matrix A is denoted by (A) and is referred to as the spectrum
of A.
The eigenvectors of a matrix A are those vectors X for which multiplication by A results in a vector in
the same direction or opposite direction to X . Since the zero vector 0 has no direction this would make no
sense for the zero vector. As noted above, 0 is never allowed to be an eigenvector.
Lets look at eigenvectors in more detail. Suppose X satisfies 7.1. Then
AX X = 0
or
(A I) X = 0
for some X 6= 0. Equivalently you could write ( I A) X = 0, which is more commonly used. Hence,
when we are looking for eigenvectors, we are looking for nontrivial solutions to this homogeneous system
of equations!
Recall that the solutions to a homogeneous system of equations consist of basic solutions, and the
linear combinations of those basic solutions. In this context, we call the basic solutions of the equation
( I A) X = 0 basic eigenvectors. It follows that any (nonzero) linear combination of basic eigenvectors
is again an eigenvector.
Suppose the matrix ( I A) is invertible, so that ( I A)1 exists. Then the following equation
would be true.
X = IX
= ( I A)1 ( I A) X
= ( I A)1 (( I A) X )
= ( I A)1 0
= 0
This claims that X = 0. However, we have required that X 6= 0. Therefore ( I A) cannot have an inverse!
Recall that if a matrix is not invertible, then its determinant is equal to 0. Therefore we can conclude
that
det ( I A) = 0 (7.2)
Note that this is equivalent to det (A I) = 0.
The expression det (xI A) is a polynomial (in the variable x) called the characteristic polynomial
of A, and det (xI A) = 0 is called the characteristic equation. For this reason we may also refer to the
eigenvalues of A as characteristic values, but the former is often used for historical reasons.
224 Spectral Theory
The following theorem claims that the roots of the characteristic polynomial are the eigenvalues of A.
Thus when 7.2 holds, A has a nonzero eigenvector.
Proof. For A an n n matrix, the method of Laplace Expansion demonstrates that det ( I A) is a polyno-
mial of degree n. As such, the equation 7.2 has a solution C by the Fundamental Theorem of Algebra.
The fact that is an eigenvalue is left as an exercise.
Now that eigenvalues and eigenvectors have been defined, we will study how to find them for a matrix A.
First, consider the following definition.
For example, suppose the characteristic polynomial of A is given by (x 2)2 . Solving for the roots of
this polynomial, we set (x 2)2 = 0 and solve for x. We find that = 2 is a root that occurs twice. Hence,
in this case, = 2 is an eigenvalue of A of multiplicity equal to 2.
We will now look at how to find the eigenvalues and eigenvectors for a matrix A in detail. The steps
used are summarized in the following procedure.
2. For each , find the basic eigenvectors X 6= 0 by finding the basic solutions to ( I A) X = 0.
To verify your work, make sure that AX = X for each and associated eigenvector X .
Solution. We will use Procedure 7.5. First we find the eigenvalues of A by solving the equation
det (xI A) = 0
This gives
1 0 5 2
det x = 0
0 1 7 4
x + 5 2
det = 0
7 x4
x2 + x 6 = 0
The augmented matrix for this system and corresponding reduced row-echelon form are given by
" #
7 2 0 1 72 0
7 2 0 0 0 0
Multiplying this vector by 7 we obtain a simpler description for the solution to this system, given by
2
t
7
5 2 2 4 2
= =2
7 4 7 14 7
This is what we wanted, so we know this basic eigenvector is correct.
Next we will repeat this process to find the basic eigenvector for 2 = 3. We wish to find all vectors
X 6= 0 such that AX = 3X . These are the solutions to ((3)I A)X = 0.
1 0 5 2 x 0
(3) =
0 1 7 4 y 0
2 2 x 0
=
7 7 y 0
The augmented matrix for this system and corresponding reduced row-echelon form are given by
2 2 0 1 1 0
7 7 0 0 0 0
Solution. We will use Procedure 7.5. First we need to find the eigenvalues of A. Recall that they are the
solutions of the equation
det (xI A) = 0
7.1. Eigenvalues and Eigenvectors of a Matrix 227
which becomes
x5 10 5
det 2 x 14 2 = 0
4 8 x6
Using Laplace Expansion, compute this determinant and simplify. The result is the following equation.
(x 5) x2 20x + 100 = 0
Solving this equation, we find that the eigenvalues are 1 = 5, 2 = 10 and 3 = 10. Notice that 10 is
a root of multiplicity two due to
x2 20x + 100 = (x 10)2
Therefore, 2 = 10 is an eigenvalue of multiplicity two.
Now that we have found the eigenvalues for A, we can compute the eigenvectors.
First we will find the basic eigenvectors for 1 = 5. In other words, we want to find all non-zero vectors
X so that AX = 5X . This requires that we solve the equation (5I A) X = 0 for X as follows.
1 0 0 5 10 5 x 0
5 0 1 0 2 14 2 y = 0
0 0 1 4 8 6 z 0
By now this is a familiar problem. You set up the augmented matrix and row reduce to get the solution.
Thus the matrix you must row reduce is
0 10 5 0
2 9 2 0
4 8 1 0
where s R. If we multiply this vector by 4, we obtain a simpler description for the solution to this system,
as given by
5
t 2 (7.3)
4
where t R. Here, the basic eigenvector is given by
5
X1 = 2
4
Notice that we cannot let t = 0 here, because this would result in the zero vector and eigenvectors are
never equal to 0! Other than this value, every other choice of t in 7.3 results in an eigenvector.
It is a good idea to check your work! To do so, we will take the original matrix and multiply by the
basic eigenvector X1 . We check to see if we get 5X1 .
5 10 5 5 25 5
2 14 2 2 = 10 = 5 2
4 8 6 4 20 4
Taking any (nonzero) linear combination of X2 and X3 will also result in an eigenvector for the eigen-
value = 10. As in the case for = 5, always check your work! For the first basic eigenvector, we can
check AX2 = 10X2 as follows.
5 10 5 1 10 1
2 14 2 0 = 0 = 10 0
4 8 6 1 10 1
This is what we wanted. Checking the second basic eigenvector, X3 , is left as an exercise.
It is important to remember that for any eigenvector X , X 6= 0. However, it is possible to have eigen-
values equal to zero. This is illustrated in the following example.
This reduces to x3 6x2 + 8x = 0. You can verify that the solutions are 1 = 0, 2 = 2, 3 = 4. Notice
that while eigenvectors can never equal 0, it is possible to have an eigenvalue equal to 0.
Now we will find the basic eigenvectors. For 1 = 0, we need to solve the equation (0I A) X = 0.
This equation becomes AX = 0, and so the augmented matrix for finding the solutions is given by
2 2 2 0
1 3 1 0
1 1 1 0
230 Spectral Theory
We can verify that this eigenvector is correct by checking that the equation AX1 = 0X1 holds. The
product AX1 is given by
2 2 2 1 0
AX1 = 1 3 1 0 = 0
1 1 1 1 0
This clearly equals 0X1 , so the equation holds. Hence, AX1 = 0X1 and so 0 is an eigenvalue of A.
Computing the other basic eigenvectors is left as an exercise.
In the following sections, we examine ways to simplify this process of finding eigenvalues and eigen-
vectors by using properties of special types of matrices.
There are three special kinds of matrices which we can use to simplify the process of finding eigenvalues
and eigenvectors. Throughout this section, we will discuss similar matrices, elementary matrices, as well
as triangular matrices.
We begin with a definition.
A = P1 BP
It turns out that we can use the concept of similar matrices to help us find the eigenvalues of matrices.
Consider the following lemma.
Proof. We need to show two things. First, we need to show that if A = P1 BP, then A and B have the same
eigenvalues. Secondly, we show that if A and B have the same eigenvalues, then A = P1 BP.
Here is the proof of the first statement. Suppose A = P1 BP and is an eigenvalue of A, that is
AX = X for some X 6= 0. Then
P1 BPX = X
and so
BPX = PX
Since P is one to one and X 6= 0, it follows that PX 6= 0. Here, PX plays the role of the eigenvector in
this equation. Thus is also an eigenvalue of B. One can similarly verify that any eigenvalue of B is also
an eigenvalue of A, and thus both matrices have the same eigenvalues as desired.
Proving the second statement is similar and is left as an exercise.
Note that this proof also demonstrates that the eigenvectors of A and B will (generally) be different.
We see in the proof that AX = X , while B (PX ) = (PX ). Therefore, for an eigenvalue , A will have
the eigenvector X while B will have the eigenvector PX .
The second special type of matrices we discuss in this section is elementary matrices. Recall from
Definition 2.43 that an elementary matrix E is obtained by applying one row operation to the identity
matrix.
It is possible to use elementary matrices to simplify a matrix before searching for its eigenvalues and
eigenvectors. This is illustrated in the following example.
Solution. This matrix has big numbers and therefore we would like to simplify as much as possible before
computing the eigenvalues.
We will do so using row operations. First, add 2 times the second row to the third row. To do so, left
multiply A by E (2, 2). Then right multiply A by the inverse of E (2, 2) as illustrated.
1 0 0 33 105 105 1 0 0 33 105 105
0 1 0 10 28 30 0 1 0 = 10 32 30
0 2 1 20 60 62 0 2 1 0 0 2
By Lemma 7.10, the resulting matrix has the same eigenvalues as A where here, the matrix E (2, 2) plays
the role of P.
We do this step again, as follows. In this step, we use the elementary matrix obtained by adding 3
times the second row to the first row.
1 3 0 33 105 105 1 3 0 3 0 15
0 1 0 10 32 30 0 1 0 = 10 2 30 (7.4)
0 0 1 0 0 2 0 0 1 0 0 2
232 Spectral Theory
Again by Lemma 7.10, this resulting matrix has the same eigenvalues as A. At this point, we can easily
find the eigenvalues. Let
3 0 15
B = 10 2 30
0 0 2
Then, we find the eigenvalues of B (and therefore of A) by solving the equation det (xI B) = 0. You
should verify that this equation becomes
(x + 2) (x + 2) (x 3) = 0
Solving this equation results in eigenvalues of 1 = 2, 2 = 2, and 3 = 3. Therefore, these are also
the eigenvalues of A.
Through using elementary matrices, we were able to create a matrix for which finding the eigenvalues
was easier than for A. At this point, you could go back to the original matrix A and solve ( I A) X = 0
to obtain the eigenvectors of A.
Notice that when you multiply on the right by an elementary matrix, you are doing the column op-
eration defined by the elementary matrix. In 7.4 multiplication by the elementary matrix on the right
merely involves taking three times the first column and adding to the second. Thus, without referring to
the elementary matrices, the transition to the new matrix in 7.4 can be illustrated by
33 105 105 3 9 15 3 0 15
10 32 30 10 32 30 10 2 30
0 0 2 0 0 2 0 0 2
The third special type of matrix we will consider in this section is the triangular matrix. Recall Defi-
nition 3.12 which states that an upper (lower) triangular matrix contains all zeros below (above) the main
diagonal. Remember that finding the determinant of a triangular matrix is a simple procedure of taking
the product of the entries on the main diagonal.. It turns out that there is also a simple way to find the
eigenvalues of a triangular matrix.
In the next example we will demonstrate that the eigenvalues of a triangular matrix are the entries on
the main diagonal.
Exercises
Exercise 7.1.1 If A is an invertible n n matrix, compare the eigenvalues of A and A1 . More generally,
for m an arbitrary integer, compare the eigenvalues of A and Am .
Exercise 7.1.2 If A is an n n matrix and c is a nonzero constant, compare the eigenvalues of A and cA.
Exercise 7.1.3 Let A, B be invertible n n matrices which commute. That is, AB = BA. Suppose X is an
eigenvector of B. Show that then AX must also be an eigenvector for B.
Exercise 7.1.4 Suppose A is an n n matrix and it satisfies Am = A for some m a positive integer larger
than 1. Show that if is an eigenvalue of A then | | equals either 0 or 1.
A (kX + pY ) = (kX + pY )
One eigenvalue is 2.
One eigenvalue is 1.
7.2. Diagonalization 235
One eigenvalue is 3.
One eigenvalue is 2.
Exercise 7.1.15 If A is the matrix of a linear transformation which rotates all vectors in R2 through 60 ,
explain why A cannot have any real eigenvalues. Is there an angle such that rotation through this angle
would have a real eigenvalue? What eigenvalues would be obtainable in this way?
Exercise 7.1.16 Let A be the 2 2 matrix of the linear transformation which rotates all vectors in R2
through an angle of . For which values of does A have a real eigenvalue?
Exercise 7.1.17 Let T be the linear transformation which reflects vectors about the x axis. Find a matrix
for T and then find its eigenvalues and eigenvectors.
Exercise 7.1.18 Let T be the linear transformation which rotates all vectors in R2 counterclockwise
through an angle of /2. Find a matrix of T and then find eigenvalues and eigenvectors.
Exercise 7.1.19 Let T be the linear transformation which reflects all vectors in R3 through the xy plane.
Find a matrix for T and then obtain its eigenvalues and eigenvectors.
7.2 Diagonalization
236 Spectral Theory
Outcomes
A. Determine when it is possible to diagonalize a matrix.
The most important theorem about diagonalizability is the following major result.
Proof. Suppose P is given as above as an invertible matrix whose columns are eigenvectors of A. Then
P1 is of the form
W1T
WT
2
P1 = ..
.
WnT
where WkT X j = k j , which is the Kroneckers symbol defined by
1 if i = j
i j =
0 if i 6= j
Then
W1T
W2T
P1 AP = .. AX1 AX2 AXn
.
WnT
W1T
W2T
= .. 1 X1 2 X2 n Xn
.
WnT
1 0
..
= .
0 n
7.2. Diagonalization 237
Solution. By Theorem 7.13 we use the eigenvectors of A as the columns of P, and the corresponding
eigenvalues of A as the diagonal entries of D.
First, we will find the eigenvalues of A. To do so, we solve det (xI A) = 0 as follows.
1 0 0 2 0 0
det x 0 1 0 1 4 1 = 0
0 0 1 2 4 4
This computation is left as an exercise, and you should verify that the eigenvalues are 1 = 2, 2 = 2,
and 3 = 6.
238 Spectral Theory
Next, we need to find the eigenvectors. We first find the eigenvectors for 1 , 2 = 2. Solving (2I A) X =
0 to find the eigenvectors, we find that the eigenvectors are
2 1
t 1 +s 0
0 1
where t, s are scalars. Hence there are two basic eigenvectors which are given by
2 1
X1 = 1 , X2 = 0
0 1
0
You can verify that the basic eigenvector for 3 = 6 is X3 = 1
2
Then, we construct the matrix P as follows.
2 1 0
P = X1 X2 X3 = 1 0 1
0 1 2
That is, the columns of P are the basic eigenvectors of A. Then, you can verify that
1 1 1
4 2 4
1 1
1
P = 2 1 2
1 1 1
4 2 4
Thus,
41 1
2
1
4
2 0 0 2 1 0
1 1
P AP =
1
2 1 2
1 4 1 1 0 1
1 1 1 2 4 4 0 1 2
4 2 4
2 0 0
= 0 2 0
0 0 6
You can see that the result here is a diagonal matrix where the entries on the main diagonal are the
eigenvalues of A. We expected this based on Theorem 7.13. Notice that eigenvalues on the main diagonal
must be in the same order as the corresponding eigenvectors in P.
Consider the next important theorem.
7.2. Diagonalization 239
The corollary that follows from this theorem gives a useful tool in determining if A is diagonalizable.
It is possible that a matrix A cannot be diagonalized. In other words, we cannot find an invertible
matrix P so that P1 AP = D.
Consider the following example.
Solution. Through the usual procedure, we find that the eigenvalues of A are 1 = 1, 2 = 1. To find the
eigenvectors, we solve the equation ( I A) X = 0. The matrix ( I A) is given by
1 1
0 1
Then, solving the equation ( I A) X = 0 involves carrying the following augmented matrix to its
reduced row-echelon form.
0 1 0 0 1 0
0 0 0 0 0 0
Then the eigenvectors are of the form
1
t
0
and the basic eigenvector is
1
X1 =
0
240 Spectral Theory
In this case, the matrix A has one eigenvalue of multiplicity two, but only one basic eigenvector. In
order to diagonalize A, we need to construct an invertible 2 2 matrix P. However, because A only has
one basic eigenvector, we cannot construct this P. Notice that if we were to use X1 as both columns of P,
P would not be invertible. For this reason, we cannot repeat eigenvectors in P.
Hence this matrix cannot be diagonalized.
The idea that a matrix may not be diagonalizable suggests that conditions exist to determine when it
is possible to diagonalize a matrix. We saw earlier in Corollary 7.16 that an n n matrix with n distinct
eigenvalues is diagonalizable. It turns out that there are other useful diagonalizability tests.
First we need the following definition.
In other words, the eigenspace E (A) is all X such that AX = X . Notice that this set can be written
E (A) = null( I A), showing that E (A) is a subspace of Rn .
Recall that the multiplicity of an eigenvalue is the number of times that it occurs as a root of the
characteristic polynomial.
Consider now the following lemma.
This result tells us that if is an eigenvalue of A, then the number of linearly independent -eigenvectors
is never more than the multiplicity of . We now use this fact to provide a useful diagonalizability condi-
tion.
In some applications, a matrix may have eigenvalues which are complex numbers. For example, this often
occurs in differential equations. These questions are approached in the same way as above.
Consider the following example.
7.2. Diagonalization 241
Solution. We will first find the eigenvalues as usual by solving the following equation.
1 0 0 1 0 0
det x 0 1 0 0 2 1 = 0
0 0 1 0 1 2
This reduces to (x 1) x2 4x + 5 = 0. The solutions are 1 = 1, 2 = 2 + i and 3 = 2 i.
There is nothing new about finding the eigenvectors for 1 = 1 so this is left as an exercise.
Consider now the eigenvalue 2 = 2 + i. As usual, we solve the equation ( I A) X = 0 as given by
1 0 0 1 0 0 0
(2 + i) 0 1 0 0 2 1 X = 0
0 0 1 0 1 2 0
In other words, we need to solve the system represented by the augmented matrix
1+i 0 0 0
0 i 1 0
0 1 i 0
We now use our row operations to solve the system. Divide the first row by (1 + i) and then take i
times the second row and add to the third row. This yields
1 0 0 0
0 i 1 0
0 0 0 0
Now multiply the second row by i to obtain the reduced row-echelon form, given by
1 0 0 0
0 1 i 0
0 0 0 0
Therefore, the eigenvectors are of the form
0
t i
1
and the basic eigenvector is given by
0
X2 = i
1
242 Spectral Theory
Exercises
Exercise 7.2.26 Suppose A is an n n matrix and let V be an eigenvector such that AV = V . Also
suppose the characteristic polynomial of A is
Explain why
An + an1 An1 + + a1 A + a0 I V = 0
If A is diagonalizable, give a proof of the Cayley Hamilton theorem based on this. This theorem says A
satisfies its characteristic equation,
An + an1 An1 + + a1 A + a0 I = 0
Exercise 7.2.27 Suppose the characteristic polynomial of an n n matrix A is 1 X n . Find Amn where m
is an integer.
One eigenvalue is 2. Diagonalize if possible. Hint: This one has some complex eigenvalues.
244 Spectral Theory
One eigenvalue is 2. Diagonalize if possible. Hint: This one has some complex eigenvalues.
One eigenvalue is 1. Diagonalize if possible. Hint: This one has some complex eigenvalues.
One eigenvalue is 3. Diagonalize if possible. Hint: This one has some complex eigenvalues.
Exercise 7.2.32 Suppose A is an n n matrix consisting entirely of real entries but a + ib is a complex
eigenvalue having the eigenvector, X + iY Here X and Y are real vectors. Show that then a ib is also an
eigenvalue with the eigenvector, X iY . Hint: You should remember that the conjugate of a product of
complex numbers equals the product of the conjugates. Here a + ib is a complex number whose conjugate
equals a ib.
Outcomes
A. Use diagonalization to find a high power of a matrix.
Suppose we have a matrix A and we want to find A50 . One could try to multiply A with itself 50 times, but
this is computationally extremely intensive (try it!). However diagonalization allows us to compute high
7.3. Applications of Spectral Theory 245
Similarly,
3
A3 = PDP1 = PDP1 PDP1 PDP1 = PD3 P1
In general, n
An = PDP1 = PDn P1
Therefore, we have reduced the problem to finding Dn . In order to compute Dn , then because D is
diagonal we only need to raise every entry on the main diagonal of D to the power of n.
Through this method, we can compute large powers of matrices. Consider the following example.
Solution. We will first diagonalize A. The steps are left as an exercise and you may wish to verify that the
eigenvalues of A are 1 = 1, 2 = 1, and 3 = 2.
The basic eigenvectors corresponding to 1 , 2 = 1 are
0 1
X1 = 0 , X2 = 1
1 0
Then also
1 1 1
P1 = 0 1 0
1 1 0
which you may wish to verify.
246 Spectral Theory
Then,
1 1 1 2 1 0 0 1 1
P1 AP = 0 1 0 0 1 0 0 1 0
1 1 0 1 1 1 1 0 1
1 0 0
= 0 1 0
0 0 2
= D
Therefore,
A50 = PD50 P1
50
0 1 1 1 0 0 1 1 1
= 0 1 0 0 1 0 0 1 0
1 0 1 0 0 2 1 1 0
It follows that
50
0 1 1 1 0 0 1 1 1
A50 = 0 1 0 0 150 0 0 1 0
1 0 1 0 0 2 50 1 1 0
250 1 + 250 0
= 0 1 0
12 50 12 50 1
Through diagonalization, we can efficiently compute a high power of A. Without this, we would be
forced to multiply this by hand!
The next section explores another interesting application of diagonalization.
We already have seen how to use matrix diagonalization to compute powers of matrices. This requires
computing eigenvalues of the matrix A, and finding an invertible matrix of eigenvectors P such that P1 AP
7.3. Applications of Spectral Theory 247
is diagonal. In this section we will see that if the matrix A is symmetric (see Definition 2.29), then we can
actually find such a matrix P that is an orthogonal matrix of eigenvectors. Thus P1 is simply its transpose
PT , and PT AP is diagonal. When this happens we say that A is orthogonally diagonalizable
In fact this happens if and only if A is a symmetric matrix as shown in the following important theorem.
1. A is symmetric.
3. A is orthogonally diagonalizable.
Proof. The complete proof is beyond this course, but to give an idea assume that A has an orthonormal
set of eigenvectors, and let P consist of these eigenvectors as columns. Then P1 = PT , and PT AP = D a
diagonal matrix. But then A = PDPT , and
so A is symmetric.
Now given a symmetric matrix A, one shows that eigenvectors corresponding to different eigenvalues
are always orthogonal. So it suffices to apply the Gram-Schmidt process on the set of basic eigenvectors
of each eigenvalue to obtain an orthonormal set of eigenvectors.
We demonstrate this in the following example.
Solution. In this case, verify that the eigenvalues are 2 and 1. First we will find an eigenvector for the
eigenvalue 2. This involves row reducing the following augmented matrix.
21 0 0 0
0 2 32 12 0
0 12 2 32 0
and so an eigenvector is
0
1
1
Finally to obtain an eigenvector of length one (unit eigenvector) we simply divide this vector by its length
to yield:
0
1/ 2
1/ 2
Next consider the case of the eigenvalue 1. To obtain basic eigenvectors, the matrix which needs to be
row reduced in this case is
11 0 0 0
0 1 32 12 0
0 12 1 32 0
1 0 0
= 0 1 0
0 0 2
which is the desired diagonal matrix.
We can now apply this technique to efficiently compute high powers of a symmetric matrix.
There are applications of great importance which feature a special type of matrix. Matrices whose columns
consist of non-negative numbers that sum to one are called Markov matrices. An important application
250 Spectral Theory
Solution. The columns of A are comprised of non-negative numbers which sum to 1. Hence, A is a Markov
matrix.
Now, consider the entries ai j of A in terms of population. The entry a11 = .4 is the proportion of
residents in location one which stay in location one in a given time period. Entry a21 = .6 is the proportion
of residents in location 1 which move to location 2 in the same time period. Entry a12 = .2 is the proportion
of residents in location 2 which move to location 1. Finally, entry a22 = .8 is the proportion of residents
in location 2 which stay in location 2 in this time period.
Considered as a Markov matrix, these numbers are usually identified with probabilities. Hence, we
can say that the probability that a resident of location one will stay in location one in the time period is .4.
Observe that in Example 7.27 if there was initially say 15 thousand people in location 1 and 10 thou-
sands in location 2, then after one year there would be .4 15 + .2 10 = 8 thousands people in location
1 the following year, and similarly there would be .6 15 + .8 10 = 17 thousands people in location 2
the following year.
More generally let Xn = [x1n xmn ]T where xin is the population of location i at time period n. We
call Xn the state vector at period n. In particular, we call X0 the initial state vector. Letting A be the
migration matrix, we compute the population in each location i one time period later by AXn . In order to
find the population of location i after k years, we compute the ith component of Ak X . This discussion is
summarized in the following theorem.
The sum of the entries of Xn will equal the sum of the entries of the initial vector X0 . Since the columns
of A sum to 1, this sum is preserved for every multiplication by A as demonstrated below.
!
ai j x j = x j ai j = xj
i j j i j
Solution. Using Theorem 7.28 we can find the population in each location using the equation Xn+1 = AXn .
For the population after 1 unit, we calculate X1 = AX0 as follows.
X1 = AX0
x11 .6 0 .1 100
x21 = .2 .8 0 200
x31 .2 .2 .9 400
100
= 180
420
Therefore after one time period, location 1 has 100 residents, location 2 has 180, and location 3 has 420.
Notice that the total population is unchanged, it simply migrates within the given locations. We find the
locations after two time periods in the same way.
X2 = AX1
x12 .6 0 .1 100
x22 = .2 .8 0 180
x32 .2 .2 .9 420
102
= 164
434
We could progress in this manner to find the populations after 10 time periods. However from our
above discussion, we can simply calculate (An X0 )i , where n denotes the number of time periods which
have passed. Therefore, we compute the populations in each location after 10 units of time as follows.
X10 = A10 X0
252 Spectral Theory
10
x110 .6 0 .1 100
x210 = .2 .8 0 200
x310 .2 .2 .9 400
115. 085 829 22
= 120. 130 672 44
464. 783 498 34
Since we are speaking about populations, we would need to round these numbers to provide a logical
answer. Therefore, we can say that after 10 units of time, there will be 115 residents in location one, 120
in location two, and 465 in location three.
A second important application of Markov matrices is the concept of random walks. Suppose a walker
has m locations to choose from, denoted 1, 2, , m. Let ai j refer to the probability that the person will
travel to location i from location j. Again, this requires that
k
ai j = 1
i=1
In this context, the vector Xn = [x1n xmn ]T contains the probabilities xin the walker ends up in location
i, 1 i m at time n.
X1 = AX0
0.4 0.1 0.5 1
= 0.4 0.6 0.1 0
0.2 0.3 0.4 0
0.4
= 0.4
0.2
7.3. Applications of Spectral Theory 253
X2 = AX1
0.4 0.1 0.5 0.4
= 0.4 0.6 0.1 0.4
0.2 0.3 0.4 0.2
0.3
= 0.42
0.28
This gives the probabilities that our walker ends up in locations 1, 2, and 3. For this example we are
interested in location 3, with a probability on 0.28.
Returning to the context of migration, suppose we wish to know how many residents will be in a
certain location after a very long time. It turns out that if some power of the migration matrix has all
positive entries, then there is a vector Xs such that An X0 approaches Xs as n becomes very large. Hence as
more time passes and n increases, An X0 will become closer to the vector Xs.
Consider Theorem 7.28. Let n increase so that Xn approaches Xs . As Xn becomes closer to Xs , so too
does Xn+1 . For sufficiently large n, the statement Xn+1 = AXn can be written as Xs = AXs.
This discussion motivates the following theorem.
Xs = AXs
where Xs has positive entries which have the same sum as the entries of X0.
As n increases, the state vectors Xn will approach Xs .
Note that the condition in Theorem 7.31 can be written as (I A)Xs = 0, representing a homogeneous
system of equations.
Consider the following example. Notice that it is the same example as the Example 7.29 but here it
will involve a longer time frame.
Solution. By Theorem 7.31 the steady state vector Xs can be found by solving the system (I A)Xs = 0.
254 Spectral Theory
The augmented matrix and the resulting reduced row-echelon form are given by
0.4 0 0.1 0 1 0 0.25 0
0.2 0.2 0 0 0 1 0.25 0
0.2 0.2 0.1 0 0 0 0 0
Again, because we are working with populations, these values need to be rounded. The steady state
vector Xs is given by
117
117
466
We can see that the numbers we calculated in Example 7.29 for the populations after the 10th unit of
time are not far from the long term values.
Consider another example.
7.3. Applications of Spectral Theory 255
Find the comparison between the populations in the three locations after a long time.
Solution. In order to compare the populations in the long term, we want to find the steady state vector Xs .
Solve
1 1 1
5 2 5
1 0 0 1 1 1 x1s 0
0 1 0 4 4 2 x2s = 0
0 0 1 11 1 3 x3s 0
20 4 10
The augmented matrix and the resulting reduced row-echelon form are given by
4 1 1
5 2 5 0
1 0 16
0
19
14 3
1
0 18
4 2 0 1 19 0
11 1 7
0 0 0 0 0
20 4 10
and so an eigenvector is
16
18
19
18
Therefore, the proportion of population in location 2 to location 1 is given by 16 . The proportion of
19
population 3 to location 2 is given by 18 .
Proof. Remember that the determinant of a matrix always equals that of its transpose. Therefore,
T
det (xI A) = det (xI A) = det xI AT
256 Spectral Theory
because I T = I. Thus the characteristic equation for A is the same as the characteristic equation for AT .
Consequently, A and AT have the same eigenvalues. We will show that 1 is an eigenvalue for AT and then
it will follow that 1 is an eigenvalue for A.
Remember that for a migration matrix, i ai j = 1. Therefore, if AT = bi j with bi j = a ji , it follows
that
bi j = a ji = 1
j j
The migration matrices discussed above give an example of a discrete dynamical system. We call them
discrete because they involve discrete values taken at a sequence of points rather than on a continuous
interval of time.
An example of a situation which can be studied in this way is a predator prey model. Consider the
following model where x is the number of prey and y the number of predators in a certain area at a certain
time. These are functions of n N where n = 1, 2, are the ends of intervals of time which may be of
interest in the problem. In other words, x (n) is the number of prey at the end of the nth interval of time.
An example of this situation may be modeled by the following equation
x (n + 1) 2 3 x (n)
=
y (n + 1) 1 4 y (n)
This says that from time period n to n + 1, x increases if there are more x and decreases as there are more
y. In the context of this example, this means that as the number of predators increases, the number of prey
decreases. As for y, it increases if there are more y and also if there are more x.
This is an example of a matrix recurrence which we define now.
7.3. Applications of Spectral Theory 257
In this section, we will examine how to find solutions to a dynamical system given certain initial
conditions. This process involves several concepts previously studied, including matrix diagonalization
and Markov matrices. The procedure is given as follows. Recall that when diagonalized, we can write
An = PDn P1 .
Given initial conditions x0 and y0 , the solutions to the system are found as follows:
3. Then Vn = PDn P1V0 where V0 is the vector containing the initial conditions.
4. If given specific values for n, substitute into this equation. Otherwise, find a general solution
for n.
Express this system as a matrix recurrence and find solutions to the dynamical system for initial
conditions x0 = 20, y0 = 10.
Vn+1 = AVn
258 Spectral Theory
x (n + 1) 1.5 0.5 x (n)
=
y (n + 1) 1.0 0 y (n)
Then
1.5 0.5
A=
1.0 0
You can verify that the eigenvalues of A are 1 and .5. By diagonalizing, we can write A in the form
1 1 1 1 0 2 1
P DP =
1 2 0 .5 1 1
Then, we can find solutions for various values of n. Here are the solutions for values of n between 1
and 5
25.0 27.5 28.75
n=1: ,n = 2 : ,n = 3 :
20.0 25.0 27.5
29.375 29.688
n=4: ,n = 5 :
28.75 29.375
Notice that as n increases, we approach the vector given by
2x0 y0 2 (20) 10 30
= =
2x0 y0 2 (20) 10 30
y
29
28
27
x
28 29 30
The following example demonstrates another system which exhibits some interesting behavior. When
we graph the solutions, it is possible for the ordered pairs to spiral around the origin.
Solution. Let
0.7 0.7
A=
0.7 0.7
To find solutions, we must diagonalize A. You can verify that the eigenvalues of A are complex and are
given by 1 = .7 + .7i and 2 = .7 .7i. The eigenvector for 1 = .7 + .7i is
1
i
and that the eigenvector for 2 = .7 .7i is
1
i
Thus the matrix A can be written in the form
1
12 i
1 1 .7 + .7i 0 2
1 1
i i 0 .7 .7i 2 2i
and so,
Vn = PDn P1V0
n 1
12 i
x (n) 1 1 (.7 + .7i) 0 2 x0
=
y (n) i i 0 (.7 .7i)n 1
2
1
2i
y0
In this picture, the dots are the values and the dashed line is to help to picture what is happening.
These points are getting gradually closer to theorigin, but they are circling
the origin in the clockwise
x (n) 0
direction as they do so. As n increases, the vector approaches
y (n) 0
This type of behavior along with complex eigenvalues is typical of the deviations from an equilibrium
point in the Lotka Volterra system of differential equations which is a famous model for predator-prey
interactions. These differential equations are given by
x = x (a by)
y = y (c dx)
where a, b, c, d are positive constants. For example, you might have X be the population of moose and Y
the population of wolves on an island.
Note that these equations make logical sense. The top says that the rate at which the moose population
increases would be aX if there were no predators Y . However, this is modified by multiplying instead
by (a bY ) because if there are predators, these will militate against the population of moose. The more
predators there are, the more pronounced is this effect. As to the predator equation, you can see that the
equations predict that if there are many prey around, then the rate of growth of the predators would seem
to be high. However, this is modified by the term cY because if there are many predators, there would
be competition for the available food supply and this would tend to decrease Y .
The behavior near an equilibrium point, which is a point where the right side of the differential equa-
tions equals zero, is of great interest. In this case, the equilibrium point is
c a
x = ,y =
d b
Then one defines new variables according to the formula
c a
x+ = x, y = y +
d b
7.3. Applications of Spectral Theory 261
Note that most of the time, the eigenvalues of the new matrix will be complex.
You can also notice that the upper right corner will be negative by considering higher powers of the
matrix. Thus letting 1, 2, 3, denote the ends of discrete intervals of time, the desired discrete dynamical
system is of the form
x (n + 1) a b x (n)
=
y (n + 1) c d y (n)
262 Spectral Theory
where a, b, c, d are positive constants and the matrix will likely have complex eigenvalues because it is a
power of a matrix which has complex eigenvalues.
You can see from the above discussion that if the eigenvalues of the matrix used to define the dynamical
system are less than 1 in absolute value, then the origin is stable in the sense that as n , the solution
converges to the origin. If either eigenvalue is larger than 1 in absolute value, then the solutions to the
dynamical system will usually be unbounded, unless the initial condition is chosen very carefully. The
next example exhibits the case where one eigenvalue is larger than 1 and the other is smaller than 1.
The following example demonstrates a familiar concept as a dynamical system.
1, 1, 2, 3, 5,
Solution. This sequence is extremely important in the study of reproducing rabbits. It can be considered
as a dynamical system as follows. Let y (n) = x (n + 1) . Then the above recurrence relation can be written
as
x (n + 1) 0 1 x (n) x (0) 1
= , =
y (n + 1) 1 1 y (n) y (0) 1
Let
0 1
A=
1 1
The eigenvalues of the matrix A are 1 = 21 12 5 and 2 = 21 5 + 12 . The corresponding eigenvectors
are, respectively, " 1 # " 1 #
2 5 12 2 5 1
2
X1 = , X2 =
1 1
You can see from a short computation that one of the eigenvalues is smaller than 1 in absolute value
while the other is larger than 1 in absolute value. Now, diagonalizing A gives us
1 1
5 12 12 5 12 1 5 1 1 5 1
2 0 1 2 2 2 2
1 1
1 1 1 1
1
2 5 + 12 0
= 1
0 2 12 5
7.3. Applications of Spectral Theory 263
Then it follows that for a given initial condition, the solution to this dynamical system is of the form
1 n
5 12 12 5 12 1 1
x (n) 2 5 + 0
= 2 2
n
y (n) 1 1
1 1 0 22 5
1 1
1
5 5 10 5 + 2
1
1 1 1
5 5 5 5 2 5 21 1
It follows that
n
1 1 1 1 1 1 n 1 1
x (n) = 5+ 5+ + 5 5
2 2 10 2 2 2 2 10
Here is a picture of the ordered pairs (x (n) , y (n)) for n = 0, 1, , n.
There is so much more that can be said about dynamical systems. It is a major topic of study in
differential equations and what is given above is just an introduction.
Exercises
1 2
Exercise 7.3.33 Let A = . Diagonalize A to find A10 .
2 1
1 4 1
Exercise 7.3.34 Let A = 0 2 5 . Diagonalize A to find A50 .
0 0 5
1 2 1
Exercise 7.3.35 Let A = 2 1 1 . Diagonalize A to find A100 .
2 3 1
264 Spectral Theory
Exercise 7.3.36 The following is a Markov (migration) matrix for three locations
7 1 1
10 9 5
1 7 2
10 9 5
1 1 2
5 9 5
(a) Initially, there are 90 people in location 1, 81 in location 2, and 85 in location 3. How many are in
each location after one time period?
(b) The total number of individuals in the migration process is 256. After a long time, how many are in
each location?
Exercise 7.3.37 The following is a Markov (migration) matrix for three locations
1 1 2
5 5 5
2 2 1
5 5 5
2 2 2
5 5 5
(a) Initially, there are 130 individuals in location 1, 300 in location 2, and 70 in location 3. How many
are in each location after two time periods?
(b) The total number of individuals in the migration process is 500. After a long time, how many are in
each location?
Exercise 7.3.38 The following is a Markov (migration) matrix for three locations
3 3 1
10 8 3
1 3 1
10 8 3
3 1 1
5 4 3
The total number of individuals in the migration process is 480. After a long time, how many are in each
location?
Exercise 7.3.39 The following is a Markov (migration) matrix for three locations
3 1 1
10 3 5
3 1 7
10 3 10
2 1 1
5 3 10
7.3. Applications of Spectral Theory 265
The total number of individuals in the migration process is 1155. After a long time, how many are in each
location?
Exercise 7.3.40 The following is a Markov (migration) matrix for three locations
2 1 1
5 10 8
3 2 5
10 5 8
3 1 1
10 2 4
The total number of individuals in the migration process is 704. After a long time, how many are in each
location?
Exercise 7.3.41 A person sets off on a random walk with three possible locations. The Markov matrix of
probabilities A = [ai j ] is given by
0.1 0.3 0.7
0.1 0.3 0.2
0.8 0.4 0.1
If the walker starts in location 2, what is the probability of ending back in location 2 at time n = 3?
Exercise 7.3.42 A person sets off on a random walk with three possible locations. The Markov matrix of
probabilities A = [ai j ] is given by
0.5 0.1 0.6
0.2 0.9 0.2
0.3 0 0.2
It is unknown where the walker starts, but the probability of starting in each location is given by
0.2
X0 = 0.25
0.55
Exercise 7.3.43 You own a trailer rental company in a large city and you have four locations, one in
the South East, one in the North East, one in the North West, and one in the South West. Denote these
locations by SE,NE,NW, and SW respectively. Suppose that the following table is observed to take place.
SE NE NW SW
1 1 1 1
SE 3 10 10 5
1 7 1 1
NE 3 10 5 10
2 1 3 1
NW 9 10 5 5
1 1 1 1
SW 9 10 10 2
266 Spectral Theory
In this table, the probability that a trailer starting at NE ends in NW is 1/10, the probability that a trailer
starting at SW ends in NW is 1/5, and so forth. Approximately how many will you have in each location
after a long time if the total number of trailers is 413?
Exercise 7.3.44 You own a trailer rental company in a large city and you have four locations, one in
the South East, one in the North East, one in the North West, and one in the South West. Denote these
locations by SE,NE,NW, and SW respectively. Suppose that the following table is observed to take place.
SE NE NW SW
1 1 1 1
SE 7 4 10 5
2 1 1 1
NE 7 4 5 10
1 1 3 1
NW 7 4 5 5
3 1 1 1
SW 7 4 10 2
In this table, the probability that a trailer starting at NE ends in NW is 1/10, the probability that a trailer
starting at SW ends in NW is 1/5, and so forth. Approximately how many will you have in each location
after a long time if the total number of trailers is 1469.
Exercise 7.3.45 The following table describes the transition probabilities between the states rainy, partly
cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts off p.c. it ends up sunny the
next day with probability 51 . If it starts off sunny, it ends up sunny the next day with probability 52 and so
forth.
rains sunny p.c.
1 1 1
rains 5 5 3
1 2 1
sunny 5 5 3
3 2 1
p.c. 5 5 3
Given this information, what are the probabilities that a given day is rainy, sunny, or partly cloudy?
Exercise 7.3.46 The following table describes the transition probabilities between the states rainy, partly
cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts off p.c. it ends up sunny the
1
next day with probability 10 . If it starts off sunny, it ends up sunny the next day with probability 25 and so
forth.
rains sunny p.c.
1 1 1
rains 5 5 3
1 2 4
sunny 10 5 9
7 2 2
p.c. 10 5 9
Given this information, what are the probabilities that a given day is rainy, sunny, or partly cloudy?
Exercise 7.3.47 You own a trailer rental company in a large city and you have four locations, one in
the South East, one in the North East, one in the North West, and one in the South West. Denote these
7.3. Applications of Spectral Theory 267
locations by SE,NE,NW, and SW respectively. Suppose that the following table is observed to take place.
SE NE NW SW
5 1 1 1
SE 11 10 10 5
1 7 1 1
NE 11 10 5 10
2 1 3 1
NW 11 10 5 5
3 1 1 1
SW 11 10 10 2
In this table, the probability that a trailer starting at NE ends in NW is 1/10, the probability that a trailer
starting at SW ends in NW is 1/5, and so forth. Approximately how many will you have in each location
after a long time if the total number of trailers is 407?
Exercise 7.3.48 The University of Poohbah offers three degree programs, scouting education (SE), dance
appreciation (DA), and engineering (E). It has been determined that the probabilities of transferring from
one program to another are as in the following table.
SE DA E
SE .8 .1 .3
DA .1 .7 .5
E .1 .2 .2
where the number indicates the probability of transferring from the top program to the program on the
left. Thus the probability of going from DA to E is .2. Find the probability that a student is enrolled in the
various programs.
Exercise 7.3.49 In the city of Nabal, there are three political persuasions, republicans (R), democrats (D),
and neither one (N). The following table shows the transition probabilities between the political parties,
the top row being the initial political party and the side row being the political affiliation the following
year.
R D N
R 15 61 27
1 1 4
D 5 3 7
3 1 1
N 5 2 7
Find the probabilities that a person will be identified with the various political persuasions. Which party
will end up being most important?
Exercise 7.3.50 The following table describes the transition probabilities between the states rainy, partly
cloudy and sunny. The symbol p.c. indicates partly cloudy. Thus if it starts off p.c. it ends up sunny the
next day with probability 51 . If it starts off sunny, it ends up sunny the next day with probability 72 and so
268 Spectral Theory
forth.
rains sunny p.c.
1 2 5
rains 5 7 9
1 2 1
sunny 5 7 3
3 3 1
p.c. 5 7 9
Given this information, what are the probabilities that a given day is rainy, sunny, or partly cloudy?
A. Some Prerequisite Topics
The topics presented in this section are important concepts in mathematics and therefore should be exam-
ined.
A set is a collection of things called elements. For example {1, 2, 3, 8} would be a set consisting of the
elements 1,2,3, and 8. To indicate that 3 is an element of {1, 2, 3, 8} , it is customary to write 3 {1, 2, 3, 8} .
We can also indicate when an element is not in a set, by writing 9 / {1, 2, 3, 8} which says that 9 is not an
element of {1, 2, 3, 8} . Sometimes a rule specifies a set. For example you could specify a set as all integers
larger than 2. This would be written as S = {x Z : x > 2} . This notation says: S is the set of all integers,
x, such that x > 2.
Suppose A and B are sets with the property that every element of A is an element of B. Then we
say that A is a subset of B. For example, {1, 2, 3, 8} is a subset of {1, 2, 3, 4, 5, 8} . In symbols, we write
{1, 2, 3, 8} {1, 2, 3, 4, 5, 8} . It is sometimes said that A is contained in B" or even B contains A". The
same statement about the two sets may also be written as {1, 2, 3, 4, 5, 8} {1, 2, 3, 8}.
We can also talk about the union of two sets, which we write as A B. This is the set consisting of
everything which is an element of at least one of the sets, A or B. As an example of the union of two sets,
consider {1, 2, 3, 8} {3, 4, 7, 8} = {1, 2, 3, 4, 7, 8}. This set is made up of the numbers which are in at least
one of the two sets.
In general
A B = {x : x A or x B}
Notice that an element which is in both A and B is also in the union, as well as elements which are in only
one of A or B.
Another important set is the intersection of two sets A and B, written A B. This set consists of
everything which is in both of the sets. Thus {1, 2, 3, 8} {3, 4, 7, 8} = {3, 8} because 3 and 8 are those
elements the two sets have in common. In general,
A B = {x : x A and x B}
If A and B are two sets, A \ B denotes the set of things which are in A but not in B. Thus
A \ B = {x A : x
/ B}
For example, if A = {1, 2, 3, 8} and B = {3, 4, 7, 8}, then A \ B = {1, 2, 3, 8} \ {3, 4, 7, 8} = {1, 2}.
A special set which is very important in mathematics is the empty set denoted by 0. / The empty set, 0,
/
is defined as the set which has no elements in it. It follows that the empty set is a subset of every set. This
269
270 Some Prerequisite Topics
is true because if it were not so, there would have to exist a set A, such that 0/ has something in it which is
not in A. However, 0/ has nothing in it and so it must be that 0/ A.
We can also use brackets to denote sets which are intervals of numbers. Let a and b be real numbers.
Then
[a, b] = {x R : a x b}
[a, b) = {x R : a x < b}
(a, b] = {x R : a < x b}
[a, ) = {x R : x a}
(, a] = {x R : x a}
These sorts of sets of real numbers are called intervals. The two points a and b are called endpoints,
or bounds, of the interval. In particular, a is the lower bound while b is the upper bound of the above
intervals, where applicable. Other intervals such as (, b) are defined by analogy to what was just
explained. In general, the curved parenthesis, (, indicates the end point is not included in the interval,
while the square parenthesis, [, indicates this end point is included. The reason that there will always be
a curved parenthesis next to or is that these are not real numbers and cannot be included in the
interval in the way a real number can.
To illustrate the use of this notation relative to intervals consider three examples of inequalities. Their
solutions will be written in the interval notation just described.
Solution. We need to find x such that 2x + 4 x 8. Solving for x, we see that x 12 is the answer.
This is written in terms of an interval as (, 12].
Consider the following example.
Solution. This inequality is true for any value of x where x is a real number. We can write the solution as
R or (, ) .
In the next section, we examine another important mathematical concept.
Another example:
3
a11 + a12 + a13 = a1i
i=1
N = {1, 2, }
is well ordered.
Consider the following proposition.
This proposition claims that if a set has a lower bound which is a real number, then this set is well
ordered.
Further, this proposition implies the principle of mathematical induction. The symbol Z denotes the
set of all integers. Note that if a is an integer, then there are no integers between a and a + 1.
Proof. Let T consist of all integers larger than or equal to a which are not in S. The theorem will be proved
if T = 0.
/ If T 6= 0/ then by the well ordering principle, there would have to exist a smallest element of T ,
denoted as b. It must be the case that b > a since by definition, a
/ T . Thus b a + 1, and so b 1 a and
b1 / S because if b 1 S, then b 1 + 1 = b S by the assumed property of S. Therefore, b 1 T
which contradicts the choice of b as the smallest element of T . (b 1 is smaller.) Since a contradiction is
obtained by assuming T 6= 0, / it must be the case that T = 0/ and this says that every integer at least as large
as a is also in S.
Mathematical induction is a very useful device for proving theorems about the integers. The procedure
is as follows.
2. Assume Sn is true for some n, which is the induction hypothesis. Then, using this assump-
tion, show that Sn+1 is true.
Solution. By Procedure A.8, we first need to show that this statement is true for n = 1. When n = 1, the
statement says that
1
1 (1 + 1) (2(1) + 1)
k2 =
6
k=1
6
=
6
= 1
The sum on the left hand side also equals 1, so this equation is true for n = 1.
Now suppose this formula is valid for some n 1 where n is an integer. Hence, the following equation
is true.
n
n (n + 1) (2n + 1)
k2 = 6
(1.1)
k=1
We want to show that this is true for n + 1.
Suppose we add (n + 1)2 to both sides of equation 1.1.
n+1 n
k2 = k2 + (n + 1)2
k=1 k=1
n (n + 1) (2n + 1)
= + (n + 1)2
6
The step going from the first to the second line is based on the assumption that the formula is true for n.
Now simplify the expression in the second line,
n (n + 1) (2n + 1)
+ (n + 1)2
6
This equals
n (2n + 1)
(n + 1) + (n + 1)
6
and
n (2n + 1) 6 (n + 1) + 2n2 + n (n + 2) (2n + 3)
+ (n + 1) = =
6 6 6
Therefore,
n+1
(n + 1) (n + 2) (2n + 3) (n + 1) ((n + 1) + 1) (2 (n + 1) + 1)
k2 = 6
=
6
k=1
showing the formula holds for n + 1 whenever it holds for n. This proves the formula by mathematical
induction. In other words, this formula is true for all n = 1, 2, .
Consider another example.
274 Some Prerequisite Topics
Solution. Again we will use the procedure given in Procedure A.8 to prove that this statement is true for
all n. Suppose n = 1. Then the statement says
1 1
<
2 3
which is true.
Suppose then that the inequality holds for n. In other words,
1 3 2n 1 1
<
2 4 2n 2n + 1
is true.
2n+1
Now multiply both sides of this inequality by 2n+2 . This yields
1 3 2n 1 2n + 1 1 2n + 1 2n + 1
< =
2 4 2n 2n + 2 2n + 1 2n + 2 2n + 2
1
The theorem will be proved if this last expression is less than . This happens if and only if
2n + 3
2
1 1 2n + 1
= >
2n + 3 2n + 3 (2n + 2)2
which occurs if and only if (2n + 2)2 > (2n + 3) (2n + 1) and this is clearly true which may be seen from
expanding both sides. This proves the inequality.
Lets review the process just used. If S is the set of integers at least as large as 1 for which the formula
holds, the first step was to show 1 S and then that whenever n S, it follows n + 1 S. Therefore, by
the principle of mathematical induction, S contains [1, ) Z, all positive integers. In doing an inductive
proof of this sort, the set S is normally not mentioned. One just verifies the steps above.
B. Selected Exercise Answers
x + 3y = 1 10 1
1.1.1 , Solution is: x = 13 , y = 13 .
4x y = 3
3x + y = 3
1.1.2 , Solution is: [x = 1, y = 0]
x + 2y = 1
x + 3y = 1 10 1
1.2.4 , Solution is: x = 13 , y = 13
4x y = 3
x + 3y = 1 10 1
1.2.5 , Solution is: x = 13 , y = 13
4x y = 3
x + 2y = 1
1.2.6 2x y = 1 , Solution is: x = 35 , y = 15
4x + 3y = 3
1.2.7
No solution exists. You can see this
by writing the
augmented matrix and doing row operations.
1 1 3 2 1 0 4 0
2 1 1 1 , row echelon form: 0 1 7 0 . Thus one of the equations says 0 = 1 in an
3 2 2 0 0 0 0 1
equivalent system of equations.
4g I = 150
4I 17g = 660
1.2.8 , Solution is : {g = 60, I = 90, b = 200, s = 50}
4g + s = 290
g+I +sb = 0
1.2.14 These can have a solution. For example, x + y = 1, 2x + 2y = 2, 3x + 3y = 3 even has an infinite set
of solutions.
1.2.15 h = 4
275
276 Selected Exercise Answers
1.2.18 If h 6= 2 there will be a unique solution for any k. If h = 2 and k 6= 4, there are no solutions. If h = 2
and k = 4, then there are infinitely many solutions.
1.2.19 If h 6= 4, then there is exactly one solution. If h = 4 and k 6= 4, then there are no solutions. If h = 4
and k = 4, then there are infinitely many solutions.
1.2.47 The last column must not be a pivot column. The remaining columns must each be pivot columns.
1
4 (20 + 30 + w + x) y = 0
1
1.2.48 You need 4 (y + 30 + 0 + z) w = 0 , Solution is: [w = 15, x = 15, y = 20, z = 10] .
1
4 (20 + y + z + 10) x = 0
1
4 (x + w + 0 + 10) z = 0
1.2.60 It is because you cannot have more than min (m, n) nonzero rows in the reduced row-echelon form.
Recall that the number of pivot columns is the same as the number of nonzero rows from the description
of this reduced row-echelon form.
1.2.61 (a) This says B is in the span of four of the columns. Thus the columns are not independent.
Infinite solution set.
(b) This surely cant happen. If you add in another column, the rank does not get smaller.
(c) This says B is in the span of the columns and the columns must be independent. You cant have the
rank equal 4 if you only have two columns.
(d) This says B is not in the span of the columns. In this case, there is no solution to the system of
equations represented by the augmented matrix.
(e) In this case, there is a unique solution since the columns of A are independent.
278 Selected Exercise Answers
1.2.62 These are not legitimate row operations. They do not preserve the solution set of the system.
2.1.3 To get A, just replace every entry of A with its additive inverse. The 0 matrix is the one which has
all zeros in it.
A = A + (A + B) = (A + A) + B = 0 + B = B
2.1.8 A + (1) A = (1 + (1)) A = 0A = 0. Therefore, from the uniqueness of the additive inverse proved
in the above Problem 2.1.5, it follows that A = (1) A.
3 6 9
2.1.9 (a)
6 3 21
8 5 3
(b)
11 5 4
(c) Not possible
3 3 4
(d)
6 1 7
(e) Not possible
2
(h)
5
3 0 4
2.1.11 (a) 4 1 6
5 1 6
1 2
(b)
2 3
(c) Not possible
4 6
(d) 5 3
1 2
8 1 3
(e)
7 6 6
2.1.12
1 1 x y x z w y
=
3 3 z w 3x + 3z 3w + 3y
0 0
=
0 0
x y
Solution is: w = y, x = z so the matrices are of the form .
x y
0 1 2
2.1.13 X T Y = 0 1 2 , XY T = 1
0 1 2
2.1.14
1 2 1 2 7 2k + 2
=
3 4 3 k 15 4k + 6
1 2 1 2 7 10
=
3 k 3 4 3k + 3 4k + 6
3k + 3 = 15
Thus you must have , Solution is: [k = 4]
2k + 2 = 10
2.1.15
1 2 1 2 3 2k + 2
=
3 4 1 k 7 4k + 6
1 2 1 2 7 10
=
1 k 3 4 3k + 1 4k + 2
However, 7 6= 3 and so there is no possible choice of k which will make these matrices commute.
280 Selected Exercise Answers
1 1 1 1 2 2
2.1.16 Let A = ,B = ,C = .
1 1 1 1 2 2
1 1 1 1 0 0
=
1 1 1 1 0 0
1 1 2 2 0 0
=
1 1 2 2 0 0
1 1 1 1
2.1.18 Let A = ,B = .
1 1 1 1
1 1 1 1 0 0
=
1 1 1 1 0 0
0 1 1 2
2.1.20 Let A = ,B = .
1 0 3 4
0 1 1 2 3 4
=
1 0 3 4 1 2
1 2 0 1 2 1
=
3 4 1 0 4 3
1 1 2 0
1 0 2 0
2.1.21 A =
0
0 3 0
1 3 0 3
1 3 2 0
1 0 2 0
2.1.22 A =
0
0 6 0
1 3 0 1
1 1 1 0
1 1 2 0
2.1.23 A =
1
0 1 0
1 0 0 3
1
2.1.28 Show that 2 AT + A is symmetric and then consider using this as one of the matrices. A =
A+AT AAT
2 + 2 .
2.1.29 If A is symmetric then A = AT . It follows that aii = aii and so each aii = 0.
1 " #
0 1 35 1
5
2.1.36 =
5 3 1 0
1 " 1
#
2 1 0 3
2.1.37 =
3 0 1 32
282 Selected Exercise Answers
1 " #
1
2 1 1 2
2.1.38 does not exist. The reduced row-echelon form of this matrix is
4 2 0 0
1 d b
a b adbc adbc
2.1.39 = c a
c d adbc adbc
1
1 2 3 2 4 5
2.1.40 2 1 4 = 0 1 2
1 0 2 1 2 3
1 2 0 3
1 0 3
1 2
2.1.41 2 3 4 = 0 3 3
1 0 2 1 0 1
5
1 0 3
2
2.1.42 The reduced row-echelon form is 0 1 3 . There is no inverse.
0 0 0
1 1 1
1
1
2 2 2
1 2 0 2 1 1 5
1 3
2 2 2
1 2 0
2.1.43
2
=
1 3 2 1 0 0 1
1 2 1 2 2 3 1 9
4 4 4
x 1
2
2.1.45 (a) y = 3
z 0
x 12
(b) y = 1
z 5
x 3c 2a
y = 1b 2c
3 3
z ac
T
2.1.49 You need to show that A1 acts like the inverse of AT because from uniqueness in the above
problem, this will imply it is the inverse. From properties of the transpose,
T T
AT A1 = A1 A = IT = I
T T
A1 AT = AA1 = IT = I
T 1
Hence A1 = AT and this last matrix exists.
2.1.50 (AB) B1 A1 = A BB1 A1 = AA1 = I B1 A1 (AB) = B1 A1 A B = B1 IB = B1 B = I
2.1.51 The proof of this exercise follows from the previous one.
2 2
2.1.52 A2 A1 = AAA1 A1 = AIA1 = AA1 = I A1 A2 = A1 A1 AA = A1 IA = A1 A = I
1
2.1.53 A1 A = AA1 = I and so by uniqueness, A1 = A.
3.1.4
1 2 1
2 1 3 =6
2 1 1
3.1.5
1 2 1
1 0 1 =2
2 1 1
3.1.6
1 2 1
2 1 3 =6
2 1 1
3.1.7
1 0 0 1
2 1 1 0
= 4
0 0 0 2
2 1 3 1
3.1.9 It does not change the determinant. This was just taking the transpose.
3.1.10 In this case two rows were switched and so the resulting determinant is 1 times the first.
284 Selected Exercise Answers
3.1.11 The determinant is unchanged. It was just the first row added to the second.
3.1.12 The second row was multiplied by 2 so the determinant of the result is 2 times the original deter-
minant.
3.1.13 In this case the two columns were switched so the determinant of the second is 1 times the
determinant of the first.
3.1.14 If the determinant is nonzero, then it will remain nonzero with row operations applied to the matrix.
However, by assumption, you can obtain a row of zeros by doing row operations. Thus the determinant
must have been zero after all.
3.1.15 det (aA) = det (aIA) = det (aI)det (A) = an det (A) . The matrix which has a down the main diagonal
has determinant equal to an .
3.1.16
1 2 1 2
det = 8
3 4 5 6
1 2 1 2
det det = 2 4 = 8
3 4 5 6
1 0 1 0
3.1.17 This is not true at all. Consider A = ,B = .
0 1 0 1
3.1.18 It must be 0 because 0 = det (0) = det Ak = (det (A))k .
3.1.19 You would need det AAT = det (A) det AT = det (A)2 = 1 and so det (A) = 1, or 1.
3.1.20 det (A) = det S1 BS = det S1 det (B) det (S) = det (B) det S1 S = det (B).
1 1 2
3.1.21 (a) False. Consider 1 5 4
0 3 3
(b) True.
(c) False.
(d) False.
(e) True.
(f) True.
(g) True.
(h) True.
285
(i) True.
(j) True.
3.1.22
1 2 1
2 3 2 = 6
4 1 2
3.1.23
2 1 3
2 4 2 = 32
1 4 5
3.1.24 One can row reduce this using only row operation 3 to
1 2 1 2
0 5 5 3
0 0 2 9
5
0 0 0 63
10
3.1.25 One can row reduce this using only row operation 3 to
1 4 1 2
0 10 5 3
0 0 2 19
5
0 0 0 211
20
1 3 4
13 13 13
3 9 1
= 13 13 13
6 5 2
13 13 13
1
27 2
T 7 7
1 2 0 1 3 6 3 1
1
3.2.27 det 0 2 1 = 7 so it has an inverse. This inverse is 7 2 1 5
= 7 7 17
3 1 1 2 1 2 6 5 2
7 7 7
3.2.28
1 3 3
det 2 4 1 = 3
0 1 1
so it has an inverse which is
1 0 3
2 1 5
3 3 3
2
3 13 23
3.2.30
1 0 3
det 1 0 1 = 2
3 1 0
and so it has an inverse. The inverse turns out to equal
1 3
2 2 0
3
9
2 2 1
1 1
2 2 0
1 1
3.2.31
(a) =1
1 2
1 2 3
(b) 0 2 1 = 15
4 1 1
1 2 1
(c) 2 3 0 =0
0 1 2
3.2.33
1 t t2
det 0 1 2t = t 3 + 2
t 0 2
and so it has no inverse when t = 3 2
3.2.34
et cosht sinht
det et sinht cosht = 0
et cosht sinht
and so this matrix fails to have a nonzero determinant at any value of t.
3.2.35
et et cost et sint
det et et cost et sint et sint + et cost = 5et 6= 0
et 2et sint 2et cost
and so this matrix is always invertible.
3.2.36 If det (A) 6= 0, then A1 exists and so you could multiply on both sides on the left by A1 and obtain
that X = 0.
3.2.37 You have 1 = det (A) det (B). Hence both A and B have inverses. Letting X be given,
A (BA I) X = (AB) AX AX = AX AX = 0
and so it follows from the above problem that (BA I)X = 0. Since X is arbitrary, it follows that BA = I.
3.2.38
et 0 0
det 0 et cost et sint = e3t .
t t t t
0 e cost e sint e cost + e sint
Hence the inverse is
T
e2t 0 0
e3t 0 e2t cost + e2t sint e2t cost e2t sin t
0 e2t sint e2t cos (t)
t
e 0 0
= 0 et (cost + sint) (sint) et
0 et (cost sint) (cost) et
3.2.39
1
et cost sint
et sint cost
et cost sint
1 t 1 t
2e 0 2e
= 12 cost + 21 sint sint 12 sint 12 cost
1 1
2 sint 2 cost cost 12 cost 12 sint
288 Selected Exercise Answers
3.2.40 The given condition is what it takes for the determinant to be non zero. Recall that the determinant
of an upper triangular matrix is just the product of the entries on the main diagonal.
3.2.41 This follows because det (ABC) = det (A) det (B) det (C) and if this product is nonzero, then each
determinant in the product is nonzero and so each of these matrices is invertible.
3.2.42 False.
4.2.3
4 3 2
4 = 2 1 2
3 1 1
4.7.17 This formula says that ~u ~v = k~ukk~vk cos where is the included angle between the two vectors.
Thus
k~u ~vk = k~ukk~vkk cos k k~ukk~vk
and equality holds if and only if = 0 or . This means that the two vectors either point in the same
direction or opposite directions. Hence one is a multiple of the other.
289
4.7.18 This follows from the Cauchy Schwarz inequality and the proof of Theorem 4.29 which only used
the properties of the dot product. Since this new product has the same properties the Cauchy Schwarz
inequality holds for it as well.
4.7.22
Since this is true for all ~x, it follows that, in particular, it holds for
~x = BT AT~y (AB)T ~y
and so BT AT~y (AB)T ~y = ~0. However, this is true for all ~y and so BT AT (AB)T = 0.
h iT h iT
3 1 1 1 4 2
4.7.23
9+1+1 1+16+4
= 3 = 0.197 39 = cos Therefore we need to solve 0.197 39 =
11 21
cos . Thus = 1.7695 radians.
4.7.24 1+4+110
1+4+49
= 0.55555 = cos Therefore we need to solve 0.55555 = cos , which gives
= 2. 031 3 radians.
5
1
14
~u~v 5 5
4.7.25 ~u~u~u = 14 2 = 7
3 15
14
1
1 2
~u~v 5
4.7.26 ~u~u~u = 10 0 = 0
3 32
1
14
h iT h iT 1
1 2 2 1 1 2 3 0 2 17
~u~v =
4.7.27 ~u~u~u= 1+4+9 3 3
14
0
0
4.7.30 No, it does not. The 0 vector has no direction. The formula for proj~0 (~w) doesnt make sense either.
290 Selected Exercise Answers
4.7.31
~u ~v ~u ~v 1 1
~u 2
~v ~u 2
~v = k~uk2 2 (~u ~v)2 2
+ (~u ~v)2 0
k~vk k~vk k~vk k~vk2
And so
k~uk2 k~vk2 (~u ~v)2
~u~v
You get equality exactly when ~u = proj~v~u = k~vk2
~v in other words, when ~u is a multiple of ~v.
4.7.32
4.9.34 If ~a 6= ~0, then the condition says that k~a ~uk = k~ak sin = 0 for all angles . Hence ~a = ~0 after
all.
3 4 0
4.9.35 0 0 = 18 . So the area is 9.
3 2 0
3 4 1
4.9.36 1 1 = 18 . The area is given by
3 2 7
q
1 1
1 + (18)2 + 49 = 374
2 2
4.9.37 1 1 1 2 2 2 = 0 0 0 . The area is 0. It means the three points are on the same
line.
1 3 8
4.9.38 2 2 = 8 . The area is 8 3
3 1 8
291
1 4 6
4.9.39 0 2 = 11 . The area is 36 + 121 + 4 = 161
3 1 2
4.9.40 ~i ~j ~j =~k ~j = ~i. However, ~i ~j ~j = ~0 and so the cross product is not associative.
4.9.41 Verify directly from the coordinate description of the cross product that the right hand rule applies
to the vectors ~i, ~j,~k. Next verify that the distributive law holds for the coordinate description of the cross
product. This gives another way to approach the cross product. First define it in terms of coordinates and
then get the geometric properties from this. However, this approach does not yield the right hand rule
property very easily. From the coordinate description,
~a ~b ~a = i jk a j bk ai = jik a j bk ai = jik bk ai a j = ~a ~b ~a
and so k~a ~bk = k~akk~bk sin , the area of the parallelogram determined by ~a,~b. Only the right hand rule
is a little problematic. However, you can see right away from the component definition that the right hand
rule holds for each of the standard unit vectors. Thus ~i ~j =~k etc.
~i ~j ~k
1 0 0 =~k
0 1 0
1 7 5
4.9.43 1 2 6 = 113
3 2 3
4.9.44 Yes. It will involve the sum of product of integers and so it will be an integer.
4.9.45 It means that if you place them so that they all have their tails at the same point, the three will lie
in the same plane.
4.9.46 ~x ~a ~b = 0
4.9.48 Here [~v,~w,~z] denotes the box product. Consider the cross product term. From the above,
Thus it reduces to
(~u ~v) [~v,~w,~z]~w = [~v,~w,~z] [~u,~v, ~w]
292 Selected Exercise Answers
4.9.49
k~u ~vk2 = i jk u j vk irs ur vs = jr ks kr js ur vs u j vk
= u j vk u j vk uk v j u j vk = k~uk2 k~vk2 (~u ~v)2
It follows that the expression reduces to 0. You can also do the following.
k~u ~vk2 = k~uk2 k~vk2 sin2
= k~uk2 k~vk2 1 cos2
= k~uk2 k~vk2 k~uk2 k~vk2 cos2
= k~uk2 k~vk2 (~u ~v)2
which implies the expression equals 0.
4.9.50 We will show it using the summation convention and permutation symbol.
(~u ~v) i = ((~u ~v)i ) = i jk u j vk
= i jk uj vk + i jk uk vk = ~u ~v +~u ~v i
and so (~u ~v) = ~u ~v +~u ~v .
4.10.57 The velocity is the sum of two vectors. 50~i + 300
~i + ~j = 50 + 300 ~i + 300
~j. The component
2 2 2
in the direction of North is then 300
= 150 2 and the velocity relative to the ground is
2
300 ~ 300 ~
50 + i+ j
2 2
4.10.60 Velocity of plane for the first hour: 0 h 150 +i 40 0 = 40 150 . After one hour it is at
(40, 150). Next the velocity of the plane is 150 12 23 + 40 0 in miles per hour. After two hours
h i
it is then at (40, 150) + 150 12 23 + 40 0 = 155 75 3 + 150 = 155.0 279. 9
4.10.61 Wind: 0 50 . Direction it needs to travel: (3, 5) 1 . Then you need 250 a b + 0 50
34
to have this direction where a b is an appropriate unit vector. Thus you need
a2 + b2 = 1
250b + 50 5
=
250a 3
Thus a = 35 , b = 45 . The velocity of the plane relative to the ground is 150 250 . The speed of the plane
relative to the ground is given by
q
(150)2 + (250)2 = 291.55 miles per hour
q
It has to go a distance of (300)2 + (500)2 = 583. 10 miles. Therefore, it takes
583. 1
= 2 hours
291. 55
293
4.10.62 Water: 2 0 Swimmer: 0 3 Speed relative to earth: 2 3 . It takes him 1/6 of an
hour to get across. Therefore, he ends up travelling 16 4 + 9 = 16 13 miles. He ends up 1/3 mile down
stream.
4.10.63 Man: 3 ah b Water: 2 0 Then you need 3a = 2 and so a = 2/3 and hence b = 5/3.
i
The vector is then 23 35 .
In the second case, he could not do it. You would need to have a unit vector a b such that 2a = 3
which is not possible.
~ ~ ~ ~
4.10.67 proj~D ~F = F~ D ~D = k~Fk cos ~D = k~Fk cos ~u
kDk kDk kDk
20
4.10.68 40 cos 180 100 = 3758.8
4.10.69 20 cos 6 200 = 3464.1
4.10.70 20 cos 4 300 = 4242.6
4.10.71 200 cos 6 20 = 3464.1
4 0
4.10.72 3 1 10 = 30 You can consider the resultant of the two forces because of the prop-
4 0
erties of the dot product.
4.10.73
1 1 1
2 2 2
~F1
1
10 + ~F2 1
10 = ~F1 + ~F2
1
10
2 2 2
0 0 0
1
6 2
= 4 1 10
2
4 0
= 50 2
2 0
1
4.10.74 3 2 20 = 10 2
4 1
2
5.1.2
(a~v + b~w ~u)
T~u (a~v + b~w) = a~v + b~w ~u
k~uk2
(~v ~u) (~w ~u)
= a~v a 2
~u + b~w b ~u
k~uk k~uk2
= aT~u (~v) + bT~u (~w)
294 Selected Exercise Answers
5.1.3 Linear transformations take ~0 to ~0 which T does not. Also T~a (~u +~v) 6= T~a~u + T~a~v.
5.2.4 (a) The matrix of T is the elementary matrix which multiplies the jth diagonal entry of the identity
matrix by b.
(b) The matrix of T is the elementary matrix which takes b times the jth row and adds to the ith row.
(c) The matrix of T is the elementary matrix which switches the ith and the jth rows where the two
components are in the ith and jth positions.
5.2.5 Suppose
~cT1
.. 1
. = ~a1 ~an
~cTn
Thus ~cTi ~a j = i j . Therefore,
~cT1
1 .
~b1 ~bn ~a1 ~an ~ai = ~b1 ~bn .. ~ai
~cTn
= ~b1 ~bn ~ei
= ~bi
1
Thus T~ai = ~b1 ~bn ~a1 ~an ~ai = A~ai . If~x is arbitrary, then since the matrix ~a1 ~an
is invertible, there exists a unique ~y such that ~a1 ~an ~y =~x Hence
! !
n n n n
T~x = T yi~ai = yi T~ai = yi A~ai = A yi~ai = A~x
i=1 i=1 i=1 i=1
5.2.6
5 1 5 3 2 1 37 17 11
1 1 3 2 2 1 = 17 7 5
3 5 2 4 1 1 11 14 6
5.2.7
1 2 6 6 3 1 52 21 9
3 4 1 5 3 1 = 44 23 8
1 1 1 6 2 1 5 4 1
5.2.8
3 1 5 2 2 1 15 1 3
1 3 3 1 2 1 = 17 11 7
3 3 3 4 1 1 9 3 3
5.2.9
3 1 1 6 2 1 29 9 5
3 2 3 5 2 1 = 46 13 8
3 3 1 6 1 1 27 11 5
295
5.2.10
5 3 2 11 4 1 109 38 10
2 3 5 10 4 1 = 112 35 10
5 5 2 12 3 1 81 34 8
~v~u
5.2.14 Recall that proj~u (~v) = k~uk2
~u and so the desired matrix has ith column equal to proj~u (~ei ) . Therefore,
the matrix desired is
1 2 3
1
2 4 6
14
3 6 9
5.2.15
1 5 3
1
5 25 15
35
3 15 9
5.2.16
1 0 3
1
0 0 0
10
3 0 9
2
2
cos 3 sin 3 21 12 3
5.4.27 2 2 = 1
sin 3 cos 3 2 3 12
5.4.28
cos 3 sin 3 cos 4 sin 4
sin 3 cos 3 sin 4 cos 4
1 1
1 1
4 2
3 + 4 2
1
4 2 4 23
= 1 1 1
4 2 3 4 2 4 2 3+ 4 2
5.4.29
2 2 1 1
1 0 cos 3 sin 3 2 2 3
2 2 = 1 1
0 1 sin 3 cos 3 2 3 2
5.4.30
1 0 cos 3 sin 3 1
2 1
2 3
= 1 1
0 1 sin 3 cos 3 2 3 2
5.4.31 1
1 0 cos 4 sin 4 2 2 1
2 2
= 1 1
0 1 sin 4 cos 4 2 2 2 2
5.4.32 1
1 0 cos 6 sin 6 2 3 1
2
=
0 1 sin 6 cos 6 1
2
1
2 3
5.4.33 1
cos 4 sin 4 1 0 2 2 12 2
= 1
sin 4 cos 4 0 1 2 2 2 2
1
5.4.34 1
cos 4 sin 4 1 0 2 2 12 2
=
sin 4 cos 4 0 1 12 2 21 2
5.4.35 1
cos 6 sin 6 1 0 2 3 1
2
=
sin 6 cos 6 0 1 1
2 12 3
5.4.36 1
cos 6 sin 6 1 0 2 3 12
=
sin 6 cos 6 0 1 12 1
2 3
5.4.37
2
cos 3 sin 23 cos 4 sin 4
=
sin 2
3 cos 23 sin 4 cos 4
1 1
1
1
4 23 4 2 4 2 3 2
4
1 1 1 1
4 2 3+ 4 2 4 2 3 4 2
Note that it doesnt matter about the order in this case.
297
5.4.38 1
1 0 0 cos 6 sin 6 0 2 3 12 0
0 1 0 sin 6
cos 6
0 = 1 1
0
2 2 3
0 0 1 0 0 1 0 0 1
5.4.39
cos ( ) sin ( ) 1 0 cos ( ) sin ( )
sin ( ) cos ( ) 0 1 sin ( ) cos ( )
cos2 sin2 2 cos sin
=
2 cos sin sin2 cos2
Now to write in terms of (a, b) , note that a/ a2 + b2 = cos , b/ a2 + b2 = sin . Now plug this in to the
above. The result is " 2 2 # 2
a b ab
a2 +b2
2 a2 +b2 1 a b2 2ab
= 2
2 a2ab 2
b2 a2
2 2
a + b2 2ab b2 a2
+b a +b
6.1.1 (a) z + w = 5 i
(b) z 2w = 4 + 23i
(c) zw = 62 + 5i
(d) w
z = 50 37
53 53 i
z
6.1.4 If z = 0, let w = 1. If z 6= 0, let w =
|z|
6.1.5 (a + bi) (c + di) = ac bd + (ad + bc) i = (ac bd)(ad + bc) i (a bi) (c di) = acbd (ad + bc) i
which is the same thing. Thus it holds for a product of two complex numbers. Now suppose you have that
it is true for the product of n complex numbers. Then
z1 zn+1 = z1 zn zn+1
= an zn + an1 zn1 + + a1 z + a0
= an zn + an1 zn1 + + a1 z + a0
= an zn + an1 zn1 + + a1 z + a0
= p (z)
6.1.7 The problem is that there is no single 1.
6.2.12 You have z = |z| (cos + i sin ) and w = |w| (cos + i sin ) . Then when you multiply these, you
get
6.3.16 Yes, it holds for all integers. First of all, it clearly holds if n = 0. Suppose now that n is a negative
integer. Then n > 0 and so
1 1
[r (cost + i sint)]n = n = r n (cos (nt) + i sin (nt))
[r (cost + i sint)]
6.3.18 x3 + 27 = (x + 3) x2 3x + 9
299
6.3.20 x4 + 16 2 2
= x 2 2x + 4 x + 2 2x + 4 . You can use the information in the preceding prob-
lem. Note that (x z) (x z) has real coefficients.
6.3.22 p (x) = (x z1 ) q (x) + r (x) where r (x) is a nonzero constant or equal to 0. However, r (z1 ) = 0 and
so r (x) = 0. Now do to q (x) what was done to p (x) and continue until the degree of the resulting q (x)
equals 0. Then you have the above factorization.
6.4.23
(x (1 + i)) (x (2 + i)) = x2 (3 + 2i) x + 1 + 3i
(b) Solution is : x = 1 21 i, x = 1 12 i
(c) Solution is : x = 12 , x = 12 i
7.1.2 Say AX = X . Then cAX = c X and so the eigenvalues of cA are just c where is an eigenvalue
of A.
m =
Hence if 6= 0, then
m1 = 1
and so | | = 1.
7.1.5 The formula follows from properties of matrix multiplications. However, this vector might not be
an eigenvector because it might equal 0 and eigenvectors cannot equal 0.
0 1
7.1.14 Yes. works.
0 0
7.1.16 When you think of this geometrically, it is clear that the only two values of are 0 and or these
added to integer multiples of 2
1 0
7.1.17 The matrix of T is . The eigenvectors and eigenvalues are:
0 1
0 1
1, 1
1 0
0 1
7.1.18 The matrix of T is . The eigenvectors and eigenvalues are:
1 0
i i
i, i
1 1
1 0 0
7.1.19 The matrix of T is 0 1 0 The eigenvectors and eigenvalues are:
0 0 1
0 1 0
0 1, 0 , 1 1
1 0 0
7.2.20 The eigenvalues are 1, 1, 1. The eigenvectors corresponding to the eigenvalues are:
10 7
2 1, 2 1
3 2
Therefore this matrix is not diagonalizable.
301
7.2.27 The eigenvalues are distinct because they are the nth roots of 1. Hence if X is a given vector with
n
X= a jV j
j=1
then
n n n
Anm X = Anm a jV j = a j AnmV j = a jV j = X
j=1 j=1 j=1
so Anm = I.
AX = (a ib) X
302 Selected Exercise Answers
Letting x3s = t and using the fact that there are a total of 480 individuals, we must solve
5 2
t + t + t = 480
6 3
We find that t = 192. Therefore after a long time, there are 160 people in location 1, 128 in location 2, and
192 in location 3.
7.3.41
0.38
X3 = 0.18
0.44
Therefore the probability of ending up back in location 2 is 0.18.
7.3.42
0.367
X2 = 0.4625
0.1705
Therefore the probability of ending up in location 1 is 0.367.
305
306 INDEX
union, 269
unit vector, 137
variable
basic, 25
free, 25
vector, 127
addition, 128
addition, geometric meaning, 127
components, 127
corresponding unit vector, 137
length, 137
orthogonal, 151
perpendicular, 151
points and vectors, 126
projection, 152, 153
scalar multiplication, 129
subtraction, 129
vectors, 46
column, 46
row vector, 46
velocity, 173
zero matrix, 42
zero transformation, 181
zero vector, 129
ADAPTED FORMATIVE ONLINE COURSE COURSE LOGISTICS
OPEN TEXT ASSESSMENT SUPPLEMENTS & SUPPORT