Further linear algebra. Chapter VI.
Inner
product spaces.
Andrei Yafaev
1 Geometry of Inner Product Spaces
Definition 1.1 Let V be a vector space over R and let h, i be a symmetric
bilinear form on V . We shall call the form positive definite if for all non-zero
vectors v V we have
hv, vi > 0.
Notice that a symmetric bilinear form is positive definite if and only if its
canonical form (over R) is In .
Clearly x21 + . . . + x2n is positive definite on Rn . Conversely, suppose B is
a basis such that the matrix with respect to B is the canonical form. For any
basis vector bi , the diagonal entry satisfies hbi , bi i > 0 and hence hbi , bi i = 1.
Definition 1.2 Let V be a vector space over C. A Hermitian form on V is
a function h, i : V V C such that:
For all u, v, w V and all C,
hu + v, wi = hu, wi + hv, wi;
For all u, v V ,
hu, vi = hv, ui.
Example 1.1 The simplest example is the following : take V = C, then
< z, w >= z w is a hermitian form on C.
Here A is
A matrix A Mn (C) is called a Hermitian matrix if At = A.
the matrix obtained from A by applying complex conjugation to the entries.
1
If A is a Hermitian matrix then the following is a Hermitian form on Cn :
hv, wi = v t Aw.
In fact every Hermitian form on Cn is one of these.
To see why, suppose wePare given a Hermitian
P form <, >. Choose a basis
B = (b1 , . . . , bn ). Let v = i i bi and w = j j bj . We calculate
X X X
< v, w >=< i bi , j bj >= i j < bi , bj >= v t Aw
i j i,j
where A = (< bi , bj >). Of course At = A because < bi , bj >= < bj , bi >.
A matrix A satisfying At = A is called hermitian.
Example 1.2 If V = Rn , then <, > defined by < x1 , . . . , xn , y1 , . . . , yn >=
P
i,j xi yj is called the standard inner product.
If V = Cn , then <, > defined by < z1 , . . . , zn , w1 , . . . , wn >= i,j zi wj is
P
called the standard (hermitian) inner product.
Note that a Hermitian form is conjugate-linear in the second variable, i.e.
wi.
hu, v + wi = hu, vi + hu,
Note also that by the second axiom
hu, ui R.
Definition 1.3 A Hermitian form is positive definite if for all non-zero
vectors v we have
hv, vi > 0.
In other words, < v, v > 0 for all v and < v, v >= 0 if and only if
v = 0.
Clearly, the fom z w is positive definite.
Definition 1.4 By an inner product space we shall mean one of the follow-
ing:
either A finite dimensional vector space V over R with a positive definite
symmetric bilinear form;
2
or A finite dimensional vector space V over C with a positive definite
Hermitian form.
We shall often write K to mean the field R or C, depending on which is
relevant.
Example 1.3 Consider the vector space V of all continuous functions [0, 1]
C.
Then we can define
Z 1
hf, gi = f (x)g(x)dx.
0
This defines an inner product on V (easy exercise).
Another example. Let V = Mn (R) the vector space of n n-matrices with
real entries. Then
< A, B >= tr(AB t )
is an inner product on V .
t
Similarly, if V = Mn (C) and < A, B >= tr(AB ) is an inner product.
Definition 1.5 Let V be an inner product space. We define the norm of a
vector v V by p
||v|| = hv, vi.
= ||2 for for v V we have ||v|| =
Lemma 1.4 For K we have
|| ||v||.
The proof is obvious.
Theorem 1.5 (Cauchy-Schwarz inequality) If V is an inner product space
then
u, v V |hu, vi| ||u|| ||v||.
Proof. If v = 0 then the result holds so suppose v 6= 0. We have for all
K,
hu v, u vi 0.
Expanding this out we have:
vi + ||2 ||v||2 0.
||u||2 hv, ui hu,
3
hu,vi
Setting = ||v||2
we have:
hu, vi 2
2hu, vi hv, ui ||v||2 0.
||u|| hv, ui hu, vi +
||v||2 ||v||2 ||v||
2
Multiplying by ||v||2 we get
||u||2 ||v||2 2|hu, vi|2 + |hu, vi|2 0.
Hence
||u||2 ||v||2 |hu, vi|2 .
Taking the square root of both sides we get the result.
Theorem 1.6 (Triangle inequality) If V is an inner product space with
norm || || then
u, v V ||u + v|| ||u|| + ||v||.
Proof. We have
||u + v||2 = hu + v, u + vi
= ||u||2 + 2 hu, vi + ||v||2 .
Notice that |(< u, v >)| | < u, v > | hence
||u + v||2 ||u||2 + 2| < u, v > | + ||v||2
So the CauchySchwarz inequality implies that
||u + v||2 ||u||2 + 2||u|| ||v|| + ||v||2 = (||u|| + ||v||)5 .
Hence
||u + v|| ||u|| + ||v||.
Definition 1.6 Two vectors v, w in an inner product space are called or-
thogonal if hv, wi = 0.
4
Theorem 1.7 (Pythagoras Theorem) Let (V, <, >) be an inner product
space. If v, w V are orthogonal, then
||v||2 + ||w||2 = ||v + w||2
Proof. Since
||v + w||2 = hv + w, v + wi = ||v||2 + 2hv, wi + ||w||2 ,
so we have
||v||2 + ||w||2 = ||v + w||2
if hv, wi = 0.
2 GramSchmidt Orthogonalisation
Definition 2.1 Let V be an inner product space. We shall call a basis B of
V an orthonormal basis if hbi , bj i = i,j .
Proposition 2.1 If B is an orthonormal basis then for v, w V we have:
hv, wi = [v]tB [w]B .
Proof. If the basis B = (b1 , . . . , bn ) is orthonormal, then the matrix of <, >
in this basis is the identity In . The proposition follows.
Theorem 2.2 (GramSchmidt Orthogonalisation) Let B be any basis.
Then the basis C defined by
c 1 = b1
hb2 , c1 i
c 2 = b2 c1
hc1 , c1 i
hb3 , c1 i hb3 , c2 i
c3 = b3 c1 c2
hc1 , c1 i hc2 , c2 i
..
.
n1
X hbn , cr i
c n = bn cr ,
r=1
hcr , cr i
5
is orthogonal. Furthermore the basis D defined by
1
dr = cr ,
||cr ||
is orthonormal.
Proof. Clearly each bi is a linear combination of C, so C spans V . As the
cardinality of C is dim V , C is a basis. It follows also that D is a basis. Well
prove by induction that {c1 , . . . , cr } is orthogonal. Clearly any one vector is
orthogonal. Suppose {c1 , . . . , cr1 } are orthogonal. The for s < r we have
r1
X hbr , ct i
hcr , cs i = hbr , cs i hct , cs i.
t=1
hct , ct i
By the inductive hypothesis we have
hbr , cs i
hcr , cs i = hbr , cs i hcs , cs i. = hbr , cs i hbr , cs i = 0.
hcs , cs i
(notice that < ct , cs >= 0 unless t = s). This shows that {c1 , . . . , cr } are
orthogonal. Hence C is an orthogonal basis. It follows easily that D is
orthonormal.
This theorem shows in particular that an orthonormal basis always ex-
ists. Indeed, take any basis and turn it into an orthonormal one by applying
Gram-Schmidt process to it.
Proposition 2.3 If V is an inner product space with an Porthonormal basis
n
B = {b1 , . . . , bn }, then any v V can be written as v = i=1 hv, ei iei .
Pn Pn
Proof. We have v = i=1 i ei and hv, ej i = i=1 i hei , ej i = j .
Definition 2.2 Let S be a subspace of an inner product space V . The or-
thogonal complement of S is defined to be
S = {v V : w S hv, wi = 0}.
6
Theorem 2.4 If (V, <, >) is an inner product space and W is a subspace of
V then
V = W W ,
and hence any v V can be written as
v = w + w ,
for unique w W and w W .
Proof. We show first that V = W + W .
Let E = {e1 , . . . , en } be an orthonormal basis for V , such that {e1 , . . . , er }
is a basis for W . This can be constructed by Gram-Schmidt orthogonalisa-
tion. (choose a basis {b1 , . . . , br } for W and complete to a basis {b1 , . . . , bn }
of V .
Then apply Gram-Schmidt process. Notice that in Gram-Schmidt pro-
cess, when constructing orthonormal basis, the vectors c1 , . . . , ck lie in the
space generated by c1 , . . . , ck1 , bk . It follows that the process will give an
orthonormal basis e1 , . . . , en such that e1 , . . . , er is an orthonormal basis of
W .)
If v V then
X r Xn
v= i ei + i ei .
i=1 i=r+1
Now r
X
i ei W.
i=1
If w W then there exist i R such that
r
X
w= i ei .
i=1
So * +
n
X r
X n
X
w, i ej = i j hei , ej i = 0.
j=r+1 i=1 j=r+1
Hence n
X
i ei W .
i=r+1
7
Therefore
V = W + W .
Next suppose v W W . So hv, vi = 0 and so v = 0.
Hence V = W W and so any vector v V can be expressed uniquely
as
v = w + w ,
where w W and w W .
3 Adjoints.
Definition 3.1 An adjoint of a linear map T : V V is a linear map T
such that hT (u), vi = hu, T (v)i for all u, v V .
Theorem 3.1 (existence and uniqueness) Every T : V V has a unique
adjoint. If T is represented by A (w.r.t. an orthonormal basis) then T is
represented by At .
Proof. (Existence) Let T be the linear map represented by At . Well prove
that it is an adjoint of A.
hT v, wi = [v]t At [w] = [v]t At [w]. = hv, T wi.
Notice that here we have used that the basis is orthonormal : we said that
the matrix of <, > was the identity. (Uniqueness) Let T , T be two adjoints.
Then we have
hu, (T T )vi = 0.
for all u, v V . In particular, let u = (T T )v, then ||(T T )v|| = 0
hence T (v) = T (v) for all v V . Therefore T = T .
Example 3.2 Consider V = C2 with the standard orthonormal basis and let
T be represented by
1 i
A=
i 1
Then T = T (such a linear map is called autoadjoint).
Notice that T being self-adjoint is equivalent to the matrix
representing it being hermitian
8
2i 1+i
A=
1 + i i
Then T = T
t
We also see that T = T (using that T is represented by A ).
4 Isometries.
Theorem 4.1 If T : V V be a linear map of a Euclidean space V then
the following are equivalent.
(i) T T = Id (i.e. T = T 1 ).
(ii) u, v V hT u, T vi = hu, vi. (i.e. T preserves the inner product.)
(iii) v V ||T v|| = ||v||. (i.e. T preserves the norm.)
Definition 4.1 If T satisfies any of the above (and so all of them) then T
is called an isometry.
t
We also see that T = T (using that T is represented by A ).
Proof. (i) = (ii)
Let u, v V then
hT u, T vi = hu, T T vi = hu, vi ,
since T = T 1 .
(ii) = (iii)
If v V then
||T v||2 = hT v, T vi
so by (ii)
||T v||2 = hv, vi = ||v||2 .
Hence ||T v|| = ||v||, so (iii) holds.
(iii) = (ii) We just show that the form can be recovered from the
norm. We have
2hu, vi = ||u + v||2 ||u||2 ||v||2 , hv, wi = hv, iwi.
9
For the second equality, notice that,
2 < v, iw >=< v, iw > +< v, iw > = i < v, w > +i< v, w > =
1
i(< v, w > < v, w >) = (< v, w > < v, w >) = 2 < v, w >
i
Now suppose ||T v|| = ||v|| for all v. Take u, v V . We have ||T (u+v)|| =
||u + v||, ||T (u)|| = ||u||, and ||T (u)|| = ||v||. It follows that
2 < T (u), T (v) >= ||T (u)+T (v)||2 ||T (u)||2 ||T (v)||2 = ||u+v||2 ||u||2 ||v||2 = 2 < u, v >
and the second inequality shows that
< T (u), T (v) >= R < T (u), iT (v) >= R(< T (u), T (iv) >) = R(< u, iv >) = < u, v >
. Hence
< T (u), T (v) >=< u, v >
.
(ii) implies (i):
hT T u, vi = hT u, T vi
= hu, vi .
Therefore < (T T I)u, v >= 0 for all v. In particular, take v = (T T I)u,
then h(T T I)u, (T T 1)ui = 0. Therefore T T = I.
Notice that in an orthonormal basis, an isometry is represented
t
by a matrix such that A = A1 .
We let On (R) be the set of n n real matrices satisfying AAt = In (in
other words At = A1 ). f An (R) then det A = 1. If A On (R) then
At = A1 so
det A = det At = det(A1 ) = det A1 .
Therefore det(A)2 = 1 and det A = 1.
Theorem 4.2 The following are equivalent.
(i) A On (R).
(ii) The columns of A form an orthonormal basis for Rn (for the standard
inner product on Rn ).
10
(iii) The rows of A form an orthonormal basis for Rn .
Proof. We prove (i) (ii) (the proof of (i) (iii) is identical).
Consider At A. If A = [C1 , . . . , Cn ], so the jth column of A is Cj , then
the (i, j)th entry of At A is Cit Cj .
So At A = In Cit Cj = i,j hCi , Cj i = i,j {C1 , . . . , Cn }
is an orthonormal basis for Rn .
For example take the matrix:
1/2 1/ 2
1/ 2 1/ 2
This matrix is in O2 (R).
In fact it is the matrix of rotation by angle /4.
Theorem 4.3 Let V be a Euclidean space with orthonormal basis E = {e1 , . . . , en }.
If F = {1 , . . . ,n } is a basis for V and P is the transition matrix from E to
F, then
P n (R) F is an orthonormal basis for V .
Proof. The jth column of P is [fj ]E so
n
X
fj = pk,j ek .
k=1
Hence
* n n
+ n
n X n
X X X X
hfi , fj i = pk,i ek , pl,j el = pk,i pl,j hek , el i = pk,i pk,j = (P t P )i,j .
k=1 l=1 k=1 l=1 k=1
So F is an orthonormal basis for Rn hfi , fj i = i,j iff P t P = In
P n (R).
Notice that it is NOT true that matrices in On (R) are diagonal-
isable.
Indeed, take
cos() sin()
sin() cos()
11
where is not a multiple of .
The characteristic polynomial is x2 2 cos()x+1. Then, as cos()2 1 <
0, there are no real eigenvalues and the matrix is not diagonalisable.
Notice that for a given matrix A, it is easy to check that columns are
orthogonal. If that is the case, then A is in On (R) and it is easy to calculate
inverse : A1 = At .
5 Orthogonal Diagonalisation.
Definition 5.1 Let V be an inner space. A linear map T : V V is
self-adjoint if
T = T
Notice that in an othonormal basis, T is represented by a matrix A such
t
that A = A. In particular if V is real, then A is symmetric.
Theorem 5.1 If A Mn (C) is Hermitian then all the eigenvalues of A are
real.
t
Proof. Recall that Hermitian means that A = A and that this implies that
< Au, v >=< u, Av > for all u, v. Let be an eigenvalue of A and let v 6= 0
be a corresponding eigenvector. Then
Av = v
It follows that
< Av, v >= < v, v >=< v, Av >= < v, v >
As v 6= 0, we can divide by < v, v >= ||v||2 6= 0 hence we can divide by it.
It follows that =
In particular a real symmetric matrix always has an eigenvalue : take a
complex eigenvalue (always exists !), then by the above theorem it will be
real.
Theorem 5.2 (Spectral theorem) Let T : V V be a self-adjoint linear
map of an inner product space V . Then V has an orthonormal basis of
eigenvectors.
12
Proof. This is rather similar to Theorem 5.4.
We use induction on dim(V ) = n. True for n = 1 so suppose the result
holds n 1 and let dim(V ) = n.
Since T is self-adjoint, if E is an orthonormal basis for V and A is the
matrix representing T in E then
t
A=A.
So A is Hermitian. Hence by Theorem 6.18 A has a real eigenvalue .
So there is a vector e1 V \ {0} such that T e1 = e1 . Normalizing
(dividing by ||e1 ||) we can assume that ||e1 || = 1.
Let W = Span{e1 } then by Theorem 6.9 we have V = W W . Now
n = dim(V ) = dim(W ) + dim(W ) = 1 + dim(W ),
so dim(W ) = n 1.
We claim that T : W W , i.e. T (W ) W . Let w = e1 W ,
R and v W . Then
hw, T vi = hT w, vi = hT w, vi = hT (e1 ), vi = hT e1 , vi = he1 , vi = 0,
since e1 W . Hence T : W W .
By induction there exists an orthonormal basis of eigenvectors {e2 , . . . , en }
for W . But V = W W so E = {e1 , . . . , en } is a basis for V and he1 , ei i = 0
for 2 i n and ||e1 || = 1. Hence E is an orthonormal basis of eigenvectors
for V .
Theorem 5.3 Let T : V V be a self-adjoint linear map of a Euclidean
space V . If , are distinct eigenvalues of T then
u V v V hu, vi = 0.
Proof. If u V and v V then
hu, vi = hu, vi = hT u, vi = hu, T vi = hu, T vi = hu, vi = hu, vi .
So ( ) hu, vi = 0, with 6= . Hence hu, vi = 0.
13
Example 5.4 Let
1 i
A=
i 1
This matrix is self-adjoint.
One calculates the cracteristic polynomial and finds t(t 2) (in particular
the minimal polynomial is the same, hence you know that the matrix is di-
agonalisable for other reasons than being self-adjoint). For eigenvalue zero,
one finds eigenvector
i
1
i
For eigenvalue 2, one finds Then we normalise the vectors : v1 =
1
1
i i
and v1 = 12 We let
2 1 1
1 i i
P =
2 1 1
and
1 0 0
P AP =
0 2
In general the procedure for orthogonal orthonormalisationis as
follows.
Let A be an n n self-adjoint matrix.
Find eigenvalues i and eigenspaces V1 (i ). Because it is diagonalisable,
you will have:
V = V1 (1 ) Vr (r )
Choose a basis for V as union of bases of V1 (i ). Apply Gram-Schmidt to it
to get an orthonormal basis.
For example :
1 2 2
A = 2 4 4
2 4 4
This matrix is symmetric hence self-adjoint.
One calculates the characteristic polynomial and finds 2 ( 9).
14
1
For V1 (9), one finds v1 = 2 To make this orthonormal, divide by
2
the norm, i.e replace v1 by 13 v1 .
For V1 (0), one finds V1 (0) = Span(v2 , v3 ) with
2
v3 = 1
0
and
2
v4 = 0
1
By Gram-Schmidt process we replace v3 by
2
1
1
5 0
and v4 by
2
1
4
3 5 5
Let
1/3 2/5 2/3 5
P = 2/3 1/ 5 4/35
2/3 0 5/3 5
We have
9 0 0
P 1 AP = 0 0 0
0 0 0
15