Thanks to visit codestin.com
Credit goes to www.scribd.com

100% found this document useful (1 vote)
40 views494 pages

Advanced Calculus

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
40 views494 pages

Advanced Calculus

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 494

The Elements of

Real Analysis

Second Edition

Robert G. Bartle

Professor of Mathematics
University of Illinois
Urbana-Champaign

John Wiley
& Sons., New York - London - Sydney - Toronto
Copyright © 1964, 1976 by John Wiley & Sons, Inc.

All rights reserved. Published simultaneously in Canada.

No part of this book may be reproduced by any means, nor


transmitted, nor translated into a machine language with-
out the written permission of the publisher.

Library of Congress Cataloging in Publication Data:


Bartle, Robert Gardner, 1927-
The elements of real analysis.
Bibliography: p.
Includes index.
1. Mathematical analysis. II. Title.
QA300.B29 1975 515'.8 75-15979

Printed in the United States of America

10987654321
To my parents
Preface

At one time an undergraduate student of advanced mathematics was


expected to develop technique in solving problems that involved consider-
able computation; however, he was not expected to master “theoretical
subtleties” such as uniform convergence or uniform continuity. He was
required to be able to use the Implicit Function Theorem, but not to know
its hypotheses. This situation has changed; it is now considered important
that all advanced students of mathematics—future mathematicians, com-
puter scientists, physicists, engineers, or economists—grasp the basic
theoretical nature of the subject. They will then understand both the
power and the limitation of the general theory more fully.
This textbook developed from my experience in teaching real analysis at
the University of Illinois since 1955. My audience often ranges from
unusually well-prepared freshmen to graduate students. Most of them are
usually not mathematics majors, but they have studied at least the equival-
ent of three semesters of (nonrigorous) calculus, including partial deriva-
tives, multiple integrals, line integrals, and infinite series. It is desirable
for all of the students to have a semester of linear or modern algebra to
prepare the way for this course in which analytic theorems are proved.
However, since many of the students I encounter do not possess this
background, I begin the study of analysis with a few algebraic proofs to
start them on their way.
In this edition, I introduce the algebraic and order properties of the real
number system in Sections 4 and 5 in a simpler manner than was used in
the first edition. In addition, I introduce the definitions of a vector space
and a normed space in Section 8, since these notions occur so frequently in
modern mathematics. J also shortened several sections to make the mater-
ial more readily available and to provide additional flexibility in using this
book as a text. I added many new exercises and projects but tried to keep
the book at the same level of sophistication as the first edition. There have
been only minor changes in the first part of the book. However, since
experience has shown that the discussion of differentiation and integration
in R? was too brief in the first edition, I assembled the theory of functions
of one variable into a single chapter, and expanded considerably the
treatment of functions of several variables.
In Sections 1 to 3, I present the set-theoretic terminology and notation

vii
viii PREFACE

that is employed subsequently and introduce a few basic concepts. How-


ever, these sections do not give a systematic presentation of set theory.
(Such a presentation is not needed or desirable at this stage.) These
sections should be examined briefly and consulted later if necessary. The
text really starts with Section 4, and Section 6 introduces “analysis.” It is
possible to cover Sections 4 to 12, 14 to 17, 20 to 24.1, and most of 27 to
31 in one semester. I would exercise the instructor’s prerogative and
briefly introduce certain other topics (such as series) at the expense of
soft-pedaling (or even omitting) various results that are not essential to the
later material. Since the entire text provides a little more material than
can usually be covered in one year at this level, the instructor probably will
limit discussion of some sections. However, it will be useful to the student
to have the additional material available for future reference. Most of the
topics generally associated with courses in “advanced calculus’’ are dealt
with here. The main exception is the subject line and surface integrals and
Stokes’s Theorem; this topic is not discussed, since an intuitive treatment is
properly a part of calculus, and a rigorous treatment requires a rather
extensive discussion in order to be useful.

Sections 4 to 10

27| 14, 15, 16

29, 30, 31
PREFACE ix

The logical dependence of the various sections of this textbook is indi-


cated by the adjoining diagram. A solid line in this diagram indicates a
dependence on the preceding section; a dotted line indicates a slight
dependence. All definitions, theorems, corollaries, and lemmas, for in-
stance, are numbered consecutively according to the section number. I
assigned names to the more important theorems whenever a name seemed
appropriate. The proofs are set off from the text by the head proor and
end Q.E.b,
It is not possible to over-emphasize the importance of the exercises and
projects; only by applying serious and concerted efforts to their solution
can one hope to master the material in this book. The projects develop a
specific topic in a connected sequence of exercises; we believe they convey
to the student at least a taste of the pleasure (and torment!) of doing
research in mathematics. I hope that no student will fail to try his hand on
several of these projects, for I believe that they are a particularly valuable
feature of this book.
In writing this book, I have drawn from my classroom experience and
have been influenced by many sources. I benefited by discussions with
students and colleagues and, since the publication of the first edition, I
have had an extensive correspondence with students and teachers at other
institutions. I thank everyone who has made comments and suggestions.
Their interest in improving the book encouraged me to undertake this
revision. Professors K. W. Anderson, W. G. Bade, and A. L. Peressini
read the manuscript of the first edition and offered useful suggestions. I
particularly thank my colleague, Professor B. C. Berndt, for his extensive
and incisive comments and corrections. I am also grateful to Carolyn J.
Bloemker for her patience and painstaking typing of the revised manus-
cript under a variety of circumstances. Finally, I appreciate the
assistance and cooperation of the Wiley staff.

23 June 1975 Robert G. Bartle


Urbana-Champaign, Illinois
Chapter Summaries

Introduction: A Glimpse at Set Theory 1


1. The Algebra of Sets, 1
Equality of sets, intersection, union, Cartesian product
2. Functions, 11
Tabular representation, transformations, restrictions and exten-
sions, composition, injective and inverse functions, surjective
and bijective functions, direct and inverse images
3. Finite and Infinite Sets, 22
Finite, countable, and uncountable sets, the uncountability of R
and i

|. The Real Numbers 27


4. The Algebraic Properties of R, 28
The field properties of R, irrationality of V2
5. The Order Properties of R, 32
Order properties, absolute value
6. The Completeness Property of R, 37
Suprema and infima, Archimedean Property, the existence of v2
7. Cuts, Intervals, and the Cantor Set, 45
The Cut Property, cells and intervals, Nested Property, the
Cantor set, models for R

ll. The Topology of Cartesian Spaces 52


8. Vector and Cartesian Spaces, 52
Vector spaces, inuer product spaces, normed spaces, the Schwarz
Inequality, the Cartesian space R’
9. Open and Closed Sets, 62
Open sets, closed sets, neighborhoods
10. The Nested Cells and Bolzano-Weierstrass Theorem, 68
Nested Cells Theorem, cluster points, Bolzano-Weierstrass
Theorem
11. The Heine-Borel Theorem, 72
Compactness, the Heine-Borel Theorem, Cantor Intersection
Theorem, Lebesgue Covering Theorem

xi
xii CHAPTER SUMMARIES

. Connected Sets, 80
The connectedness of intervals in R, polygonally connected open
sets are connected, connected sets in R are intervals
. The Complex Number System, 86
Definition and elementary properties

Convergence 90
14. Introduction to Sequences, 90
Convergence, uniqueness of the limit, examples
15. Subsequences and Combinations, 98
Subsequences, algebraic combinations of sequences
16. Two Criteria for Convergence, 104
Monotole Convergence Theorem, Bolzano-Weierstrass
Theorem, Cauchy sequences, the Cauchy Criterion
17. Sequences of Functions, 113
Convergence, uniform convergence, the uniform norm, Cauchy
Criterion for Uniform Convergence
18. The Limit Superior, 123
The limit superior and inferior of a sequence in R, unbounded
sequences, infinite limits
19, Some Extensions, 128
Order of magnitude, Cesaro summation, double sequences,
iterated limits

IV. Continuous Functions 136


20. Local Properties of Continuous Functions, 136
Continuity at a point and on a set, the Discontinuity Criterion,
combinations of functions
21. Linear Functions, 147
Linear functions, matrix representation, the norm
22. Global Properties of Continuous Functions, 150
Global Continuity Theorem, Preservation of Compactness, Pre-
servation of Connectedness, Continuity of the Inverse Function
Theorem, bounded continuous functions
23, Uniform Continuity and Fixed Points, 158
Uniform continuity, Lipschitz condition, Fixed Point Theorem
for Contractions, Brouwer Fixed Point Theorem
24, Sequences of Continuous Functions, 165
Interchange of limit and continuity, approximation by step and
piecewise linear functions, Bernstein polynomials, the Bernstein
and Weierstrass Approximation Theorems
CHAPTER SUMMARIES xiii

25. Limits of Functions, 174


Deleted and non-deleted limits, the deleted and non-deleted
limit inferior, semi-continuity
26. Some Further Results, 182
The Stone and Stone-Weierstrass Approximation Theorems,
Polynomial Approximation Theorem, Tietze’s Extension
Theorem, equicontinuity, Arzela- Ascoli Theorem

V. Functions of One Variable 193


27. The Mean Value Theorem, 193
The derivative, Interior Maximum Theorem, Rolle’s Theorem,
Mean Value Theorem
28. Further Applications of the Mean Value Theorem, 201
Applications, L’Hospital’s Rules, interchange of limit and de-
rivative, Taylor’s Theorem
29. The Riemann-Stieltjes Integral, 212
Riemann-Stieltjes sums and the integral, Cauchy Criterion for
Integrability, properties of the integral, integration by parts,
modification of the integral
30. Existence of the Integral, 227
Riemann Criterion for Integrability, the integrability of continu-
ous functions, Mean Value Theorems, Differentiation Theorem,
Fundamental Theorem of Integra] Calculus, Change of Variable
Theorem
31. Further Properties of the Integral, 240
Interchange of limit and integral, Bounded Convergence
Theorem, Monotone Convergence Theorem, integral form of the
remainder, integrals depending on a parameter, Leibniz’s for-
mula, Interchange Theorem, Riesz Representation Theorem
32. Improper and Infinite Integrals, 257
Improper integrals of unbounded functions, infinite integrals,
Cauchy Criterion, Comparison Test, Limit Comparison Test,
Dirichlet’s Test, absolute convergence
33. Uniform Convergence and Infinite Integrals, 267
Cauchy Criterion for uniform convergence, Weierstrass M-Test,
Dirichlet’s Test, infinite integrals depending on a parameter,
Dominated Convergence Theorem, iterated infinite integrals

. Infinite Series 286


<

34. Convergence of Infinite Series, 286


Convergence of series, Cauchy Criterion, absolute convergence,
Rearrangement Theorem
xiv CHAPTER SUMMARIES

35. Tests for Absolute Convergence, 294


Comparison Test, Limit Comparison Test, Root Test, Ratio
Test, Raabe’s Test, Integral Test
36. Further Results for Series, 305
Abel’s Lemma, Dirichlet’s Test, Abel’s Test, Alternating Series
Test, double series, Cauchy multiplication
37. Series of Functions, 315
Absolute and uniform convergence, Cauchy Criterion, Weier-
strass M-Test, Dirichlet’s Test, Abel’s Test, power series,
Cauchy-Hadamard Theorem, Differentiation Theorem, Unique-
ness Theorem, Multiplication Theorem, Bernstein’s Theorem,
Abel’s Theorem, Tauber’s Theorem
38. Fourier Series, 330
Bessel’s Inequality, Riemann-Lebesgue Lemma, Pointwise Con-
vergence Theorem, Uniform Convergence Theorem, Norm Con-
vergence Theorem, Parseval’s Equality, Fejér’s Theorem,
Weierstrass Approximation Theorem

Vil. Differentiation in R’ 346

39. The Derivative in R’, 347


Partial derivatives, directional derivatives, the derivative of
f:R° — R’‘, the Jacobian
40. The Chain Rule and Mean Value Theorems, 360
Chain Rule, Mean Value Theorem, interchange of the order of
differentiation, higher derivatives, Taylor’s Theorem
41. Mapping Theorems and Implicit Functions, 375
Class C’, Approximation Lemma, Injective Mapping Theorem,
Surjective Mapping Theorem, Open Mapping Theorem, Inver-
sion Theorem, Implicit Function Theorem, Parametrization
Theorem, Rank Theorem
42. Extremum Problems, 397
Relative extrema, Second Derivative Test, extremum problems
with constraints, Lagrange’s Theorem, inequality constraints

Vill. Integration in R’ 412


43. The Integralin R’, 412
Content zero, Riemann sums and the integral, Cauchy Criterion,
properties of the integral, Integrability Theorem
44. Content and the Integral, 422
Sets with content, characterization of the content function,
further properties of the integral, Mean Value Theorem, iterated
integrals
CHAPTER SUMMARIES xv

45. Transformation of Sets and Integrals, 437


Images of sets with content under C' maps, transformations by
linear maps, transformations by non-linear maps, the Jacobian
Theorem, Change of Variables Theorem, polar and _ spherical
coordinates, strong form of the Change of Variables Theorem

References 456

Hints for Selected Exercises 458

Index 475
INTRODUCTION:
A GLIMPSE AT
SET THEORY

The idea of a set is basic to all of mathematics, and all mathematical


objects and constructions ultimately go back to set theory. In view of the
fundamental importance of set theory, we shall present here a brief resumé
of the set-theoretic notions that will be used frequently in this text.
However, since the aim of this book is to present the elements (rather than
the foundations) of real analysis, we adopt a rather pragmatic and naive
point of view. We shall be content with an informal discussion and shall
regard the word “‘set”’ as understood and synonymous with the words
“class,” “collection,” m8 66 “aggregate,” and ‘“‘ensemble.’’ No attempt will be
made to define these terms or to present a list of axioms for set theory. A
reader who is sophisticated enough to be troubled by our informal develop-
ment should consult the references on set theory that are given at the end
of this text. There he will learn how this material can be put on an
axiomatic basis. He will find this axiomatization to be an interesting
development in the foundations of mathematics. However, since we re-
gard it to be outside the subject area of the present book, we shall not go
through the details here.
The reader is strongly urged to read this introduction quickly to absorb
the notations we shall employ. Unlike the later chapters, which must be
studied, this introduction is to be considered background material. One
should not spend much time on it.

Section 1. The Algebra of Sets


If A denotes a set and if x is an element, it is often convenient to write
xeA
as an abbreviation for the statement that x is an element of A, or that x isa

1
2 INTRODUCTION: A GLIMPSE AT SET THEORY

member of the set A, or that the set A contains the element x, or that x is
in A. We shall not examine the nature of this property of being an element
of a set any further. For most purposes it is possible to employ the naive
meaning of ‘‘membership,” and an axiomatic characterization of this rela-
tion is not necessary.
If A is a set and x is an element which does not belong to A, we shall
often write
x€A.

In accordance with our naive conception of a set, we shall require that


exactly one of the two possibilities
xeA, xé@A,

holds for an element x and a set A.


If A and B are two sets and x is an element, then there are, in principle,
four possibilities (see Figure 1.1):
(1) xeA and xcB; (2) xcA and xéB;
(3) x€A and xeB; (4) x€A and x€B.
If the second case cannot occur (that is, if every element of A is also an
element of B), then we shall say that A is contained in B, or that B
contains A, or that A is a subset of B, and we shall write
ASB or B2A.
If A < B and there exists an element in B which is not in A, we say that A
is a proper subset of B.
It should be noted that the statement that A < B does not automatically
preclude the possibility that A exhausts all of B. When this is true the sets
A and B are “‘equal” in the sense we now define.
1.1 DEFINITION. Two sets are equal if they contain the same
elements. If the sets A and B are equal, we write A = B.

y
(4)

Figure 1.1
1. THE ALGEBRA OF SETS 3

Thus in order to show that the sets A and B are equal we must show that the
possibilities (2) and (3) mentioned above cannot occur. Equivalently, we must
show that both AC B and BCA,

The word “‘property”’ is not easy to define precisely. However, we shall


not hesitate to use it in the usual (informal) fashion. If P denotes a
property that is meaningful for a collection of elements, then we agree to
write

{x : P(x)}
for the set of all elements x for which the property P holds. We usually
read this as ‘‘the set of all x such that P(x).” It is often worthwhile to
specify which elements we are testing for the property P. Hence we shall
often write

{x eS: P(x)}
for the subset of S for which the property P holds.

Exampces. (a) If N={1, 2, 3,...} denotes the set of natural numbers,


then the set
{x EN:x°-3x+2=0}
consists of those natural numbers satisfying the stated equation. Now the
only solutions of the quadratic equation x*-3x+2=0 are x=1 and x=
2. Hence, instead of writing the above expression (since we have detailed
information concerning all of the elements in the set under examination)
we shall ordinarily denote this set by {1, 2} thereby listing the elements of
the set.
(b) Sometimes a formula can be used to abbreviate the description of a
set. For example, the set of all even natural numbers could be denoted by
{2x :x € N}, instead of the more cumbersome {y € N: y = 2x, x EN}.
(c) The set {x ¢€N:6<x <9} can be written explicitly as {7, 8}, thereby
exhibiting the elements of the set. Of course, there are many other
possible descriptions of this set. For example:
{x EN :40<x’<
80},
{x eN:x?—15x+56=0},
{7+x:x=0 or x=1}.

(d) In addition to the set of natural numbers (consisting of the elements


denoted by 1, 2, 3,...) which we shall systematically denote by N, there
are a few other sets for which we introduce a standard notation. The set of
integers is
Z={0, 1, -1, 2, -2,3, -3,...}.
4 INTRODUCTION: A GLIMPSE AT SET THEORY

The set of rational numbers is

Q={m/n:mneZ and nO}.


We shall treat the sets N, Z, and Q as if they are well understood and shall
not re-examine their properties in much detail. Of basic importance for
our later study is the set R of all real numbers which will be examined in
Sections 4-6. A particular subset of R that will be useful is the unit
interval
I={xeR:0<x< 1}.
Finally, we denote the set of complex numbers by C. A more detailed
definition of C and a brief description of some of its properties will be
given in Section 13.

Set Operations
We now introduce some methods of constructing new sets from given ones.
1.2. DEFINITION. If A andB aresets, then their intersection
is the set of
all elements that belong to both A and B. We shall denote the
intersection of the sets A, B by the symbol ANB, which is read “A
intersect B.” (See Figure 1.2.)
1.3. Derinrrion. If A and B are sets, then their union is the set of all
elements which belong either to A or to B or to both A and B. We shall
denote the union of the sets A, B by the symbol A UB, which is read “A
union B.” (See Figure 1.2.)
We could also define A NB and AUB by
ANB={x:xeA and xeB},
AUB={x:xeEA or xeBh.

In connection with the latter, it is important to realize that the word “‘or’’ is
being used in the inclusive sense that is customary in mathematics and
logic. In legal terminology this inclusive sense is sometimes indicated by
“and/or.”

We have tacitly assumed that the intersection and the union of two sets is again a
set. Among other things this requires that there must exist a set which has no elements
atall(forifA andB have nocommon elements, their intersection has no elements).

1.4 DEFINITION. The set which has no elements is called the empty or
the void set and will be denoted by the symbol#. If A and B are sets withno
common elements (thatis, if AM B = 9), then we say that A and B are disjoint
or that they are non-intersecting.
1. THE ALGEBRA OF SETS 5

ANB KY

aus TMM
Figure 1.2. The intersection and union of two sets.

The next result gives some of the algebraic properties of the operations on
sets that we have just defined. Since the proofs of these assertions are
routine, we shall leave most of them to the reader as exercises.

1.5 THEOREM. Let A, B, C, be any sets, then


(a) ANA=A, AUA=A;
(b) ANB=BNA, AUB=BUA;
(c) (ANB)NC=AN(BNC), (AUB)UC=AU(BUOQ);
(d) AN(BUC)=(ANB)U(ANC),
AU(BNC)=(AUB)N(A UC).

These equalities are sometimes referred to as the idempotent, the com-


mulative, the associative, and the distributive properties, respectively, of the
operations of intersection and union of sets.
In order to give a sample proof, we shall prove the first equation in (d). Let x be an
element of AN(BUC), thenxe A and x € B UC. This means that x € A, and either
6 INTRODUCTION: A GLIMPSE AT SET THEORY

x<€BorxeC. Hence we either have (i) x ¢ A and x €B, or we have (ii) xe A and
xéC. Therefore, either xe ANB or xe ANC, so xE(ANB)U(ANC). This
shows that AN(B UC) is a subset of (AN B)U(ANC).
Conversely, let y be anelementof(A NB)U (AMC). Then, either (iii) ye AM B,
or (iv) ye ANC. It follows that y € A, and either ye BoryeC. Therefore, ye A
and ye BUC so that ye AN(BUC). Hence (ANB)U(ANC) is a subset of
AN(BUC). In view of Definition 1.1, we conclude that the sets A M(B UC) and
(A NB)U(ANC) are equal.
Asan indication of an alternate method, we note that there are, in principle, a total of
8( = 2°) possibilities for an element x relative to three sets A, B, C (see Figure 1.3);
namely:

(1) xe€A,xeB,xeC; (2) xe A, x€B, x€C;


(3) x€ A, x€B,xeC; (4) xE€ A, x€B, x€C;
(5) x€A,xEB,xeC; (6) x€ A, xB, x€éC;
(7) x€A, x¢B, x EC; (8) x€ A, x€B, x€C.

The proof consists in showing that both sides of the first equation in (d) contain those
and only those elements x belonging to the cases (1), (2), or (3).

In view of the relations in Theorem 1.5(c), we usually drop the parentheses


and write merely
ANBNC, AUBUC.

It is possible to show thatif{A:, Ao,..., An}isa collection of sets, then there


is a uniquely defined set A consisting of all elements which belong to at least
one of the sets Aj, jf =1,2,...,; and there exists a uniquely defined set B
consisting of all elements which belong to all of the sets A;, /=1,2,...,n.
Dropping the use of parentheses, we write

A=A,UA,U-:-*UA,, B=A:iNA2N-*
+N Aas.

es
(en

(8)

Figure 1.3
1. THE ALGEBRA OF SETS 7

Sometimes, in order to save space, we mimic the notation used in calculus for
sums and employ a more condensed notation, such as

A=Uiz A= UfAsi=12,...,0),

B=(V A= (Arf =1,2,..., 1}


j-
Similarly, if for each j in a set J there is a set Aj, then J{A, :j € J} denotes
the set of all elements which belong to atleast one of the sets A;. Inthe same
way, (\{A; :j ¢ J} denotes the set of allelements which belong to all of the sets
A; forjeJ.
Wenow introduce another method of constructing anew set from two given
ones.
1.6. Derinition. If A and B are sets, then the complement of B
relative to A is the set of all elements of A which do not belong to B. We
shall denote this set by A \ B (read “‘A minus B’’), although the related
notations A—B and A~B are sometimes used by other authors. (See
Figure 1.4.)
In the notation introduced above, we have

A\B={xEA:x¢B}.

Sometimes the set A is understood and does not need to be mentioned


explicitly. In this situation we refer simply to the complement of B and
denote A \ B by €(B).
Returning to Figure 1.1, we note that the elements x which satisfy (1)
belong to A 1B; those which satisfy (2) belong to A \ B; and those which
satisfy (3) belong to B\ A. We-shall now show that A is the union of the sets
AB and A \B.

ANB ES

Figure 1.4. The relative complement.


8 INTRODUCTION: A GLIMPSE AT SET THEORY

1.7 THEOREM. The sets AMB and A\B are non-intersecting and
A=(ANB)U(A\B).
PROOF. Suppose xe AMB and xe A \B. The latter asserts that xe A
and x¢B which contradicts the relation xe¢ AM B. Hence the sets are
disjoint.
IfxeA, then eitherx¢ Borx¢B. Inthe formercase x ¢ A andx€B so
that xe AMB. In the latter situation, xe A and x¢B so thatxeA\B.
This shows that A is a subset of (AMB)U(A\B). Conversely, if ye
(AN B)U(A \B), then either ye ANB, or ye A\B. In either case we
have y € A, showing that (A 1 B) U(A \ B) isa subset of A. QED.
We shall now state the De Morgan? laws for three sets; a more general
formulation will be given in the exercises.

1.8 THeorem. If A, B, C, are any sets, then

A\(BUC)=(A\B)N(A\C),
A\(BNC)=(A\B)U(A\C).
PROOF. We shall carry out a demonstration of the first relation, leaving
the second one to the reader. To establish the equality of the sets, we show
that every element in A\(BUC) is contained in both (A\B) and
(A \C) and conversely.
Ifxisin A \(B UC), then x isinA butxisnotinBUC. Hencex isin A,
but x isneither in B norin C.(Why?) Therefore, x isin A but not B, and x is
in A but not C. That is, xe A\B and xeA\C, showing that xe
(A\B)N(A\C).
Conversely, if xe(A\B)N(A\C), then xe(A\B) and xe(A\C).
Thus x¢ A and bothx¢B andx¢C. It follows that xe A andx¢(BUC),
so thatxe A\(BUC).
Since the sets (A \B)N(A\C) and A \(BUC) contain the same ele-
ments, they are equal by Definition 1.1. OED,

Cartesian Product

We now define the Cartesiant product of two sets.

1.9 DeFINITION. If A and B are two non-void sets, then the Cartesian
product A x B of A and B is the set of all ordered pairs (a, b) with ae A and
be B. (See Figure 1.5.)

+ AuGustus DE MorGAN (1806-1873) taught at University College, London. He was a


mathematician and logician and helped prepare the way for modern mathematical logic.
+ RENE DESCARTES (1596-1650), the creator of analytic geometry, was a French gentleman,
soldier, mathematician, and one of the greatest philosophers of all time.
1. THE ALGEBRA OF SETS 9

AxB

Figure 1.5. The Cartesian product.

(The definition just given is somewhat informal as we have not defined what is
meant by an ‘ordered pair.” We shall not examine the matter further except to
mention that the ordered pair (a, b) could be defined to be the set whose sole
elements are {a}, {a, b}. Itcan then be shown that the ordered pairs (a, b) and (a’, b’)
are equal if and only if a=a' and b=b’. This is the fundamental property of
ordered pairs.)
Thus if A ={1, 2, 3} and B = {4, 5}, then the set A x B is the set whose
elements are the ordered pairs

(1, 4), (1,5), (2, 4), (2, 5), (3, 4), (G, 5).
We may visualize the set A x B as the set of six points in the plane with the
coordinates which we have just listed.
We often draw a diagram (such as Figure 1.5) to indicate the Cartesian
product of two sets A, B. However, it should be realized that this dia-
gram may be somewhat of a simplification. For example, if A=
{xe R:l<x<2}andB={xeR:0<x<1 o0r2<x <3}, then instead of
a rectangle, we should have a drawing like Figure 1.6.

2 ~
AxB

1 2
ee)

1 2

Figure 1.6. The Cartesian product.


10 INTRODUCTION: A GLIMPSE AT SET THEORY

Exercises

1.A. Draw a diagram to represent each of the sets mentioned in Theorem 1.5.
1.B. Prove part (c) of Theorem 1.5.
1.C. Prove the second part of (d) of Theorem 1.5.
1.D. Prove that A cB if and only if ANB=A.
1.E. Show that the set D of all elements that belong either to A or to B but not to
both is given by
D=(A\B)U(B\A).

This set D is often called the symmetric difference of A and B. Represent it by a


diagram.
1.F. Show that the symmetric difference D, defined in the preceding exercise is
also given by D=(AUB)\(ANMB).
1.G. If BCA, show that B= A \(A \B).
1.H. If A and B are any sets, show that AN B=A\(A\B).
LI. If {A,, Ao,..., Ax} is a collection of sets, and if E is any set, show that

ENUA=U(ENA), EUU A=U(EUA).


jel pel yl pet

1.J. If {Ay, Az,..., A,}is a collection of sets, and if E is any set, show that

ENN
A =(\(ENA), EU A=[) (EVA).
rl mi yal an

1.K. Let E be aset and {A,, A.,..., A,} bea collection of sets. Establish the De
Morgan laws:

E\MQA=U(B\A)
V= a
E\U A= pel I=
(ENA).
Note that if E \ A, is denoted by €(A,), these relations take the form

e(Aa)-Uea),
jel j=l
(0 a)-A 6a),
ye! rl

1.L. Let J be any set and, for each j € J, let A, be contained in X. Show that

@6(MIA, :feID= ULe(A,)


fe J},
@(UIA, FEIN= NA):
je J}.
1.M. If B, and B, are subsets of B and if B = B, UB.,, then

AXB=(AXB,)U(AXB,).

Section 2. Functions
We now turn to a discussion of the fundamental notion of a function or
mapping. It will be seen that a function is a special kind of a set, although
2. FUNCTIONS il

there are other visualizations which are often suggestive. All of the later
sections will be concerned with various types of functions, but they will
usually be of less abstract nature than considered in the present intoductory
section.
To the mathematician of a century ago the word “function” ordinarily
meant a definite formula, such as

f(x)=x?+3x—-5,

which associates to each real number x another real number f(x). The fact
that certain formulas, such as

g(x)=vx—-5

do not give rise to real numbers for all real values of x was, of course,
well-known but was not regarded as sufficient grounds to require an
extension of the notion of function. Probably one could arouse controversy
among those mathematicians as to whether the absolute value

h(x) =|x|
of a real number is an “honest function” or not. For, after all, the definition
of |x| is given “in pieces” by

I={ x, if x20,
lax if x<0.
As mathematics developed, it became increasingly clear that the require-
ment that a function be a formula was unduly restrictive and that a more
general definition would be useful. It also became evident that it isimportant
to make a clear distinction between the function itself and the values of the
function. The reader probably finds himselfin the position of the mathemati-
cian of a century ago in these two respects due to no fault of his own. We
propose to bring him up to date with the current usage, but we shall do so in
two steps. Our first revised definition of a function would be:
A function f fromaset A toa set Bisa rule of correspondence that assigns
to each x in a certain subset D of A, a uniquely determined element f(x)
of B.

Certainly, the explicit formulas of the type mentioned above are includedin
this tentative definition. The proposed definition allows the possibility that
the function might not be defined for certain elements of A and also allows the
consideration of functions for which the sets A and B are not necessarily real
numbers (but might even be desks and chairs—or cats and dogs).
12 INTRODUCTION: A GLIMPSE AT SET THEORY

However suggestive the proposed definition may be, it has a significant


defect: it is not clear. There remains the difficulty of interpreting the phrase
“rule of correspondence.” Doubtless the reader can think of phrases that
will satisfy him better than the above one, butitis not likely that he can dispel
the fog entirely. The most satisfactory solution seems to be to define
“function” entirely in terms of sets and the notions introduced in the
preceding section. This has the disadvantage of being more artificial and
losing some of the intuitive content of the earlier description, but the gain in
clarity outweighs these disadvantages.
The key idea is to think of the graph of the function: that is, a collection of
ordered pairs. We notice that an arbitrary collection of ordered pairs
cannot be the graph of a function, for once the first member of the ordered
pair is named, the second is uniquely determined.

2.1. Derirrion. Let A and B be sets (which are not necessarily


distinct). A function from A to B is a set f of ordered pairs in A x B with
the property that if (a, b) and (a, 6’) are elements of f, then b=b'. The set
of all elements of A that can occur as first members of elements in f is
called the domain of f and will be denoted D(f). The set of all elements of
B that can occur as second members of elements f is called the range of f
(or the set of values of f) and will be denoted by R(f). In case D(f)=A,
we often say that f maps A into B (or is a mapping of A into B) and write
f:A—B.

If (a, b) is an element of a function f, then it is customary to write

b=f(a) or frarb

Figure 2.1. A function as a graph.


2, FUNCTIONS 13

instead of (a, b)ef. We often refer to the element b as the value off at the
point a, or the image under f of the point a.

Tabular Representation

One way of visualizing a function is as a graph. Another way which is


important and widely used is as a table. Consider Table 2.1, which might be
found in the sports page of the Foosland Bugle-Gazette.
The domain of this free-throw function f consists of the nine players
D(f) ={Anderson, Bade, Bateman, Hochschild, Kakutani,
Kovalevsky, Osborn, Peressini, Rosenberg},

while the range of the function consists of the six numbers

R(f) ={0, 1, 2, 4, 5, 8}.

The actual elements of the function are ordered pairs


(Anderson, 2), (Bade, 0), (Bateman, 5),
(Hochschild, 1), (Kakutani, 4), (Kovalevsky, 8),
(Osborn, 0), (Peressini, 2), (Rosenberg, 4).
In such tabular representations, we ordinarily write down only the domain of
the function in the left-hand column (for there is no need to mention the
members of the team that did not play). We could say that the value of this
free-throw function f at Anderson is 2 and write f(Anderson)=2, or
Anderson + 2, and so on.
Weare all familiar with such use of tables to convey information. They are
important examples of functions and are usually of a nature that would be
difficult to express in terms of a formula.

TABLE 2.1
Player Free Throws Made

Anderson
OTAPRPUCN

Bade
Bateman
Hochschild
Kakutani
Kovalevsky
Osborn
Peressini
SPN

Rosenberg
14 INTRODUCTION: A GLIMPSE AT SET THEORY

Figure 2.2. A function as a transformation.

Transformations and Machines

There is another way of visualizing a function: as a transformation of part of


the set A into part of B. In this phraseology, when (a, b) € f, we think off as
taking the element a from the subset D(f) of A and “‘transforming”’ or
“mapping” it into an element b = f(a) inthe subset R(f) of B. We often draw
a diagram suchas Figure2.2. We frequently use this geometrical representa-
tion ofa function even when thesets A and B are notsubsets of the plane.
There is another way of visualizing a function: namely, as a machine which
will accept elements of D(f) as inputs and yield corresponding elements of
R(f) as outputs. If we take an element x from D(f) and putit into f, then out
comes the corresponding value f(x). Ifwe puta different element y of D(f)
into f, we get f(y) (which may or may not differ from f(x)). If we try to insert
something which does not belong to D(f) into f, we find that it isnot accepted,
for f can operate only on elements belonging to D(f). (See Figure 2.3.)

\
f(x)
Figure 2.3. A function as a machine.
2. FUNCTIONS 15

This last visualization makes clear the distinction between f and f(x): the
first is the machine, the second is the output of the machine when we put x into
it. Certainly it is useful to distinguish between a machine and its outputs.
Only afool would confuse a meat grinder with ground meat; however, enough
people have confused functions with their values that itis worthwhile to make
a modest effort to distinguish between them notationally.

Restrictions and Extensions of Functions

If f isa function with domain D(f) and D1 isa subset of D(f), itis often useful
to define a new function f, with domain D, by fi(x)=f(x) for all xeD,.
This, function f, is called the restriction off to the set D,. In terms of
Definition 2.2, we have
fi={(a, De f:ae Dy}.
Sometimes we write f; =f | D, to denote the restriction of the functionf to the
set Dj.
A similar construction (that appears less artificial) is the notion of an
“extension.” If g is a function with domain D(g) and D2 > D(g), then any
function g2 with domain D2 such that g2(x) = g(x) for all x € D(g) is called an
extension of g to the set Dz.

Composition of Functions

We now want to “‘compose”’ two functions by first applyingf to each x in


D(f) and then applying g to f(x) whenever possible (that is, when f(x) belongs
to D(g)). In doing so, some care needs to be exercised concerning the
domain of the resulting function. For example, if f is defined on R by
f(x) =x? and if g is defined for x = 0 by g(x) =x, then the composition
gef can be defined only for x = 0, and for these real numbers it is to have
the value Vx.

2.2 DEFINITION. Let f be a function with domain D(f) in A and range


R(f) in B and let g be a function with domain D(g) in B and range R(g) in C.
(See Figure 2.4.) The composition g°f (note the order!) is the function
from A to C given by
gef={(a,c)e A XC: there exists an element be B
such that (a, b)¢f and (b, c)« g}.

2.3. THEOREM. If f and g are functions, the composition gof is a


function with
D(gef)={x e D(f): f(x) € D(g)}},
R(gef) ={g(f(x)):x € D(gef)}.
16 INTRODUCTION: A GLIMPSE AT SET THEORY

[M] pen He RAN DA E Reef)


Figure 2.4. Composition of functions.

2.4 ExampLes. (a) Letf, g befunctions whose values at the real number
x are the real numbers given byt

f(x) =2x, g(x) =3x7?-1,

Since D(g) is the set R of all real numbers and R(f)< D(g), the domain
D(gef) is also R and gef(x) =3(2x)’>—1=12x’—1. On the other hand,
D(feg)=R, but fog(x) = 2(3x?— 1) = 6x?—2.
(b) If h is the function with D(h) = {x € R:x = 1} defined by

h(x) =vx-1,

and if f is as in part (a), then D(hef)={xeR:2x=l}={xeR:x=>3} and


hof(x)=V2x—1. Also D(fceh)={xeR:x=1} and feh(x)=2Vx-1.
If g is the function in part (a), then D(ho g)={xeR:3x*-1> 1}=
{xe R:x<—v? or x=v3 and hog(x)=V3x°-2. Also D(geh)=
{xe R:x = 1} and geh(x)=3x—4. (Note that the formula expressing geh
has meaning for values of x other than those in the domain of g eh.)
(c) Let F, G be the functions with domains D(F)={xeR:x = O},
and D(G)=R, such that the values of F and G at a point x in their
domains are

F(x)=Vx, 9 G(x) =—x?-1.

Then D(GeF)={xe R:x = 0} and GeF(x)=—x-—1, whereas D(F°G)=


{x € D(G): G(x)e D(f)}. This last set is void as G(x) <0 for all x e D(G).
Hence the function F°G is not defined at any point, so FoG is the ‘“‘void
function.”
+ We also denote this by writing f:x > 2x and g:x+>3x*-1forxeR.
2. FUNCTIONS 17

Injective and Inverse Functions


We now give a way of constructing a new function from a given one in case
the original function does not take on the same value twice.

2.5 DEFINITION. Let f be a function with domain D(f) in A and range


R(f) in B. We say that f is injective or one-one if, whenever (a, b) and
(a’, b’) are elements of f, then a=a'. If f is injective we may say that f is
an injection.
In other words, f isinjectiveif and only ifthe two relations f(a) = b, f(a’) =b
imply thata=a’. Alternatively, f is injective if and only if when a, a’ are
in D(f) and a# a’, then f(a) 4 f(a’).
We claim that if f is injective from A to B, then the set of ordered pairs in
BX A obtained by interchanging the first and second members of ordered
pairs in f yields a function g which is also injective.
We omit the proof of this assertion, leaving it as an exercise; it is a good test
for the reader. The connections between f and g are:
D(g)=R(f), = R(g) =D),
(a, b)ef if and only if (b, ae g.
This last statement can be written in the more usual form:

b=f(a) if and only if a=g(b).


2.6 Derinition. Let fbeaninjection with domain D(f)in A andrange
R(f) in B. If g={(b,a)e
BX A: (a, b)ef}, then g is an injection with
domain D(g) = R(f) in B and with range R(g) = D(f)in A. The function g
is called the function inverse to f and is denoted by f'.
The inverse function can be interpreted from the mapping point of view. (See
Figure 2.5.) If f is injective, it maps distinct elements of D(f) into distinct elements
of R(f). Thus, each element b of R(f) is the image underf of a unique clement a in
D(f). The inverse function f~' maps the element b into this unique element a.

Figure 2.5. The inverse function.


18 INTRODUCTION: A GLIMPSE AT SET THEORY

2.7 Exampies. (a) Let F:xe>x* be the function with domain


D(F)=R, the set of all real numbers, and range in R such that the value ofF
at the real number x is F(x)=x’. (In other words, F is the function
{(x.x?):xeR}.) It is readily seen that F is not one-one; in fact, the
ordered pairs (2, 4), (—2, 4) both belong to F. Since F is not one-one, it
does not have an inverse.
(b) Let f be the function with domain D(f)={x eR: x = 0} and R(f) =
R whose value at x in D(f) is f(x) =x”. Note that f is the restriction to
D(f) of the function F in part (a). In terms of ordered pairs, f=
{(x, x*):x ER, x = 0}. Unlike the function F in part (a), f is injective, for if
x’=y? with x, y in D(f), then x=y. (Why?) Therefore, f has an
inverse function g with D(g)=R(f)={xeR:x = 0} and R(g)=D(f)=
{xe¢R:x =O}. Furthermore, y=x*=f(x) if and only if x=g(y). This
inverse function g is ordinarily called the positive square root function and
is denoted by
g(y)=Vvy, yeR, y=0.
(c) If f, isthe function {(x, x”): x ¢ R, x < 0}, then asin (b), f,is one-one and
has domain D(f,) ={x ¢R:x < 0} and range R(f.)) ={xe€R:x = 0}. Note
that f; is the restriction to D(f,) of the function F of part (a). The function
g. inverse to f is called the negative square root function and is denoted by

gly)=—vy, yeR, y=0,


so that g:(y) <0.
(d) The sine function F introduced in trigonometry with D(F) = R and
R(F)={y¢€R:-l=y = + lhis well-known not to be injective (for example,
sin 0 = sin 27 =0). However, if we let f be its restriction to the set D(f) =
{xe R:-—a/2<x < +7/2}, then f is injective. It therefore has an inverse
function g with D(g)=R(f) and R(g)=D/(f). Also, y=sinx with xe
D(f) if and only if x = g(y). The function g is called the (principal branch)
of the inverse sine function and is often denoted by

g(y)
= Are siny or g(y) =Sin’' y.

Surjective and Bijective Functions


2.8 DEFINITION. Let f be a function with domain D(f)<¢ A and range
R(f)cB. We say that f is surjective, or that f maps onto B, in case the
range R(f)=B. If f is surjective, we may say that f is a surjection.
In defining a function it is important to specify the domain of the function
and the set in which the values are taken. Once this has been done it is
possibJe to inquire whether or not the function is surjec: ive.
2. FUNCTIONS 19

2.9 DEFINITION. A function f with domain D(f)<¢ A andrange R(f)S

surjective (that is, it maps D(f) onto B). Iff is bijective, we may say that f
is a bijection.

Direct and Inverse Images


Let f be an arbitrary function with domain D(f) in A and range R(f) in B.
We do not assume that f is injective.

2.10 DEFINITION. IfE isasubset of A, then the direct image of E under


f is the subset of R(f) given by
{f(x):xe€ END(fy}.
We usually denote the direct image of a set E under f by the notation f(E).
(See Figure 2.6.)
It will be observed that if EM D(f)=@, then f(E)=@. If E contains
the single point p in D(f), then the set f(E) contains the single point
f(p). Certain properties of sets are preserved under the direct image, as
we now show.
2.11 THEOREM. Letfbea function with domainin A and range in Band
let E, F be subsets of A.
(a) If ECF, then f(E) <f(F). (b) f(ENF)Sf(E)NfF(P).
(c) f((BUF)=f(E)Uf(P). (d) f(E\ F)Sf(B).
proor. (a) If xe FE, then xeF and hence f(x)ef(F). Since this is
true for all x¢ E, we infer that f(E) ¢f(F).
(b) Since ENF CE, it follows from part (a) that f(ENF)<f(E);
likewise, f(ENF)Cf(F). Therefore, we conclude that f(ENF)¢
F(E) OF).

Figure 2.6. Direct images.


20 INTRODUCTION: A GLIMPSE AT SET THEORY

(c) Since ECE UF and FCEUB, it follows from part (a) that f(E)U
f(F)Sf(E UF). Conversely, if y <¢f(E UF), then there exists an element
x€EUF such that y=f(x). Since x€E or x€F, it follows that either
y = f(x) €f(E) or that yef(F). Therefore, we conclude that f(E UF) ¢
f(E) Uf(F), which completes the proof of part (c).
(d) Part (d) follows immediately from (a). O.E.D.
It will be seen in Exercise 2.J that, in general, it is not possible to replace the
inclusion sign in (b) by equality.
We now introduce the notion of the inverse image of a set under a function.
Note that it is not required that the function be injective.
2.12 DeFINITION. If H is a subset of B, then the inverse image of H
under f is the subset of D(f) given by
{x : f(x)
€ H}.
We usually denote the inverse image of a set H underf by the symbol f~'(H).
(See Figure 2.7.)
Once again, we emphasize thatf need not be injective so that the inverse function f~'
need not exist. (However, if f-’ does exist, then f-'(H) is the direct image of H
under f7').

2.13 THEOREM. Letfbeafunction


with domaininA and range
in Band
let G, H be subsets of B.
(a) If GH, then f '\(G)cf'(H). (b) f'\(gNH)=f (G)Nf“(H).
() f(GUM)=f(G)UF (A). (dd) f'(G H)=f"(G)\f"@).
PROOF. (a) Suppose that x ef '(G); then f(x)¢ G oH and hence xe
f*(H).
(b) Since GMH is a subset of G and H, it follows from part (a) that

f“(GNH)cf (G)nf (A).

Figure 2.7. Inverse Images.


2. FUNCTIONS 21

Conversely, if x ef '(G)Nf~'UH), then f(x) eG and f(x)e€H. Therefore,


f(x)€ GOH and xef (GOH).
(c) Since G and H are subsets of G UH, it follows from part (a) that
f(GUH)=f (G)Uf (A).
Conversely, if xe f-'(GUH), then f(x)¢ GUH. It follows that either
f(x) e G whence x ef” ‘(G), or f(x) €H in which case x ef -'(H). Hence

P\(GUH)cf (G)Uf (A).


(d) If xef-'(G\H), then f(x)€G\H. Therefore, xef“(G) and
x f~'(H), whence it follows that
f\(G\H)cf
(G)\ f(A).
Conversely, if wef -'(G)\ f(A), then f(w)e¢G and f(w)¢H. Hence
f(w)e¢ G\ FHand it follows that

f-(G)\f “CA sf (G\ A).


O.E.D.

Exercises

2.4. Prove that Definition 2.2 actually yields a function and not just a subset.
2.B. Let A= B=R and consider the subset C = {(x, y):x°+ y’=l} of AX B. Is
this set a function with domain in R and range in R?
2.C. Consider the subset of R x R defined by D = {(x, y):|x|+|y|= 1}. Describe
this set in words. Is it a function?
2.D. Give an example of two functions f, g on R to R such that f# g, but such that
fog=gof.
2.E. Prove that if f is an injection from A to B, then f"'={(b, a): (a, b)ef}isa
function. Then prove it is an injection.
2.F. Suppose f is an injection. Show that f’ef(x)=x for all x in D(f) and
fef(y)=y for all y in R(f).
2.G. Let f and g be functions and suppose that gef(x) =x for all x in D(f).
Show that f is injection and that R(f)< D(g) and R(g)S D(f).
2.H. Let f, g be functions such that

gof(x)=x for all x in D(f),


featyy=y for all y in D(g).
Prove that g=f"'.
2.1. Show that the direct image f(E) is empty if and only if EN D(f) =9.
2.5. Let f be the function on R to R given by f(x)=x’, and let E=
{xeR:-1l<x<O}andF={xeR:0<x <= 1}. Then ENF={O}and f(EN F)={0}
while f(E)=f(F)={yeR:0<y<=1}. Hence f(EMF) is a proper subset of
f(E) Nf(F). Now delete 0 from E and F.
22 INTRODUCTION: A GLIMPSE AT SET THEORY

2.K. If f, E, F are as in Exercise 2.J, then E\ F={xeR:—1=x<0} and


f(E)\ f(F)=@. Hence, it does not follow that
fUE\ F) Sf(E)\ fF).
2.L. Show that if f is an injection of D(f) into R(f) and if H is a subset of R(f), then
the inverse image of H underf coincides with the direct image of H under the inverse
function f~'.
2.M. If f and g are as in Definition 2.2, then D(gof) =f *(D(g)).

Section 3 Finite and Infinite Sets

The purpose of this section is very restricted: it is to introduce the terms


“finite,” “countable,” and “infinite.” It provides a basis for the study of
cardinal numbers, but it does not pursue this study. Although the theories of
cardinal and ordinal numbers are fascinating in their own right, it turns out
that very little exposure to these topics is really essential for the material
in this text.t
We shall assume familiarity with the set of natural numbers. We shall
denote this set by the symbol N; the elements of N are denoted by the
familiar symbols
1,2,3,....

Ifn, m € N, we all have an intuitive idea of what is meant by saying that n is less
than or equal to m. We now borrow this notion, realizing that complete
precision requires more analysis than we have given. We assume that every
non-empty subsetof Nhasaleastelement. Thisisan important property of N;
we sometimes say that N is well-ordered meaning that N has this property.
This Well-Ordering Property is equivalent to mathematical induction. We
shall feel free to make use of arguments based on mathematical induction,
which we suppose to be familiar to the reader.
By an initial segment of N is meant a set consisting of all of the natural
numbers which are less than or equal to some fixed element of N. Thus an
initial segment S, of N determines and is determined by an element n of N
as follows:
An element x of N belongs to S, if and only if x <n.
For example: the subset S, = {1, 2} is the initial segment of N determined by
the natural number 2; the subset S,={1, 2, 3, 4} is the initial segment of N
determined by the natural number 4; but the subset {1, 3, 5} of N is not an
initial segment of N, since it contains 3 but not 2, and 5 but not 4.
+ A reader wishing to learn about these topics should consult the book of Halmos cited in the
References.
3. FINITE AND INFINITE SETS 23

3.1 Derinition. A set B is finite if it is empty or if there is a bijection


with domain B and range in an initial segment ot N. If there is no such
function, the set isinfinite. If there is a bijection ofB onto N, then the set B is
denumerable (or enumerable). If a set is either finite or denumerable, it
is said to be countable.
When there is an injective (or one-one) function with domain B and range
C, we sometimes say that B can be put into one-one correspondence with C.
By using this terminology, we rephrase Definition 3.1 and say that a set B is
finite if it is empty or can be put into one-one correspondence with a subset of
an initial segment of N. We say that B is denumerable if it can be put into
one-one correspondence with all of N.
It willbe noted that, by definition, a set B is either finite orinfinite. However, it may
be that, owing to the description of the set, it may not be a trivial matter to decide
whether the given set B is finite or infinite.
The subsets of N denoted by {1, 3, 5}, {2, 4, 5, 8, 10}, {2, 3, ..., 100}, are finite since,
although they are not initial segments of N, they are contained in initial segments of N
and hence can be put into one-one correspondence with subsets of initial segments of
N. The set E of even natural numbers

E ={2,4,6,8,...}

and the set O of odd natural numbers

O={1,3,5,7,...}

are not initial segments of N. However, since they can be put into one-one
correspondence with all of N (how?), they are both denumerable.
Even though the set Z of all integers

Z={...,—2,-1,0,1,2,...},
contains the set N, it may be seen that Z is a denumerable set. (How?)

We now state some theorems without proof. At first reading itis probably
best to accept them without further examination; ona later reading, however,
the reader will do well to attempt to provide proofs for these statements. In
doing so, he will find the inductive property of the set N of natural numbers to
be useful.t

3.2 Tueorem. A set B is countable if and only if there is an injection


with domain B and range in N.
3.3. THEOREM. Any subset of a finite set is finite. Any subset of a
countable set is countable.
3.4 THEorEM. The union of a finite collection of finite sets is a finite
set. The union of a countable collection of countable set is a countable set.
¥ See the books of Halmos and Hamilton-Landin cited in the References.
24 INTRODUCTION: A GLIMPSE AT SET THEORY

It is a consequence of the second part of Theorem 3.4 that the set Q of all
rational numbersformsacountableset. (Werecallthatarationalnumberisa
fraction m/n, where m and n are integers andn#0. To see that Q isa
countable set we form the sets

Ao= {0},

Ai=fi, —t i i 1

A=, -23, BB+ hs

Note that each of the sets A, is countable and that their union is all of Q.
Hence Theorem 3.4 asserts that Q iscountable. Infact, wecan enumerate Q
by the “diagonal procedure”’:
0 1112 _ 11
915 49 2p 1p 2p Bp ee ee

By using this type of argument, the reader should be able to construct a


proof of Theorem 3.4. See also Exercise 3.K.

The Uncountability of R and!

Despite the fact that the set of rational numbers is countable, the entire
set R of real numbers is not countable. In fact, the set I of real numbers x
satisfying 0 = x = 1 isnot countable. To demonstrate this, we shall use the
elegant “‘diagonal’’ argument of G. Cantor.j We assume it is known that
every real number x with 0 =x <1 has a decimal representation in the
form x = 0.a,a2a3 +--+, where each a, denotes one of the digits 0, 1, 2, 3, 4,
5, 6, 7, 8, 9. It is to be realized that certain real numbers have two
representations in this form (for example, the rational number 7 has the
two representations
0.1000 -- - and 0.0999 ---).

We could decide in favor of one of these two representations, but it is not


necessary to do so. Since there are infinitely many rational numbers in the
interval0 = x =< 1,(why?) theset I cannotbe finite. Weshallnow show thatit
isnot denumerable. Suppose that there isan enumeration x,, x2, X3,... ofall
t GEORG CANTOR (1845-1918) was born in St. Petersburg, studied in Berlin with Weierstrass,
and taught at Halle. He is best known for his work on set theory, which he developed during
the years 1874-1895.
3. FINITE AND INFINITE SETS 25

real numbers satisfying 0 = x <= 1 given by


X:=0.did24a3°°°

X2=0.bib.bs see

X3=0.€1C2€3° °°

Nowlet y; bea digit different from 0, a,, and 9; let y. bea digit different from 0,
bo, and 9; let ys bea digit different from 0, c3,and 9, etc. Consider the number
y with decimal representation
y= O.yi1y2y3 tte

which clearly satisfies 0< y <1. The number y is not one of the numbers
with two decimal representations, since y,#0,9. Atthesame time y # x, for
any n (since the nth digits in the decimal representations for y and x, are
different). Therefore, any denumerable collection of real numbers in this
interval will omit at least one real number belonging to this interval.
Therefore, this interval is not a countable set.
Suppose that a set A is infinite; we shall suppose that there is a one-one
correspondence with a subset of A and allof N. In other words, we assume
that every infinite set contains a denumerable subset. This assertion is a weak
form of the so-called ‘‘Axiom of Choice,”’ which is one of the usual axioms of
set theory. After the reader has digested the contents of this book, he may
turn to an axiomatic treatment of the foundations which we have been
discussing in a somewhat informal fashion. However, for the moment he
would do well to take the above statement as a temporary axiom. It can be
replaced later by a more far-reaching axiom of set theory.

Exercises

3.A. Exhibit a one-one correspondence between the set E of even naturalnumbers


and N.
3.B. Exhibit a one-one correspondence between the set O of odd natural numbers
and N.
3.C. Exhibit a one-one correspondence between N and a proper subset of N.
3.D. If A is contained in some initial segment of N, use the well-ordering
property of N to define a bijection of A onto some initial segment of N.
3.E. Given an example of a countable collection of finite sets whose union is not
finite.
3.F. Use the fact that every infinite set has a denumerable subset to show that every
infinite set can be put into one-one correspondence with a proper subset of itself.
3.G. Show that if the set A can be put into one-one correspondence with a set B,
then B can be put into one-one correspondence with A.
26 INTRODUCTION: A GLIMPSE AT SET THEORY

3.H. ShowthatifthesetA can be putinto one-one correspondence


with aset B, and
if B can be put into one-one correspondence with a set C, then A can be put into
one-one correspondence with C.
3.1. Usinginductiononn € N, show that the initial segment determined by n cannot
be put into one-one correspondence with the initial segment determined by m EN, if
mon.
3.J. Prove that N cannot be put into one-one correspondence with any initial
segment of N.
3.K. For eachneé N let A, ={a,;:j © N}, and suppose that A, 1. A,, =@ for n# m,
n, me N. Show that the function f(n, j)=in+j—-2)(n+j—1)+n gives an enum-
eration of U{A,:n<N}.
|
THE REAL NUMBERS

In this chapter we shall discuss the properties of the real number


system. Although it would be possible to construct this system from a
more primitive set (such as the set N of the natural numbers or the set Q of
rational numbers), we shall not do so. Instead, we shall exhibit a list of
properties that are associated with the real number system and show how
other properties can be deduced from the ones assumed.
For the sake of clarity we prefer not to state all the properties of the real
number system at once. Instead, we shall introduce first, in Section 4, the
“algebraic properties’ based on the two operations of addition and multi-
plication and discuss briefly some of their consequences. Next, we intro-
duce the ‘‘order properties.” In Section 6, we make the final step by
adding the “‘completeness property.” There are several reasons for this
somewhat piecemeal procedure. First, there are a number of properties to
be considered, and it is well to take a few at a time. Furthermore, the
proofs required in the preliminary algebraic stages are more natural at first
than some of the later proofs. Finally, since there are several other
interesting methods of adding the “completeness property,” we wish to
have it isolated from the other assumptions.
Part of the purpose of Sections 4 and 5 is to provide examples of proofs
of elementary theorems which are derived from explicitly stated assump-
tions, It is our experience that students who have not had much exposure
to rigorous proofs can grasp the arguments presented in these sections
readily and can then proceed into Section 6. However, students who are
familiar with the axiomatic method and the technique of proofs can go into
Section 6 after a cursory look at Sections 4 and 5.
In Section 7, we introduce the notion of a cut in the real number system,
and define various types of cells and intervals. The important Nested Cells
Property of R is established, and the Cantor set is briefly discussed.

27
28 THE REAL NUMBERS

Section 4 The Algebraic Properties of R

In this section we shall give the ‘‘algebraic” structure of the real number
system. Briefly expressed, the real numbers form a ‘“‘field’’ in the sense
of abstract algebra. We shall now explain what that means.
By a binary operation in a set F we mean a function B with domain F x F
and rangein F. Instead of using the notation B(a, b) to denote the value of
the binary operation B at the point (a, b) in F x F, it is conventional to use
a notation such as aBb, or a+b, or a: b.
4.1 ALGEBRAIC PROPERTIES OF R. In the set R of real numbers
there are two binary operations (denoted by+ and - and called addition and
multiplication, respectively) satisfying the following? properties:
(Al) a+b=b+a_ forall a, bin R;
(A2) (a+b)+c=a+(b+c) for all a, b,c in R;
(A3) there exists an element 0 in R with O+a=a and a+0=
a for all a in R;
(A4) for each element a in R there is an element —a in R such that
at+(-a)=0 and (-a)+a=0;
(M1) a-b=b-a_ forall a, bin R;
(M2) (a: b)-c=a-:(b:c) for all a, b,c in R;
(M3) the element 1 in R is distinct from 0 and has the property that
1l-a=a and a-:l=a forallainR;
(M4) for each element a# 0 in R there is an element 1/a in R such that
a:(t/a)=1 and (1/a)-a=1;
(D) a-(b+c)=(a-b)+(a-c) and (b+c)-a=(b-a)t+(c-a) for
all a, b, c in R.
These properties are certainly familiar to the reader. We will now
obtain a few easy (but important) consequences of them. First of all we
shall prove that 0 is the only element of R that satisfies (A3), and 1 is the
only element that satisfies (M3).
4.2 THEOREM. (a) If z and a are elements of R such that z+a=a,
then z =0.
(b) If w and b40 are elements ofR such that w - b=), thenw=1.
PROOF. (a) The hypothesis is that z+ a= a. Add —a to both sides and
use (A4), (A2), (A4), and (A3) to obtain

0=a+(—a)=(z+a)+(—a)=z+(a+(-a))
=z+0=z.

t This list is not intended to be “‘minimal.’” Thus the second assertions in (A3) and (A4)
follow from the first assertions by using (A1).
4. THE ALGEBRAIC PROPERTIES OF R 29

The proof of part (b) is left as an exercise. Note that it uses the
hypothesis that b# 0. Q.E.D.
We now show that the elements —a and 1/a (when a# 0) are uniquely
determined by the properties given in (A4) and (M4).
4.3 THEOREM. (a) If a and b are elements of R and a+b=0, then
b=~a.
(b) If a#0 and b are elements ofR and a - b =1, then b = 1/a.
PROOF. (a) Ifa+b=0, add —a to both sides to obtain (-a)+(a+b)=
—a+0. Now use (A2) on the left and (A3) on the right side to obtain
((-a)+a)+b=—a.
If we use (A4) and (A3) on the left side, we obtain b = —a.
The proof of (b) is left as an exercise. Note that it uses the hypothesis
that a#0. Q.E.D.

Properties (A4) and (M4) guarantee the possibility of solving the equa-
tions

a+x=0, a-x=1 (a#0),


for x, and Theorem 4.3 implies the uniqueness of the solutions. We now
show that the right-hand sides of these equations can be arbitrary elements
of R.

4.4 THEOREM. (a) Let a, b be arbitrary elements of R. Then the


equation a+ x =b has the unique solution x = (—a) + b.
(b) Let a#0 and b be arbitrary elements of R. Then the equationa -x =b
has the unique solution x = (1/a) « b.
PRooF. Since a+((—a)+b)=(a+(-—a))+b=0+b= 5, it is clear that
x =(—a)+b is a solution of the equation at+x=b. To establish that it is
the only solution, let x; be any solution of this equation; hence
atx,=b.
We add —a to both sides to obtain
(-a)+(a+x1)=(-a)+b.

If we employ (A3), (A4), and (A2), we get


xi =04+x,=(-at+a)+x

=(-a)+(at+x1)=(-a)+b.
Hence x; = (—a)+b.
The proof of part (b) is left as an exercise. QED.
30 THE REAL NUMBERS

4.5 THEOREM. If aand b are any elements of R, then


(a) a-0=0; (b) -a=(-1)-a;
(c) ~(a+b)=(-a)+(—b),; (d) -(-a) =a;
(e) (-1)-(-D=1.
PROOF. (a) From (M3), we know that a: 1 =a. Hence
ata-O=a-1t+a-0=a:(1+0)
=a-l=a.
If we apply Theorem 4.2(a), we infer that a -0=0.
(b) It is seen that
at+(-1):a=1-:a+(-1):a=(1+(-1))-a
=0-a=0.

It follows from Theorem 4.3(a) that (-1):a=~—a.


(c) We have
—(a+b)=(-1)-(a+b)=(-1)-a+(-1)-b
=(-a)+(-b).

(d) By (A4) we have(—a)+a=0. According to the uniqueness assertion


of Theorem 4.3(a), it follows that a = —(—a).
(e) In part (b), substitute a=—1. We have

—(-1)=(-1): (2).
Hence the assertion follows from part (d) with a = 1. Q.E.D.

4.6 THEOREM (a) Ifae Rand a#0, then 1/a40 and 1/(1/a) =a.
(b) Ifa, be Randa - b=0, then either a=0 or b=0.
(c) If a, bE R, then (—a) + (—b) =a - b.
(d) IfaeR and a# 0, then 1/(-a)=—(1/a).

PROOF. (a) If a#0, then 1/a40 for otherwise 1 =a -(1/a)=a-0=0,


contrary to (M3). Since (1/a) - a =1, it follows from Theorem 4.3(b) that
a=l/(1/a).
(b) Suppose that a: b=0 and that a#0. If we multiply by 1/a, we
obtain

b=1-b=((1/a)-a)-b=(1/a)-(a-b)
=(1/a)-0=0.

A similar argument holds if b4 0.


4. THE ALGEBRAIC PROPERTIES OF R 31

(c) From Theorem 4.5, we have —a = (—1)- a, and —b =(—1) - b; hence

(—a)- (—b) = ((-1) - a)- ((-1)- 5)


=(a-(-1)): (1): 5)
=a:((-1)-(-1))-b=a-1-b
=a:b.
(d) If a¥0, then 1/a#0 and —a#0. Since a - (1/a) = 1, it follows from
part (c) that (—a)-(—(1/a))=1. If we apply Theorem 4.3(b), we deduce
that t/(—a) = —(1/a) as claimed. Q.E.D.

Rational Numbers

From now on we shall generally drop the use of the dot to denote
multiplication and write ab for ab. As usual we shall write a’ for aa,
nti
a’ for aaa =(a’)a, and if ne N we define a"**=(a")a. It follows by use
of mathematical induction that if m, ne N, then
(*) a™*" =q"q"

foranyaeéR. Similarly we shall write 2 for1+1, 3 for2+1=(14+1)+1,


and so forth. In addition we shall generally write b—a instead of
(—a)+b=b-+ (a) and, if a#0, we shall generally write

b/a or b
a
instead of (1/a)-b=b-+(1/a). We shall also write a for 1/a, and a™ for
1/a". It can then be shown that formula (*) above holds for m, ne Z when
ax 0.
Elements of R which are of the form
b —b
aon |
for a, bE N, a#0, are said to be rational numbers, and the set of all
rational numbers in R will be denoted by the standard notation Q. All of
the elements of R which are not rational numbers are said to be irrational
numbers. Although this terminology is unfortunate, it is also quite
standard and we shall adopt it.
We shall close this section with a proof of the fact that there does not
exist a rational number whose square is 2.

4.7 THEOREM. There does not exist a rational number r such that
2
r=2.,
32 THE REAL NUMBERS

PROOF. Suppose, on the contrary that (p/q)’=2, where p and q are


integers. We may, without loss of generality, suppose that p and q have no
common integral factors. (Why?) Since p? = 2q’, it follows that p must be
an even integer (for if p=2k+1 is odd, then p?=4k*?+4k+1=
2(2k*+2k)+1 is odd). Therefore p=2k for some integer k and hence
4k? =2q?. It follows that q?= 2k’, whence q must also be even. There-
fore both p and q are divisible by 2, contrary to our hypothesis. Q.E.D.

Exercises

4.A. Prove part (b) of Theorem 4.2.


4.B. Prove part (b) of Theorem 4.3.
4.C. Prove part (c) of Theorem 4.4.
4.D. Using mathematical induction, show that if ae R and m, néN, then
a™*"=a™a",
4.E. Show that if ae R, a#0, and m, ne Z, then a™™" =a"a".
4.F. Use the argument in Theorem 4.7 to show that there does not exist a
rational number s such that s*=6.
4.G. Modify the argument in Theorem 4.7 to show that there does not exist a
rational number f such that t?=3.
4.H. If €€R is irrational and re R, r# 0, is rational, show that r+é and ré are
irrational.

Section 5 The Order Properties of R

The purpose of this section is to introduce the important ‘‘order’’ proper-


ties of R, which will play a very important role in subsequent sections. The
simplest way to introduce the notion of order is to make use of the notion
of “‘strict positivity,” which we now explain.

5.1 THE ORDER PROPERTIES OF R. There is a non-empty subset P of


R, called the set of strictly positive real numbers, satisfying the properties:
(i) If a, b belong to P, then a + b belongs to P.
(ii) If a, b belong to P, then ab belongs to P.
(iii) If a belongs to R, then precisely one of the following relations
holds: aeP, a=0. —aeP.

Condition (iti) is sometimes called the property of trichotomy. It implies


that the set N={—a:aeP}, sometimes called the set of strictly negative
real numbers, has no elements in common with P. In fact the entire set R
is the union of the three disjoint sets P, {0}, N.

5.2 Derinirion. If aeP, we say that a is a strictly positive real


number and write a>0. If a is either in P or is 0, we say that a is a
5. THE ORDER PROPERTIES OF R 33

positive real number and write a= 0. If —ae P, we say that a is a strictly


negative real number and write a<0. If —a is either in P or is 0, we say
that a is a negative real number and write a < 0.

It should be noted that, according to the terminology just introduced, the number
0 is both positive and negative; it is the only number with this dual status. This
terminology may seem a bit strange at first, but it will prove to be a convenience.
Some authors reserve the term “‘positive’’ for the elements of the set P and use the
term ‘non-negative’ for the elements of P U {0}.

We now introduce the order relations.

5.3 Dermirion. Let a, b be elements of R. If a—beP, then we


write a>b. If ~-(a—b)eP, then we write a<b. If a—be PU{0}, then
we write a> b. If —(a—b)e PU{0}, then we write a < b.

As usual, it is often convenient to turn the signs around and write

b<a, b>a, b<a, b>a,

respectively. In addition, if a< b and b <c, then we often write

a<b<c or c>b>a.

If a = b and b <c, then we often write

axb<c or c>b=a.

Properties of the Order

We shall now establish the basic properties of the order relation in R.


These are the familiar “laws’’ for inequalities which the reader has met in
earlier courses. They will be frequently used in later sections and are of
great importance.

5.4 THEOREM. Leta, b, c be elements of R.


(a) Ifa>bandb>c, thena>c.
(b) Exactly one of the following holds: a>b, a=b, a<b.
(c) Ifa= band b=a, thena=b.

PROOF. (a) If a—b and b—c belong to P, then from 5.1(i) we infer
that a—c=(a—b)+(b—c) also belongs to P. Hence a>c.
(b) By 5.1(iii) exactly one of the following possibilities takes place:
a—beP, a-b=0, b-a=-(a-—b)eP.
(c) If a# b, then from part (b) we must have either a—b or b—a in P.
Hence, either a>b or b>a. In either case one of the hypotheses is
contradicted. O.E.D.
34 THE REAL NUMBERS

5.5 THeoreM. (a) If0#aeR, thena’>O0.


(b) 1>0.
(c) IfneN, thenn>0.
PROOF. (a) Either a or —a belongs to P. If aeP, then by 5.1(i)
we have a’=aaeP. If —aeP, then by Theorem 4.7(c) we have
a’=(-a)(—a)eP. Hence, in either case, a’¢€ P.
(b) Since 1 = (1)’, the conclusion follows from (a).
(c) We use mathematical induction. The validity of the assertion
with n=1 is part (b). If the assertion is true for the natural number k
(that is, supposing keP), then since 1eP, it follows from 5.1) that
k+1eP. Hence the assertion is true for all natural numbers. Q.E.D.
The next properties are probably familiar to the reader.
5.6 THEOREM. Leta, b, c, d be elements of R.
(a) Ifa>b, then at+c>b+e.
(b) If a>b and c>d, thenat+c>b+d,
(c) If a>b and c>0, then ac >be.
(c’) Ifa>b and c <0, then ac< be.
(d) If a>0, then 1/a>0.
(d‘) If a<0, then 1/a<0.
PROOF. (a) Observe that (a+c)—(b+c)=a—b.
(b) If a—b and c—d belong to P, then by 5.1(i) we conclude that
(a+c)—(b+d) =(a—b)+(c—d) also belongs to P.
(c) If a—b and c belong to P, then by 5.1(ii) ac— bc =(a—b)c also
belongs to P.
(c’) If a—b and —c belong to P, then by 5.1(ii) be — ac = (a — b)(—c) also
belongs to P.
(d) If a>0, then by 5.1(jii) a#0 so that the element 1/a exists. If
1/a=0, then 1= a(1/a)=0, a contradiction. If 1/a <0, then part (c’) with
c=1/a implies that 1=a(1/a)<0, contradicting 5.5(b). Therefore we
must have 1/a >0, since the other two possibilities have been excluded.
(d’‘) This can be proved by an argument similar to that in (d).
Alternatively, we can invoke Theorem 4.6(d) and use (d) directly. —9.£.p.
5.7 THEOREM. Ifa>b, thena>i(a+b)>b.
PROOF. Since a>b it follows from Theorem 5.6(a) with c =a that
2a>a+b and from Theorem 5.6(a) with c=b that a+b>2b. By
Theorem 5.5(c) we know that 2>0 and from Theorem 5.6(d) that $>0.
Applying Theorem 5.6(c) with c=3, we deduce that a>3(a+b) and
4a+b)>b. Hence a >3(a+b)>b, as claimed. Q.E.D.
The theorem just proved (with b=0) implies that given any strictly
positive number a, there is another strictly smaller and strictly positive
5. THE ORDER PROPERTIES OF R 35

number (namely 3a). Thus, there is no smallest strictly positive real


number.
We have already seen that if a>0 and b>0, then ab>0. Also if a<0
and b<0, then ab>0. We now show that the converse is true.
5.8 THEOREM. If ab>0, then we either have a>0O and b>0, or we
havea<0 and b<0O.
pRoor. If ab>0, then a¥0 and b#0. (Why?) If a>0, then from
Theorem 5.6(d) we infer that 1/a >0 and from 5.6(c) that b = ((1/a)a)b =
(1/a)(ab) >0. On the other hand, if a<0, then from Theorem 5.6(d’) we
infer that 1/a <0 and from 5.6(c’) that b = ((1/a)a)b = (1/a)(ab)<0. Q.E.D.
5.9 COROLLARY. If ab <0, then we either have a>0 and b <0, or we
havea<Q and b>0.
The proof of this assertion is left an exercise.

Absolute Value

The trichotomy property 5.1(iii) assures that if a~0, then one of the
numbers a and —a is strictly positive. The absolute value of a#0 is
defined to be the strictly positive one of the pair {a, —a}, and the absolute
value of 0 is defined to be 0.
5.10 DerFinirion. If ae R, the absolute value of a is denoted by |a|
and is defined by
laj=a if a0,
=—a if a<0.

Thus the domain of the absolute value function is all of R, its range is
PU {0}, and it maps the elements a, —a, into the same element.
5.11 THEOREM. (a) |al=0 if and only if a=0.
(b) |-al=|a| forallaeR.
(c) |ab| =a] |b| forall a, bER.
(d) Ifc = 0, then ja|=c if and only if -cs asc.
(e) -lja|<a<ja| forallaeR.
PROOF. (a) If a=0, then by definition |0}=0. If a#0, then also
—a#0, so that |a| #0.
(b) If a=0, then |0|=0=|-O|. If a>0, then JaJ=a=|-al. If a<0,
then |a| =—a =|-al.
(c) If a>0 and b>0, then ab>0 so that |ab|=ab=|a||b|. If a<0
and b>0, then ab<0 so that |ab|=—(ab)=(—a)b=|a||b|. The other
cases are handled similarly.
36 THE REAL NUMBERS

(d) If(a|=c, then both a=c and—a=c. (Why?) From the latter and
Theorem 5.6(c’) we infer that —-c < a so that -c<a=c. Conversely, if
this relation holds then both a < c and —a < c, whence |a| = ¢.
(e) Use part (d) with c =|a| = 0. QED.
The next result will be used very frequently in the sequel. (Recall that
a+b means both a+b and a—b.)
5.12 THE TRIANGLE INEQUALITY. If a, b are any real numbers, then

| |a|—|b| | = la +b] = |a| +|b}.


PRooF. According to Theorem 5.11(e), we have —|a| =a <|al and
—|b| < +b <|b|. Employing 5.6(b) we infer that
—({a]+ |b) =—laj-|b] s a+b =|al+|dl.
From Theorem 5.11(d) it follows that |a + b| =< |a|+]b|, proving the second
part of the inequality.
Since |a|=|(a—b)+b] <= |a—b|+[b} (why?), it follows that |a|—|b| <
ja—b|. Similarly jb|—|a|<|a—b|. (Why?) Combining these two in-
equalities, we deduce that | |a|—|b| |< |a—b|, which is the first part of the
inequality with the minus sign. To obtain the inequality with the plus sign,
replace b by —b. O.E.D.

5.13. CoROLLARY. [If a1, @2,...,@, are any n real numbers, then
a: ta2t- + aa] = lai]t+la2}+---+fanl.
PROOF. If n=2, the conclusion is precisely 5.12. If n>2, we use
mathematical induction and the fact that

|ait+ a+: . +k + asi] = |(ar t+ a2t: . “+ ax) + axa

<|ait+aot+-+-+axl+la.i|. o8D.

Exercises

5.A. If a, be R and a’?+b?=0, show that a=b=0.


5.B. If néN, show that n? =n and hence 1/n?< 1/n.
5.C. Ifa >—1, ae R, show that (i+ a)" =1+naforallneN. This inequality is
called Bernoulli’s Inequality.+ (Hint: use mathematical induction.)
5.D. If c>1, cE R, show that c">c forallneN. (Hint:c=1+a with a>0).
5.E. Ifc>1, ce R, show that c"=c" form=n,m,neN.
5.F. Suppose thatO<c<1. Ifm=n, m, n€N, show thatO<c™<c"<1,
5.G. Show that n< 2" for allie N. Hence 1/2" <1/n for all neN.
5.H. If a and b are positive real numbers and n EN, then a" <b" if and only if
a<b.
+ JACOB BERNOULLI (1654-1705) was a member of a Swiss family that produced several
mathematicians who played an important role in the development of calculus.
6. THE COMPLETENESS PROPERTY OF R 37

5.1. Show that ifa<x<b and a<y<=b, then |x—y|<b-—a. Interpret this
geometrically.
5.J. Let 6>0, ae R. Show that a—8<x<a+té if and only if |x—al/<6.
Similarly, a—8 <x <a+6 if and only if |x—a| = 8.
5.K. If a, be R and b# 0, show that |a/b| = |a|/|b].
5.L. Show if a, be R, then |a + b|=|a|+|b| if and only if ab = 0.
5.M. Sketch the points (x, y) in the plane R < R for which |y| = |x].
5.N. Sketch the points (x, y) in the plane R x R for which |x|/+|y|= 1.
5.0. If x, y, z belong to R, then x = y = z if and only if |x — y|+|y —z|=|x—z|.
5.P. If0<a<1, then 0<a*<a<1, while if 1<a, then 1<a<a’,

Section 6 The Completeness Property of R

In this section we shall present one more property of the real number
system which is often called the ‘completeness property” since it guaran-
tees the existence of elements in R when certain hypotheses are satisfied.
There are various versions of this completeness property, but we choose to
give here what is probably the most efficient method by assuming that
bounded sets in R have a supremum.

Suprema and Infima

We now introduce the notion of an upper bound of a set of real numbers.


This idea will be of utmost importance in later sections.
6.1 DeFIniTIon. Let S be a subset of R.
(a) An element uéR is said to be an upper bound of S if s = u for all
ses.
(b) An element w €R is said to be a lower bound of S if w<s for all
séeS.
We note that a subset $ ¢ R may not have an upper bound (for example,
take S = R). However, ifit has one upper bound, then it has infinitely many
(for if u is an upper bound of S, then u+n is also an upper bound of S$ for
anyneN). Again, the set S;={xeR:0<x<1}has 1 for an upper bound;
in fact, any number u=1 is an upper bound of S,. Similarly the set
S.= {x €R:0= x <1} has the same upper bounds as S,. Note, however,
that S. contains the upper bound 1, while S, does not contain any of its
upper bounds. (Why can no number c <1 be an upper bound of S,?)
To show that a number u € R is not an upper bound of S CR we must produce an
element so€S such that u<so. If S=, the empty set, that cannot be done.
Hence the empty set has the unusual property that every real number is an upper
bound; also every real number is a lower bound of §. This may seem artificial, } ut
it is a logical consequence of our definitions, so we must accept it.
38 THE REAL NUMBERS

™\ aN
Lower bounds for S Upper bounds for S

Figure 6.1, Suprema and infima.

As a matter of terminology, when a set has an upper bound, we shall say


that it is bounded above, and when a set has a lower bound, we shall say
that it is bounded below. If a set has both an upper and a lower bound,
we shail say that it is bounded. If a set lacks either an upper or a lower
bound, we shall say that it is unbounded. Thus the sets S; and S2 above
are both bounded. However the subset P={x¢R:x>0} of R is un-
bounded since it does not have an upper bound. Similarly, the set R is
unbounded since it does not have either an upper or a lower bound.
6.2 DEFINITION. Let S be a subset of R.
(a) If S is bounded above, then an upper bound of S is said to be a
supremum (or a least upper bound) of S if it is less than any other upper
bound of S.
(b) If S is bounded below, then a lower bound of S is said to be an
infimum (or a greatest lower bound) of S if it is greater than any other
lower bound of S. (See Figure 6.1.)
Expressed differently, a number u< R is a supremum of a subset S of R if it
satisfies the two conditions:
(@) s<u forallseS;
(ii) if s <= v for all s eS, then u sv.
Indeed, the condition (i) makes u an upper bound of S, and (ii) shows that u is less
than any other upper bound of S.
It is apparent that there can be only one supremum for a given subset S
of R. For, if u: and uz are suprema of S, then they are both upper bounds
of S. Since u; is a supremum of S and uz is an upper bound of S, we must
have u,<wu:. A similar argument shows that we must have u.< uj.
Therefore u:= us. In a similar manner one shows that there can be only
one infimum for a given subset S of R. Where these numbers exist, we
shall denote them by
sup S and int S.
It is often convenient to have another characterization of the supremum of
a subset of R.
6.3. LEMMA. A number ueéR is the supremum of a non-empty subset
SCR if and only if it has the following properties:
(i) There are no elements s ES withu<s.
(ii) If v <u, then there is an element s, € S such that v<s,.
6. THE COMPLETENESS PROPERTY OF R 39

proof. Suppose that u satisfies (i) and (ii). The condition (i) implies
that u is an upper bound of S._ If v is any number with v <u, then property
(ii) shows that v cannot be an upper bound of S. Hence u is the supremum
of S.
Conversely, let u be the supremum of S. Since u is an upper bound of
S, condition (i) holds. If o<u, then v is not an upper bound of S.
Therefore, there exists an element s, € S such that v < s,. Q.E.D.
The reader should convince himself that the number 1] is the supremum
of both of the sets S$; and S, which were defined after Definition 6.1. We
note that S, contains its supremum, but that S, does not contain its
supremum. Thus, when we say that a set has a supremum, we are
making no statement as to whether the set contains the supremum as an
element or not.
It is a deep and fundamental property of the real number system that
every non-empty subset of R which is bounded above has a supremum. We
shall make frequent and essential use of this property, which we take as
our final assumption about R.
6.4 SUPREMUM PROPERTY. Every non-empty set of real numbers
which has an upper bound has a supremum.
The analogous property of infima can be readily established from the
Supremum Property.
6.5 INFIMUM PROPERTY. Every non-empty set of real numbers which
has a lower bound has an infimum.
PROOF. Let S be bounded below and let S;={-s:s¢S} so that S, is
bounded above. The Supremum Property assures that S, has a supremum
u. We leave it to the reader to show that —u is the infimum of S. O.E.D.

The Archimedean} Property


One important consequence of the Supremum Property is that the subset
N of natural numbers is not bounded above in R. In particular this means
that given any real number x, there exists a natural number n, which is
greater than x (otherwise x would be an upper bound for N). We shall
now prove this assertion.
6.6 ARCHIMEDEAN Property. If x € R, there is a natural number n, € N
such that x <n,.

proor. If the conclusion fails, then x is an upper bound for N.


+ This property of R is named after Archimedes (287-212 8.c.), who has been called “the
greatest intellect of antiquity,” and was one of the founders of the scientific method.
40 THE REAL NUMBERS

Therefore, by the Supremum Property, N has a supremum u. Since x is


an upper bound for N, it follows that u=<x. Since u-1<u, it follows
from Lemma 6.3 (ii) that there exists ni ¢ N such that u—1<1n,. Therefore
u<n+1, but since n,+1¢N this contradicts the assumption that u is an
upper bound of N. Q.E.D.
6.7 COROLLARY. Let y and z be strictly positive real numbers.
(a) There is a natural number n such that ny > z.
(b) There is a natural number n such that 0<1/n<z.
(c) There is a natural number n such thatn—1<y <n.
PROOF. (a) Since y and z are strictly positive, then x =2z/y is also
strictly positive. Let neN be such that z/y=x<n. Then z<ny, as
claimed.
(b) Let ne N be such that 0<1/z<n. Then 0<1/n<z.
(c) The Archimedean Property assures that there exist natural numbers
m such that y<m. Let n be the least such natural number (see Section 3).
Then n-l<ey<n. OED.
We noted after Theorem 5.7 that there is no smallest strictly positive real
number. Corollary 6.7(b) shows that given any z >0 there is a rational
number of the form 1/n with 0<1/n<z. Sometimes one says “there are
arbitrarily small rational numbers of the form 1/n.”

The Existence of V2
One important property of the Supremum Property is that, as we have
said before, it assures the existence of certain real numbers. We shall
make use of it many times in this way. At the moment we will show that it
guarantees the existence of a positive real number x such that x* = 2; that
is, a positive square root of 2. This result complements Theorem 4.7.
6.8 THEOREM. There exists a positive number x © R such that x* = 2.
PROOF, Let S={y€R:0<y,y’= 2}. The set S is bounded above by
2; for, if not, then there exists an element s¢S such that 2<s whence it
follows that 4<s? <2, a contradiction. By the Supremum Property the
set S has a supremum and we let x= sup S. Clearly x >0.
We claim that x°=2. If not, then either x°<2 or x’°>2. If x?<2, let
néN be chosen such that 1/n <(2-x”)/(2x +1). In this case
2

(x +*)n a x24 PEn n


dn yt etl
n
ty gan,
which means that x + 1/n € S, contrary to the fact that x is an upper bound
of S.
If x°>2, we choose m EN such that 1/m <(x?—2)/2x. Since x = sup S,
6. THE COMPLETENESS PROPERTY OF R 41

there exists an so¢ S with x—1/m<so. But this implies that


x 2x1 1\7
2<x7 Bey yt. (x -+) <s0°.
m mom m
Hence so’ > 2, contrary to the fact that so¢S.
Since we have excluded the possibilities that x7<2 and x? >2, we must
have x7 = 2. OED.
By modifying the argument in Theorem 6.8 very slightly, the reader can show
that if a = 0, then there is a unique number b > 0 such that b?=a. We call b the
positive square root of a and denote it by

b=Va or b=a"’,

We now know that there exists at least one irrational element, namely,
V2 (the positive square root of 2). Actually there are “more” irrational
numbers than rational numbers in the sense that (as we have seen in
Section 3) the set of rational numbers is countable while the set of irra-
tional numbers is not countable. We shall now show that there are
arbitrarily small irrational numbers; this result complements Corollary 6.7.
6.9 CoROLLARY. Let €>0 be an irrational number and let z>0.
Then there exists a natural number m such that the irrational number E/m
satisfies O0<é/m <z.
PROOF. Since €>0, z>0, it follows from Theorem 5.6(d) and 5.6(c)
that é/z>0. By the Archimedean Property there exists a natural number
m such that 0<é/z<m. Therefore 0<é/m<z and it is an exercise to
show that &/m is irrational. Q.E.D.
We now show that between any two distinct real numbers there is a
rational number and an irrational number. (In fact, there are infinitely
many of both kinds!)
6.10 THEOREM. Let x and y be real numbers with x <y.
(a) Then there is a rational number r such thatx<r<y.
(b) If €>0 is any irrational number, then there is a rational number s such
that the irrational number sé satisfies x < s& <y.
PROOF. It is no loss of generality to assume that 0<x. (Why?)
(a) Since y —x >0, it follows from Corollary 6.7(b) that there is a natural
number m such that 0<1/m<y—x. From Corollary 6.7(a) there is a
natural number k such that
k =k _— Xx,
m m
and we let n be the least such natural number. Therefore
n-1 n
—s=x<—.
m m
42 THE REAL NUMBERS

We must also have n/m < y, for otherwise


n—1 n
—— XK ys_
m

which implies that y-x =< 1/m contrary to the choice of m. Therefore
x<nim<y.
(b) Supposing that O0<x<y and €>0, we have x/E<y/&é& By part
(a) there exists a rational number s such that x/é<s<y/é Therefore
x<s&<y. (Show that sé is irrational.) Q.E.D.

Exercises

6.A. Prove that a non-empty finite set of real numbers has a supremum and an
infimum.
6.B. If a subset S of R contains an upper bound, then this upper bound is the
supremum of S.
6.C. Give an example of a set of rational numbers which is bounded but does not
have a rational supremum.
6.D. Give an example of a set of irrational numbers that has a rational sup-
remum.
6.E. Prove that the union of two bounded sets is bounded.
6.F. Give an example of a countable collection of bounded sets whose union is
bounded, and an example where the union is unbounded.
6.G. If S is a bounded set in R and if S, is a nonempty subset of S, then show
that
inf S <= inf S, <= sup S, = sup S.

Sometimes it is more convenient to express this in another way. Let D#§ and let
f:D— R have bounded range. If D, is a non-empty subset of D, then

inf {f(x):x € D} = inf {f(x):x € Do} = sup {f(x):x € Do} = sup {f(x):x € D}.

6.H. Let X and Y be non-empty sets and let f:X x Y — R have bounded range
in R. Let
f(x) = sup {f(x, y):y € ¥}, fly) = sup {f(x, y):x © X}.

Establish the Principle of Iterated Suprema:

sup {f(x, y):x €X, y € Y}= sup {fi(x):x € X}


= sup {f.ly):y © Y}.
We sometimes express this in symbols by:

sup
my f(x, y) = supeysup f(x, y) = supyo supx f(x, y).
6.1. Let f and f, be as in the preceding exercise and let

8aly)
= inf {f(x, y):x
© X}
6. THE COMPLETENESS PROPERTY OF R 43

Prove that
sup {g.(y):y © Y} s inf {f,(x):x € X}.

Show that strict inequality can hold. We sometimes express this inequality by

sup inf f(x, y) = inf sup f(x, y).

6.J. Let X be a non-empty set and let f: X > R have bounded range in R. If
aéR, show that
sup {a+ f(x):x €X}=a-+sup {f(x):x € X},
inf {a+ f(x):x€X}=a+tinf {f(x):x © X}.

6.K. Let X be a non-empty set and let f and g be defined on X and have
bounded ranges in R. Show that

inf {f(x):x © X}+ inf {g(x):x € X} = inf {f(x)+ g(x): x eX}


< inf {f(x):x € X}+sup {g(x):x € X}
= sup {f(x)+g(x):xe X}
= sup {f(x):x € X}+sup {g(x):x © X}.

Give examples to show that each inequality can be strict.


6.L. If z >0 show that there exists n € N such that 1/2" < z.
6.M. Modify the argument given in Theorem 6.8 to show that if a >0, then the
number
b=sup{yeR:0<y, y’=a}

exists and has the property that b?=a. This number will be denoted by Va or a” ?
and is called the positive square root of a.
6.N. Use Exercise 5.P to show that if O0<a<1, then 0<a<vVa< 1, while if
1<a, then 1<Va<a.

Projects}
6.a. If a and b are strictly positive real numbers and if n € N, we have defined a"
and b”. It follows by mathematical induction that if m, n €N, then
(i) a"a* =a":

(ii) (a@"Y =a";


(iti) (ab) = a"b";

(iv) a<b if and only if a* <b".

t The projects are intended to be somewhat more challenging to the reader, but they differ
considerably in difficulty. We have put these three (rather difficult) projects here because
they belong here logically. The reader should return to them later after he has
accumulated more experience with suprema.
44 THE REAL NUMBERS

We shall adopt the convention that a°= 1 anda" = 1/a". Thus we have defined a’*
for x in Z and it is readily checked that properties (i)—(iii) remain valid.
We wish to define a* for rational numbers x in such a way that (i)-(iii) hold. The
following steps can be used as an outline. Throughout we shall assume that a and b
are real numbers exceeding 1.
(a) If r is a rational number given by r= m/n, where m and n are integers and
n>0 we define S,(a)={xeR:0<x"<a™}. Show that S,(a) is a bounded non-
empty subset of R and define a’ =sup S,(a).
(b) Prove that z = a' is the unique positive root of the equation z*=a"™. (Hint:
there is a constant K such that if O0<e<1, then (1+e)"<1+Ke. Hence if
x" <a" <y’, there exists an ¢ >0 such that

x"(1+e)"<a™<y"/(i+e)")

(c) Show that the value of a’ given in part (a) does not depend on the representa-
tion of rin the form m/n. Also show that if r is an integer, then the new definition
of a’ gives the same value as the old one.
(d) Show that if r, se Q, then a‘a’ =a" and (a‘') =a”.
(e) Show that a’b' = (ab).
(f) If re Q, r>0, then a <b if and only if a’<b’.
(g) If r, s¢ Q, then r<s if and only if a’<a‘.
(h) If c is a real number satisfying O0<c<1, we define c'=(1/c)"".. Show that
parts (d) and (e) hold and that a result similar to (g), but with the inequality
reversed, holds.
6.8. Now that a* has been defined for rational numbers x, we wish to define it for
real x. In doing so, make free use of the results of the preceding project. As
before, let a and b be real numbers exceeding 1. If ueR, let

Ta)
= {a :reQ, r <u}.

Show that T,(a) is a bounded non-empty subset of R and define

a“ =sup T,(a).

Prove that this definition yields the same result as the previous one when u is
rational. Establish the properties that correspond to the statements given in parts
(d)-(g) of the preceding project. The very important function which has been
defined on R in this project is called the exponential function (to the base a).
Some alternative definitions will be given in later sections. Sometimes it is
convenient to denote this function by the symbol

eXxPa
and denote its value at the real number u by exp,(u) instead of a".
6.y. Making use of the properties of the exponential function that were estab-
lished in the preceding project, show that exp, is an injective function with domain
R and range {y¢R:y>0}. Under our standing assumption that a > 1, this expo-
nential function is strictly increasing in the sense that if x<u, then exp,(x)<
exp.(u). Therefore, the inverse function exists with domain {v€R:v >0} and
7. CUTS, INTERVALS, AND THE CANTOR SETS 45

range R. We call this inverse function the logarithm (to the base a) and denote it
by
log..

Show that log, is a strictly increasing function and that

exp.(log.(v))=v forv>0, log.(exp.(u))=u for weR.

Also show that log. (1) = 0, log. (a) = 1, and that

log.(v) <0 forv<1, log.(v)>0 for v>1.

Prove that if v, w >0, then

log. (vw) = log, (v) + log, (w).

Moreover, if v > 0 and x eR, then

log, (v*) = x log.(v).

Section 7 Cuts, Intervals, and the Cantor Set

Another method of completing the rational numbers to obtain R was


devised by Dedekind’; it is based on the notion of a “cut.”
7.1 Derinirion. An ordered pair (A, B) of non-void subsets of R is
said to form a cut if AN B=9, AUB=R, anda<b forallaéA and all
beB,
A typical example of a cut in R is obtained for a fixed element €€ R by
defining
A={xeER:x < é}, B={xeR:x>
é}.
Alternatively, we could take
Ai:={xEeR:x< é}, B,={xeR:x
> é}.
It is am important property of R, that every cut in R is determined by
some real number. We shall now establish this property.

A {
NF
B
Ae

Figure 7.1. A Dedekind cut.

} RICHARD DEDEKIND (1831-1916) was a student of Gauss. He contributed to number


theory, but is best known for his work on the foundations of the real number system.
46 THE REAL NUMBERS

7.2 Cur Propgerry. If (A, B) is a cut in R, then there exists a unique


number €€ R such thata = éforallac A and = b forallbeB.
PROOF. By hypothesis, the sets A and B are non-void. Any element
of B is an upper bound of A. Hence A has a supremum which we denote
by €. Since é is an upper bound of A, then a = é for allacA.
If b € B, then from the definition of acut a = b for allae A. Hence b is
an upper bound of A and so é=b. Thus the existence of a number with
the stated properties is demonstrated.
To establish the uniqueness of é, let 7 ¢R be such that a = 7 for all
aéA andy <b for all be B. It follows that y is an upper bound of A;
hence é <7. If <n, then there exists a number £ = (€+y)/2 such that
E<{<y. Noweither{¢A orf¢B. If fA, we have a contradiction of
the fact that a < € for allae A. If ¢€B, we have a contradiction of the
fact that y = b for allbe B. Therefore we must have é = n. Q.E.D.
Actually, what Dedekind did was, in essence, to define a real number to be a cut
in the rational number system. This procedure enables one “‘to construct”’ the real
number system R from the set Q of rational numbers.

Cells and Intervals

If ae€ R, then the sets

{xe R:x <a}, {xe R:x>a}


are called the open rays determined by a. Similarly, the sets
{xER:x <a}, {xER:x =a}

are called the closed rays determined by a. The point a is called the end
point of these rays. These sets are often denoted by the notations

(—%, a), (a, +o), (—%, a], [a, +o),

respectively; here —© and + are merely symbols and are not to be


considered elements of R.
If a, b¢ R and a <= b, then the set
{xe R:a<x<b}
is called the open cell determined by a and b and is often denoted by
(a,b). The set
{xeR:a<x <b}
is called the closed cell determined by a and b and is denoted by [a, b].
The sets

{xeR:a=<x<b}, {xe R:a<x<b}


7. CUTS, INTERVALS, AND THE CANTOR SETS 47

are called the half-open (or half-closed) cells determined by a and b and
are denoted by

[a,b), (a, b],


respectively. The points a, b are called the end points of these cells.
By an interval in R, we mean either a ray, or a cell, or all of R. Thus
there are ten different kinds of intervals in R; namely,

0, (-*, a), (-~, al, [a, bl, [a, b),

{a,b], (a,b), [b, +), (b, +=), RR,


where a, be R anda<b. Five of these intervals are bounded. Two are
bounded above but not below, and two are bounded below but not above.
The unit cell (or the unit interval) is the set (0, 1]={xe R:0<x<1}. It
will be denoted by the standard notion I.
We shall say that a sequence of intervals I, n € N, is nested in case the
chain

h2hoah2>:::Dh2Iwi2D:::
of inclusions holds. It is important to notice that a nested sequence of
intervals does not need to have acommon point. Indeed, it is an exercise
to show that if I, = (n, +~), ne N, then the sequence of intervals obtained
is nested but has no common point. Similarly, if J, =(0, 1/n), ne N, then
the sequence is nested but has no common point.
However, it is a very important property of R that every nested se-
quence of closed cells has a common point. We shall now prove that fact.
7.3. NESTED CELLS PROPERTY. If néEN, let I, be a non-void closed
cell in R and suppose that this sequence is nested in the sense that

Lh2h>-:-2hL2:--

Then there exists an element which belongs to all of these cells.


PROOF. Suppose that I,=[a,,b.], where a, <b, for all neN. We
note that I, < 1, for all n, hence a, < 5, for all n. Hence the set {a,: ne N}
is bounded above. We let é be its supremum; hence a, < € for all n.
We claim that é < b, for allne N. If not, there exists some me N such
that b,,<é Since € is the supremum of {a,:ne€ N} there must exist a,
such that b.<a,. Now let q be the larger of the natural numbers m and
p. Since a:<a@2<:-''<a<-:: and b= b>--:> b,>--- we infer
that b, = by, <a, <a,. But this implies that b,<a,, contrary to the as-
sumption that I, =[a,, b,] is a non-void closed cell. Therefore & < b, for
allne N. Since a, < € < b,, we infer that €¢ I, =[a,, b.] for allne N.
OED.
48 THE REAL NUMBERS

We note that, under the hypotheses of 7.3, there may be more than one
common element. In fact, if we let yn =inf{b, :n¢ N}, it is an exercise to
show that

[Enl= 1b
The Cantor Set

We shall now introduce a subset of the unit cell F which is of consider-


able interest and is frequently useful in constructing examples and counter-
examples. We shall denote this set by F and refer to it as the Cantor set
(although it is also sometimes called Cantor’s ternary set or the Cantor
discontinuum).
One way of describing F is as the set of real numbers in I which have a
ternary (= base 3) expansion using only the digits 0, 2. However, we
choose to define it in different terms. In a sense that will be made more
precise, F consists of those points in I that remain after ‘‘middle third”
intervals have been successively removed.
To be more explicit: if we remove the open middle third of E we obtain
the set
F,=[0, 3]U5, 1).
If we remove the open middle third of each of the two closed intervals in
F,, we obtain the set

F.=[0, 5] UB, JUG, s] UB, 1].


Hence F, is the union of 4(= 2”) closed intervals all of which are of the
form [k/3’, (k + 1)/37]. We now obtain the set F; by removing the open
middle third of each of these sets. In general, if F, has been constructed
and consists of the union of 2" intervals of the form [k/3", (k + 1)/3"],
then we obtain F.,, by removing the open middle third of each of
these intervals. The Cantor set is what remains after this process has been
carried out for each n in N.
7.4 DEFINITION. The Cantor set F is the intersection of the sets F,,
néN, obtained by successive removal of open middle thirds.
At first glance, it may appear that every point is ultimately removed by this
process. However, this is evidently not the case since the points 0, +, 3, 1 belong to
all the sets F,, n€ N, and hence to the Cantor set F. In fact, it is easily seen that
there are an infinite number of points in F, even though F is relatively thin in some
other respects. Indeed, it is not difficult to show that there are a non-denumerable
number of elements of F and that the points of F can be put into one-one
correspondence with the points of I. Hence the set F contains a large number of
elements.
7. CUTS, INTERVALS, AND THE CANTOR SETS 49

Fi

Fs

Fy
Figure 7.2. The Cantor set.

We now give two senses in which F is “thin.” First we observe that F does not
contain any non-void interval. For if x belongs to F and (a, b) is an open interval
containing x, then (a, b) contains some middle thirds that were removed to obtain
F. (Why?) Hence (a, b) is not a subset of the Cantor set, but contains infinitely
many points in its complement €(F).
A second sense in which F is thin refers to ‘length.””. While it is not possible to
define length for arbitrary subsets of R, it is easy to convince oneself that F cannot
have positive length. For, the length of F, is 3, that of F, is 3, and, in general, the
length of F, is 3)". Since F is a subset of F,, it cannot have length exceeding that of
F,. Since this must be true for each n in N, we conclude that F, although uncount-
able, cannot have positive length.
As strange as the Cantor set may seem, it is relatively well-behaved in many
respects. It provides us with a bit of insight into how complicated subsets of R can
be and how little our intuition guides us. It also serves as a test for the concepts
that we will introduce in later sections and whose import are not fully grasped in
terms of intervals and other very elementary subsets.

Models for R
In Sections 4—6, we have introduced R axiomatically in the sense that we
have listed some properties that we assume it to have. This approach
raises the question as to whether such a set actually exists and to what
extent it is uniquely determined. While we shall not settle these questions,
a few remarks about them is certainly appropriate.
The existence of a set which is a complete ordered field can be demon-
strated by actual construction. If one feels sufficiently familiar with the
rational field Q, one can define real numbers to be special subsets of Q and
define addition, multiplication, and order relations between these subsets
in such a way as to obtain a complete ordered field. There are two
standard procedures that are used in doing this: one is Dedekind’s method
of ‘‘cuts’’ which is discussed in the book of Rudin that is cited in the
References. The second way is Cantor’s method of “Cauchy sequences”
which is discussed in the book of Hamilton and Landin.
50 THE REAL NUMBERS

In the last paragraph we have asserted that it is possible to construct a


model of R from Q (in at least two different ways). It is also possible to
construct a model of R from the set N of natural numbers and this is often
taken as the starting point by those who, like Kronecker,} regard the
natural numbers as given by God. However, since even the set of natural
numbers has its subtleties (such as the Well-ordering Property), we feel that
the most satisfactory procedure is to go through the process of first con-
structing the set N from primitive set theoretic concepts, then developing
the set Z of integers, next constructing the field Q of rationals, and finally
the set R. This procedure is not particularly difficult to follow and it is
edifying; however, it is rather lengthy. Since it is presented in detail in the
book of Hamilton and Landin, it will not be given here.
From the remarks already made, it is clear that complete ordered fields
can be constructed in different ways. Thus we cannot say that there is a
unique complete ordered field. In a sense, all of the methods of construc-
tion suggested above lead to complete ordered fields that are “isomor-
phic.” (This means that if R, and R2 are complete ordered fields obtained
by these constructions, then there exists a one-one mapping ¢ of R, onto
R, such that (i) » sends a rational element of R, into the corresponding
rational element of Ro, (ii) sends a+b into e(a)+¢(b), (iii) @ sends ab
into p(a)p(b), and (iv) @ sends a positive element of R: into a positive
element of R2.) Within naive set theory, we can provide an argument
showing that any two complete ordered fields are isomorphic in the sense
described. Whether this argument can be formalized within a given sys-
tem of logic depends on the rules of inference employed in the system.
Thus the question of the extent to which the real number system can be
regarded as being uniquely determined is a rather delicate logical issue.
However, for our purposes this uniqueness (or lack of it) is not important,
for we can choose any particular complete ordered field as our model for
the real number system.

Exercises
7.A. If (A, B) is a cut in R, show that sup A =inf B.
7.B. If the cuts (A, B) and (A’, B’) determine the real numbers & and &’,
respectively, show that €<&€’ implies that AC A’, AZ A’.
7.C. Is the converse of the preceding exercise true?
7.D. Let A={xER:x =0 or x’? <2} and B={xeR:x>0 and x’>2}. Show
that (A, B) is a cut in R.

+ LEOPOLD KRONECKER (1823-1891) studied with Dirichlet in Berlin and Kummer in Bonn.
After making a fortune before he was thirty, he returned to mathematics. He is known for his
work in algebra and number theory and for his personal opposition to the ideas of Cantor on
set theory.
7. CUTS, INTERVALS, AND THE CANTOR SETS 51

7.E. Let I, = (n, +%) for ne N. Show that the sequence of intervals is nested,
but that there is no common point.
7.F. Let J, =(0,1/n) for ne N. Show that this sequence of intervals is nested,
but that there is no common point.
7.G. If I, =[a,, b.], 1é N, is a nested sequence of closed cells, show that

Q80,.8::'<a<'''sb
S's b sb.

If we put € =sup
{a, 2 € N} and 7 =inf {b,,:m
€ N}, show that [é q]= ial I.
7.H. Show that every number in the Cantor set has a ternary (= base 3) expan-
sion using only the digits 0, 2.
7.1. Show that the collection of “right hand” end points in F is denumerable.
Show that if all these end points are deleted from F, then what remains can be put
onto one-one correspondence with all of [0,1). Conclude that the set F is not
countable.
7,J. Every open interval (a, b) which contains a point of F also contains an entire
“middle third”’ set which belongs to €(F). Hence F does not contain any non-void
open interval.
7.K. By removing sets with ever decreasing length, show that we can construct a
“Cantor-like’’ set which has positive length. How large can we make the length of
this set?
7.L. Show that F is not the union of a countable collection of closed intervals.
I
THE TOPOLOGY
OF CARTESIAN SPACES

The sections of Chapter I were devoted to developing the algebraic


properties, the order properties, and the completeness property of the real
number system. Considerable use of these properties will be made in this
and later chapters.
Although it would be possible to turn immediately to a discussion of
sequences of real numbers and continuous real functions, we prefer to
delay the study of these topics a little longer. Indeed, we shall insert here
the definitions of a vector space, a normed space, and an inner product
space. We do so because these notions are easily grasped and because
such spaces arise throughout all of analysis (to say nothing of its applications
to geometry, physics, engineering, economics, etc.) Ofcourse, the Cartesian
spaces R? will be of especial interest to us. Fortunately, our intuition for
R’ and R? usually carries over without much change to the space R’, anda
knowledge of these spaces is of help in analyzing more general spaces.

Section 8 Vector and Cartesian Spaces


A “‘vector space” is a set in which one can add two elements, and can
multiply an element by a real number, in such a way that certain familiar
properties hold. We shall now be more precise.
8.1 DEFINITION. A vector space is a set V (whose elements are called
vectors) equipped with two binary operations, called vector addition and
scalar multiplication.
If x, y€ V there is an element x+y in V, called the vector sum of x and
y. This vector addition operation satisfies the following properties:
(Al) x+y=y+x for all x, y in V;
(A2) (xt+y)+z=x+(y+z) for all x, y, z in V;

52
8. VECTOR AND CARTESIAN SPACES 53

(A3) there exists an element 0 in V such that O0+x=x and


x+0=x forall x in V;
(A4) given x in V there is an element —x in V such that x+(-x)=0
and (-—x)+x=0.
If ae R and xe¢ V there is an element ax in V, called the multiple of a
and x. This scalar multiplication operation satisfies the following proper-
ties:
(M1) 1x=x for all xe V;
(M2) a(bx)=(ab)x for all a, be R and xeV;
(D) a(xt+y)=ax+ay and (a+b)x=ax+by for all real a, bER
and x, ye V.
We shall now give some elementary, but important, examples of vector
spaces.
8.2 ExaAmpLes. (a) Therealnumber
system isa vector space where the
addition and scalar multiplication operations are the usual addition and
multiplication of real numbers.
(b) Let R’ denote the Cartesian product R x R. Hence R’ consists of all
ordered pairs (x1, x2) of real numbers. If we define vector addition and
scalar multiplication by
(x1, X2) + (y1, Y2) = x1 + ya, X2t y2),
a(X1, X2) = (ax1, ax),
then it can readily be checked that the properties in Definition 8.1 are
satisfied. [Here 0=(0, 0) and —(xi, x2) = (—x1, —x2).] Hence R’ is a vec-
tor space under these operations,
2
(c) Let p€N and tet R°’ denote the collection of all ordered ‘“p-tuples’
(%1, X2,... 5 Xp)
with x.¢ R fori=1,...,p. If we define vector addition and scalar multi-
plication by
(X1, Xo)... Xp)+(V1y Yor. s+ > Vp) = (Kit yi, X2+ Yo, ..., Xp t+ yp)
@(X1, X2,..- 5 Xp) =(AX1, AX2,..., GXp),
then it can readily be checked that R? is a vector space under these
operations. [Here it is seen that O=(0,0,...,0) and —(x1, x2,..., Xp) =
(—X1, —X2,.-., —Xp)-]
(d) Let S be any set and let R* denote the collection of all functions u
with domain S$ and range in R. (Hence R* is the collection of all real-
valued functions defined in S.) If we define u+v and au by
(ut+v)(s) = u(s)+u(s),
(au)(s)= au(s),
54 THE TOPOLOGY OF CARTESIAN SPACES

for all s eS, then it can readily be checked that R* is a vector space under
these operations. [Here 0 is the function identically equal to zero, and —u
is the function whose value at s ES is —u(s).]
In later sections, we shall encounter many other vector spaces.
Generally we shall write x — y instead of x +(—y).

Inner Products and Norms

The reader will note that the scalar multiplication in a vector space Visa
function with domain R x V and range V. Many vector spaces are also
equipped with a function with domain VxV and range R that is of
importance.
8.3 DEFINITION. If V is a vector space, then an inner product (or dot
product) is a function on V x V to R, denoted by (x, y)> x - y, satisfying
the properties:
@) x-x20 forall xeV;
Gi) x-x=0 if and only if x=0;
(iii) x-y=y-x forall x, yeV;
(iv) x-(y+z)=x-yt+x-z and (x+y)-z=x-z+y-z forall x, y,
zéEV;
(v) (ax): y=a(x-y)=x-(ay) for all ac R, and x, yeV.
A vector space in which an inner product has been defined is called an
inner product space.
It is possible for different inner products to be defined in the same vector
space (cf. Exercise 8.D).
8.4 EXAMPLES. (a) The ordinary multiplication in R satisfies the
above properties, so R is an inner product space.
(b) In R’, we define
(x1, X2) + (y1, y2) = Xiyit X2ye2.
It is easy to check that this defines an inner product on R’.
(c) In R?’, we define
(X1, X25 2+ +5 Xp) * (Yay Vos - ++ Vp) = X1 Yr tH H2y2t+ + -+Xpyp.
It is easy to check that this defines an inner product on R?.
8.5 DeFiniTion. If V is a vector space, then a norm on V is a
function on V to R denoted by x + ||x|| satisfying the properties:
(i) |x| =O for all xeV;
(ii) ||x|=0 if and only if x=0;
(iii) ||ax||=|a||[x|| for all ae R, xeV;
(iv) |x + yll = [xl|+ lly] for all x, ye Vv.
A vector space in which a norm has been defined is called a normed space.
8. VECTOR AND CARTESIAN SPACES 55

As we will see in the exercises, the same vector space can have several
interesting norms.

8.6 EXAMPLES. (a) The absolute value function on R satisfies the


properties in 8.5.
(b) In R’, we define
(x1, x2)| = (xa>+ x2 )?,

Properties (i), (ii), and (iii) are very easily checked. Property (iv) is a bit
more complicated.
(c) In R?, we define

I[(x1, 2 +++, Xp] = Ger


+ x2? te xp’).
Again, properties (i), (ii), and (iii) are easy.

We shall now give a theorem which asserts that an inner product can
always be used to define a norm in a very natural way.
8.7 THEOREM. Let V be an inner product and define ||x|| by

IxJ=vx-x for xeV.


Then x = ||x|| is a norm on V and satisfies the property that
(+) x-y x [xIlllyll.
Moreover, ifx and y are non-zero, then the equality holds in (*) if and only if
there is some strictly positive real number c such that x = cy.
PROOF Since x - x 2 0 for all x € V, then the square root of x- x exists,
so |x|] is well-defined. The first three properties of the norm are direct
consequences of 8.3(i), (ii), and (v). To prove (+), let a, be R, x,y € V, and
let z=ax — by. If we use properties 8.3(i), (iii), (iv), and (v), we get
O<z-z=a’x-x-2abx-y+b’y-y.
Now take a =|ly|| and b =||x]|, to get

O = flylP ll?—2 [lyf fell » - y + HelP IlylP


=2 (xl llyll det llyll-x - y).
Hence the inequality (*) holds.
If x = cy with c >0, then ||x{|= |ly|] and so
x-y=(cy)-y=ely-y)=e ly?
= [hell Il
so that equality holds in («). Conversely, if x - y =||x!|||y|| the calculation in
the preceding paragraph shows that z =||y|| x —||x|| y has the property that
56 THE TOPOLOGY OF CARTESIAN SPACES

z-z=0. Therefore z =0 and, since x and y are non-zero vectors, we can


take c=||x||/ly||. To establish 8.5(Gv), we use (*) to show that
Ix+yP=(et+y)- (ety)
FSX°XtXsytyrxtyy

= |[x|P+2(x - y+ flylP
< ||x{? +2 laf lly lly lP
< (lhxl]+lyl)*,
whence it follows that ||x + y|| < |]x||+ly|| for all x, y € V. QED.
We leave the proof of the following corollary as an exercise.
8.8 COROLLARY. If x, y are elements of V, then
(**) x y| = [lxllllyll.
Moreover, if y#0 then the equality can hold in (**) if and only if there is a
real number c such that x = cy.
Both of the inequalities (*) and («*) are called the Schwarz Inequality, or
the Cauchy-Bunyakovskii-Schwarz Inequality.; They will be frequently
used. The inequality 8.5(iv) is called the Triangle Inequality. We leave it
to the reader to show that

| -ll— lly] = te ll = [lell+ ly


for any x, y in a normed space.

The Cartesian Space R°


By the p-dimensional real Cartesian space we mean the set R’ equipped
with the vector addition and scalar multiplication defined in Example
8.2(c), and the inner product defined in Example 8.4(c). As we have seen,
this inner product induces the norm

Il(xs, x2). 5 Xp)|J= Var xe te tx.


+ AuGuSTIN-LOUIS CAUCHY (1789-1857) was the founder of modern analysis but also made
profound contributions to other mathematical areas. He served as an engineer under
Napoleon, followed Charles X into self-imposed exile, and was excluded from his position at
the Collége de France during the years of the July monarchy because he would not take a
loyalty oath. Despite his political and religious activities, he found time to write 789
mathematical papers.
VICTOR BUNYAKOVSKIi (1804-1889), a professor at St. Petersburg, established a generali-
zation of the Cauchy Inequality for integrals in 1859. His contribution was overlooked by
western writers and was later discovered independently by Schwarz.
HERMANN AMANDUS SCHWARZ (1843-1921) was a student and successor of Weierstrass at
Berlin. He made numerous contributions, especially to complex analysis.
8. VECTOR AND CARTESIAN SPACES 57

The real numbers xi, X2,...,Xp are called the first, second,..., p-th
coordinates (or components) of the vector x = (x1, X2,..., Xp).
In R’, the real number |x|] can be thought of either as the “length” of x
or as the distance from x to 0. More generally, we think of {|x — y|| as the
distance from x to y. With this interpretation, property 8.5(ii) asserts that
the distance from x to y is zero if and only if x =y. Property 8.5(iii) with
a =—1 asserts that ||x — y||=|ly — x|], which means that the distance from x
to y is equal to the distance from y to x. The Triangle Inequality implies
that

IIx—yll = lx —z{|+llz - yl
which means that the distance from x to y is no greater than the sum of the
distance from x to z and the distance from z to y.
8.9 Derinition. Let x€R° and let r>0Q. Then the set {ye R?:
jx — yl|<r} is called the open ball with center x and radius r. The
set {ye R?:|x—yllar} is called the closed ball with center x and
radius r. The set {ye R? :||x —y||=r} is called the sphere with center x
and radius r.
The notion of a ball depends on the norm. It will be seen in the
exercises that some balls are not very “round.”
It is often convenient to have relations between the norm of a vector in
R?’ and the magnitude of its components.
8.10 THEOREM. If x ={xi, X2,..., Xp) is any element of R°, then

[x:| = [lx] = Vp sup {/x1], bxal, -- - xel}-


PROOF. Since ||x|?= x1? +x2°+--++4+x,’, it is plain that |x| < |{x|| for all
i, Similarly, if M = sup {|xi, [x2], ..., [xp[}, then ||x|? = pM’, so that ||xif/<
vpM. QED.

An open ball with A closed bali with


center x, center x,
Figure 8.1
58 THE TOPOLOGY OF CARTESIAN SPACES

The inequality just established asserts, in a quantitative fashion, that if


the norm of x is small, then the lengths of its components are small, and
conversely.

Exercises

8.A. If V is a vector space and if x + z =x for some x and z in V, show that z=0.
Hence the zero element in V is unique.
8.B. If x+y =0 for some x and y in V, show that y=—x.
8.C. Let S={1,2,...,p}, for some peN. Show that the vector space R* is
“essentially the same”’ as the space R’.
8.D. If w, and w, are strictly positive, show that the definition
(1, X2) * (Yi, Yo) = X1Y1Wi
+ X2VoWo,
yields an inner product on R*. Generalize this to R’.
8.E. The definition
(X1, X2) - (Ys Ya) = X11

is not an inner product on R*. Why?


8.F. Tix = (x1, %2,...,%,) ER’, define ||x||, by

[xh= [xa]
+ [a] ++ + - + [xp
Prove that x +> ||x|], is a norm on R’.
8.G. If x = (x1, %2,...,%,) ER’, define |x|. by

[ele= sup {]x:|, [x2], -- - [pl}-


Prove that x +> ||x||. is a norm on R’.
8.H. In the set R’, describe the sets

Si ={x € R?: |x|] <0, S.= {x € R?:||x||.<


1}.
8.1. If x, ye R’, the norm defined in 8.4(c) satisfies the Parallelogram Identity:

lx + y[P + ]]x — ylP = 20x? + [lylP).


Prove this and show that it can be interpreted as saying that the sum of the squares of
the lengths of the four sides of a parallelogram equals the sum of the squares of the
diagonals.
8.J. Show that the norms defined in Exercises 8.F and 8.G do not satisfy the
Parallelogram Identity.
8.K. Show that there exist positive constants a, b such that
a |x|, = [xl] = b |x\|h, forall xeéR’.
Find the largest constant a and the smallest constant b with this property.
&.L. Show that there exist positive constants a, b such that

a |x|], = |[x|]. < b |Ix|], forall xeR?’.


Find the largest constant a and the smallest constant b with this property.
8. VECTOR AND CARTESIAN SPACES 59

8.M. If x, y belong to R’, is it true that

Ix-yl=[blh lyf = and fx - y| = [fell lly l . ?


8.N. If x, y belong to R’, then is it true that the relation

lle + yll = Ieell + ily


holds if and only if x =cy or y =cx with c = 0?
8.0. Let x, y belong to R°, then is it true that the relation

[x + yll-= [lll llyll


holds if and only if x =cy or y =cx with c = 0?
8.P. If x, y belongs to R’, then

lx + yl? = hel? + IlyIP


holds if and only if x - y=0. In this case, one says that x and y are orthogonal or
perpendicular.
8.Q. A subset K of R’ is said to be convex if, whenever x, y belong to K and t isa
real number such that 0 <t < 1, then the point

(1-£)x
+ ty =x+t(y—x)

also belongs to K. Interpret this condition geometrically and show that the subsets

K, ={x € R?:||x|| < 1},


K,={, n)¢ R?:0<é<n},
K,={(é n)e R?:0<y sé = I},
are convex but that the subset

K,= {x € R?:||x||= 1}
is not convex.
8.R. The intersection of any collection of convex subsets of R’ is convex. The
union of two convex subsets of R’ may not be convex.
8.S. If M is any set, then a function d: M x M —> R is called a metric on M if it
satisfies:
(i) d(x, y)=0 for all x, y in M;
(ii) d(x, y)=0 if and only if x=y;
(iii) d(x, y)=d(y, x) for all x, y in M;
(iv) d(x, y) <d(x,z)+d(z, y) for all x, y, z in M.
Show that if x ||x|| is any norm on a vector space V and if we define d by
d(x, y) =||x — yl, for x, y € V, then d is a metric on V.
8.T. Suppose that d is a metric on aset M. By using Definition 8.9 as a model,
define an open ball with center x € M and radius r. Interpret the sets S, and S., in
Exercise 8.H as open balls in R® with respect to two different metrics. Interpret
Exercise 8.K as saying that a ball with center 0 relative to the metric d, (derived from
the norm in 8.6(b)) contains and is contained in balls with center 0 relative to the
metric d, derived from {| ||. Make similar interpretations of Exercise 8.L and
Theorem 8.10.
60 THE TOPOLOGY OF CARTESIAN SPACES

8.U. Let M be any set and let d be defined on M x M by the requirement that

_ fo if x=y,
dx y={} if x#y.
Show that d gives a metric on M in the sense defined in Exercise 8.S. Ifx is any point
in M, then the open ball with center x and radius 1 (relative to the metric d) consists
of precisely one point. However, the open ball with center x and radius 2 (relative to
d) consists of all of M. This metric d, is sometimes called the discrete metric on the
set M.

Projects

8.a. In this project we develop some important inequalities.


(a) Let a and 6 be positive real numbers. Show that

ab = (a’+b’)/2,
and that the equality holds if and only if a= b. (Hint: consider (a — b)’.)
(b) Let a, and a, be positive real numbers. Show that

VG,Q,5 (a, + a,)/2

and that the equality holds if and only if a,=a,.


(c) Let ai, a2,..., @, be m = 2" positive real numbers. Show that

(*) (Q,42° ++ Am)" < (+a, +++ ++ a, )/m


and that the equality holds if and only if a; =---=a,,.
(d) Show that the inequality («) between the geometric mean and the arithmetic
mean holds even when m is not a power of 2. (Hint: if 2°’ <m <2", let b, =a, for
j=il,...,m and let
b, =(ai+a.+++++a,)/m
for j=m-+1,...,2". Now apply part (c) to the numbers b,, bo, ... , ba.)
(e) Let ai, d2,...,@, and by, bo,...,5, be two sets of real numbers. Prove
Lagrange’s Identity+

(Hint: experiment with the cases n = 2 and n= 3 first.)


(f) Use part (e) to establish Cauchy’s Inequality

(San) =(SefSe}
n

j=l
2 n

j=l
n

k=1

Show that the equality holds if and only if the ordered sets (a,, a2,...,4a,) and
(by, b2,..., b,) are proportional.
+ JosEpH-LoUIS LAGRANGE (1736-1813) was born in Turin, where he become professor at the
age of nineteen. He later went to Berlin for twenty years as successor to Euler and then to
Paris. He is best known for his work on the calculus of variations and analytical mechanics.
8. VECTOR AND CARTESIAN SPACES 61

(g) Use part (f) and establish the Triangle Inequality

{5 ca aby} = {¥ a7} + {> web”


n a2 nm V2 mn V2

= si j=l

8.8. In this project, let {a,, a,,...,a@,}, and so forth, be sets of n positive real
numbers.
(a) Itcan be proved (for example, by using the Mean Value Theorem) that if a and
b are positive and 0<a<1, then
a*b'* <aa+(1—a)b

and that the equality holds if and only if a=b. Assuming this, let r>1 and let s
satisfy .

Lt,
r os

(so that s>1 andr+s=rs). Show that if A and B are positive, then

AB<* 4B
ros
and that the equality holds if and only if A’ = B’.
(b) Let {a,,...,a,} and {b,,...,b,} be positive real numbers. If r, s>1 and
(1/r) + (1/s) = 1, establish Hélder’s Inequality+
n n tree yds
2» ab, = {2 ai} ‘{e or} .

(Hint: Let A = {3 a‘}"" and B = {> b;}"* and apply part (a) to a/A and b/B.)
(c) Using Hélder’s Inequality, establish the Minkowski Inequality?

arm} =fay i=l ap


(Hint: (a+b) =(at+b)(at+ b)” =ala+ b)" + b(at by”)
(d) Using Hélder’s Inequality, prove that
n n lir

am) Ya< {ain) x a}


j=1 pel

(e) If a, a, and b, < by, then (a,— a2)(b:— b,) = O and hence
a,b, + ayb, = a,b.+ azb,.
Show that if a,< a,<-::<a, andb,<b,<--:< b,, then

nd, abe {¥ al {> of.


+ OTTO HOLDER (1859-1937) studied at Géttingen and taught at Leipzig. He worked in both
algebra and analysis.
+ HERMANN MINKOWSKI (1864-1909) was professor at Kénigsberg and Gottingen. He is best
known for his work on convex sets and the ‘‘geometry of numbers.”
62 THE TOPOLOGY OF CARTESIAN SPACES

(f) Suppose that O<a,<a,<:--sa, and 0O=b,<b,<:--=b, and r2=1.


Establish the Chebyshev Inequality
lf ir
{a/n) % a’) fun) y bf} = {ary y (aby
Show that this inequality must be reversed if {a,} is increasing and {b,} is decreasing.

Section 9 Open and Closed Sets

Many of the deepest properties of real analysis depend on certain topolog-


ical notions. In the next few sections we shall introduce the basic concepts
and derive some of the most crucial topological properties of the space R?°.
These results will be frequently used in the following chapters.
9.1 DeFINirion. A set G in R?° is said to be open in R’ (or merely
open) if, for each point x in G, there is a real number r>0O such that
every point y in R° satisfying |x —y||<,r also belongs to the set G. (See
Figure 9.1.)
By using Definition 8.9, we can rephrase this definition by saying that a set
G is open if every point in G is the center of some open ball entirely
contained in G.
9.2. ExaAmpces. (a) The entire set R° is open, since we can take r= 1
for any x.

Figure 9.1. An open sect.


+ PAENUTI L. CHEBYSHEV (1821-1894) was a professor at St. Petersburg. He made many
contributions to mathematics, but his most important work was in number theory, probability,
and approximation theory.
9. OPEN AND CLOSED SETS 63

(b) The set G={xeR:0<x<1} is open in R=R’. The set F=


{x€R:0 <x = 1} is not openin R. (Why?)
(c) The sets G = {(x, y)e R*:x?+ y?<1} and H={(x, y):0<x?+y’?<1}
are open, but the set F = {(x, y):x’?+y’ = 1} is not open in R?. (Why?)
(d) The set G ={(x, y)e R?:0<x <1, y =O} is not open in R?. [Com-
pare this with (b).] The set H ={(x, y)¢ R?:0<y <1} is open, but the set
K={(x, y)eR?:0<y <1} is not open in R’.
(e) The set G={(x, y,z)¢R°:z>0} is open in R* as is the set H=
{(x, y,z)€R°:x>0, y>0, z>O}. On the other hand, the set F=
{(x, y, Z) = R?:x = y =z} is not open.
(f) The empty set @ is open in R?, since it contains no points at all, and
hence the requirement in Definition 9.1 is trivially satisfied.
(g) If B is the open ball with center z and radius a >0 and if x € B, then
the ball with center x and radius a —||z — x|| is contained in B. Thus B is
open in R?.

We now state the basic properties of open sets in R’. In courses on


topology this next result is summarized by saying that the open sets, as
defined in Definition 9.1, form a topology for R°.

9.3. Orkn Set Properties. (a) The empty set @ and the entire space
R?® are open in R?.
(b) The intersection of any two open sets is open in R?.
(c) The union of any collection of open sets is open in R®.

PROOF. We have already commented on the open character of the sets


@ and R’.
To prove (b), let Gi, G2 be open and let G;= GN G2. To show that Gs is
open, let x € G3. Since x belongs to the open set Gi, there exists r; > 0 such
that if ||x—zl|<n, then z¢G,. Similarly, there exists r.>0 such that if
|x — w||<r2, then we Gz. Choosing rs to be the minimum of r; and r:, we
conclude that if y ¢ R° is such that ||x — y||< ra, then y belongs to both G, and
G2. Hence such elements y belong to G3= G,M Gz, showing that G3 is
open in R°.
To prove (c), let {G., Gp, .. .} be a collection of sets which are open and let
G be their union. To show that G is open, let x¢G. By definition of the
union, it follows that for some set, say for G,, we have x EG,. Since G, is
open, there exists a ball with center x which is entirely contained in G,.
Since G, ¢ G, this ball is entirely contained in G, showing that G is open
in R’. QED.

By induction, it follows from property (b) above that the intersection of


any finite collection of sets which are open is also open in R’. That the
intersection of an infinite collection of open sets may not be open can be seen
64 THE TOPOLOGY OF CARTESIAN SPACES

from the example


1 1
(9.1) G, = xERi-T<x<lt+o7y, neN.

The intersection of the sets G, is the set F={x € R:0 < x = 1}, which is not
open.

Closed Sets

We now introduce the important notion of a closed set in R”.


9.4 DeFinition. A set F in R? is said to be closed in R? (or merely
closed) in case its complement €(F) = R? \ F is open in R?.
9.5 EXAMPLES. (a) The entire set R? is closed in R®, since its comple-
ment is the empty set, which was seen in 9.2(f) to be open in R?.
(b) The empty set 9 is closed in R’, since its complement in R? is all of R’
which was seen in 9.2(a) to be open in R?.
(c) Theset F={xeR:0=x = l}isclosedin R. One way of seeing this is
by noting that the complement of F in R is the union of the two sets
{xER:x<O}, {xeR:x>1}, each of which is open. Similarly, the set
{xé€R:0 <x} is closed.
(d) The set F = {(x, y)= R*:x’+ y’ < 1} is closed, since its complement in
R’ is the set
{(x, YE R?:x7+y? >I}.
which is seen to be open.
(e) The set H={(x, y,z)e€R°:x
= 0} is closed in R°, as is the set
F={(x, y,z)¢R°:x=y=z}.
(f) The closed ball B with center x in R” and radius r > 0 is aclosed set of
R’. For, if z¢B, then the open ball with center z and radius ||z—x||—r
is contained in €(B). Therefore, €(B) is open and B is closed in R’.
In ordinary parlance, when applied to doors, windows, and minds, the words
“open” and ‘‘closed’”’ are antonyms. However, when applied to subsets of R’,
these words are not antonyms. For example, we noted above that the sets #, R’ are
both open and closed in R’. (The reader will probably be relieved to learn that there
are no other subsets of R’ which have both properties.) In addition, there are many
subsets of R’ which are neither open nor closed; in fact, most subsets of R’ have this
neutral character. As a simple example, we cite the set

(9.2) A={xeR:0<x<1}.

This set A fails to be open in R, since it contains the point 0. Similarly, it fails to be
closed in R, because its complement in R is the set {x € R:x <0 or x = 1}, which is
not open since it contains the point 1. The reader should construct other examples
of sets which are neither open nor closed in R’.
9. OPEN AND CLOSED SETS 65

We now state the fundamental properties of closed sets. The proof of this
result follows directly from Theorem 9.3 by using DeMorgan’s laws
(Theorem 1.8 and Exercise 1.K).

9.6 CLOSED SET PROPERTIES. (a) The empty set @ and the entire
space R? are closed in R’.
(b) The union of any two closed sets is closed in R?.
(c) The intersection of any collection of closed sets is closed in R’.

Neighborhoods

We now introduce some additional topological notions that will be useful


and permit us to characterize open and closed sets in other terms.

9.7 DEFINITION. (a) Ifx €R’, then any set which contains an open set
containing x is called a neighborhood of x.
(b) A point x € R? is called an interior point of a set A < R? in case there
is a neighborhood of x which is entirely contained in A.
(c) A point x € R? is called a boundary point of aset A < R? in case every
neighborhood of x contains points in A and points in €(A).
(d) A point x € R? is called an exterior point of aset A < R? in case there
exists a neighborhood of x which is entirely contained in @(A).

It should be noted that given x € R’ and A € R’, there are three mutually exclusive
possibilities: (i) x is an interior point of A, (ii) x is a boundary point of A, or (iii)
x is an exterior point of A.

9.8 ExampLes. (a) Aset U isaneighborhood of a point x if and only


if there exists a ball with center x entirely contained in U.
(b) A point x is an interior point of A if and only if there exists a ball with
center x entirely contained in A.
(c) A point x is a boundary point of A if and only if for each natural
number n there exists points a,€ A and b, € €(A) such that ||x — a,|]<1/n
and |x — b,j]< 1/n.
(d) Every point of the interval (0, 1)< R is an interior point. The points
0, 1 are the boundary points of (0, 1).
(e) Let A=[0,1]¢R. Then the interior points of A are the points in the
open interval (0,1). The points 0, 1 are the boundary points of A.
(f) The boundary points of the open and the closed balls with center
x €R° and radius r > 0, are the points in the sphere with center x and radius
r. (See Definition 8.9.)

We now characterize open sets in terms of neighborhoods and interior


points.
66 THE TOPOLOGY OF CARTESIAN SPACES

9.9 THEOREM. If BCR’, then the following statements are equivalent:


(a) B is open;
(b) every point of B is an interior point of B;
(c) Bis a neighborhood of each of its points.

PROOF. If (a) holds and x € B, then the open set B is a neighborhood


of x and x is therefore an interior point of B.
It is trivial that (b) implies (c).
If (c) holds, then for each x € B, there is an open set G, ¢ B with x € G,.
Hence B = |) {G,: x € B}, so that it follows from Theorem 9.3(c) that B is
open in R’. O.E.D.

It follows from what we have shown that an open set contains none of its
boundary points. Closed sets are the other extreme in this respect.

9.10 THEOREM. A set FCR? is closed if and only if it contains all


of its boundary points.

PROOF. Suppose that F is closed and that x is a boundary point of F. If


x¢ F, then the open set €(F) contains x and no points of F, contrary to the
hypothesis that x is a boundary point of F. Hence we must have x € F.
Conversely, suppose that F contains all of its boundary points. If y¢F,
then y is neither a point of F or a boundary point of F; hence it is an exterior
point. Therefore there exists a neighborhood M of y entirely contained in
€(F). Since this is true for all y¢ F, we infer that €(F) is open, whence F is
closed in R’. O.E.D.

Open Sets in R

We close this section by characterizing the form of an arbitrary open


subset of R.
9.11 THEOREM. A subset of R is open if and only if it is the union of a
countable collection of open intervals.
PROOF. Since an open interval is open (why?), it follows from 9.3(c) that
the union of any countable union of open intervals is open.
Conversely, let G#@ be an open set in R and let {r,:n¢N} be an
enumeration of all of the rational points in G. For each ne N let m, be the
smallest natural number such that the interval J, = (ft, — 1/ta, ta + 1/m,) is
entirely contained in G. It follows that
U LEG.
neN

Now let x be an arbitrary point in G and let me N be such that (x —2/m,


x+2/m)c¢G. It follows from Theorem 6.10 that there exists a rational
9. OPEN AND CLOSED SETS 67

number y in (x —1/m, x +1/m); hence y € G and so y =r, for some natural


number n. If x does not belong to J,=(m—1/m,, m+1/m,), then we
must have 1/m,<1/m; but since it is readily seen that

(nb ineBe(e-2.x12)c
m m m m

this contradicts the choice of the m,. Therefore we have x ¢J, for this
value of n. Since xéG is arbitrary, we infer that

GoUh.
neN

Therefore G is equal to this union. Q.E.D.

It follows from the theorem just given that a subset of R is closed if and
only if it is the intersection of a countable collection of closed intervals.
(Why?) It does not follow that the countable union of closed intervals must
be closed, nor does every closed set have this property.
A generalization of this result is given in Exercise 9.G.

Exercises

9.A. Justify the assertion made about the set G, F made in Example 9.2(b).
9.B. Justify the assertions made in Example 9.2(c).
9.C. Prove that the intersection of any finite collection of open sets is open in R?’.
(Hint: use 9.3(b) and induction.)
9.D. What are the interior, boundary, and exterior points in R of the set [0, 1).
Conclude that it is neither open nor closed.
9.E. Give an example in R? which is neither open nor closed. Prove your
assertion.
9.F. Write out the details of the proof of Theorem 9.6.
9.G. Show that a subset of R’ is open if and only if it is the union of a countable
collection of open balls. (Hint: the set of all points in R? all of whose coordinates are
rational numbers is countable.)
9.H. Every open subset of R? is the union of a countable collection of closed sets.
9.1, Every closed subset of R° is the intersection of a countable collection of
open sets.
9.J. If A is any subset of R’, let A° denote the union of all open sets which are
contained in A; the set A° is called the imterior of A. Note that A° is an open set;
prove that it is the largest open set contained in A. Prove that

ACSA, (A= A°
(ANBY=A°NB’, — (R’)°= RR’.
Give an example to show that (A UB)°= A°U B® may not hold.
9.K. Prove that a point belongs to A° if and only if it is an interior point of A.
68 THE TOPOLOGY OF CARTESIAN SPACES

9.L. If A is any subset of R’, let A~ denote the intersection of all closed sets
containing A; the set A” is called the closure of A. Note that A” is a closed set;
prove that it is the smallest closed set containing A. Prove that
ASA, (AY =Aq
(AUB) =A°UB, f =o.

Give an example to show that (AMB)=A7NB™ may not hold.


9.M. Prove that a point belongs to A” if and only if it is either an interior or a
boundary point of A.
9.N. Give an example of a set A in R’ such that A°=@ and AT=R?’. Cansucha
set A be countable?
9.0. Let A and B be subsets of R. The Cartesian product A x B is open in R’ if
and only if A and B are open in R.
9.P. Let A and B be subsets of R. The Cartesian product A X B is closed in R’ if
and only if A and B are closed in R.
9.Q. Interpret the concepts introduced in this section for the Cantor set F of
Definition 7.4. In particular:
(a) Show that F is closed in R.
(b) There are no interior points in F.
(c) There are no non-empty open sets contained in F.
(d) Every point of F is a boundary point.
(e) The set F cannot be expressed as the union of a countable collection of closed
intervals.
(f) The complement of F can be expressed as the union of a countable collection
of open intervals.

Section 10 The Nested Cells and


Bolzano-Weierstrass Theorems

In this section we shall present two very important results that will be
often used in later chapters. In a sense they can be regarded as the
Completeness Property for R’, when p> 1.
We recall from Section 7 that if a = b, then the open cell in R, denoted by
(a, b), is the set defined by
(a, b)={xE R:a<x <b},
It is readily seen that such a set is open in R. Similarly, the closed cell [a, b]
in R is the set

[a, b])={xeR:a<x
<b},
which is closedin R. The Cartesian product of two intervals is usually called
a rectangle and the Cartesian product of three intervals is often called a
10. NESTED CELLS AND BOLZANO-WEIERSTRASS THEOREMS 69

parallelepiped. For simplicity, we shall employ the term ‘cell’ regard-


less of the dimension of the space.
10.1 DEFINITION. An open cell J in R? is the Cartesian product of p
open cells of realnumbers. Hence J has the form
J={x=(%1,...,x%)ER?ia<xi<b, for i=1,2,...,p}.
Similarly, a closed cell I in R? is the Cartesian product of p closed cells of
real numbers. Hence I has the form
T={x=(1,...,x%,)¢€R’:asxu<b, for i=1,2,...,p}.

A subset of R° is bounded if it is contained in some cell.


As an exercise, show that an open cell in R? is an open set, and a closed cell is a
closed set. Also, a subset of R’ is bounded if and only if it is contained in some ball.
It will be observed that this terminology for bounded sets is consistent with that
introduced in Section 6 for the case p= 1.

The reader will recall from Section 7 that the Supremum Property of the
real number system implies that every nested sequence of non-empty closed
cells in R has a common point. We shall now prove that this property
carries over to the space R?.
10.2 NESTED CELLS THEOREM. Let (i.) be a sequence of non-empty
closed ‘cells in R? which is nested in the sense that L2In2-:--2h2°°°
Then there exists a point in R’ which belongs to all of the cells.

PROOF. Suppose that I, is the cell


T, = {(%1, 6.) Xp)!
er S XS Der, ... , Aep S Xp S Dip}.

It is easy to see that the cells [aui, ii], kK EN, form a nested sequence of
non-empty closed cells of real numbers and hence by the completeness of
the real number system R, there is a real number y; which belongs to all
of these cells. Applying this argument to each coordinate, we obtain a
point y=(yi,..-, yp) of R? such that if j satisfies j=1,2,...,p, then y,
belongs to all the cells {[axj, bij]: k €N}. Hence the point y belongs to
all of the cells (I,,). QED.

Cluster Points and Bolzano-Weierstrass

10.3. DEFINITION. A point xe R? is a cluster point (or a point of


accumulation) of a subset A < R” in case every neighborhood of x contains
at least one point of A distinct from x.
We shall consider some examples.
70 THE TOPOLOGY OF CARTESIAN SPACES

10.4 Exampres, (a) A point x ¢ R? is acluster point of A if and only


if for every natural number n there exists an element a,¢ A such that
0<||x-—a,||<1/n.
(b) If a boundary point of a set does not belong to the set, then it is a
cluster point of the set.
(c) Every point of the unit interval I of R is a cluster point of £.
(d) Let A =(0, 1), then every point of A is both an interior and a cluster
pointof A. The points 0, 1 are cluster points (but not interior points) of A.
(e) Let B=INQ be the set of all rational numbers in the unit interval.
Every point of I is a cluster point of B in R, but there are no interior points
of B.
(f) A finite subset of R’ has no cluster points. (Why?)
(g) The infinite set of integers Z << R has no cluster points. (Why?)
10.5 THEOREM A setF CR? is closed if and only if it contains all of its
cluster points.
PROOF Suppose that F is closed and that x is a cluster point of F. If
x F, then the open set €(F) is a neighborhood of x and so must contain at
least one point of F. But this is impossible, so we conclude that x € F.
Conversely, if F contains all of its cluster points, we shall show that € (F) is
open. For, if y¢@(F), then y is not acluster point of F. Therefore, there
exists a neighborhood V, of y such that F V,=@. Therefore V,<€(F).
Since this is true for every y € €(F), we infer that €(F)isopeninR’. O.E.D.

The next result is one of the most important results in this book. It is of
basic importance and will be frequently used. It should be noted that the
conclusion may fail if either hypothesis is removed [see Examples 10.4(f, g)].

10.6. BoLZANOo-WEIERSTRASSt THEOREM. Every bounded infinite


subset of R?’ has a cluster point.

PROOF. If B is a bounded set with an infinite number of elements, let


I, be a closed cell containing B. We divide I, into 2” closed cells by
bisecting each of its sides. Since I, contains infinitely many points of B, at
least one part obtained in this subdivision will also contain infinitely many

t BERNARD BoLZzAno (1781-1848) was professor of the philosophy of religion at Prague, but
he had deep thoughts about mathematics. Like Cauchy, he was a pioneer in introducing a
higher standard of rigor in mathematical analysis. His treatise on the paradoxes of the infinite
appeared after his death.
KARL WEIERSTRASS (1815-1897) was for many years a professor at Berlin and exercised a
profound influence on the development of analysis. Always insisting on rigorous proof he
developed, but did not publish, an introduction to the real number system. He also made
important contributions to real and complex analysis, differential equations, and the calculus of
variations.
10. NESTED CELLS AND BOLZANO-WEIERSTRASS THEOREMS 71

points of B. (For if each of the 2’ parts contained only a finite number of


points of the set B, then B must be a finite set, contrary to hypothesis.) Let
I, be one of these parts in the subdivision of J, which contains infinitely
many elements of B. Now divide I, into 2’ closed cells by bisecting
each of its sides. Again, one of these subcells of I, must contain an
infinite number of points of B, for otherwise I, could contain only a finite
number, contrary to its construction. Let I; be a subcell of I, containing
infinitely many points of B. Continuing this process, we obtain a nested
sequence (I,) of non-empty closed cells of R°. According to the
Nested Cells Theorem, there is a point y which belongs to all of the cells
I, kK=1,2,.... We shall now show that y is a cluster point of B and
this will complete the proof of the assertion.
First, we note that if I, =[ai, b:]<---([a,, bp] with a <b, and if 1) =
sup {bi— ai, ..., b, — ap}, then I(I,) > 0 is the length of the largest side of h.
According to the above construction of the sequence (I,), we have

o< I(h) = 581 Ih)


fork €N. Suppose that V is any neighborhood of the common point y and
suppose that all points z in R® with lly — z||<r belong to V. We now choose
k so large that I, < V; such a choice is possible since if w is any other point
of I,, then it follows from Theorem 8.10 that

Jo
ly — wll s Vp Uh) = 357 ML).

—-——--¥_ i

Figure 10.1
72 THE TOPOLOGY OF CARTESIAN SPACES

According to Corollary 6.7, it follows that if k is sufficiently large, then

Jp
ai Wh) <r.
For such a value of k we have . © V. Since I, contains infinitely many
elements of B, it follows that V contains at least one element of B different
from y. Therefore, y isa cluster point of B. QED.

Exercises
10.A. Let I, c R? be the open cells given by I, = (0, 1/n)*-++X(0,1/n). Show
that these cells are nested but that they do not contain any common point.
10.B. Let J, R’ be the closed intervals given by J, =[n, +%)x-+--*[n, +00).
Show that these intervals are nested, but that they do not contain any common point.
10.C. A point x is a cluster point of a set A ¢ R? if and only if every neighborhood
of x contains infinitely many points of A.
10.D. Let A={1/n:neN}. Show that every point of A isa boundary pointin R,
but that 0 is the only cluster point of A in R.
10.E. Let A, B be subsets of R’ and let x beacluster pointof AN Bin R’. Prove
that x is a cluster point of both A and B.
10.F. Let A, B be subsets of R° and let x be acluster point of AUB in R’. Prove
that x is either a cluster point of A or of B.
10.G. Show that every point in the Cantor set F is a cluster point of both F and
€(F).
10.H. If A is any subset of R’, then there exists a countable subset C of A such
that if x « A and ¢ >0, then there is an element z € C such that ||x — z||<e. Hence
every element of A is either in C or is a cluster point of C.

Projects
10.a. Let M be a set and d be a metric on M as defined in Exercise 8.5.
Reexamine the definitions and theorems of Sections 9 and 10, in order to determine
which carry over for sets that have a metric. It will be seen, for example, that the
notions of open, closed, and bounded set carry over. The Bolzano-Weierstrass fails
for suitable M and d, however. Whenever possible, either show that the theorem
extends or give a counterexample to show that it may fail.

10.8. Let F be a family of subsets of a set X which (i) contains # and X, (ii)
contains the intersection of any finite family of sets in J, and (iii) contains the union
of any family of setsin Y. Wecall J atopology for X, and refer to the setsin J as the
open sets. Reexamine the definitions and theorem s of Secticns 9 and 10, trying to
determine which carry over for sets X which have a topology 7.

Section 11 The Heine-Borel Theorem

The Nested Cells Theorem 10.2 and the Bolzano-Weierstrass Theorem


10.6 are intimately related to the very important notion of compactness,
11. THE HEINE-BOREL THEOREM 73

which we shall discuss in the present section. Although it is possible to


obtain most of the results of the later sections without knowing the Heine-
Borel Theorem, we cannot go much farther in analysis without requiring this
theorem, so it is false economy to avoid exposure to this deep result.
11.1 Derinirion. A set K is said to be compact if, whenever it is
contained in the union of a collection G = {G,} of open sets, then it is also
contained in the union of some finite number of the sets in @.
A collection G of open sets whose union contains K is often called a
covering of K. Thus the requirement that K be compact is that every
covering G of K can be replaced by a finite covering of K, using only sets in
G, We note that in order to apply this definition to prove that a set K is
compact, we need to examine an arbitrary collection of open sets whose
union contains K and show that K is contained in the union of some finite
subcollection of each such collection. On the other hand, to show that a set
H is not compact, it is sufficient to exhibit only one covering which cannot be
replaced by a finite subcollection which still covers H.
11.2 Exampies. (a) Let K ={x1, X2,..., Xm} be a finite subset of R’.
It is clear that if @ = {G,} is a collection of open sets in R’, and if every point
of K belongs to some subset of G. then at most m carefully selected subsets
of & will also have the property that their union contains K. Hence K isa
compact subset of R°.
(b) In R we consider the subset H={xeER:x =O}. Let G,=(-I1,n),
néN, so that @={G,:n€N} is a collection of open subsets of R whose
union contains H. If {Gi,, Gi, ..., Gn} is a finite subcollection of @, let
M =sup {m, no,..., m}so that Gy, S Gu, forj=1,2,...,k. It follows that
Gm is the union of {G,.,, Gu, ---, Gu}. However, the real number M does
not belong to Gm and hence does not belong to

Gu,
Cr

i=!

Therefore, no finite union of the sets § can contain H, and H is


not compact.
(c) Let H=(0,1)in R. If G,=(1/n, 1-1/n) for n > 2, then the collec-
tion @ ={G,:n > 2} of open sets is a covering of H. If {G,,,..., Gi} isa
finite subcollection of G, let M=sup{n,...,m} so that G, ¢Gm for
j=1,2,...,k. It follows that Gs is the union of the sets {G,,,..., Gn}.
However, the real number 1/M belongs to H but does not belong to Gu.
Therefore, no finite subcollection of G can form a covering of H, so that H
is not compact.
(d) Consider the set F=[0, 1]; we shall show that I is compact. Let
4 ={G,} be a collection of open subsets of R whose union contains I. The
real number x = 0 belongs to some open set in the collection @ and so do
74 THE TOPOLOGY OF CARTESIAN SPACES

numbers x satisfying 0 < x <e, forsome e >0. Let x* be the supremum of


those points x in I such that the cell [0, x] is contained in the union of a finite
number of setsin G Since x* belongs to I, it follows that x* is an element of
some open set in Y Hence for some ¢ >0, the cell [x*—«,x*+e] is
contained ina set Goin the collection G. But (by the definition of x*) the cell
[0, x*— ¢ ]is contained in the union of a finite number of setsin Hence by
adding the single set Go to the finite number already needed to cover
(0, x*— e], we infer that the set [0, x*+ e]is contained in the union of a finite
number of sets.in G. This gives a contradiction unless x* = 1.

It is usually not an easy matter to prove that a set is compact, using the
definition only. We now present a remarkable and important theorem
which completely characterizes compact subsets of R’. In fact, part of the
importance of the Heine-Borel Theorem{ is due to the simplicity of
the conditions for compactness in R?.

11.3. HEINE-BOREL THEOREM. A subset of R? is compact if and only if


itis closed and bounded.

PROOF. First we show that if K is compact in R’, then K is closed.


Let x belong to €(K) and for each natural number m, let G,, be the set
defined by

Gn ={y ER? :|ly —x||>1/m}.

It is readily seen that each set Gn, mE N, is open in R®. Also, the union of
all the sets G,,, m € N, consists of all points of R’ except x. Since x€ K, each
point of K belongs to some set Gn. In view of the compactness of K, it
follows that there exists a natural number M such that K is contained in the
union of the sets
Gi, G2,..., Gu.

Since the sets G,, increase with m, then K is contained in Gu. Hence the
neighborhood {z € R? :||z
— x||<1/M} does not intersect K, showing that
€(K) is open. Therefore, K is closed in R®. (See Figure 11.1 where the
closed balls complementary to the G,, are depicted.)
Next we show that if K is compact in R”, then K is bounded (that is, K is
contained in some set {x € R? :||x||<r} for sufficiently large r). In fact, for
ft EpUARD HEINE (1821-1881) studied at Berlin under Weierstrass and later taught at Bonn
and Haile. In 1872 he proved that a continuous function on a closed interval is uniformly
continuous.
(F. BE. J.) Emie Borel (1871-1956), a student of Hermite, was professor at Paris and one of
the most influential mathematicians of his day. He made numerous and deep contributions to
analysis and probability. In 1895 he proved that if a countable collection of open intervals
cover a closed interval, then they have a finite subcovering.
11. THE HEINE-BOREL THEOREM 75

Figure 11.1. A compact set is closed.

each natural number m, let H,, be the open set defined by


H,, ={x € R? :{|x||<m}.
The entire space R’, and hence K, is contained in the union of the increasing
sets Hy, me N. Since K is compact, there exists a natural number M such
that KC Hy. This proves that K is bounded.
Tocomplete the proof of this theorem we need to show that if K is aclosed
and bounded set which is contained in the union of a collection = {G.} of
open setsin R’®, then it is contained in the union of some finite number of sets
in%. Since the set K is bounded, we may enclose it in a closed cell I; in R°.
For example, we may take I,={(x.,...,%p):|x.| <1, k=1,...,p} for suit-
ably large r>0. For the purpose of obtaining a contradiction, we shall
assume that K is not contained in the union of any finite number of the sets in
G. Therefore, at least one of the 2° closed cells obtained by bisecting the
sides of I, contains points of K and is such that the part of K in it is not
contained in the union of any finite number of the setsin G (For, if each of
the 2° parts of K were contained in the union of a finite number of sets in &,
then K would be contained in the union of a finite number of sets in &,
76 THE TOPOLOGY OF CARTESIAN SPACES

ee ee ee _ hy

Figure 11.2

contrary tohypothesis.) Let I, be any one of the subcells in this subdivision


of I, which is such that the non-empty set K Q J; is not contained in the union
of any finite number of sets in G We continue this process by bisecting the
sides of I, to obtain 2’ closed subcells of I, and we let I; be one of these
subcells such that the non-empty set K M J; is not contained in the union of a
finite number of sets in Y, and so on.
In this way we obtain a nested sequence (I,,) of non-empty cells (see Figure
11.2); according to the Nested Cells Theorem there is a point y common to
the [.. Since each [, contains points in K, the common element y is a cluster
point of K. Since K is closed, then y belongs to K and is contained in some
open set G, in G. Therefore, there exists a number e > 0 such that all points
w with lly — w||<« belong to G,. On the other hand, the cells I, k = 2, are
obtained by successive bisection of the sides of the cell Lh=
{(x1..., Xp):|x;| < r} so the length of the side of I, is 7/2"-?. It follows from
Theorem 8.10 that if w € J, then ||y — w|| =< rVp/2*"". Hence, if k is chosen
so large that rVp/2*"' < e, then all points in I, are contained in the single set
G,. But this contradicts the construction of f, as a set such that KN I, is not
contained in the union of a finite number of sets in G. This contradiction
shows that the assumption that the closed bounded set K requires an infinite
number of sets in Y to enclose it is untenable. Q.E.D.

Some Applications
Asaconsequence of the Heine-Borel Theorem, we obtain the next result,
which is due to Cantor. It is a strengthening of the Nested Cells Theorem,
since general closed sets are considered here and not just closed cells.
11. THE HEINE-BOREL THEOREM 77

11.4 CANTOR INTERSECTION THEOREM. Let F, be a non-empty


closed, bounded subset of R® and let
F,DF,D::-DF,2:°::

be a sequence of non-empty closed sets. Then there exists a point belonging to


all of the sets {F.:k € N}.
PROOF. Since F; is closed and bounded, it follows from the Heine-Borel
Theorem that itiscompact. For each kéEN, let G, be the complement of F,
in R’. Since F, is assumed to be closed, G, isopenin R°. If, contrary to the
theorem, there is no point belonging to all of the sets F., k € N, then the
union of the sets G;, k € N, contains the compact set F;. Therefore, the set
F, is contained in the union of a finite number of the sets G,; say, in Gi,
G2,..., Gx. Since the G, increase, we have GU --- UGx=Gx. Since
F; S Gx, it follows that F: 1 Fk =. By hypothesis F; > Fx, so FiO Fx = Fx.
Our assumption leads to the conclusion that Fx =, which contradicts
the hypothesis and establishes the theorem. Q.E.D.
11.5 LEBESGUE COVERING THEOREM. Suppose that §={G,} is a
covering of a compact subset K of R’. There exists a strictly positive number
\ such that if x, y belong to K and |x — y||<A, then there is a set in con-
taining both x and y.
PROOF. For each point u in K, there is an open set Ga in Y containing
u. Let 5(u)>0 be such that if ||v— ul]<28(u), then v belongs to Ga.
Consider the open set S(u)={v ER? : |v— ul|<8(u)} and the collection
¥ ={S(u): ue K} of open sets. Since # is a covering of the compact set K,
then K is contained in the union of a finite number of sets in ¥, say in
S(ui),..., S(un). We now define A to be the strictly positive real number
A =inf {8(u1),..., d(ua)}.

If x, y belong to K and |x — y||<A, then x belongs to S(u;) for some j with


1=<j<n, so|lx—ul]<8(u). Since |x — yl|<A, we have {ly — u|| = |ly — x]|+
\|x — uj||< 28(u;). According to the definition of 6(u;), we infer that both x
and y belong to the set Gag). QED.

We remark that a positive number A having the property stated in the


theorem is sometimes called a Lebesgue} number of the covering &.
Although we shall make use of arguments based on compactness in later
sections, it seems appropriate to insert here two results which appear
intuitively clear, but whose proof seems to require use of some type of
compactness argument.
+ HENRI LEBESGUE (1875-1941) is best known for his pioneering work on the modern theory
of the integral which is named for him and which is basic to present-day analysis.
78 THE TOPOLOGY OF CARTESIAN SPACES

11.6 NEAREST POINT THEOREM. Let F be a non-void closed subset of


R’ and let x be a point outside of F. Then there exists at least one point y
belonging to F such that ||z — x|| = |ly — x|| for all ze F.
PROOF, Since F is closed and x¢F, then (cf. Exercise 11.H) the dis-
tance from x to F, which is defined to be d = inf {||x
— z|]: z € F} satisfies
d>0. Leth ={zeF:||x-z||< d+1/k}fork €N. According to Example
9.5(f) these sets are closed in R° and it is clear that F, is bounded and that
F,2>F,2>- . ‘DRF,2- sey

Furthermore, by the definition of d and F,, it is seen that F, is non-


empty. It follows from the Cantor Intersection Theorem 11.4 that there is
a point y belonging to all F., ke N. It is readily seen that ||x — yl|=d, so
that y satisfies the conclusion. (See Figure 11.3.) OED.
A variant of the next theorem is of considerable importance in the
theory of analytic functions. We shall state the result only for p=2 and
use intuitive ideas as to what it means for a set to be surrounded by a closed
curve (that is, a curve which has no end points).

11.7 CIRCUMSCRIBING CONTOUR THEOREM. Let F be a closed and


bounded set in R* and let G be an open set which contains F. Then there
exists a closed curve C, lying entirely in G and made up of arcs of a finite
number of circles, such that F is surrounded by C.

PARTIAL PROOF. If x belongs to F< G, there exists a number (x) >0


such that if |ly—x\|<8(x), then y also belongs to G. Now let G(x)=
{y € R*:||y—x||<28(x)} for each x in F. Since the collection G=
{G(x):x € F} constitutes a covering of the compact set F, the union of a

Figure 11.3
11. THE HEINE-BOREL THEOREM 79

Figure 11.4

finite number of the sets in G, say G(x.),..., G(x), contains the compact
set F. By using arcs from the circles with centers x, and radii 3 (x,), we
obtain the desired curve C. (See Figure 11.4). The detailed construction
of the curve will not be given here. O.E.D.

Exercises

11.A. Show directly from the definition (i.e., without using the Heine-Borel
Theorem) that the open ball given by ({(x, y):x?+y’?<1} is not compact in R’.
11.B. Show directly that the entire space R* is not compact.
11.C. Prove directly that if K is compact in R’ and FC K isa closed set, then F
is compact in R°.
11.D. Prove that if K is a compact subset of R, then K is compact when
regarded as a subset of R’.
11.E. By modifying the argument in Example 11.2(d), prove that the interval
J={(x, y):0=x<1,0<y = l}is compact in R’.
11.F. Locate the places where the hypotheses that the set K is bounded and that
it is closed were used in the proof of the Heine-Borel Theorem.
11.G. Prove the Cantor Intersection Theorem by selecting a point x, from F,
and then applying the Bolzano- Weierstrass Theorem 10.6 to the set {x, :n © N}.
11.H. If F is closed in R® and if

d(x, F) = inf {|x — zl|:z ¢ F}=0,


then x belongs to F.
11.1. Does the Nearest Point Theorem in R imply that there is a strictly positive
real number nearest zero?
11.J. If F isa non-empty closed set in R° and if x# F, is there a unique point of F
that is nearest to x?
80 THE TOPOLOGY OF CARTESIAN SPACES

11.K. If K is a compact subset of R’ and x is a point of R’, then the set


K, ={x+y:yeK}is also compact. (This set K, is sometimes called the translation
of the set K by x.)
11.L. The intersection of two open sets is compact if and only if it is empty. Can
the intersection of an infinite collection of open sets be a non-empty compact set?
11.M. If F is a compact subset of R? and G is an open set which contains F, then
there exists a closed polygonal curve C lying entirely in G which surrounds F.
11.N. Let {H,:n€N} be a family of closed subsets of R’ with the property that
no set H, contains a non-void open set. (For example, H, is a point or a line in
R’.) Let G¥@ be an open set.
(a) If x,¢ G \ H,, show that there exists a closed ball B, with center x; such that
B,<G and HinB,=9.
(b) If x.¢ H, belongs to the interior of B,, show that there exists a closed ball B,
with center x, such that B, is contained in the interior of B, and H,NB,=9@.
(c) Continue this process to obtain a nested family of closed balls such that
H,9B,=9. By the Cantor Intersection Theorem 11.4 there is a point x) common
to all of the B,. Conclude that x.¢G\IH,, so that G cannot be contained in
UH,. This result is a form of what is often called “the Bairet Category Theorem.”
11.0. A line in R? is a set of points (x, y) which satisfy an equation of the form
ax + by +c =0 where (a, b) 4 (0, 0). Use the preceding exercise to show that R? is
not the union of a countable collection of lines.
11.P. The set €(Q) of irrational numbers in R is not the union of a count-
able family of closed sets, none of which contains a non-empty open set.
11.0. The set Q of rational numbers is not the intersection of a countable
collection of open sets in R.

Section 12 Connected Sets


We shall now introduce the notion of a connected set which will be used
occasionally in the following.
12.1. DEFINITION. A subset DCR? is said to be disconnected if
there exist two open sets A, B such that AND and BND are disjoint,
non-empty, and have union D. In this case the pair A, B is said to forma
disconnection of D. A subset which is not disconnected is said to be
connected. (See Figure 12.1.)

12.2 Examp es. (a) Theset NC R is disconnected, since we can take


A={xeER:x<3/2} and B={xeR:x >3/2}.
(b) The set H = {1/n: ne N} is disconnected.
(c) The set S consisting of all positive rational numbers is disconnected in
R since we can take A = {xe R: x <V2} and B={xeR:x>Vv2}.

+ RENE Louls BaIRE (1874-1932) was a professor at Dijon. He worked in set-theory and
real analysis.
12. CONNECTED SETS &1

R?

Figure 12.1. A disconnected set.

(d) If 0<c<1, then the sets A={xER,x =< ¢}, B={xeER:x>c} split
the unit interval I={xeR:0 =x <= 1} into disjoint, non-empty sets with
union I. However since A is not open, this example does not show that I is
disconnected. In fact, we shall show below that the set Fis connected.
12.3. THEOREM. The closed unit interval I =[0, 1] is a connected subset
of R.
PROOF. We proceed by contradiction and suppose that A, B are open
sets forming a disconnection of I. Thus AMF and BOF are non-empty
bounded disjoint sets whose union is F. Since A and B are open, the sets
ANfFand BMI cannot consist of only one point. (Why?) For the sake of
definiteness, we suppose that there exist points a¢ A, b€B such that
O0<a<b<1. Applying the Supremum Property 6.4, we let c=
sup {x € A:x<b} so that O0<c<1; hence ce AUB. If ceA, thenc#b
and since A is open there is a point a1¢ A, c< ai, such that the interval
{c,a:] is contained in {x€A:x<b}, contrary to the definition of c.
Similarly, if c € B, then since B is open there is a point b, € B, b,<c, such
that the interval [b,,c] is contained in BNI, contrary to the definition
ofc. Hence the hypothesis that F is disconnected leads to a contradiction.
QED.
The reader should note that the same proof can be used to show that the
Open interval (0, 1) is connected in R.
82 THE TOPOLOGY OF CARTESIAN SPACES

RP

Figure 12.2

12.4 THEOREM. The entire space R° is connected.


PROOF. If not, then there exist two disjoint non-empty open sets A, B
whose union is R’. (See Figure 12.2.) Letx eA andy € B and consider the
line segment S joining x and y; namely,

S={x+t(y—x):tel.

Let A: ={teR:x+t(y—x)eA} and let Bi={teR:x+t(y—x)eB}. Itis


easily seen that A, and B, are disjoint non-empty open subsets of R and
provide a disconnection for E, contradicting Theorem 12.3. Q.E.D.
12.5 COROLLARY. The only subsets of R’ which are both open and
closed are ® and R?.
proor. For if A is both open and closed in R’, then B = R’ \ A is also.
If A isnot empty and not all of R’, then the pair A, B forms a disconnection
for R®, contradicting the theorem. QED,

Connected Open Sets

In certain areas of analysis, connected open sets play an especially


importantrole. By using the definition it is easy to establish the next result.
12.6 Lemma. An open subset of R’ is connected if and only if it cannot
be expressed as the union of two disjoint non-empty open sets.
12. CONNECTED SETS 8&3

It is sometimes useful to have another characterization of open connected


sets. In order to give such a characterization, we shall introduce some
terminology. If x and y are two points in R’, then a polygonal curve joining
x and y is a set P obtained as the union of a finite number of ordered line
segments (L1, L2,..., Ln) in R? such that the line segment L, has end points
x, Z1; the line segment L» has end points z,, z2;...; and the line segment
L, has end points zn-1, y. (See Figure 12.3.)

12.7 THEOREM. Let G be an open set in R°®. Then G is connected if


and only if any pair of points x, y in G can be joined by a polygonal curve
lying entirely in G.
PROOF. Assume that G is not connected and that A, B isa disconnection
for G. Let xe ANG and ye BONG and let P=(I1,12,...,L.) be a
polygonal curve lying entirely in G and joining x and y. Let k be the
smallest natural number such that the end point z,_, of L, belongsto ANG
and the end point z, belongs to BG (see Figure 12.4). If we define A,
and B, by
Ar={te
Ri uit t(% —u-dJEANGH,
By, ={teR:
2-14 t(z, — 2-1) €E BN G},

then it is easily seen that A, and B, are disjoint non-empty open subsets of
R. Hence the pair Ai, B: form a disconnection for the unit interval 5
contradicting Theorem 12.3. Therefore, if G is not connected, there exist
two points in G which cannot be joined by a polygonal curve in G.
Next, suppose that G is a connected open set in R” and that x belongs to
G. Let G, be the subset of G consisting of all points in G which can be

ani

Figure 12.3. A polygonal curve.


84 THE TOPOLOGY OF CARTESIAN SPACES

—_————oo
> ae a“ —

~N a _"
Figure 12.4

joined to x by a polygonal curve which lies entirely in G; let G» consist of


all the points in G which cannot be joined to x by a polygonal curve lying
in G. Itis clear that G17 G.=@. The set G: is not empty since it contains
the point x. We shall now show that G,; isopenin R®. If y belongs to Gi,
it follows from the fact that G is open that for some real number r > 0, then
\|w ~ y||<r implies that we G. By definition of G,, the point y can be
joined to x by a polygonal curve and by adding a segment from y to w, we
infer that w belongs to G,. Hence G, is an open subset of R’. Similarly,
the subset G. is openin R’. If G, isnot empty, then the sets Gi, G, forma
disconnection of G, contrary to the hypothesis that G is connected.
Therefore, G.=§ and every point of G can be joined to x by a polygonal
curve lying entirely in G. Q.E.D.

Connected Sets in R
We close this section by showing that the connected subsets of R are
precisely the intervals (see Section 7).
12.8 THEOREM. A subset R is connected if and only if it is an
interval.
PARTIAL PROOF. The proof given in Theorem 12.3 can be readily
modified to establish the connectedness of an arbitrary non-void interval.
We leave the details to the reader.
Conversely, let C&R be connected and suppose that C#@. We note
that C has the property that if a, be C and a<b, then any number c
satisfying a<c<b must also belong to C; for if c€ C, then the sets A =
{xe R:x<c} and B={xeR:x>c} form a disconnection of C.
12. CONNECTED SETS 8&5

(i) Now suppose that C is bounded above and below, and let a = inf C
and b=sup C. We shall show that C must have one of the four forms

[a, b], [a, b), (a, b], (a, b).

Indeed, if ae C and be C, then we have seen in the preceding paragraph


that [a, b] < C and the fact that C <f[a, b] follows from the fact that a and b
are lower and upper bounds, respectively, of C.
If ae C but b¢ C, let b’ be any number with a = b’<b. Since b=sup C,
there must be an element b”’eC such that a<b’<b". Therefore the
number b' must belong to C and, since b' is any number satisfying a <
b'<b, we infer that C =[a, b).
Similarly, if ag C but b eC, we infer that C = (a, b], while if aZ¢C and
béC, then we deduce that C = (a, b).
(ii) Now suppose that C is bounded below but not bounded above, and let
a=inf C, so that Cc[a,+~). If ae C and if x is any real number with
a <x, then since C is not bounded above there exist c ¢ C such that x <c
whence it follows from the above property that x ¢ C. Since x is an arbitrary
number satisfying a < x, we conclude that C =[a, +).
Similarly, if a¢g C we conclude that C = (a, +).
(iii) If C isnot bounded below but is bounded above and if b = sup C, then
there are the two cases C =(—%, b] or C = (—%, b) according as be C or
bEC.
(iv) Finally, if C is neither bounded below nor bounded above, then we
have the case C = (—x, +0), QED.

Exercises

12.A. If A and B are connected subsets of R°, give examples to show that A U B,
ANB, A \B can be either connected or disconnected.
12.B. If C&R? is connected and x is a cluster point of C, then CU {x} is
connected.
12.C. If CCR? is connected, show that its closure C™ (see Exercise 9.L) is also
connected.
12.E. If K CR? is convex (see Exercise 8.Q), then K is connected.
12.F. The Cantor set F is wildly disconnected. Show that if x, ye F, x# y, then
there is a disconnection A, B of F such that xe A, ye B.
12.G. If C, and C, are connected subsets of R, then the product C,xC, is a
connected subset of R’.
12.H. Show that the set

A={(x, y)eER?:0<y =x’,


x4 O}UL{(O, 0}

is connected in R*. However there does not exist a polygonal curve lying entirely in
A joining (0, 0) to other points in the set.
86 THE TOPOLOGY OF CARTESIAN SPACES

12.1. Show that the set

s= {Cs yyeR*:y =sint, x0] Ufo, y):i-lsys i}

is connected in R’. However, it is not always possible to join two points in S by a


polygonal curve (or any “continuous” curve) lying entirely in S.

Section 13 The Complex Number System

Once the real number system is at hand, it is a simple matter to create the
complex number system. We shall indicate in this section how the complex
field can be constructed.
As seen before, the real number system is a field which satisfies certain
additional properties. In Section 8, we constructed the Cartesian space R?
and introduced some algebraic operations in the p-fold Cartesian product of
R. However, we did not make R? into a field. It may come as a surprise
that it is not possible to define a multiplication which makes R’, p = 3, intoa
field. Nevertheless, it is possible to define a multiplication operation in
RXR which makes this set into a field. We now introduce the desired
operations.
13.1 DEFINITION. The complex number system C consists of all or-
dered pairs (x, y) of real numbers with the operation of addition defined by

(x yt yJa(xtx5,y+y'),
and the operation of multiplication defined by

(x, y) +(x’, y') = (xx'— yy’, xy’+x'y).


Thus the complex number system C has the same elements as the
two-dimensional space R*. It has the same addition operation, but it
possesses a multiplication as R? does not. Therefore, considered merely as
sets, C and R’ are equal since they have the same elements; however, from
the standpoint of algebra, they are not the same since they possess different
operations.
An element of C is called a complex number and is often denoted by a
single letter such as z. If z = (x, y), then we refer to the real number x as the
real part of z and to y as the imaginary part of z, in symbols,
x=Rez, y =Imz.
The complex number Zz = (x, —y) is called the conjugate of z = (x, y).

¥ This section may be omitted on a first reading.


13. THE COMPLEX NUMBER SYSTEM 87

It is an important fact that the definition of addition and multiplication


given above for elements of C makes it a “field” in the sense of abstract
algebra. That is, it satisfies the algebraic properties listed in 4.1 provided
the number 0 in (A3) is replaced by the pair (0,0), the element corre-
sponding to —a in (A4) is the pair (—x,—y), the number 1 in (M3) is
replaced by the pair (1,0), and the number corresponding to 1/a is the
pair

(= x+y? , x74
3)
y?

when (x, y) 4 (0, 0).


Sometimes it is convenient to adopt part of the notation of Section 8 and
write
az = a(x, y)
= (ax, ay),
when a is areal number and z = (x, y) isin C. With this notation, it is clear
that each element in C has a unique representation in the form of a sum ofa
product of a real number with (1, 0) and of the product of a real number with
(0,1). Thus we can write
z=(x, y)=x(1, 0)+ y(0, 1).
Since the element (1, 0) is the identity element of C, it is natural to denote it
by 1 (or to suppress it entirely when it is afactor). For the sake of brevity it is
convenient to introduce a symbol for (0, 1) and i is the conventional choice.
With this notation, we write
z=(x, y}=xtiy.

In addition, we have Z = (x, ~y) =x —iy and

x=Rez =2+2 y=Imz= z72


2’ 2i
By Definition 13.1, we have (0, 1)(0, 1)=(—1, 0) which can be written
as i7=—1. Thus in C the quadratic equation
z°+1=0,
has a solution. The historical reason for the development of the complex
number system was to obtain a system of “numbers” in which every
quadratic equation has a solution. It was realized that not every equation
with real coefficients has a real solution, and so complex numbers were
invented to remedy this defect. It is a well-known fact that not only do the
complex numbers suffice to produce solutions for every quadratic equation
with real coefficients, but they also suffice to guarantee solutions for any
polynomial equation of arbitrary degree and with coefficients which may be
88 THE TOPOLOGY OF CARTESIAN SPACES

complex numbers. This result is called the Fundamental Theorem of


Algebra and was proved first by the great Gauss} in 1799.
Although C cannot be given the order properties discussed in Section 5,
it is easy to endow it with the metric and topological structure of Sections 8
and 9. For, if z =(x, y) belongs to C, we define the absolute value of z to
be
le|= G+ yy,
It is readily seen that the absolute value just defined has the properties:
(i) |z|=9;
(ii) {z|=0 if and only if z=0;
(iii) wz] =|wl lel;
(iv) | |w|—[z||= lw #2] = [wl +z.
It will be observed that the absolute value of the complex number z = (x, y)
is precisely the same as the norm of the element (x, y) in R’. Therefore,
all of the topological properties of the Cartesian spaces that were intro-
duced and studied in Sections 9 to 12 are meaningful and valid for C. In
particular, the notions of open and closed sets in C are exactly as for the
Cartesian space R®. Furthermore, the Bolzano-Weierstrass Theorem
10.6, and the Heine-Borel Theorem 11.3 and its consequences, also hold
in C, as does Theorem 12.7.
The reader should keep these remarks in mind throughout the remaining
section of this book. It will be seen that all of the succeeding material
which applies to Cartesian spaces of dimension exceeding one, applies
equally well to the complex number system. Thus most of the results to be
obtained pertaining to sequences, continuous functions, derivatives, inte-
grals, and infinite series are also valid for C without change either in state-
ment or in proof. The only exceptions to this statement are those proper-
ties which are based on the order properties of R.
In this sense complex analysis is a special case of real analysis; however,
there are a number of deep and important new features to the study of
analytic functions that have no general counterpart in the realm of real
analysis. Hence only the fairly superficial aspects of complex analysis are
subsumed in what we shall do.

Exercises

13.A. Show that the complex number iz is obtained from z by a counter-


clockwise rotation of a/2 radians (= 90°) around the origin.
13.B. H c =(cos 6, sin 6) = cos @ +i sin 6, then the number cz is obtained from z
by a counter-clockwise rotation of @ radians around the origin.
+ CARL FRIEDRICH GAUSS (1777-1855), the prodigious son of a day laborer, was one of the
greatest of all mathematicians, but is also remembered for his work in astronomy, physics, and
geodesy. He became professor and director of the Observatory at Gottingen.
13. THE COMPLEX NUMBER SYSTEM 89

13.C. Describe the geometrical relation between the complex numbers z and
az+b,wherea#0. Show that the mapping defined for z € C, by f(z) =az + b, sends
circles into circles and lines into lines.
13.D. Describe the geometrical relations among the complex numbers z, Z and
1/z for z#0. Show that the mapping defined by g(z) =Z sends circles into circles
and lines into lines. Which circles and lines are left fixed under g?
13.E. Show that the inversion mapping, defined by h(z) = 1/z, sends circles and
lines into circles and lines. Which circles are sent into lines? Which lines are sent
into circles? Examine the images under h of the vertical lines given by the equation
Re z =constant, the horizontal lines Im z = constant, the circles |z|= constant.
13.F, Investigate the geometrical character of the mapping defined by g(z) =z’.
Determine if the mapping g is one-one and if it maps C onto allof C. Examine the
inverse images under g of the lines
Re z =constant, Im z = constant,

and the circles |z| = constant.


lI
CONVERGENCE

The material in the preceding two chapters should provide an adequate


understanding of the real number system and the Cartesian spaces. Now
that these algebraic and topological foundations have been laid, we are
prepared to pursue questions of a more analytic nature. We shall begin
with a study of the convergence of sequences. Some of the results in this
chapter may be familiar to the reader from other courses in analysis, but
the presentation given here is intended to be rigorous and to give certain
more profound results than are usually discussed in earlier courses.
We shall first introduce the meaning of the convergence of a sequence of
elements in R” and establish some elementary (but useful) results about
convergent sequences. We then present some important criteria for con-
vergence, Next we study the convergence and uniform convergence of
sequences of functions. After a brief section on the limit superior we
append a final section which, though interesting, can be omitted without
loss of continuity since the results will not be applied later.
Because of the linear limitations inherent in a book, we have decided to
follow this chapter with a study of continuity, differentiation, and integra-
tion. This has the unfortunate aspect of deferring a full presentation of
series until much later. The instructor is encouraged to give at least a brief
introduction to series along with this chapter, or he can go directly to the
first part of Chapter VI after Section 16, if he prefers to do so.

Section 14 Introduction to Sequences


Although the theory of convergence can be presented on a very abstract
level, we prefer to discuss the convergence of sequences in a Cartesian
spaces R’, paying special attention to the case of the real line. The reader
should interpret the ideas by drawing diagrams in R and R’.

90
14. INTRODUCTION TO SEQUENCES 91

14.1 DEFINITION. If S is any set, a sequence in S is a function on the


set N={1, 2, ...} of natural numbers and whose range is in S$. In particu-
lar, a sequence in R? is a function whose domain is N and whose range is
contained in R?.
In other words, a sequence in R? assigns to each natural number n=1,2,...,a
uniquely determined element of R’. Traditionally, the element of R’ which is
assigned to a natural number n is denoted by a symbol such as x, and, although this
notation is at variance with that employed for most functions, we shall adhere to the
conventional symbolism. [To be consistent with earlier notation, if X:N— R? isa
sequence, the value of X at n< N should be symbolized by X(n), rather than by x,.]
While we accept the traditional notation, we also wish to distinguish between the
function X and its values X(n)=x,. Hence when the elements of the sequence
(that is, the values of the function) are denoted by x,, we shall denote the function
by the notation X =(x,) or by X=(x,:n€N). We use parentheses to indicate that
the ordering induced by that in N is a matter of importance. Thus we are
distinguishing notationally between the sequence X=(x,:neN) and the set
{x, :n € N} of values of this sequence.
In defining sequences we often list in order the elements of the sequence,
stopping when the rule of formation seems evident. Thus we may write

(2, 4, 6, 8, ...)
for the sequence of even integers. A more satisfactory method is to specify a
formula for the general term of the sequence, such as
(Qn:neEN).

In practice it is often more convenient to specify the value x, and a method of


obtaining X41, 42 1, when x, is known. Still more generally, we may specify x,
and a rule for obtaining x,,, from X,, X2.,...,X,. We shall refer to either of these
methods as inductive definitions of the sequence. In this way we might define the
sequence of even natural numbers by the definition

x,=2, Xnvi= Xn +2, n>.

or by the (apparently more complicated) definition


x,=2, Xnat=Xnit X, nei.

Clearly, many other methods of defining this sequence are possible.

We now introduce some methods of constructing new sequences from


given ones.
14.2. DEFINITION. If X =(x,) and Y =(y.) are sequences in R’, then
we define their sum to be the sequence X+ Y=(x.+ yn) in R°, their
difference to be the sequence X — Y= (x, —y,), and their inner product to
be the sequence X - Y = (x - yn) in R which is obtained by taking the inner
product of corresponding terms. Similarly, if X =(x,) is a sequence in R
and if Y =(y,) is a sequence in R”, we define the product of X and Y to be
92 CONVERGENCE

the sequence in R’ denoted by XY = (x, yu); or, if ce R and X =(x,), we


define cX =(cx,). Finally, if Y =(y,) is a sequence in R with y,#0, we
can define the quotient of a sequence X=(x,) in R’ by Y to be the
sequence X/Y = (x,/yn)-
For example, if X, Y are the sequences in R given by

X=(2,4,6,...,2n,...),
_ y=(L5.4..04,..),
-(,1i1 1

then we have

xey=(32,0%...,2m#),
_(,9 19 Qn?+1

_(,717
x-v=(15,4,...,7%-4,..),
Qn?-1

XY =(2,2,2,...,2,...),
3X =(6, 12, 18,...,6n,...),
X= (2, 8,18,...,2n?,...).
Similarly, if Z denotes the sequence in R given by

-=(1,0,1,..., 1-1)" )
Qos ds

then we have defined X + Z, X— Z and XZ; but X/Z is not defined, since
some of the elements in Z are zero.
We now come to the notion of the limit of a sequence.
14.3. DeFINiTION. Let X=(x,) be a sequence in R’. An element x
of R? is said to be a limit of X if, for each neighborhood V of x there is a
natural number Ky such that for all n = Ky, then x, belongsto V. Ifxisa
limit of X, we also say that X converges to x. If a sequence has a limit, we
say that the sequence is convergent. If a sequence has no limit then we say
that it is divergent.
The notation Ky is used to suggest that the choice of K will depend on
V. It is clear that a small neighborhood V will usually require a large
value of Ky in order to guarantee that x, € V for all n = Ky.
We have defined the limit of a sequence X = (x,) in terms of neighbor-
hoods. It is often convenient to use the norm in R? to give an equivalent
definition, which we now state as a theorem.

14.4 THEOREM. Let X=(x,) be a sequence in R’. An element x of


R? is a limit of X if and only if for each & >0 there is a natural number K(e)
such that for alln = K(e), then |x. —x||<e.
14. INTRODUCTION TO SEQUENCES 93

PROOF. Suppose that x is a limit of the sequence X according to


Definition 14.3. Now let «>0 and consider the open ball V(e)=
{y€R?:|ly—x||<e}, which is a neighborhood of x. By Definition 14.3
there is a natural number Ky.) such that if n= Kye), then x, € V(e).
Hence if n = Ky), then ||x,—x||<e. This shows that the stated property
holds when x is a limit of X.
Conversely, suppose that the property in the theorem holds for all « >0;
we must show that Definition 14.3 is satisfied. To do this, let V be any
neighborhood of x; then there is a number ¢ >0 such that the open ball
V(e) with center x and radius e is contained in V. According to the
property in the theorem, there is a natural number K(e) such that if
n> K(e), then |x. —x||<e. Stated differently, ifn => K(e), then x, € V(e);
hence x, € V and the requirement in Definition 14.3 is satisfied. QED.
14.5 UNIQUENESS OF Limits. A sequence in R?’ can have at most one
limit.
PROOF. Suppose, on the contrary that x’, x" are limits of X =(x,) and
that x'#x". Let V’, V" be disjoint neighborhoods of x’, x”, respectively,
and let K', K” be natural numbers such that if n => K' then x, € V’ and if
n= K" then x,¢ V". Let K =sup{K’, K"} so that both xx € V’ and xx e€
V". We infer that xx belongs to V'N V", contrary to the supposition that
V' and V" are disjoint. Q.E.D.
When a sequence X = (x,) in R? has a limit x, we often write
x =lim X, or x = lim (x,),

or sometimes use the symbolism x, — x.


We say that a sequence X =(x,) in R’ is bounded if there exists M>0
such that ||x,||<.M for allneN.
14.6 Lemma. A convergent sequence in R? is bounded.
PROOF. Let x =lim(x,) andlete=1. By Theorem 14.4 there exists a
natural number K = K(1) such that if n = K, then ||x,—x||<1. By using
the Triangle Inequality, we infer that if n = K, then ||x,||<||x|+1. If we
set M = sup {||xll, lal], . . - , [lx], |[xl]+ 1}, then ||x,|] < M for all ne N.
QED.
It might be suspected that the theory of convergence of sequences in R?
is more complicated than in R, but this is not the case (except for notational
matters). In fact, the next result is important in that it shows that ques-
tions of convergence in R? can be reduced to the idential questions in R
for each of the coordinate sequences.
Before stating this result, we recall that a typical element x in R° is represented
94 CONVERGENCE

in coordinate fashion by “‘p-tuple’’


X =(X1, X2,---, %)-

Hence each element in a sequence (x,) in R’ has a similar representation; thus


Xn = (Xins Xon,- ++) Xp). In this way, the sequence (x,) generates p sequences of real
numbers; namely, (X1,), (%2n),- ++, (%pn). We shall now show that the convergence
of the sequence (x,.) is faithfully reflected by the convergence of these p sequences
of coordinates.

14.7 THEOREM. A sequence (x,) in R? with

Xn = (Xiny Xany +++» Xpn)y neN,

converges to an element y =(y:, y2,..-, yp) if and only if the corresponding p


sequences of real numbers
(14.1) (Xin), (Xan), +++» (Xen),
converge to yi, Yo, -.-, Yp respectively.
PROOF. If x, — y, then |x, —y||<« for n = K(e). In view of Theorem
8.10, for each j=1,2,..., p, we have

|xin — y;| = lle — yil<e, for n= K(e).


Hence each of the p coordinate sequences must converge to the corre-
sponding real number.
Conversely, suppose that the sequences in (14.1) converge to y, for
j=1,2,...,p. Given ¢ >0, there is a natural number M(e) such that if
n= M(e), then
Ixn—yil<e/Vp for j=1,2,...,p.
From this it follows that, when n = M(s), then

Im —yP=¥ [wy se’,


j=1

so that the sequence (x,) converges to y. QED.

Some Examples

We shall now present some examples, establishing the convergence of a


sequence using only the methods we presently have available. It will be
noticed that in order to proceed, we must have “‘guessed”’ the value of the
limit by previous examination of the sequence. All of the examples to be
presented next involve some manipulative skill and ‘‘trickery,” but the
results we obtain will be very useful to us in establishing (by less tricky
procedures) the convergence of other sequences. So we are as interested
in the results as in the methods.
14. INTRODUCTION TO SEQUENCES 95

14.8 ExampLes. (a) Let (x,) be the sequence in R where x, =1/n.


We shall show that lim(1/n)=0. To do this let ¢>0; according to
Corollary 6.7(b) (of the Archimedean Property) there exists a natural
number K(e) such that 1/K(e)<e. Then, ifn = K({e) we have
1 - 1
O<x == <e,
n ~ K(e)

whence it follows that |x,—O|<e for n=K(e). Since ¢>0 is arbi-


trary this proves that lim (1/n)=0.
(b) Let a >0 and consider the sequence X = (1/(1+na)) in R. We shall
show that lim X =0. First we note that

<
Loot .
9 l+na na

We want the dominant term to be smaller than a given ¢ >0 when n is


sufficiently large. By Corollary 6.7(b) again, there exists a natural number
K(e) such that 1/K(e)< ae. Then ifn = K(e) we have

1 J
0
<TTna na = Kea”
whence it follows that |1/(1+ na)—0|<e« forn=K(e). Since ¢ >0 is arbi-
trary this shows that lim X =0.
(c) Let bE R satisfy 0<b <1 and consider the sequence (b"). We shall
show that lim(b")=0. To do this, it is cohvenient to write b in the form

__1
lta

where a >0, and to use Bernoulli’s Inequality (1+ a)" = 1+na for nEN.
(See Exercise 5.C.) Hence
nl 1 1
0<b ~ (1+a)" =TSna~ na’

As in the preceding example, if « >0 is given, then there is a natural


number K(«) such that |b"—0|<e when n= K(e). Therefore we have
lim (b") = 0.
(d) Let c>0 and consider the sequence (c'"). We shall show that
lim (c™") = 1.
First suppose that c>1. Then c’”=1+d, with d,>0 and hence by
Bernoulli’s Inequality
c=(1+d,)"= 1+nd,.

It follows that c-1=nd,. Since c>1, we have c-1>0. Hence given


96 CONVERGENCE

e >0, then there is a natural number K(e) such that if n > K(e), then

0<c™=1=d, 5 —*<e,

Therefore |c’"—1|<e when n => K(e), as desired.


Now suppose that 0<c <1 (for the case c = 1 is obvious). Then ¢’" =
1/(1+h,) with h, >0 and hence by Bernoulli’s Inequality
1 1 1
Thy = T+nh, nh.
It follows that 0<h, <1/nc. But since c >0, given e >0 there is a natural
number K(e) such that if n = K(e) then

0<1-c'" <h,<_t<e.
+h,
= hn

nc

Vn
Therefore |c 1|<e when n > K(e), as desired.
(e) Consider the sequence X =(n"); we shall show that lim X=1, a
tather non-obvious fact. Write n’"=1+k, with k,>0 for n>1; hence
n=(1+k,)". By the Binomial Theorem, when n> 1 we have

n= 14 nk, ED gay MD

It follows that k,”<2/(n — 1), so that


2
ka e2.

Now let « >0 be given. Then there exists K(<) such that if n = K(e), then
1/(n—1)<e?/2; whence it follows that 0<k, <« and so
O<n”"-1=k,.<e
for n= K(e). Since ¢ >0 is arbitrary, this proves that lim (n™") = 1.
These examples show that a body of results which will make the in-
genuity employed here unnecessary would prove highly useful. We shall
obtain such results in the following two sections; but we close this section
with a result which is very often useful.

14.9 THEOREM. Let X =(x,) be a sequence in R’ and letxeR’. Let


A =(a,) be a sequence in R which is such that
(i) lim (a,) = 0,
(ii) |x. —x|| =< Cla.| for some C>0 and allneN.
Then lim (x,) =x.

PROOF. Let «>0 be given. Since lim(a,)=0, there exists a natural


14. INTRODUCTION TO SEQUENCES 97

number K(e) such that if n => K(e) then


C |a,.|=C |a,—O| se.
It follows that
\|x.
— x|| = C |an| <= €
for alln > K(e). Since « > 0 is arbitrary, we infer that lim(x,)=x. O5.D.

Exercises

14.A. Let be R; show that lim (b/n) =0.


14.B. Show that lim (1/n—1/(n+1)) =0.
14.C. Let X =(x,) be a sequence in R’ which is convergent to x, and let ce R.
Show that lim (cx,.) = cx.
14.D. Let X =(x,) be a sequence in R’ which is convergent to x. Show that
lim (|lx,{)) = |x|]. GHiint: use the Triangle Inequality.)
14.E. Let X=(x,) be a sequence in R® and let lim (|x,|)=0. Show that
lim (x,)=0. However, give an example in R to show that the convergence of (|x,|)
may not imply the convergence of (x,).
14.F. Show that lim (1/Vn) =0. In fact, if (x,) isa sequence of positive numbers
and lim (x,) = 0, then lim (/x,) = 0.
14.G. Let dER satisfy d>1. Use Bernoulli’s Inequality to show that the
sequence (d") is not bounded in R. Hence it is not convergent.
14.H. Let beR satisfy 0<b<1; show that lim (nb")=0. (Hint: use the Bino-
mial Theorem as in Example 14.8(e).)
14.1. Let X=(x,) be a sequence of strictly positive real numbers such that
lim (%n+1/%.) <1. Show that for some r with 0<r< 1 and some C >0, then we have
0<x,<Cr" for all sufficiently large ne N. Use this to show than lim (x,) = 0.
14.J. Let X=(x,) be a sequence of strictly positive real numbers such that
lim (%,41/%,) > 1. Show that X is not a bounded sequence and hence is not con-
vergent.
14.K. Give an example of a convergent sequence (x,) of strictly positive real
numbers such that lim (x,.,/x,)}=1. Give an example of a divergent sequence with
this property.
14.L. Apply the results of Exercises 14.[ and 14.J to the following sequences.
(Here O0< a<1,1<b,c>0.)

(a) (a"), (b) (na"),


(c) (b"), (d) (b"/n),
(e) (c/n), (f) (2°"/3"").
14.M. Let X =(x,) be a sequence of strictly positive real numbers such that
lim (x}")<1. Show that for some r with 0<r<1, then 0<x,<r" for all suffi-
ciently large ne N. Use this to deduce that lim (x,) = 0.
14.N. Let X=(x,) be a sequence of strictly positive real numbers such that
lim (xi")>1. Show that X is not a bounded sequence and hence is not convergent.
98 CONVERGENCE

14.0. Give an example of a convergent sequence (x,) of strictly positive real


numbers such that lim (x}")= 1. Give an example of a divergent sequence with this
property.
14.P. Reexamine the convergence of the sequences in Exercise 14.L in the light
of Exercises 14.M and 14.N.
14.Q. Examine the convergence of the following sequences in R.

0 (2, »
© (5). (@) (0).

Section 15 Subsequences and Combinations


This section gives some information about the convergence of sequences
obtained in various ways from sequences which are known to be con-
vergent. It will help to enable us to expand our collection of convergent
sequences rather extensively.
15.1 Derinirion. If X =(x,) is a sequence in R? andifn<m<---<
t,<++-+ is a strictly increasing sequence of natural numbers, then the
sequence X’ in R? given by
(%5 X25 soe Xeno oe )

is called a subsequence of X.
It may be helpful to connect the notion of a subsequence with that of the
composition of two functions. Let g be a function with domain N and range in N
and let g be strictly increasing in the sense that ifn <m, then g(n)<g(m). Then g
defines a subsequence of X = (x,) by the formula

Mog =(xXwyin EN).

Conversely, every subsequence of X has the form X ¢ g for some strictly increasing
function g with D(g)=N and R(g)cN.

It is clear that a given sequence has many different subsequences.


Although the next result is very elementary, it is of sufficient importance
that it must be made explicit.
15.2. Lemma. If a sequence X in R’ converges to an element x, then
any subsequence of X also converges to x.
PROOF. Let V be a neighborhood of the limit element x; by definition,
there exists a natural number Ky such that for all n = Ky, then x, belongs
to V. Now let X’ be a subsequence of X; say

X! = (X45 Xray ey Xm ee de
15. SUBSEQUENCES AND COMBINATIONS 99

Since r, = n, then r, = Ky and hence x,, belongs to V. This proves that X’


also converges to x. QED.
15.3. CoroLLary. If X=(x,) is a sequence which converges to an
element x of R? and if m is any natural number, then the sequence X'=
(Xm-+1, Xm+2,---) also converges to x.
PROOF. Since X’ is a subsequence of X, the result follows directly from
the preceding lemma. QED.

The preceding results have been mostly directed towards proving that a
sequence converges to a given point. It is also important to know precisely
what it means to say that a sequence X does not converge to x. The next
result is elementary but not trivial and its verification is an important part
of everyone’s education. Therefore, we leave its detailed proof to the
reader.

15.4 THEOREM. If X=(x,) is a sequence in R’, then the following


statements are equivalent:
(a) X does not converge to x.
(b) There exists a neighborhood V of x such that if n is any natural
number, then there is a natural number m = m(n) = n such that Xm does not
belong to V.
(c) There exists a neighborhood V of x and a subsequence X’' of X such
that none of the elements of X' belong to V.

15.5 ExampLes. (a) Let X be the sequence in R consisting of the


natural numbers
X=(1,2,...,nm...).
Let x be any real number and consider the neighborhood V of x consisting
of the open interval (x -—1,x+1). According to the Archimedean Property
6.6 there exists a natural number ko such that x + 1 < ko; hence, if n = Ko, it
follows that x,=n does not belong to V. Therefore the subsequence
X'=(ko, ko+1,...) of X has no points in V, showing that X does not
converge to x.
(b) Let Y=(y.) be the sequence in R_ consisting of Y=
(-1,1,...,(-1)”,...). We leave it to the reader to show that no point y,
except possibly y = +1, can be a limit of Y. We shall show that the point
y =—1 is not a limit of Y; the consideration for y = +1 is entirely similar.
Let V be the neighborhood of y=—1 consisting of the open interval
(—2, 0). Then, if n is even, the element y, =(—1)" = +1 does not belong to
V. Therefore, the subsequence Y’ of Y corresponding to r, = 2n, nEN,
avoids the neighborhood V, showing that y = —1 is not a limit of Y.
(c) Let Z =(z,) be asequence in R with z, => 0,forn => 1. Weconclude
100 CONVERGENCE

that no number z<0O can be a limit for Z. In fact, the open set V=
{x €R:x <0} is a neighborhood of z containing none of the elements of Z.
This shows (why?) that z cannot be the limit of Z. Hence if Z has a limit,
this limit must be positive.

Combinations of Sequences
The next theorem enables one to use the algebraic operations of Defini-
tions 14,2 to form new sequences whose convergence can be predicted
from the convergence of the given sequences.

15.6 THEOREM. (a) Let X and Y be sequences in R’ which converge to


x and y, respectively. Then the sequences X+ Y, X— Y, and X - Y converge
tox+y,x—y, and x - y, respectively.
(b) Let X =(x,) be a sequence in R? which converges to x and let A = (an)
be a sequence in R which converges to a. Then the sequence (auXn) in R?
converges to ax.
(c) Let X =(x,) be a sequence in R? which converges to x and let B = (b,)
be a sequence of non-zero real numbers which converges to a non-zero
number b. Then the sequence (b, x.) in R? converges to b™'x.
PROOF. (a) To show that (x.t+y,)-—>x+y, we need to appraise the
magnitude of ||(x, + y.) (x +y)]|. To do this, we use the Triangle Inequal-
ity to obtain
(15.1) [|(%_ (Xn++ y. yn)
— (X + yl=ll
y)l]= [Gen — x) + Cynya —~ y)I|y)
< [|x — xl]+|ly. — yl
By hypothesis, if e >0 we can choose K, such that if n = K,, then ||x,—x||<
e/2 and we choose Kz such that if n = Kz, then |ly,—y||<e/2. Hence if
Ko =sup {K,, Kz} and n = Ko, then we conclude from (15.1) that
(xn + yn) — (x + y)||<e/2+e/2=e.
Since this can be done for arbitrary ¢ > 0, we infer that X + Y converges to
x+y. Precisely the same argument can be used to show that X-—Y
converges to x — y.
To prove that X - Y converges to x - y, we make the estimate

[Xn * Yn —X* Y]=[(%e > Ya He) + Om YX y)I


xn * (Yn ~ y)+|%n— x) + yl.
Using the Schwarz Inequality, we obtain
(15.2) [xn + yn —% * y| = [lull ye — yll+ Ilan — xl IlyIl-
According to Lemma 14.6, there exists a number M >0 which is an upper
bound for {[x:||, lly}. In addition, from the convergence of X, Y, we
15. SUBSEQUENCES AND COMBINATIONS 101

conclude that if « > 0 is given, then there exist natural numbers K,, K2 such
that if n= K,, then |ly.—yl|<«/2M and if n =K, then |x, —x||<«/2M.
Now choose K = sup {K:, K2}; then, if n = K, we infer from (15.2) that

[Xn * Yn —X + yl <= M [lyn — yll-+M [xn — x||


<m(5 &.
2M *M)"
This proves that X - Y converges to x - y.
Part (b) is proved in the same way.
To prove (c), we estimate as follows:

bebfelibsfo)e(be-b)
np? xX} = b. 5 5 b*

=|p-$ 1
[ell + py Hse — a
~ |b, Bb

=e allell
Xn —X

Now let M>0 such that

1
well and |x||<M.

It follows that there exists a natural number Ko such that if n = Ko, then

1
Mle and |[x,||<_M.

Hence if n = Ko, the above estimate yields


1 1
5, *||
7 Xn 7X = M? [bn — B+ M |lxn — x1.
Therefore, if e >O is a preassigned real number, then there are natural
numbers Ki, K2 such that if n = Ki, then |b, — b|<e/2M? and if n= Kz,
then ||x.—x||<e/2M. Letting K =sup {Ko, Ki, K2} we conclude that if
n => K, then
1bo B*Ayleap x\|< M apt Manga &,

which proves that (x,/b.) converges to x/b. OED.


15.7 AppLicATIONs. Again we restrict attention to sequences in R.
(a) Let X =(x,) be the sequence in R defined by

_2n+1
neN.
" nt+5?
102 CONVERGENCE

We note that we can write x, in the form


x= 2+1/n,
"—145/n’
thus X can be regarded as the quotient of Y = (2+1/n) and Z=(1+5/n).
Since the latter sequence consists of non-zero terms and has limit 1
(why?), the preceding theorem applies to allow us to conclude that

lim X =
lim Y_2_ 2,
limZ 1
(b) If X =(xn) is a sequence in R which converges to x and if p isa
polynomial, then the sequence defined by (p(x,):n €N) converges to p(x).
(Hint: use Theorem 15.6 and induction.)
(c) Let X =(x,) be a sequence in R which converges to x and let r be a
rational function; that is, r(y) = p(y)/q(y), where p and q are polynomials.
Suppose that q(x.) and q(x) are non-zero, then the sequence (r(x,):né
N) converges to r(x). (Hint: use part (b) and Theorem 15.6.)
We conclude this section with a result which is often useful. It is
sometimes described by saying that one “‘passes to the limit in an
inequality.”
15.8 LEMMA. Suppose that X=(x,) is a convergent sequence in R?
with limit x. If there exists an element c in R? and a number r>0 such that
\|xn — cl] = r for n sufficiently large, then ||x — cll <r.
PROOF. The set V={y€R?’:||y—cl|>r} is an open subset of R’. If
x eV, then V is a neighborhood of x and so x,€ V for sufficiently large
values of n, contrary to the hypothesis. Therefore x¢ V and hence we
have |x —cl| sr. QED.
It is important to note that we have assumed the existence of the limit in this
result, for the remaining hypotheses are not sufficient to enable us to prove its
existence.

Exercises

15.A. If (x,) and (y,) are convergent sequences of real numbers and if x, = y, for
all n EN, then lim (x,) = lim (y,).
15.B. If X =(x,) and Y =(y,) are sequences of real numbers which both con-
verge to c and if Z = (z,) is a sequence such that x, = z, = y, fornéN, then Z also
converges to c.
15.C. For x, given by the following formulas, either establish the convergence or
the divergence of the sequence X = (x,):

n (-1'n
(a) X= Fa (b) x, = atl?
15. SUBSEQUENCES AND COMBINATIONS 103

2n _2n?4+3
(©) ~=307> ) x =3 ayy
(e) =n’ —n, (f) x, =sin n.
15.D. If X and Y are sequences in R’ and if X+Y converges, do X and Y
converge and have lim (X+ Y)=lim X+lim Y?
15.E. If X and Y are sequences in R’ and if X - Y converges, do X and Y
converge and have lim X - Y = (lim X) - (lim Y)?
15.F. If X =(x,) is a positive sequence which converges to x, then (Vx) con-
verges to vx. (Hint: ¥x,—Vx= (x — x)/(V%,+-Vx) when x# 0.)
15.G. If X =(x,) is a sequence of real numbers such that Y = (x,”) converges to
0, then does X converge to 0?
15.H. If x, =Vn+1-—Vn, do the sequences X = (x,) and ¥ =(Vnx,) converge?
15.1. Let (x,) be a sequence in R? such that the subsequences (x.,) and (xon+1)
converge toxe€ R’. Prove that (x,) converges to x.
15.J. Let (x,) and (y,) be sequences in R such that lim (x,) 40 and lim (x,y,.)
exists. Prove that lim (y,) exists.
15.K. Does Exercise 15.J remain true in R??
15.L. If0<a<=b and if x, =(a"+ b")'”, then lim (x,) = b.
15.M. Every irrational number in R is the limit of a sequence of rational
numbers. Every rational number in R is the limit of a sequence of irrational
numbers.
15.N. Let ACR?’ andx¢R’. Then x is a boundary point of A if and only if
there is a sequence (a,) of elements in A and a sequence (b,) of elements in @{A)
such that
lim (a,) = x = lim (6,).

15.0. Let ACR’ andx eR’. Then x isa cluster point of A if and only if there
is a sequence (a,) of distinct elements in A such that x =lim (a,).
15.P. Ifx =lim (x,) and if |x, —cl|<r for all n €N, does it follow that ||x — cl] <r?

Projects

15.a. Let d be a metric on a set M in the sense of Exercise 8.8. If X =(x,) isa
sequence in M, then an element x € M is said to be a limit of X if, for each « >0
there exists a number K(e) in N such that for all n > K(e), then d(x,, x)<«. Use
this definition and show that Theorems 14.5, 14.4, 15.2, 15.3, and 15.4 can be
extended to metric spaces. Show that the metrices d,, d,, d.. in R® give rise to the
same convergent sequences in R’. Show that if d is the discrete metric on a set,
then the only sequences which converge relative to d are those which are “‘constant
after some natural number.”
15.8. Let m denote the collection of all bounded sequences in R; let c denote
the collection of all convergent sequences in R; and let c, denote the collection of
all sequences in R which converge to zero.
(a) With the sum X+ Y and product cX as given in Definition 14.2, show that
each of the above collections is a vector space in which the zero element is the
sequence 0=(0,0,...).
104 CONVERGENCE

(b) In each of the collections m, c, Co, define the norm of X =(x,) by ||x||=
sup {|x,|:2¢€N}. Show that this definition actually yields a norm.
(c) If X and Y belong to either m, c, or co, then the product XY also belongs to it
and |X Y|| = |[X|| | Y||.. Give an example to show that equality may hold in this last
relation, and one to show that equality may fail.
(d) Show that the metric induced by the norm in part (b) in these spaces is given
by d(X, Y) =sup {|x, — y,|: 1 © N}.
(e) Show that if a sequence (X,) converges to Y relative to the metric in (d), then
each “coordinate sequence” converges to the corresponding coordinate of Y.
(Warning: X, is a sequence in R, while (X,) is a sequence in m, ¢, of co; that is, a
“sequence of sequences’”’ in R.)
(f) Give an example of a sequence (X,) in co where each coordinate sequence
converges to 0, but where d(X,, 0) does not converge to 0.

Section 16 Two Criteria for Convergence

Until now the main method available for showing that a sequence is
convergent is to identify it as a subsequence or an algebraic combination of
convergent sequences. When this can be done, we are able to calculate the
limit using the results of the preceding section. However, when this
cannot be done, we have to fall back on Definition 14.3 or Theorem 14.4 in
order to establish the existence of the limit. The use of these latter tools
has the noteworthy disadvantage that we must already know (or at least
suspect) the correct value of the limit and we then verify that our suspicion
is correct.
There are many cases, however, where there is no obvious candidate for
the limit of a given sequence, even though a preliminary analysis has led to
the belief that convergence does take place. In this section we give some
results which are deeper than those in the preceding sections and which can
be used to establish the convergence of a sequence when no particular
element presents itself as the value of the limit. The first result in this
direction is very important. Although it can be generalized to R°, it is
convenient to restrict its statement to the case of sequences in R.

16.1 MONOTONE CONVERGENCE THEOREM. Let X =(xn) be a se-


quence of real numbers which is monotone increasing in the sense that

XiSXoSt
SM SX St -
Then the sequence X converges if and only if it is bounded, in which case
lim (x,) = sup {xa}.
PROOF. It was seen in Lemma 14.6 that a convergent sequence is
bounded. If x =lim(x,) and e >0, then there exists a natural number
16. TWO CRITERIA FOR CONVERGENCE 105
xtreme xi de

NY x* f
4 4 i 1
° wre T "T T

Figure 16.1

K(e) such that if n = K(e), then


X-€ SX SxXtE.

Since X is monotone, this relation yields


x—-e <sup{x}<xt+e,
whence it follows that |x — sup {x,}|< e. Since this holds for all e >0, we
infer that lim (x,) =x = sup {xn}.
Conversely, suppose that X =(x,) is a bounded monotone increasing
sequence of real numbers. According to the Supremum Principle, the
supremum x* = sup {x,} exists; we shall show that it is the limit of X. Since
x* is an upper bound of the elements in X, then x, =< x* forneN. Since
x* is the supremum of X, if ¢ > 0 the number x*—« is not an upper bound
of X and exists a natural number K(e) such that
x* -e< XK (e)-

In view of the monotone character of X, for all n = K(e), then


x*-s<x, <x",
whence it follows that |x,—-x*|<e. Recapitulating, the number x*=
sup {x.} has the property that, given e >0 there is a natural number K(e)
(depending on «) such that |x, —x*|<e whenever n> K(e). This shows
that x* = lim X. QED.
16.2. COROLLARY. Let X =(x,) be a sequence of real numbers which is
monotone decreasing in the sense that
Mi SX. S SX SM Se
Then the sequence X converges if and only if it is bounded, in which case
lim (x,) = int {xn}.
PROOF. Let y,=—x, forneN. Then the sequence Y =(y,) is readily
seen to be a monotone increasing sequence. Moreover, Y is bounded if
and only if X is bounded. Therefore, the conclusion follows from the
theorem. QED.
16.3. ExampLes. (a) We return to the sequence X =(1/n) discussed
in Example 14.8(a). It is clear that

=>1_1
13> >. ++ poa1 >--- >):0;
106 CONVERGENCE

it therefore follows from Corollary 16.2 that X =(1/n) converges. We can


establish the value of lim (1/n) if we can calculate inf {1/n}. Alternatively,
once the convergence of X is assured we can often evaluate its limit by
using Lemma 15.2 and Theorem 15.6. In the case at hand, if X’=
(1/2, 1/4,...,1/2n,...), then it follows that
lim X = lim X’=} lim X.

We conclude, therefore, that lim X =0.


(b) Let Y =(y,) be the sequence in R defined inductively by

yi=1, Yuri= (2yn


+ 3/4 for neN.
Direct calculation shows that yi<yo2<2. Ify,i1<y, <2, then
2yn-1t3<2y,+3<2-24+3,

from which it follows that ya<Yynii1<2. By induction, the sequence Y is


monotone increasing and bounded above by the number 2. It follows
from the Monotone Convergence Theorem that the sequence Y converges
to a limit which is no greater than 2. In this case it might not be so easy to
evaluate y =lim Y by calculating sup {y,}. However, once we know that
the limit exists, there is another way to calculate its value. According to
Lemma 15.2, we have y =lim (y,) =lim (yn11).. Using Theorem 15.6, the
limit y must satisfy the relation
y =(Qy +3)/4.
Therefore, we conclude that y =3.
(c) Let Z =(z,) be the sequence in R defined by
z=l1, Zne1 =V 220 for neN.
It is clear that z1.<z2<2. If 2.<2n41<2, then 2z,<2z,.,<4 so that
Zn =V22n< Zui2 = V22n11<2=V4. This shows that Z is a monotone
increasing sequence which is bounded above by 2; hence Z converges to a
number z. It may be shown directly that 2=sup {z,} so that the limit
z=2. Alternatively, we can use the method of the preceding example.
Knowing that the sequence has a limit _z, we conclude from the relation
Zns1=V2z, that z must satisfy z=V2z. To find the roots of this last
equation, we square to obtain z*=2z, which has roots 0, 2. Evidently 0
cannot be the limit (why?); hence this limit must equal 2.
(d) Let U=(u,) be the sequence of real numbers defined by u, =(1+1/n)" for
néN. Applying the Binomial Theorem, we can write
—,,n1 n@—1)1 n(n-1)m—-2)
1
malt or owt 3! n
Lp nna): 2-1 1
te n! n"”
16. TWO CRITERIA FOR CONVERGENCE 107

Dividing the powers of n into the numerators of the binomial coefficients, we have

waneda (mp) Qt a)(-n)


_ 1/, 1 _1\f/,_2

von (B)(e-2) (0-85.


Expressing u,,. in the same way, we have
n

_ 1 1 df, 1 2
recast ea} t at wet wi)
44 2 n-1
mo +moaaa) (tag) i)

stnltai led (os):


Note that the expression for u, contains n + 1 terms and that for u,., contains n +2
terms. An elementary examination shows that each term in u, is no greater than
the corresponding term in u,,, and the latter has one more positive term.
Therefore, we have
Uy Ug Uy << ys S

To show that the sequence is bounded, we observe that if p=1,2,...,n, then


(i—p/n)<1. Moreover, 2°"! = p! (why?) so that 1/p! = 1/2°"'. From the above
expression for u,, these estimates yield

1 1
<u <1t1tp+sete- tyes, n>2.

It follows that the monotone sequence U is bounded above by 3. The Monotone


Convergence Theorem implies that the sequence U converges to a real number
which is at most 3. As is probably well-known to the reader, the limit of U is the
fundamental number e. By refining our estimates we can find closer rational
approximations to the value of e, but we cannot evaluate it exactly in this way since
it is irrational—although it is possible to calculate as many decimal places as
desired. (This illustrates that a result such as the Monotone Convergence
Theorem, which only establishes the existence of the limit of a sequence, can be of
great use even when the exact value cannot be easily obtained.)

The Bolzano-Weierstrass Theorem

The Monotone Convergence Theorem is extraordinarily useful and im-


portant, but it has the drawback that it applies only to sequences which are
monotone. It behooves us, therefore, to find a condition which will imply
convergence in R or R’ without using the mondtone property. This
desired condition is the Cauchy Criterion, which will be introduced below.
However, we shall first give a form of the Bolzano-Weierstrass Theorem
10.6 that is particularly applicable for sequences.
108 CONVERGENCE

16.4 BOLZANO-WEIERSTRASS THEOREM. <A bounded sequence in R”


has a convergent subsequence.
PROOF. Let X =(x,) be a bounded sequence in R®. If there are only a
finite number of distinct values in the sequence X, then at least one of these
values must occur infinitely often. If we define a subsequence of X by
selecting this element each time it appears, we obtain a convergent subse-
quence of X.
On the other hand, if the sequence X contains an infinite number of
distinct values in R°, then since these points are bounded, the Bolzano-
Weierstrass Theorem 10.6 for sets implies that there is at least one cluster
point, say x*. Let x,, be an element of X such that
[xu
x* I] <1.
Consider the neighborhood V2 = {y :|ly—x*| <3}. Since the point x* is a
cluster point of the set S,={x,:m = 1}, it is also a cluster point of the set
S2= {Xm im >} obtained by deleting a finite number of elements of S,.
(Why?) Therefore, there is an element x,, of S. (whence n> n1) belong-
ing to V2. Now let V; be the neighborhood V3 = {y :|ly —x*||<3} and let
S3={X%n:m>nN2}. Since x* is a cluster point of S$; there must be an
element x,, of S3 (whence n3> n2) belonging to V3. By continuing in this
way we obtain a subsequence X’= (Xn,, Xn, -..) of X with

[lXn, = XI] < 1/r,


so that lim X'= x*, QED.
16.5 CoROLLARY. If X =(x,) is a sequence in R® and x* is a cluster
point of the set {x.:ne€N}, then there is a subsequence X’ of X which
converges to x*.
In fact, this is what the second part of the proof of 16.4 established.

Cauchy Sequences
We now introduce the important notion of a Cauchy sequence in R”’. It
will turn out that a sequence in R? is convergent if and only if it is a Cauchy
sequence.
16.6 DEFINITION. A sequence X =(x,) in R” is said to be a Cauchy
sequence in case for every « > 0 there is a natural number M(e) such that
for all m, n= M(e), then |lxn —x,||<e.
In order to help motivate the notion of a Cauchy sequence, we shall
show that every convergent sequence in R” is a Cauchy sequence.
16.7 Lemma. If X=(x,) is a convergent sequence in R’, then X is a
Cauchy sequence.
16. TWO CRITERIA FOR CONVERGENCE 109

PROOF. If x =lim X; then given ¢ >0 there is a natural number K (e/2)


such that if n = K(e/2), then ||x,—x||<«/2. Thus if M(e)= K(e/2) and if
m, n= M(e), then
||%m — Xnll =< |f[Xm — x1] + |x — xn|| << 2/2 + 2/2 =e.
Hence the convergent sequence X is a Cauchy sequence. QED.

In order to apply the Bolzano-Weierstrass Theorem, we shall require the


following result.

16.8 Lemma. A Cauchy sequence in R” is bounded.


PROOF. Let X =(x,) be a Cauchy sequence and let e=1. If m—=M(1)
and n > M(t), then |x» —x,||<1. From the Triangle Inequality this im-
plies that {|x,||<||x,.|/+1 for n= M(1). Therefore, if

B=sup {llx1[], .- -, [%m-al, [ml+ 1,


then we have ||x,||<B for all neN. Thus the Cauchy sequence X is
bounded. Q.E.D.
16.9 Lemma. If a subsequence X' of a Cauchy sequence X in R?
converges to an element x, then the entire sequence X converges to x.
PROOF. Since X=(x,) is a Cauchy sequence, given « >O there is a
natural number M(e/2) such that if m, n => M(e/2), then

(#) |lXm — Xnl| < 6/2.


If the sequence X’=(x,,) converges to x, there is a natural number K =
M(e/2), belonging to the set {n, m2, ...} and such that ¢
lx — xx|]< e/2.
Now let n be any natural number such that n = M(e/2). It follows that (*)
holds for this value of n and for m= K. Thus

[x — Xml} == [x — xx]
+ [xx ~ Hal] <€,
when n= M(e/2). Therefore, the sequence X converges to the element x,
which is the limit of the subsequence X’. QED.
We are now prepared to obtain the important Cauchy Criterion. Our
proof is deceptively short, but the reader will note that the work has
already been done and we are merely putting the pieces together.
16.10 CAUCHY CONVERGENCE CRITERION. A sequence in R? is con-
vergent if and only if it is a Cauchy sequence.
PROOF. It was seen in Lemma 16.7 that a convergent sequence must be
a Cauchy sequence.
110 CONVERGENCE

Conversely, suppose that X is a Cauchy sequence in R’. It follows from


Lemma 16.8 that the sequence X is bounded in R’. According to the
Bolzano- Weierstrass Theorem 16.4, the bounded sequence X has a con-
vergent subsequence X’, By Lemma 16.9 the entire sequence X con-
verges to the limit of X’. Q.E.D.
16.11 ExaAmp.Les. (a) Let X =(x,) be the sequence in R defined by
X= 1, xX2=2,..., Xn = 3(Xn-2 + Xn-1) for n>2.
It can be shown by induction that
1<x,<2 for neN,
but the sequence X is neither monotone decreasing nor increasing.
(Actually the terms with odd subscript form an increasing sequence and
those with even subscript form a decreasing sequence.) Since the terms in
the sequence are formed by averaging, it is readily seen that
1
[Xn — Xn+1] = 57 for neN.

Thus if m>n, we employ the Triangle Inequality to obtain

[Xn —Xm| < [Xn —Xnea] te + ++ | Xm—1—


Xm
1 1 1 1 1 1
sett pean gea( 14+ geet)
<ge
Given ¢ > 0, if n is chosen so large that 1/2" <«/4 and if m = n, it follows
that
[Xn —Xm| <e.

Therefore, X is a Cauchy sequence in R and, by the Cauchy Criterion, the


sequence X converges to a number x. To evaluate the limit we note that
taking the limit in the rule of definition yields the valid, but uninformative,
result

x =4(x+x).
However, since the sequence X converges, so does the subsequence with
odd indices. By induction we can establish that
1 1,1
xi=1, w=1+5, Xs=1l+a+53.---,

1,1 1
Xone Lt a taat- . “baat wae
16. TWO CRITERIA FOR CONVERGENCE 111

It follows that
1 1 1
Xanti1 = 145 (145+: . +71)

aqoi.iel#_
=1+5 1-1/4 1+3Al 1 -7)
a).

Therefore, the subsequence with odd indices converges to 3; hence the


entire sequence has the same limit.
(b) Let X =(x,) be the real sequence given by
1 1 1 1 1 (-1)""'
mT X2 =Trape Xm = Ty cet nh:

Since this sequence is not monotone, a direct application of the Monotone


Convergence Theorem is not possible. Observe that if m >n, then
_ _ (-1)"” (-y"? . (-1)"""

Xm—%m HED (+Di + om!


Recalling that 2" < r!, we find that
1 1 1
bm al =a taal tl
1,1
Sant gaat:
141
. “Hamat S 5nat-

Therefore the sequence is a Cauchy sequence in R.


(c) If X =(x,) is the sequence in R defined by

x =ty)y...4h for neN,


1 2 n
and if m>n, then
1 1
Xm ~Xn =F tat" . —e

Since each of these m—n terms exceeds 1/m, this difference exceeds
(m—n)/m=1-—n/m. In particular, if m =2n, we have

X2n — Xn >y

This shows that X is not a Cauchy sequence, whence we conclude that X is


divergent. (We have just proved that the “harmonic series”’ is divergent.)

Exercises

16.A. Let x,ER satisfy x,>1 and let x,.,=2-—1/x, for neN. Show that the
sequence (x,) is monotone and bounded. What is its limit?
112 CONVERGENCE

16.B. Let y,=1 and y,,,=(2+,)"” for neN. Show that (y,) is monotone and
bounded. What is its limit?
16.C. Let a>Q and let z,>0. Define z,,,=(a+z,)'? forneN. Show that (z,)
converges.
16.D. If a satisfies O0<a<1, show that the sequence X =(a") is convergent.
Since Y=(a*") is a subsequence, we have lim X =lim Y=(lim X)’, and that
lim X =0.
16.E. Show that every sequence in R either has a monotone increasing subse-
quence or a monotone decreasing subsequence.
16.F. Use Exercise 16.E to prove the Bolzano-Weierstrass Theorem for sequ-
ences in R.
16.G. Determine the convergence or divergence of the sequence (x,), where

Xn
1 1 1 for neN.
~atitns2’ tn
16.H. Let X =(x,) and Y =(y,) be sequences in R’ and let Z=(z,) be the
“shuffled” sequence defined by z;=X,, Z2=Yi,..-5 Zan =Xns Zonet =Ynp---- Is it
true that Z is convergent if and only if X and Y are convergent and lim X =
lim Y?
16.1. Show directly that the following are Cauchy sequences:

@ (3),
1
w) (>),
+1
() (14+
1
+5).
1

16.3. Show directly that the following are not Cauchy sequences:

{a) (-1)), (b) (@+(-1)"/n), {c) (n’).


16.K. Let X=(x,) be a sequence of strictly positive real numbers, let
lim (%,4:/x,) =L, and let 0<e<L. Show that there exist A>0, B>0, and KEN
such that A(L ~e)" <= x, = B(L+e)" forn = K. Then show that lim (xi") = L.
16.L. Apply Example 16.3(d) and the preceding exercise to the sequence (n"/n!)
to show that lim (n/(n!)"") =e.
16.M. Establish the convergence and the limits of the following sequences:

(a) (1+1/ny"), (b) (1+1/2n)"),


(c) ((1+2/n)"), (d) (1+1/(n+1))™).
16.N. Let 0<a,< b, and define, for ne N,

Arr = (And,?, By = Ha, +B).


By induction show that a, <b,. Show that (a,) and (b,) converge to the same limit.
16.0. Give a proof of the Cantor Intersection Theorem 11.4 by taking a point
x, € F, and applying the Bolzano- Weierstrass Theorem 16.4.
16.P. Give a proof of the Nearest Point Theorem 11.6 by using the Bolzano-
Weierstrass Theorem 16.4.
16.Q. Prove that if K, and K, are compact subsets of R°, then there exist points
x,€K,, x.€K, such that if z,e K,, z,€ K,, then |\z,— z4| = |x, — xa.
17. SEQUENCES OF FUNCTIONS 113

Project
16.«. In this project, let m, c, and cy designate the collections of real sequences
that were introduced in Project 15.8 and let d denote the metric defined in part (d)
of that project.
(a) IfreFandr=0. rn4r.-:-+4, +--+ is its decimal expansion, consider the element
X,=(r,) in m. Conclude that there is an uncountable subset A of m such that if X,
and X, are distinct elements of A, then d(X,, X,) => 1.
(b) Suppose that B is a subset of c with the property that if X and Y are distinct
elements of B, then d(X, Y)=1. Prove that B is a countable set.
(c) If j EN, let Z, =(z,,:n EN) be the sequence whose first j elements are 1 and
whose remaining elements are 0. Observe that Z, belongs to each of the metric
spaces m, c, and cy and that d(Z, Z,)=1 for j#k. Show that the sequence
(Z,:}€N) is monotone in the sense that each coordinate sequence (z,;:j €N) is
monotone. Show that the sequence (Z,) does not converge with respect to the
metric d in any of the three spaces.
(d) Show that there is a sequence (X;) in m, c, and cy) which is bounded (in the
sense that there exists a constant K such that d(X, 0) = K for all j € N) but which
possesses no convergent subsequence.
(e) (if d is a metric on a set M, we say that a sequence (X;) in M is a Cauchy
sequence if for every « >0 there exists K(e)<N such that d(X,, X,)<« whenever
j,k = K(e). We say that M is complete with respect to d in case every Cauchy
sequence in M converges to an element of M.) Prove that the sets m, c, and cy are
complete with respect to the metric d we have been considering.
(f) Let f be the collection of all real sequences which have only a finite number
of non-zero elements and define d as before. Show that d is a metric on f, but that
f is not complete with respect to d.

Section 17 Sequences of Functions


In the preceding three sections we have considered the convergence of
sequences of elements of R”; in the present section we shall consider
sequences of functions. After some simple preliminaries, we shall intro-
duce the rather subtle, but basic, notion of uniform convergence of a
sequence of functions.
Let DR? be given and suppose that for each natural number ne N
there is a function f, with domain D and range in R*; we shall say (f,) is a
sequence of functions on D<R?’ to R*. It should be understood that, for
any point x in D such a sequence of functions gives a sequence of elements
in R*; namely, the sequence
(17.1) (fn(x))
which is obtained by evaluating each of the functions at x. For certain
points x in D the sequence (17.1) may converge and for other points x in D
114 CONVERGENCE

this sequence may diverge. For each of those points x for which the
sequence (17.1) converges there is, by Theorem 14.5, a uniquely deter-
mined point of R*. In general, the value of this limit, when it exists, will
depend on the choice of the point x. In this way, there arises a function
whose domain consists of all points x in D<R? for which the sequence
(17.1) converges in R*.
We shall now collect these introductory words in a formal definition of
convergence of a sequence of functions.
17.1 DEFINITION. Let (f,) be a sequence of functions on DCR? to
R‘, let Do be a subset of D, and let f be a function with domain containing
Do and range in R*. We say that the sequence (f,) converges on D, to f if,
for each x in Do the sequence (f,(x)) converges in R‘ to f(x). In this case
we call the function f the limit on D, of the sequence (f,). When such a
function f exists we say that the sequence (f,) converges to f on Do, or
simply that the sequence is convergent on Do.
It follows from Theorem 14.5 that, except for possible change in the
domain Do, the limit function is uniquely determined. Ordinarily, we
choose Dp to be the largest set possible; that is, the set of all x in D for
which (17.1) converges. In order to symbolize that the sequence (f,)
converges on D, to f we sometimes write
f=lim(f,) on Do, or fa—>f on Do.
We shall now consider some examples of this idea. For simplicity, we
shall treat the special case p =q = 1.
17.2. ExXAMPLEs. (a) For each natural number n, let f, be defined for
xin D=R by f.(x)=x/n. Let f be defined for all x in D=R by f(x) =0.
(See Figure 17.1). The statement that the sequence (f,) converges on R
to f is equivalent to the statement that for each real number x the numeri-
cal sequence (x/n) converges to 0. To see that this is the case, we apply
Example 14.8(a) and Theorem 15.6(b).
(b) Let D={x €R:0 <x = 1} and for each natural number n let f, be
defined by f,(x) =x" for all x in D and let f be defined by
f(x) =0, O<x<1,
=1, x=1.

(See Figure 17.2.) Itis clear that when x = 1, then f,(x) =f,(1) = 1" =1 so
that f.(1)— fl). We have shown in Example 14.8(c), that if O<x<1,
then f,(x) =x" — 0. Therefore, we conclude that (f,) converges on D to f.
(It is not difficult to prove that if x >1 then (f,(x)) does not converge at
all.)
(c) Let D=R and for each natural number n, let f, be the function
17. SEQUENCES OF FUNCTIONS 115

Figure 17.1

defined for x in D by
_ x?+nx
fi(x)=———,
and let f(x)=x. (See Figure 17.3.) Since f,(x) = (x’/n) +x, it follows from
Example 14.8(a) and Theorem 15.6(b) that (f,(x)) converges to f(x) for all
xeER.
(d) Let D=R and, for each natural number n, let f, be defined to be
f.(x) =(1/n) sin (nx +n). (See Figure 17.4.) (A rigorous definition of the

(140)

Figure 17.2
116 CONVERGENCE

lly,

\
Figure 17.3

sine function is not needed here; in fact, all we require is that |sin y| = 1 for
any real number y.) If f is defined to be the zero function f(x) =0, x ER,
then f=lim(f,) on R. Indeed, for any real number x, we have

[fu(x) + n)] = 2.1


— f00| =F [sin (nx
If « >O, there exists a natural number K(e) such that if n = K(e), then
1/n<e. Hence for such n we conclude that

| fu(x) —fx)|<e

Figure 17.4
17. SEQUENCES OF FUNCTIONS 117

no matter what the value of x. Therefore, we infer that the sequence (f.)
converges to f. (Note that by choosing n sufficiently large, we can make
the differences |f,,(x)—f(x)| less than ¢ for all values of x simultaneously!)
Partly to reinforce Definition 17.1 and partly to prepare the way for the
important notion of uniform convergence, we formulate the following
restatement of Definition 17.1.
17.3. LemMMa. A sequence (f.) of functions on D < R? to R* converges
to a function f on DoS D if and only if for each e > 0 and each x in Do there
is a natural number K(e, x) such that for alln = K(e, x), then
(17.2) IIf.(x) —f(x)i<e.
Since this is just a reformulation of Definition 17.1, we shall not go
through the details of the proof, but leave them to the reader as an
exercise. We wish only to point out that the value of n required in
inequality (17.2) will depend, in general, on both e>0 and xeDp. An
alert reader will have already noted that, in Examples 17.2(a—-c) the value
of n required to obtain (17.2) does depend on both e>0 and xeDo.
However, in Example 17.2(d) the inequality (17.2) can be satisfied for all
x in Do provided n is chosen sufficiently large but dependent on « alone.
It is precisely this rather subtle difference which distinguishes between
the notions of “ordinary” convergence of a sequence of functions (in the
sense of Definition 17.1) and “uniform” convergence, which we now
define.
17.4 Derinirion. A sequence (f,) of functions on DCR? to R’
converges uniformly on a subset Do of D to a function f in case for each
e >0 there is a natural number K(e) (depending on « but not on x € Dy)
such that for all n = K(e) and x € Do, then

(17.3) I|fa (2) ~ FQX)I<e.


In this case we say that the sequence is uniformly convergent on Do. (See
Figure 17.5.)
It follows immediately that if the sequence (f,) is uniformly convergent
on Dz» to f, then this sequence of functions also converges to f in the sense
of Definition 17.1. That the converse is not true is seen by a careful
examination of Examples 17.2(a~c); other examples will be given below.
Before we proceed, it is useful to state a necessary and sufficient condition
for the sequence (f,) to fail to converge uniformly on Do to f.

17.5 LemMMA. A sequence (f,) does not converge uniformly on Dy to f


if and only if for some eo>0 there is a subsequence (f.,) of (fr) and a
sequence (x) in Do such that

(17.4) fn (%)-—f(J)= eo for keEN.


118 CONVERGENCE

Figure 17.5

The proof of this result merely requires that the reader negate Definition
17.4. It will be left to the reader as an essential exercise. The preceding
lemma is useful to show that Examples 17.2(a-c) do not converge uni-
formly on the given sets Do.
17.6 ExAMpLEs. (a) We consider Example 17.2(a). If m=k and
x = k, then f(x) =1 so that
| fu (i)
— f(24.)| = [1-0] = 1.
This shows that the sequence (f,) does not converge uniformly on R to f.
(b) We consider Example 17.2(b). If m =k and x. =()"", then
| fc (2) — fated] = [fi (20) =}.
Therefore, we infer that the sequence (f,) does not converge uniformly on
[0, 1] tof.
(c) We consider Example 17.2(c). If m =k and x, =k, then
[fi (x) — f (%e)| = k,
showing that (f,) does not converge uniformly on R to f.
(d) We consider Example 17.2(d). Then since
|fn(x}—f(x)| <= 1/n
for all x in R, the sequence (f,) converges uniformly on R to f.

The Uniform Norm

In discussing uniform convergence it is often useful to employ a certain


norm on a vector space of functions.
17. SEQUENCES OF FUNCTIONS 119

If D&R?® and f:D > R‘, we say that f is bounded in case there exists
M>O such that ||f(x)||= M for allxeD. If f:D — R?* is bounded, then it
follows that the number ||f|lp defined by

(17.5) IIfllo
= sup {||f@)||:x
€ D}
existsin R. (We note that the norm on the right side of this equation is the
norm in the space R*.).
17.7. DeFinition. If D¢R?, then the collection of all bounded func-
tions on D to R‘ is denoted by B,,(D) or (when p and q are understood) by
B(D).
In the space B,,(D) we define the vector sum of two functions f, g and
the scalar multiple of ce R and f by

(17.6) (f+ gix)=f(x)+gx), — (ef)(x)


= ef (x)
for allxeD. We define the zero function to be the function 0:D — R’
defined for all xe D by 0(x)=0. We now connect this terminology with
the notions presented in Section 8.
17.8 Lemma. (a) The set B,,(D) is a vector space under the vector
operations defined in equation (17.6).
(b) The function f +> ||filp defined on B,,(D) in equation (17.5) is a norm
on Bpyg(D).
PROOF. The proof of (a) requires only routine calculations.
To prove (b), we need to establish the four properties of a norm given in
Definition 8.5. (i) It is clear from (17.5) that ||fllp = 0. (ii) Clearly ||Ol|p =
sup {|0(x)||:xeD}=0. Conversely, if ||fllb=0, then since 0 <||f(x)||<
\\fll> = 0, we infer that ||f(x)l|=0 and hence f(x) =0 for all x ¢D so that
f=0. (ii) The fact that |lef||p =|c| |Ifllp is readily seen. (iv) Since
If+ gC = FG) + gl s IFO) +g
= |Ifll + Igllo
for all xD, it follows that ||fll>+llgll is an upper bound for the set
{lf+ g)(x)|:x € D}. Therefore we have
If+ gllo = sup {Il(f+ g)(x)||:x € D}
< |lfll> +[lgllo. OED.
Sometimes the norm fr||f|l> is called the uniform norm (or the su-
premum norm) on B,,(D). We shall now show that the uniform con-
vergence of functions in B,,(D) is equivalent to convergence in the uniform
norm.
17.9 THEoremM. A sequence (f,) in B,,(D) converges uniformly on D to
120 CONVERGENCE

f¢B,,(D) if and only if

IIfn ~ fll> > 0.


PROOF. If the sequence (f,) converges uniformly to f on D, then for any
e >0 there is a natural number K(«) such that if n > K(e) and xe D then
\lf.(x) — f(x)|<e. This implies that
Ilfn — flo = sup {I(fr — f) GI] :x € D} = e.
Since ¢ >0 is arbitrary, this implies that ||f, —fllo > 0.
Conversely, if ||f: — flo > 0, then given « >0 there exists K(e) such that
if n > K(e) then ||f.~fllb = e. This implies that if x e D, then

I[fn(x)
— FO) = -— NGOs If. — fll = e.
Therefore the sequence (f,,) converges uniformly on D to f. Q.E.D.
We now illustrate the use of this lemma as a tool in examining a
sequence of functions for uniform convergence. We observe first that the
norm has been defined only for bounded functions; hence we can employ it
(directly, at least) only when the sequence consists of bounded functions.
17.10 EXAMPLES. (a) We cannot apply Lemma 17.9 to the example
considered in 17.2(a) and 17.6(a) for the reason that the functions f,,
defined to be f,(x)=x/n, are not bounded on R, which was given as the
domain. For the purpose of illustration, we change the domain to obtain a
bounded sequence on the new domain. For convenience, let us take
E=(0,1]. Although the sequence (x/n) did not converge uniformly to the
zero function on the domain R [as was seen in Example 17.6(a)], the
convergence is uniform on E=[0, 1]. To see this, we calculate

Ifo— fle =sup {|2~0:0<%< i}=4;


hence ||f.
— fills = 1/n — 0.
(b) We now consider the sequence discussed in Examples 17.2(b) and
17.6(b). Here D =[0, 1], f.(x) =x", and the limit function f is equal to 0
for O<x<1 and equal to 1 for x=1. Calculating the norm of the
difference f, — f, we have

Ife fllo=sap{%°*> OSS


Os eetig for neN.

Since this norm does not converge to zero, we infer that the sequence (fn)
does not converge uniformly on D =[0, 1] to f. This bears out our earlier
considerations.
(c) We consider Example 17.2(c). Once again we cannot apply Lemma
17.9, since the functions are not bounded. Again, we choose a smaller
17. SEQUENCES OF FUNCTIONS 121

domain, taking E =[0, a] witha >0. Since


x? +nx a x?
~ f(x)| = XM
|fa(x) n
we have

Ife—Flle = sup {\falx) —f(2)|:0 sx <a}=",


Hence the sequence converges uniformly to f on the interval [0, a]. (Why
does this not contradict the result obtained in Example 17.6(c)?)
(d) Referring to Example 17.2(d), we consider the function fi.(x) =
(1/n) sin (nx +n) on D=R. Here the limit function f(x) =0 for all x e D.
In order to establish the uniform convergence of this sequence, note that
fn
- filo = sup {(1/n) |sin (nx
+ n)|:x € R}
But since |sin y| <1, we conclude that ||f.—fllb =1/n. Hence (f.) con-
verges uniformly on R, as was established in Example 17.6(d).
One of the more useful aspects of the norm is that it facilitates the
formulation of a Cauchy Criterion for the uniform convergence of a se-
quence of bounded functions.
17.11 CAucHy CRITERION FOR UNIFORM CONVERGENCE. Let (f,)
be a sequence of functions in B,,(D). Then there is a function f € B,,(D) to
which (f.) is uniformly convergent on D if and only if for each « >0 there is a
natural number M(e) such that for all m,n = M(e), then |lfn—fullp <e.
PROOF. Suppose that the sequence (f,) converges uniformly on D toa
function fe B,,(D). Then, for « >0 there is a natural number K(e) such
that if n= K(e), then |lf.—fll><e/2. Hence if both m,n= K(e), we
conclude that

Ilfn — fallo = [lfm —fllo + Ilf


~ fallo <e.
Conversely, suppose the Cauchy Criterion is satisfied and that for e >0
there is a natural number M(z) such that ||f,. — fall <e¢ when m,n = M(e).
Now for each x € D we have

(17.6) || fm(e)—
fn (0) || = [fin — fall
< € for m,n>= M(e).
Hence the sequence (f,(x)) is a Cauchy sequence in R* and so converges to
some element of R*. We define f for x in D by

f(x) = lim (f,(x)).


From (17.6) we conclude that if m is a fixed natural number satisfying
m = M(s) and if n is any natural number with n = M(e), then for all x in D
122 CONVERGENCE

we have
Iifmn (20) — fa (X)|-<e.
If we apply Lemma 15.8 it follows that if m = M(e) and xeD, then
\lfm(x)—f(x)|| Se. Since f,, is a bounded function, it follows readily from
this (how?) that f is bounded and hence belongs to B,,(D). Moreover we
conclude that (f,) converges uniformly to f on D. Q.E.D.

Exercises
In these exercises you may make use of the elementary properties of the
trigonometric and exponential functions from earlier courses.
17.A. For each neEN, let f, be defined for x >0 by f,(x)=1/(mx). For what
values of x does lim (f,(x)) exist?
17.B. For each nEN, let g, be defined for x = 0 by the formula

g(xy=nx, Osxsi/n,

-1) ijn <x,


nx
Show that lim (g,(x)) =0 for all x >0.
17.C. Show that lim ((cos 7x)*") exists for all values of x. What is its limit?
17.D. Show that, if we define f, on R by

_ nx
f.(x) ~ Lttn’x?’

then (f,) converges on R.


17.E. Let h, be defined on the interval I =[0, 1] by the formula

h,(x) = 1— nx, O<x<l1/n,


=0, In<x<1.
Show that lim (h,) exists on I.
17.F. Let g, be defined on I by

Sn (x) = nx, Osx<in,


n
=F 7 -») lfn<x<1l.

Show that lim (g,)} exists on E


17.G. Show that if f, is defined on R by

f(x) -2 Arc tan (nx),

then f =lim (f,) exists on R. In fact the limit is given by


f(x)=1, x>0,
=0, x=0,
=-1, x<Q.
18. THE LIMIT SUPERIOR 123

17.H. Show that lim (e~™”) exists for x =O. Also consider the existence of
lim (xe™*).
17.1. Suppose that (x,) is a convergent sequence of points which lies, together
with its limit x, ina set DCR’. Suppose that (f,) converges on D to the function f.
Is it true that f(x) =lim (f,(x,))?
17.J. Consider the preceding exercise with the additional hypothesis that the
convergence of the (f,,) is uniform on D.
17.K. Prove that the convergence in Exercise 17.A is not uniform on the entire
set of convergence, but that it is uniform for x => 1.
17.L. Show that the convergence in Exercise 17.B is not uniform on the domain
x = O, but that it is uniform on a set x = c, where c > 0.
17.M. Is the convergence in Exercise 17.D uniform on R?
17.N. Is the convergence in Exercise 17.E uniform on I?
17.0. Is the convergence in Exercise 17.F uniform on I? Is it uniform on [c, 1]
for c >0?
—nx
-17.P. Does the sequence (xe™") converge uniformly for x = 0?
17.Q. Does the sequence (x’e~™) converge uniformly for x = 0?
17.R. Let (f.) be a sequence of functions which converges on D to a function f.
If A and B are subsets of D and it is known that the convergence is uniform on A
and also on B, show that the convergence is uniform on A U B.
17.S. Give an example of a sequence (f,) in B,,(I) such that ||f.|) = 1 for allneN
which does not have a uniformly convergent subsequence. (Hence the Bolzano-
Weierstrass Theorem does not hold in B,,().)

Section 18 The Limit Superior

In Section 6 we introduced the notion of the supremum of a non-empty


bounded set of real numbers, and we have made important use of this
notion many times. However, in dealing with a bounded infinite set S< R
it is sometimes also of interest to consider the largest cluster point s* of S.
This point s* is the infimum of all real numbers which are exceeded by at
most a finite number of elements of S. We shall adapt this notion to
bounded sequences in R to obtain the frequently useful concept of the
“imit superior.”

18.1 DeEFiniTion. Let X =(x,) be a bounded sequence in R.


(a) The limit superior of X, which we denote by

lim sup X, lim sup (x), or lim (xn),

is the infimum of the set V of véR such that there are at most a finite
number of néN such that v< x,.
(b) The limit inferior of X, which we denote by
lim inf X, lim inf (x,), or lim (xa),
124 CONVERGENCE

lim inf X \ fim sup N\

HEH
+ ttt t+} + HH

Figure 18.1

is the supremum of the set W of w € R such that there are at most a finite
number of m & N such that x, <w.
While a bounded sequence need not have a limit, it always has a unique
limit superior (and a unique limit inferior). That is clear from the fact that
the number v = sup {x,:n€N} belongs to the set V, while the number
inf {x,:1 €N}—1 is a lower bound for V.
There are several equivalent ways, which are often useful, that one can
define the limit superior of a bounded sequence. (The reader is strongly
urged to attempt to prove this result before reading the proof.)
18.2 THEOREM. If X=(x,) is a bounded sequence in R, then the
following statements are equivalent for a real number x*.
(a) x*=lim sup (x,).
(b) If e >0, there are at most a finite number of ne N such that x*+e<
Xn, Dut there are an infinite number such that x*—¢ <x,
(c) If vo. =sup {x.:n = m}, then x* = inf fon :ne N}.
(d) If tn =sup {x,:n = mb}, then x*= lim (vn).
(e) If Lis the set of »v € R such that there exists a subsequence of X which
converges to v, then x* = sup L.
PROOF. Let x*=lim sup (x,) and let e>0. By Definition 18.1, there
exists a ve V withx*<v<x*+e. Therefore x*+< also belongs to V, so
there can be at most a finite number of n < N such that x*+2<x,. On the
other hand x*—« is not in V, so there are an infinite number of né N such
that x*—«<-x,. Hence (a) implies (b).
If (b) holds, given « >0, then for all sufficiently large m we have v, =
x*+e; therefore inf{v,:meN}<x*+«. But since there is an infinite
number of néN such that x*—«<x,, then x*—e<v,, for all meN and
hence x*—e <inf{om:meN}. Since e >0 is arbitrary, we deduce that
x* = inf {v, 1m €N} and (c) holds.
If the sequence (v,,.) is defined as in (c), then it is monotone decreasing
and hence inf (v,,) = lim (v,), so that (c) implies (d).
Now suppose that x* satisfies (d) and X'=(x,) is a convergent subse-
quence of X; since n, = k we have x,, <1 and hence lim X' < lim(uv,) = x*.
Conversely, note that there exists ni1¢N such that v:.-1<x,< 01.
Inductively choose n.+:> mm. such that
1
< Xn SS Ue
K+
18. THE LIMIT SUPERIOR 125

Since lim (v.) = x*, it follows that x* = lim (xa,). Therefore (d) implies (e).
Finally, let w= sup L. If « >0 is given, then there can be at most a finite
number of ne N with w+e<x, (by the Bolzano Weierstrass Theorem
16.4). Therefore w+eeV and limsupX=w+e. On the other hand
there exists a subsequence X’ converging to some number exceeding w —«;
hence w—¢ is not in V andso w—e <= limsup X. Since ¢ >0 is arbitrary,
we infer that w =lim sup X. Therefore (e) implies (a). QED.
Both of the characterizations (d) and (e) can be regarded as justifying the term
“limit superior.” There are corresponding characterizations for the limit inferior
of a bounded sequence which the reader should write out and prove.

We now establish the basic algebraic properties of the limit superior and
the limit inferior of bounded sequences.

18.3. THEOREM. Let X=(xn) and Y=(y,) be bounded sequences of


real numbers. Then the following relations hold:
(a) lim inf (x,) = lim sup (x,).
(b) If c=0, then liminf(cx,)=c liminf(x,) and limsup (cx,)=
c lim sup (xn).
(b’) If c=<0, then liminf(cx,)=c limsup(x,) and limsup (cx,) =
c lim inf (x,).
(c) lim inf (x,) + lim inf (y,) = lim inf (xn + ya).
(d) lim sup (x. + yn) = lim sup (x.) +lim sup (y,).
(ec) If xs y,. for all n, then liminf(x,.) <liminf(y,) and also
lim sup (x,) = lim sup (y,).

PROOF. (a) If w<liminf(x.) and v >lim sup (x,), then there are in-
finitely many néN such that w = x,, while there are only a finite number
such that v<x,. Therefore, we must have w =< v, which implies (a).
(b) If c = 0, then multiplication by c preserves all inequalities of the
form w <= x, ete.
(b’} If c <0, then multiplication by c reverses inequalities and converts
the limit superior into the limit inferior, and conversely.
Statement (c) is dual to (d) and can be derived directly from (d) or proved
by using the same type of argument. To prove (d), let v > lim sup (x,) and
u>lim sup (y,); by definition there are only a finite number of n € N such
that v <x, and a finite number such that u<y,. Therefore there can be
only a finite number of n such that v+u<xt+y,, showing that
lim sup (x, + y,) = vu+u. This proves statement (d).
We now prove the second assertion in (e). If u>lim sup (y,), then there
can be only a finite number of natural numbers n such that u<y,. Since
Xn <= Yn, then lim sup (x,) = u, and so lim sup (x,) = lim sup (y,). QED.

Each of the equivalent conditions given in Theorem 18.2 can be used to


126 CONVERGENCE

prove the parts of Theorem 18.3. It is suggested that some of these


alternative proofs be written out as an exercise.
It might be asked whether the inequalities in Theorem 18.3 can be replaced by
equalities. In general, the answer is no. For, if X =((—1)"), then lim inf X=—1
and lim sup X= +1. If Y=((—1)"")), then X + Y =(0) so that

lim inf X +lim


inf ¥Y = -2<0=lim inf (X+ Y),
lim sup (X+ Y)=0<2=lim
sup X +lim sup Y.
We have seen that the limit inferior and the limit superior exist for any
bounded sequence, regardless of whether the sequence is convergent. We
now show that the existence of lim X is equivalent to the equality of
lim inf X and lim sup X.
18.4 Lemma. Let X be a bounded sequence of real numbers. Then X
is convergent if and only if lim inf X = lim sup X in which case lim X is the
common value.
PROOF. If x =lim X, then for each « >0 there is a natural number N(<)
such that
x-e<u,<x4+68, n= N(e).
The second inequality shows that lim sup X =< x +e and the first inequality
shows that x—e <liminf X. Hence 0 <limsup X~—liminf X < 2s, and
since ¢ >Q is arbitrary, we have the stated equality.
Conversely, suppose that x =liminf X=limsup X._ If ¢ >0, it follows
from Theorem 18.2(b) that there exists a natural number N,(e) such that if
n= N,(e), then x,<x+e. Similarly, there exists a natural number N2(e)
such that if n > Na(e), then x-—e<x, Let N(e)=sup{Ni(e), No(e)}; if
n> N(e), then |x, —x|<, showing that x =lim X. QED.

Unbounded Sequences

Sometimes it is convenient to have the limit superior and the limit


inferior defined for arbitrary (that is, not necessarily bounded) sequences in
R. To do this we need to introduce the symbols + and —, but it is to be
emphasized that we do not consider them to be real numbers; they are
merely convenient symbols.
If S is a non-empty set in R which is not bounded above, we define
sup S = +; if T is a non-empty set in R which is not bounded below, we
define inf T=—«. As remarked after Definition 6.1, every real number is
an upper bound of the empty set 9, so we define sup@=~—. Similarly,
every real number is a lower bound of , so we define inf @ = +.
Now let X =(x,) be a sequence in R which is not bounded above; then
18. THE LIMIT SUPERIOR 127

the set V of numbers v € R such that there are at most a finite number of
néN such that v<x, isempty. Hence theinf V=+. Thus if X =(x,) is
a sequence in R which is not bounded above, then we have
lim sup (x,) = +9.
Similarly, if Y = (y,) is a sequence in R which is not bounded below, then
we have
lim inf (y.) =—2.
We note that if X =(x,) is a sequence in R that is not bounded above,
then the sets {x,:n = m} are not bounded above and so
Um = sup {Xin= m}= +00
for allmeN.

Infinite Limits

If X =(x,) is a sequence in R, we say that X =(x,) diverges to +, and


write lim (x,)=+, if for every aé€R there is a K(a)¢N such that if
n= K(a) then x, >a.
Similarly we say that X = (x,) diverges to —, and write lim (x,) = —%, in
case for every a € R there is a K(a) EN such that if n = K(a) then x, <a.
It is an exercise to show that X = (x,) diverges to + if and only if
lim inf (X,) = lim sup (x) = +9,

and that X = (x,) diverges to —© if and only if


lim inf (x,) = lim sup (x,) = —©.
Exercises

18.A. Determine the limit superior and the limit inferior of the following
bounded sequences in R.

(a) ((-1)"), (b) (-1)"/n)),


(c) (-1)"+1/n), (d) (sin n).
18.B. If X =(x,) is a bounded sequence in R, show that there is a subsequence of
X which converges to lim inf X.
18.C. Formulate and prove directly the theorem corresponding to Theorem 18.2
for the limit inferior.
18.D. Give a direct proof of Theorem 18.3(c).
18.E. Prove Theorem 18.3(d) by using 18.2(b) as the definition of the limit
superior. Do the same using 18.2(d) and 18.2(e).
18.F. If X =(x,) is a bounded sequence of strictly positive elements in R show
that lim sup (x,”) < lim sup (%.41/%,).
128 CONVERGENCE

18.G. Determine the limit superior and the limit inferior of the following sequ-
ences in R.
(a) (-1)’n), {b) (nsinn),
(c) (n(sin n)?), (d) (ntann).
18.H. Show that the sequence X =(x,) in R diverges to + if and only if
lim inf X = +0,
18.I. Show that lim sup X = + if and only if there is a subsequence X’ of X such
that lim X’ = +09,
18.J. Interpret Theorem 18.3 for unbounded sequences.

Section 19 Some Extensions

It is frequently important in analysis to estimate the ‘“‘order of mag-


nitude” of a sequence or to compare two sequences relative to their
magnitude. In doing so we discard terms which make no “‘essential con-
tribution.” For example, if x,=2n+17, then when neN is large, the
dominant contribution comes from the term 2n. If y, = n?—5n, then when
né Nis large, the dominant term is n®. And, although the first few terms
of (y,) are smaller than those of (x,), the terms of this sequence ultimately
grow more rapidly than those of (x,).
We shall now introduce some terminology to make this idea more
precise and some notation, due to Landau, that is often useful.
19.1 Derinirion. Let X=(x,) and Y=(y,) be sequences in R and
suppose that y, #0 for all sufficiently large ne N. We say that X and Y
are equivalent and write
X~Y or (in) ~ (yn)
in case lim (x,/y,) = 1. We say that X is of a lower order of magnitude than
Y and write
X=0(Y) or Xn = 0(¥n)
in case lim (x,/y,) =0. We say that X is dominated by Y and write
X=O(Y) or Xn = O(yn)
in case the sequence (x,/yn) is bounded.

It is clear that either X~Y or X=o0(Y) implies that X=O(Y).


Various properties of these notations will be given in the exercises.

Cesaro Summation

We have already defined what is meant by the convergence of a sequ-


ence X = (x,) in R’ toan element x. However, it may be possible to attach
+ EDMUND (G. H.) Lanbau (1877-1938) was a professor at Géttingen and is known for his
research and books on number theory and analysis. These books are noted for their rigor and
brevity of style (and their elementary German).
19. SOME EXTENSIONS 129

x to the sequence X as a sort of “generalized limit,’ even though the


sequence X does not converge to x in the sense of Definition 14.3. There
are many ways in which one can generalize the idea of the limit of a
sequence and to give very much of an account of some of them would take
us far beyond the scope of this book. However, there is a method which is
both elementary in nature and useful in applications to oscillatory se-
quences. Since it is of some importance and the proof of the main result
is typical of many analytical arguments, we inject here a brief introduction
to the theory of Cesarot summability.
19.2 DEFINITION. If X =(x,) is a sequence of elements in R*, then
the sequence S = (a,,) defined by

_ Xi tX2 Mat Xat Xn


G1-=%X1, G2 3 geres On n gece

is called the sequence of arithmetic means of X.


In other words, the elements of S are found by averaging the terms in X. Since
this average tends to smooth out occasional fluctuations in X, it is reasonable to
expect that the sequence S$ has more chance of converging than the original
sequence X. In case the sequence S of arithmetic means converges to an element
y, we say that the sequence X is Cesaro summable to y, or that y is the (C, 1)-limit
of the sequence X.
For example, let X be the non-convergent real sequence X = (1, 0, 1, 0,...); itis
readily seen that if n is an even natural number, then o, =3 and if n is odd then
o, =(n+1)/2n. Since }= lim (o,), the sequence X is Cesdro summable to 3, which
is not the limit of X but seems like the most natural ‘‘generalized limit” we might
try to attach to X.
It seems reasonable, in generalizing the notion of the limit of a sequence,
to require that the generalized limit give the usual value of the limit
whenever the sequence is convergent. We now show that the Cesaro
method has this property.
19.3. THEOREM. If the sequence X =(x,) converges to x, then the se-
quence S =(o,) of arithmetic means also converges to x.
PROOF. We need to estimate the magnitude of

1
On X= (Kit Aot +Xn)—X
(19.1) 1
=p lGa-x)t+Ga-x)t- + +(x, —x)}.

Since x = lim (x,), then given e >0 there is a natural number N(e) such
that if m = N(e), then ||x.—x||<e. Also, since the sequence X =(xn) is
t ERNESTO CESARO (1859-1906) studied in Rome and taught at Naples. He did work in
geometry and algebra as well as analysis.
130 CONVERGENCE

convergent, there is a real number A such that ||x,—x||<A for all k. If


n= N=N(e), we break the sum on the right side of (19.1) into a sum from
k=1tok=N plusasum fromk=N+1tok=n. We apply the estimate
|x. —x||<e to the latter n—N terms to obtain

llon xl} <NA n—N, for n= N(e).


n n
If n is sufficiently large, then NA/n<e and since (n—N)/n<1, we find
that ||o, — x||<2e for n sufficiently large. Hence x =lim (a,). Q.E.D.
We shall not pursue the theory of summability any further, but refer the reader to
books on divergent series and summability. For example, see the book by Knopp
listed in the References. One of the most interesting elementary applications of
Cesaro summability is the celebrated theorem of Fejér which asserts that a
continuous function can be recovered from its Fourier series by the process of
Cesaro summability, even though it cannot always be recovered from this series by
ordinary convergence. (See Theorem 38.12.)

Double and Iterated Sequences


We recall that a sequence in R° is a function defined on the set N of
natural numbers and with range in R’. A double sequence in R? is a
function X with domain NN consisting of all ordered pairs of natural
numbers and range in R?. In other words, at each ordered pair (m, n) of
natural numbers the value of the double sequence X is an element of R°
which we shall typically denote by Xm. Generally we shall use a symbolism
such as X = (Xmn) to represent X, but sometimes it is convenient to list the
elements in an array such as

Mir X12 Xin

X21 X22 Xon

(19.2) Kal cece eee cece eee


Xmt = Xm2 Xmn

Observe that, in this array, the first index refers to the row in which the
element Xnn appears and the second index refers to the column.
19.4 DeFiniTiIon. If X =(%m.) is a double sequence in R’, then an
element x is said to be a limit (or a double limit) of X if for each positive
number ¢ there is a natural number N(e) such that for all m, n= N(e)
then ||Xmn —x||<. In this case we say that the double sequence converges
to x and write
x= lim (Xmn) or x =lim X.
19. SOME EXTENSIONS 131

Much of the elementary theory of limits of sequences carries over with


little change to double sequences. In particular, the fact that the double
limit is uniquely determined (when it exists) is proved in exactly the same
manner as in Theorem 14.5. Similarly, one can define algebraic opera-
tions for double sequences and obtain results exactly parallel to those
discussed in Theorem 15.6. There is also a Cauchy Criterion for the
convergence of a double sequence which we will state, but whose proof we
leave to the reader.
19.5 Cauchy CRITERION. If X =(Xmn) is a double sequence in R’,
then X is convergent if and only if for each & >0 there is a natural number
M(e) such that for all m, n, 1, s = M(e), then

| Xn — Xs|| <e.
We shall not pursue in any more detail that part of the theory of double
sequences which is parallel to the theory of (single) sequences. Rather, we
propose to look briefly at the relation between the limit as defined in 19.4
and the “iterated” limits.
To begin with, we note that a double sequence can be regarded (in at
least two ways) as giving a sequence of sequences! On one hand, we can
regard each row in the array given in (19.2) as a sequence in R’. Thus the
first row in the array (19.2) yields the sequence Y;=(x1,:ne€N)=
(X11, X12,-.+5 Xin ...)3 the second row in (19.2) yields the sequence Y2=
(x21:n€N); etc. It makes perfectly good sense to consider the limits of
the row sequences Yi, Yo,..., Ym,..- (when these limits exist).
Supposing that these limits exist and denoting them by yu, y2,..-, Ym-- +>
we obtain a sequence of elements in R’ which might well be examined for
convergence. Thus we are considering the existence of y =lim (ym).
Since the elements y,, are given by ym =lim Y,, where Yn =(X%m.:néN),
we are led to denote the limit y = lim (y,,) (when it exists) by the expression
y = lim lim (Xm).
We shal! refer to y as an iterated limit of the double sequence (or more
precisely as the row iterated limit of this double sequence).
What has been done for rows can equally well be done for columns.
Thus we form the sequences
Z1= (%mi:m EN), Z2= (Xm2:m EN),

and so forth. Supposing that the limits z;=lim Z), z2=lim Z,..., exist,
we can then consider z = lim (z,). When this latter limit exists, we denote
it by

z =lim lim (Xen),


132 CONVERGENCE

and refer to z as an iterated limit, or the column iterated limit of the double
sequence X = (Xinn).
The first question we might ask is: if the double limit of the sequence
X =(Xmn) exists, then do the iterated limits exist? The answer to this
question may come as a surprise to the reader; it is negative. To see this,
let X be the double sequence in R which is given by X%m.=
(-1)"""(i/m+1/n), then it is readily seen that the double limit of this
sequence exists and is 0. However, it is also readily verified that none of
the sequences
Yi=(xnineEN),...,
Ym =(XmninEN),...
has a limit. Hence neither iterated limit can possibly exist, since none of
the “‘inner’’ limits exists.
The next question is: if the double limit exists and if one of the iterated
limits exists, then does this iterated limit equal the double limit? This time
the answer is affirmative. In fact, we shall now establish a somewhat
stronger result.
19.6 DovuBLe Limir THEOREM. If the double limit x = littan (%mn)
exists, and if for each natural number m the limit ym =limn (mu) exists, then
the iterated limit limm lim, (Xnn) exists and equals x.
PROOF. By hypothesis, given ¢ >0 there is a natural number N(e) such
that if m,n > N(e), then |xn.—x||<e. Again by hypothesis, the limits
Ym = lim, (Xmn) exist, and from the above inequality and Lemma 15.8 it
follows that |lym—x||= © for all m= N(e). Therefore, we conclude that
x =lim (ym). OED.
The preceding result shows that if the double limit exists, then the only
thing that can prevent the iterated limits from existing and being equal to
the double limit is that the “inner’’ limits may not exist. More precisely,
we have the following result.
19.7 COROLLARY. Suppose the double limit exists and that the limits
yn = lim (Xmn), Zn = lim (Xmn)

exist for all natural numbers m,n. Then the iterated limits
lim lim (Xmn)5 lim lim (Xn)

exist and equal the double limit.


We next inquire as to whether the existence and equality of the two
iterated limits implies the existence of the double limit. The answer is no.
This is seen by examining the double sequence X = (Xm) in R defined by
Xmn = 1 when m#n and X%m=0 when m=n. Here both iterated limits
19. SOME EXTENSIONS 133

exist and are equal, but the double limit does not exist. However, under
some additional conditions, we can establish the existence of the double
limit from the existence of one of the iterated limits.

19.8 Derinirion. For each natural number m, let Yin =(%mn) be a


sequence in R?’ which converges to yn. We say that the sequences
{¥m:m ¢N} are uniformly convergent if, for each e >0 there is a natural
number N(e) such that if n> N(e), then ||%mn—Yml|<e for all natural
numbers m.

The reader will do well to compare this definition with Definition 17.4
and observe that they are of the same character. Partly in order to
motivate Theorem 19.10 to follow, we show that if each of the sequences
Yin is convergent, then the existence of the double limit implies that the
sequences {Y,, :m € N} are uniformly convergent.
19.9 Lemma. If the double limit of the double sequence X = (Xmn)
exists and if, for each natural number m, the sequence Yn = (X%mnin EN) is
convergent, then this collection is uniformly convergent.
PROOF, Since the double limit exists, given « >0 there is a natural
number N(«) such that if m,n = N(e), then ||xun—x|}<e. By hypothesis,
the sequence Yn =(Xmn:n EN) converges to an element y, and, applying
Lemma 15.8, we find that if m = N(e), then |ly.—x||< «. Thus if m,n =
N(e), we infer that

[inn
— Yu] = [tn — xI|+ [x — yal] < 20.
In addition, for m=1,2,..., N(e)—1 the sequence Y,, converges to Yn;
hence there is a natural number K(e) such that if m > K(e), then
||Xan — Yn | <e, m=1,2,...,N(e)—-1.

Letting M(e)=sup {N(e), K(e)}, we conclude that if n = M(e), then for


any value of m we have
IIXnn — You]< Ze.
This establishes the uniformity of the convergence of the sequences
{Yn ime N}. Q.E.D.
The preceding lemma shows that, under the hypothesis that the se-
quences Y,, converge, then the uniform convergence of this collection of
sequences is a necessary condition for the existence of the double limit.
We now establish a result in the converse direction.

19.10 ITERATED Limir THEOREM. Suppose that the single limits


Yu = lim (Xmn)s Zn = lim (Xmen);
134 CONVERGENCE

exist for all m,n éN, and that the convergence of one of these collections is
uniform. Then both iterated limits and the double limit exist and all three are
equal.
PROOF. Suppose that the convergence of the collection {Yn :m €N} is
uniform. Hence given « >0, there is a natural number N(e) such that if
n= N(e), then
(19.3) [Xn — Yl] <e
for all natural numbers m. To show that lim (ym) exists, take a fixed
number q = N(e). Since z, =lim (x,,:réN) exists, we know that if r, s =
R(e, q), then

lly. — ysil = |[¥r — Xrq [+ rq — Xsq| + | sq — ysl < 3e.


Therefore, (y,) is a Cauchy sequence and converges to an element y in R?.
This establishes the existence of the iterated limit
y= lim (ym) = lim tim (%nn).

We now show that the double limit exists. Since y=lim (yn), given
¢ >0 there is an M(e) such that if m= M(e), then lly, —yll<e. Letting
K(e)=sup {N(e), M(e)}, we again use (19.3) to conclude that if m,n =
K(e), then

[mn — Yb] = []%nan — Youll


+ [Yn = yl] < Ze.
This proves that the double limit exists and equals y.
Finally, to show that the other iterated limit exists and equals y, we make
use of Theorem 19.6 or its corollary. OED.

It might be conjectured that, although the proof just given makes use of the
existence of both collections of single limits and the uniformity of one of them, the
conclusion may follow with the existence (and uniformity) of just one collection of
single limits. We leave it to the reader to investigate the truth or falsity of this
conjecture.

Exercises

19.A. Establish the following relations:


(a) (n?+2)~(n?—3), (b) (n?+2)=o0(n’),
(c) (-1)"n’) =O), (d) (-1)"n’) = o(n*),
(e) Wn+1—Vn)~ (1/2Vn), () (sinn) = O(1).
19.B. Let X, Y, and Z be sequences with non-zero elements. Show that:
(a) X~X.
(b) If X~ Y, then Y~X.
{c) If X~
Y and Y~ Z, then X ~ Z.
19. SOME EXTENSIONS 135

19.C. If X,= O(Y) and X,= O(Y), we conclude that X,+ X,= O(Y) and sum-
marize this in the “‘equation”
(a) OCY)+O(Y)=OCY). Give similar interpretations for and prove that
(b) of Y)+0(Y) =o0(Y).
(c) If c#0, then o(cY) = 0(Y) and O(cY) = OCY).
(d) O(0(Y))= oY), o(OCY)) =o(Y).
(e) OCX)OCY) = O(XKY), O(X)0o(Y) =o(XY), o(X)o(Y)=o0(XY).
19.D. Show that X = o(Y) and Y = o(X) cannot hold simultaneously. Give an
example of sequences such that X = O(Y) but Y# O(X).
19.E. If X is a monotone sequence in R, show that the sequence of arithmetic
means is monotone.
19.F. If X =(x,) is a sequence in R and (a,) is the sequence of arithmetic means,
then lim sup (¢,,) = lim sup (x,). Give an example where inequality holds.
19.G. If X =(x,) is a sequence of positive real numbers, then is (o,.) monotone
increasing?
19.H. If a sequence X =(x,) in R’ is Cesaro summable, then X=o(n). (Hint:
X, = no, —(— 1)6,-4.)
19.J. Let X be a monotone sequence in R. Is it true that X is Cesaro summable
if and only if it is convergent?
19.J. Give a proof of Theorem 19.5.
19.K. Consider the existence of the double and the iterated limits of the double
sequences (x,,,), Where X,, is given by

(a) CI", ) —, @ 444,


@—,
m
@ cor(s+2),
nfl 1
© mn

19.L. Is a convergent double sequence bounded?


19.M. If X =(xnn.) is a convergent double sequence of real numbers, and if for
each meEN, the limit y,, =lim sup, (x,.) exists, then we have lim,., (Xm) = lim, (ym).
19.N. Which of the double sequences in Exercise 19.K are such that the collec-
tion {Y,, = lim (x,,,):m €N} is uniformly convergent?
19.0. Let X =(x,,,) be a bounded double sequence in R with the property that
for each m EN the sequence Y,, = (%mn. : 1 € N) is monotone increasing and for each
néN the sequence Z, = (x,.,:1€N) is monotone increasing. Is it true that the
iterated limits exist and are equal? Does the double limit need to exist?
19.P. Discuss the problem posed in the final paragraph of this section.
IV
CONTINUOUS
FUNCTIONS

We now begin our study of the most important class of functions in


analysis: the continuous functions. In this chapter, we shall blend the
results of Chapters II and IiI and reap a rich harvest of theorems which
have considerable depth and utility.
Section 20 introduces and examines the notion of continuity. In Section
21 we introduce the important class of linear functions. The fundamental
Section 22 studies the properties of continuous functions on compact and
connected sets, and Section 23 discusses the notion of uniform continuity.
The results of these four sections will be used repeatedly throughout the
remainder of the book. Sequences of continuous functions are studied in
Section 24, and upper and lower limits are studied in 25. The final section
presents some interesting and important results, but these results will not
be applied in later sections.
It is not assumed that the reader has any previous familiarity with a
rigorous treatment of continuous functions. However, in some of the
examples and exercises, we use the exponential, the logarithm, and the
trigonometric functions in order to give some non-trivial examples.

Section 20 Local Properties of Continuous


Functions

We shall suppose that f is a function with domain D(f) contained in R°


and with range R(f) contained in R*. In general we shall not require that
D(f)=R’ or that p=q. We shall first define continuity in terms of
neighborhoods and then mention a few equivalent conditions.
20.1 DEFINITION. If a¢ D(f), then we say that f is continuous at a

136
20. LOCAL PROPERTIES OF CONTINUOUS FUNCTIONS 137

f(a) Vv

R’

Figure 20.1

if for every neighborhood V of f(a) there exists a neighborhood U (depend-


ing on V) of a such that if x is any element of UN D(f), then f(x) is an
element of V. (See Figure 20.1.) If ACD(f), then we say that f is
continuous on A in case it is continuous at every point of A.
Sometimes it is said that a continuous function is one which ‘‘sends neighboring
points into neighboring points.’ This intuitive phrase is to be avoided if it leads
one to believe that the image of a neighborhood of a need be a neighborhood of
f(a). (Consider x -> |x| at x =0.)

We now give two equivalent statements which could have been used as
the definition.
20.2 THEOREM. Leta be a point in the domain D(f) of the function f.
The following statements are equivalent:
(a) fis continuous at a.
(b) If « >0, there exists a number &(e)>0 such that if xe D(f) and
|x — al|< 8(e), then ||f(x)— fla)i<e.
(c) If (x.) is any sequence of elements of D(f) which converges to a, then
the sequence (f(x,)) converges to f(a).
PROOF. Suppose that (a) holds and that e>0, then the ball V.=
{fy € R*:|ly — f(a)||<} is a neighborhood of the point f(a). By Definition
20.1 there is a neighborhood U of a, such that if x «UN D(f), then
f(x)e V.. Since U is a neighborhood of a, there is a positive real number
8(e) such that the open ball with radius 6(¢) and center a is contained
in U. Therefore, condition (a) implies (b).
Suppose that (b) holds and let (x,) be a sequence of elements in D(f)
which converges to a. Let ¢>0 and invoke condition (b) to obtain a
138 CONTINUOUS FUNCTIONS

8(e)>0 with the property stated in (b). Because of the convergence of


(xn) to a, there exists a natural number N(8(«)) such that if n > N(8(e)),
then |x,—al|l<6(e). Since each x,€D(f) it follows from (b) that
If (xn) — f(a)||<e, proving that (c) holds.
Finally, we shall argue indirectly and show that if condition (a) does not
hold, then condition (c) does not hold. If (a) fails, then there exists a
neighborhood Vo of f(a) such that for any neighborhood U of a, there is an
element xu belonging to D(f)MU but such that f(xu) does not belong to
Vo. For each natural number n consider the neighborhood U, of a defined
by U, = {x € R? |x — a||<1/n}; from the preceding sentence, for each n in
N there is an element x, belonging to D(f) MU, but such that f(x.) does
not belong to Vo. The sequence (x,) just constructed belongs to D(f) and
converges to a, yet none of the elements of the sequence (f(x,)) belongs to
the neighborhood V, of f(a). Hence we have constructed a sequence for
which the condition (c) does not hold. This shows that part (c) implies (a).
QED.
The following useful discontinuity criterion is a consequence of what we
have just done.
20.3. DISCONTINUITY CRITERION. The function f is not continuous at
a point a in D(f) if and only if there is a sequence (x,) of elements in D(f)
which converges to a but such that the sequence (f(xn)) of images does not
converge to f(a).
The next result is a simple reformulation of the definition. We recall
from Definition 2.12 that the inverse image f-(H) of a subset H of R’
under f is defined by
f “(H) = {x € D(f) : f(x) € H}.
20.4 THEOREM. The function f is continuous at a point a in D(f) if
and only if for every neighborhood V of f(a) there is a neighborhood V; of a
such that

(20.1) VUND(f)=f (V).


PROOF If V: is a neighborhood of a satisfying this equation, then we
can take U=Vi. Conversely, if Definition 20.1 is satisfied, then we take
Vi = UUf '(V) to obtain equation (20.1). QED.
Before we push the theory any further, we shall pause to give some
examples. For simplicity, most of the examples are for the case where
R’=R'=R.
20.5 Examp.Les. (a) Let D(f)=R and let f be the “constant” func-
tion defined to be equal to the real number c for all real numbers x. Then f
is continuous at every point of R; in fact, we can take the neighborhood U
20. LOCAL PROPERTIES OF CONTINUOUS FUNCTIONS 139

fla) + e/. -—— — —


f(a) L-------------
fla) ~e_--——

—_
———
-—-—
a
Figure 20.2

of Definition 20.1 to be equal to R for any point a in D(f). Similarly, the


function g defined by
g(x) =1, 0<x<=1,
=2, 2<xs3,
is continuous at each point in its domain.
(b) Let D(f) = R and let f be the “identity” function defined by f(x) = x,
xER. (See Figure 20.2.) Ifa is a given real number, let e >0 and let
8(e)=e. Then, if |x —a|<8(e), we have |f(x)—f(a)|=|x-al<e.
(c) Let D(f)=R and let f be the “squaring” function defined by f(x) =
x’, xe€R. Let a belong to R and let ¢ >0; then |f(x)—f(a)|=|x’-a7|=
|x—a||x+a]. We wish to make the above expression less than s by
making |x — a] sufficiently small. If a=0, then we choose 8(e)=Ve. If
a# 0, then we want to obtain a bound for |x + a| on a neighborhood of a.
For example, if |x—a|<|{al, then 0<|x|<2|]a| and |x+a|<|x|+|a]
<3 a]. Hence
(20.2) If)
~ f(a)| = 3 a |x — al,
provided that |x—a|<|a|. Thus if we define 8(¢) = inf {lal, ¢/3 |al}, then
when |x — a|<8(e), the inequality (20.2) holds and we have |f(x)—f(a)|<
€.
(d) We consider the same function as in (c) but use a slightly different
technique. Instead of factoring x*~-a’, we write it as a polynomial in
x-—a. Thus

x?— a? = (x?—2ax +a”) +(2ax — 2a’) =(x— a)? + 2a(x—a).


140 CONTINUOUS FUNCTIONS

Using the Triangle Inequality, we obtain

If(x)— f(a)| = |x —a)?+2|al |x—al.


If 6 < 1 and |x —a| <6, then |x — a|?<.5? <= 5 and the term on the right side
is dominated by +2 |a| 5 = 8(1+2]al). Hence we are led to choose
_. €
&(e) =inf Litaalt

(e) Consider D(f)={x ER:x#0} and let f be defined by f(x) =1/x,


xeED(f). If ae D(f), then

Ie) fla)| = [ux 1/a| =Aax|


Again we wish to find a bound for the coefficient of |x — a{ which is valid in
a neighborhood of a#0. We note that if |x —a|<3|al], then 3|a|<|x|, and
we have

[Fo fad] = passa.


Thus we are led to take 8(«) = inf {3|al, 3e |a|’}.
(f) Let f be defined for D(f)=R by

fx)=0, x <0,
=1, x>0.

It may be seen that f is continuous at all points a#0. We shall show that f
is not continuous at 0 by using the Discontinuity Criterion 20.3. In fact, if
Xn = 1/n, then the sequence (f(1/n)) = (1) does not converge to f(0). (See
Figure 20.3.)

(g) Let D(f)=R and let f be Dirichlet’st discontinuous function defined by


f(x) =1, if x is rational,
=0, if x is irrational.

If a is a rational number, let X = (x,) be a sequence of irrational numbers converg-


ing toa. (Theorem 6.10 assures us of the existence of such a sequence.) Since
f(x.) = 0 for all n EN, the sequence (f(x,,)) does not converge to f(a) = 1 and f is not
continuous at the rational number a. On the other hand, if b is an irrational
number, then there exists a sequence Y = (y,) of rational numbers converging to b.
The sequence (f(y,)) does not converge to f(b), so f is not continuous at b.
Therefore, Dirichlet’s function is not continuous at any point.

{ PETER GusTAV LEJEUNE DIRICHLET (1805-1859) was born in the Rhineland and taught at
Berlin for almost thirty years before going to Gottingen as Gauss’ successor. He made
fundamental contributions to number theory and analysis.
20. LOCAL PROPERTIES OF CONTINUOUS FUNCTIONS 141

ce

yt 4 !
“lyn 1
n 3 2

Figure 20.3

(h) Let D(f)={x e.R:x>0}. For any irrational number x >0, we define f(x) =
0; for a rational number of the form m/n, with the natural numbers m, n having no
common factor except 1, we define f(m/n) =1/n. We shall show that f is continu-
ous at every irrational number in D(f) and discontinuous at every rational number
in D(f). The latter statement follows by taking a sequence of irrational numbers
converging to the given rational number and using the Discontinuity Criterion.
Let a be an irrational number and ¢ > 0; then there is a natural number n such that
1lj/n<e. If 6 is chosen so small that the interval (a — 6, a+5) contains no rational
number with denominator less than n, then it follows that for x in this interval we
have |f(x) —f(a)| =|f(x)| <= 1/n<e. Thus f is continuous at the irrational number
a. Therefore, this function is continuous precisely at the irrational points in its
domain.
G) This time, let D(f) = R? and let f be the function on R’ with values in R?
defined by

f(x, y)=(2x+y, x —3y).


Let (a, b) be a fixed point in R’; we shall show that f is continuous at this point. To
do this, we need to show that we can make the expression

Ilf(x, y)— fla, b)|= {x+y —2a— by +(x —3y —a


+ 3byY?
arbitrarily small by choosing (x, y) sufficiently close to (a,b). Since {p’+q’7}" =
V2 sup {\p|, {q[}, it is evidently enough to show that we can make the terms
\2x+y—2a—bl, |x-—3y-—a+3b,
arbitrarily small by choosing (x, y) sufficiently close to (a, b) in R’. In fact, by the
Triangle Inequality,
|2x + y—2a—b| =[2(x —a)+(y—b)| = 2 |x—al+ly
— 5].
Now |x —a| < {(x — a)? +(y— by}? =|[(x, y) — (a, b)I, and similarly for |y — b|; hence
we have

[2x + y—2a—b| = 3][(x, y)— (a, bl


Similarly,

|x—3y —a +3b| = |x —a|+3 ly —b} = 4 [[(x, y)—(@, d)I.


142 CONTINUOUS FUNCTIONS

Therefore, if e>0, we can take 8(s)=e/(4V2) and be certain that if


I, y)—(a, b)I|< 8(e), then |If(x, y)—f(a, b)||<e, although a larger value of 5 can
be attained by a more refined analysis (for example, by using the Schwarz Inequal-
ity 8.7).
(j) Again let D(f) = R? and let f be defined by

FO, y) = (x? +y’, 2xy)


If (a, b) is a fixed point in R’, then

Ilf(x, y) — fla, by} = {7 + y?— a?— B’)*+ (2xy — 2ab)}”.


As in (i), we examine the two terms on the right side separately. It will be seen that
we need to obtain elementary estimates of magnitude. From the Triangle Inequal-
ity, we have

|x? + y?—a’— b?| = |x?—-a’|+|y?—


b?|.
If the point (x, y) is within a distance of 1 of (a,b), then |x|<|a|+1 whence
|x +a] <2 |a|+1 and |y{=|b|+1 so that |y+b|=<2|b|+1. Thus we have
|x?+y?—a?—b?| =|x—al (2 |a|+1)+ly—b| (2 fb|+))
< 2(\a| +{b| +1) [I(x y) — (a, b)I.
In a similar fashion, we have

|2xy —2ab| = 2 |xy —xb + xb — ab| = 2 |x| |y—b] +2 |b| |x—a|


= 2(\a|+{b| +1) [l(, y)— (a, b)h.
Therefore, we set

. &

se)=inf{ al
if {|(x, y)—(a, b)||<8(e), then we have ||f(x, y)—f(a, b)||<e, proving that f is con-
tinuous at the point (a, b).

Combinations of Functions

The next result is a direct consequence of Theorems 15.6 and 20.2(c), so


we shall not write out the details. Alternatively, it could be proved
directly by using arguments quite parallel to those employed in the proof of
Theorem 15.6. We recall that if f and g are functions with domains D(f)
and D(g) in R’ and ranges in R‘, then we define their sum f+ g, their
difference f — g and their inner product f - g for each x in D(f)M D(g) by
the formulas
fx)t+g(x), — f(x)— gx), f(x) - gx).
Similarly, if ¢ is a real number and if ¢ is a function with domain D(¢) in
R? and range in R, we define the products cf for x in D(f) and of for x in
20. LOCAL PROPERTIES OF CONTINUOUS FUNCTIONS 143

D(@)ND¢f) by the formulas

ef(x), p(x) f(x).


In particular, if ¢(x) #0 for x € Do, then we can define the quotient f/@ for
x in D(f)N Do by ,
f(x)/p (x).
With these definitions, we now state the result.
20.6 TuHeorem. [If the functions f, g, p are continuous at a point,
then the algebraic combinations

f+e, f- & f- g cf, of and fie

are also continuous at this point.


There is another algebraic combination that is often useful. If f is
defined on D(f) in R’ to R, we define the absolute value |f| of f to be the
function with range in the real numbers R whose value at x in D(f) is given
by If).
20.7 THEOREM. If f is continuous at a point, then |f| is also continuous
there.
PROOF. From the Triangle Inequality, we have,

| If) Ifa) | = If@)—fl@)|,


from which the result is immediate. Q.E.D.

We recall the notion of the composition of two functions. Let f have


domain D(f) in R?’ and range in R° and let g have domain D(g) in R* and
rangein R’. In Definition 2.2, we defined the composition h = gef to have
domain D(h) = {x € D(f): f(x) ¢D(g)} and for x in D(h) we set h(x)=
glf(x)]. Thus h=goef is a function mapping D(h), which is a subset of
D(f)cR’, into R'. We now establish the continuity of this function.

20.8 THEOREM. [Iff is continuous at a and g is continuous at b = f(a),


then the composition g cf is continuous at a.
PROOF. Let W be a neighborhood of the point c=g(b). Since g is
continuous at b, there is a neighborhood V of b such that if y belongs to
VND(g), then g(y)e W. Since f is continuous at a, there exists a neigh-
borhood U of a such that if x belongs to UMD/(f), then f(x) is in V.
Therefore, if x belongs to UN D(goef), then f(x) isin VM D(g) and g[f(x)]
belongs to W. (See Figure 20.4.) This shows that h = gef is continuous at
a. OED.
144 CONTINUOUS FUNCTIONS

Figure 20.4

Exercises

20.A. Prove that if f is defined for x = 0 by f(x) =x, then f is continuous at


every point of its domain.
20.B. Show that a “polynomial function”; that is, a function f with the form

f(x) =x" tax? + tax


+ dy for xR,
is continuous at every point of R.
20.C. Show that a ‘rational function” (that is, the quotient of two polynomial
functions) is continuous at every point where it is defined.
20.D. Use the Schwarz Inequality to show that one can take 8(e)=e/V15 in
Example 20.5(i).
20.E. Let f be the function on R to R defined by
f(x) =x, x irrational,
=1-x, x rational.
Show that f is continuous at x =} and discontinuous elsewhere.
20.F. Let f be continuous on R to R. Show that if f(x) =0 for rational x then
f(x) =0 for all x in R.
20.G. Let f and g be continuous on R to R. Is it true that f(x) = g(x) forx eR
if and only if f(y) = g(y) for all rational numbers y in R?
20.H. Use the inequality |sin x| = |x| for x ¢ R to show that the sine function is
continuous at x =0. Use this fact, together with the identity
sin x — sinu = 2 sin3(x —u) cos3(x+4u),
to prove that the sine function is continuous at any point of R.
20.1. Using the results of the preceding exercise, show that the function g,
defined on R to R by
g(x) =x sin (1/x), x #0,
=0, x=0,
is continuous at every point. Sketch a graph of this function.
20.J. Let h be defined for x #0, x € R, by
h(x)
= sin (1/x), x#0.
Show that no matter how h is defined at x = 0, it will be discontinuous at x = 0.
20. LOCAL PROPERTIES OF CONTINUOUS FUNCTIONS 145

20.K. Let F:R’?— R be defined by

F(x, y)=x’+y’ if both x, yeQ,


=0 otherwise.

Determine the points where F is continuous.


20.L. We say that a function f on R to R is additive if it satisfies

f(x+y)=f(x)+fly)
for all x, ye R. Show that an additive function which is continuous at x =0 is
continuous at any point of R. Show that a monotone additive function is continu-
ous at every point.
20.M. Suppose that f is a continuous additive function on R. If c = f(1), show
that f(x)=cx for all x in R. (Hint: first show that if r is a rational number, then
f(r) =cr)
20.N. Let g: R — R satisfy the relation

g(xt+y)=ge(x)g(y) forx,yeR.
Show that, if g is continuous at x =0, then g is continuous at every point. Also, if
g(a) =0 for some ac R, then g(x) =0 for allx ER.
20.0. If |f| is continuous at a point, then is it true that f is also continuous at this
point?
20.P. Let f, g:R’ — R be continuous at a point a ¢ R’ and let h, k be defined on
R’ to R by
h(x) =sup {f(x), g(x)}, k(x) = inf (f(x), g()}-
Show that h and k are continuous at a. (Hint: note that sup {b, c}=i(b+c+|b—e])
and inf {b, c)={b +c —|b —c]).)
20.Q. If xe R, we often define [x] to be the greatest integer n¢ Z such that
n= x. The map x +> [x] is called the greatest integer function. Sketch the graphs
and determine the points of continuity of the functions defined for x ¢ R by

(a) f(x) ={x], (b) g(x) =x—[x],


(c) h(x) =[2 sin x], (d) k(x) =sin3[x].
20.R. A function f defined on an interval IC R to R is said to be increasing on I
ifx <x’, x, x’e I imply that f(x) = f(x’). It is said to be strictly increasing on I if
x<x’, x, x'e Jimply that f(x)< f(x’). Similar definitions can be given for decreas-
ing and strictly decreasing functions. A function which is either increasing or
decreasing on an interval is said to be monotone on this interval.
(a) If f is increasing on J, then f is continuous at an interior point c € I if and only
if for every « > 0 there are points x,, X2€], x1<¢ <x,, such that f(x.) — f(x.) <e.
(b) If f is increasing on I, then f is continuous at an interior point c € I if and only
if
sup {f(x):x<c}=f(c) = inf {f(x):x >c}.

20.S. Suppose that f is increasing on I =[a, b] in the sense of the preceding


146 CONTINUOUS FUNCTIONS

exercise. Let

je = inf {f(x):x >c}—sup {f(x):x <c}.

If j. >0, we say that f has a jump of j, at the point c.


(a) If n EN, show that there can only be a finite set of points in I at which f has a
jump exceeding 1/n.
(b) Show that an increasing function can have at most a countable set of points of
discontinuity.

Projects

20.a. Let g be a function on R to R which is not identically zero and which


satisfies the functional equation

(*) g(xt+y)=gix)ety) forx, yeR.


The purpose of this project is to show that g must be an “exponential function.”
(a) Show that g is continuous at every point of R if and only if it is continuous at
one point of R.
(b) Show that g(x) >0 for all x ER.
(c) Prove that g(0)=1. If a=g(1), then a>O and g(r)=a' for allreQ.
(d) The function g is strictly increasing, is constant, or is strictly decreasing
according as g(1)>1, g(1)=1, or 0<g(1)<1.
(e) If g(x) > 1 for x in some interval (0, 8), 8 > 0, then g is strictly increasing and
continuous on R.
(f) If a >0, then there exists at most one continuous function g satisfying (*) such
that g(1) =a.
(g) Suppose thata>1. Referring to Project 6.8, show that there exists a unique
continuous function satisfying (+) such that g(1) =a.

20.8. Let P={x €R:x >0} and let h: P— R be a function not identically zero
which satisfies the functional equation

(7) h(xy) = h(x) +h(y) for x, yeP.

The purpose of this project is to show that h must be a “logarithmic function.”


(a) Show that h is continuous at every point of P if and only if it is continuous at
one point of P.
(b) Show that h cannot be defined at x =0 to satisfy (+) for {x eR:x = O}.
(c) Prove that h(1)=0. If x>0 and reQ, then h(x’) = rh(x).
(d) Show that if h(x) = 0 on some interval in {x € R:x = 1}, then h is strictly
increasing and continuous on P.
(e) If h is continuous, show that h(x)#0 for x41. Also, either h(x)>0 for
x >, or h(x)<0 for x>1.
(f) If b> 1, show that there exists at most one continuous function on P which
satisfies (+) and is such that h(b) = 1.
(g) Suppose that b>1. Referring to Project 6.Y, show that there exists a unique
continuous function satisfying (t) such that h(b) = 1.
21. LINEAR FUNCTIONS 147

Section 21. Linear Functions

The preceding discussion pertained to arbitrary functions defined on a


part of R’ to R*. Before we continue that discussion we want to introduce
a relatively simple but extremely important class of functions, namely the
“linear functions,” which arise in very many applications.
21.1 DEFINITION. A function f with domain R? and range in R? is
said to be linear in case
(21.1) flax + by) = af(x) + bf(y)
for all a, b, in R and x, y in R°.
it follows from (21.1) by induction that if a,b ...,c are neN real
numbers and x, y,..., z are n elements of R”, then

f(ax
+ by +- -+ez)= af(x)+bf(y)+-+-+cf(z).
It is readily seen that the functions in Examples 20.6(b) and 20.6(i) are
linear functions for the case p = q = 1 and p = q = 2, respectively. In fact it
is not difficult to characterize the most general linear function from R? to
R’.
21.2 THEOREM. [If f is a linear function with domain R?’ and range in
R?, then there are pq real numbers (cy), 1=i<q, 1=j sp, such that if
xX = (x1, X2,...,Xp) is any point in R°, and ify =(y1, y2,.--, Ya) = f(x) is its
image under f, then
Yr = Ci1X1 + Cy2X2 + + ++ CipXp,

(21.2)
Yq = CqiX1+ CqoX2 t+ + + + CopXp.

Conversely, if (c,) is a collection of pq real numbers, then the function which


assigns to x in R? the element y in R‘ according to the equations (21.2) is a
linear function with domain R? and range in R*.
PROOF. Let ¢1,@2,...,@ be the elements of R’ given by e:=
(1,0,...,0), e2=(0,1,...,0),..., e.=(0,0,...,1). We examine the
images of these vectors under the linear function f. Suppose that
f(e.) = (C11, C21, ..-, Cq1)s

f(e2) = (C12, C22, . 5 Ca2)s


(21.3)

fle) = (Cip, C2p,---> Cap)

Thus the real number c, is the ith coordinate of the point f(e).
148 CONTINUOUS FUNCTIONS

An arbitrary element x = (x1, X2,..., %») of R® can be expressed simply


in terms of the vectors e1, €2,..., ep; in fact,
X= X1€1
t+ X22 + + + + Xpep.

Since f is linear, it follows that

f(x) = xifler)
+ x2f(e2) ++ + ++ xpf(ep).

If we use the equations (21.3), we have

f(x) = x1(C11, C2ty- 205 Cqi)} + X2(Cr2, C22,--65 Cq2}

Hee + Xp(Cip, Crpy-++5 Cap)


= (€11X1, CarX1,-- - 5 CqiX1)
+ (€12X2, C22X0, . . . » Cq2Xo)
+++ ++ (CipXp, C2pXps + - + » CapXp)
= (CuX1 + CraX2t+ + + CipXp, CorX1
t+ Cr2X2 + + + + CrpXp,
weg CqiX
+ CgoXo
1t°* ++ CapXp)-

This shows that the coordinates of f(x) are given by the relations (21.2),
as asserted.
Conversely, it is easily verified by direct calculation that if the relations
(21.2) are used to obtain the coordinates y, of y from the coordinates x; of
x, then the resulting function satisfies the relation (21.1) and so is linear.
We shall omit this calculation, since it is straightforward. QED.

It should be mentioned that the rectangular array of numbers

C1 C12 Cip

C2, Can °° Cap


14) 00 ee ,
Cai Cq2 Cap

consisting of q rows and p columns, is often called the matrix correspond-


ing to the linear function f. There is a one-one correspondence between
linear functions of R? into R‘ and q X p matrices of real numbers. As we
have seen, the action of f is completely described in terms of its matrix.
We shall not find it necessary to develop any of the extensive theory of
matrices, however, but will regard the matrix (21.4) as being shorthand for
a more elaborate description of the linear function f.
We shall now prove that a linear function from R’ to R‘ is automatically
continuous. To do this, we first restate the Schwarz Inequality in the form

\aibi
+ dab +: +- apbp |? = far?
+ ax? + ++ ap’ Hbi
+ bo? +- +b,

We apply this inequality to each expression in equation (21.2) to obtain,


21. LINEAR FUNCTIONS 149

for 1 <i < q, the estimate


P

bys]? < (leaf?


+ lea)? +- +» + feip|”) Wl = 2 lea? TIP.
Adding these inequalities, we have

lot? = {2 % lea}
P

i=1j

from which we conclude that

(21.5) lolUrcot=
{3% lout} te
q Pp 172

21.3. THEOREM. If f is a linear function with domain R° and range in


R’, then there exists a positive constant A such that if u, v are any two vectors
in R?, then

(21.6) IIf(u)—f@)||= A lJu— ol.


Therefore, a linear function on R? to R4 is continuous at every point.
PROOF. We have seen, in deriving formula (21.5) that there exists a
constant A such that if x is any element of R? then ||f(x)||< A |x]. Now
let x=u—v and use the linearity of f to obtain f(x)=f(u—v)=
f(u)—f(v). Therefore, the formula (21.6) results. It is clear that this
relation implies the continuity of f, for we can make ||f(u)—f(v)l<e by
taking |lu—v||<<e/A if A>0. OED,
It is an exercise to show that if f and g are linear functions on R? to R’,
then f+ g is a linear function on R? to R*. Similarly, if ce R, then cf isa
linear function. We leave it to the reader to show that the collection
LCR’, R*) of all linear functions on R? to R* is a vector space under these
vector operations. In the exercises we will show how to define a norm on
this vector space.

Exercises

21.A. Show that f: R’ > R? is a linear function if and only if f(ax) = af(x) and
f(x+y) =f(x)+f(y) for all ae R and all x, ye R’.
21.B. If f is a linear function of R’ into R*, show that the columns of the matrix
representation (21.4) of f indicate the elements in R* into which the elements
e,=(1,0,...,0), e2.=(0,1,...,0),...,¢,=(0,0,...,1) of R’ are mapped by f.
21.C. Let f be a linear function of R? into R* which sends the elements e, =
(1,0), e2=(0,1) of R’* into the vectors f(e.)=(2, 1,0), f(e.)=(1, 0, -0) of R°.
Give the matrix representation of f. What vectors in R° are the images under f of
the elements (2,0), (1, 1), and (1, 3)?
21.D. If f denotes the linear function of Exercise 21.C, show that not every
vector in R? is the image umerf of a vector in R?.
150 CONTINUOUS FUNCTIONS

21.E. Let g be any linear function on R’ to R®. Show that not every element of
R’ is the image under g of a vector in R’.
21.F. Let h be any linear function on R* to R’. Show that there exist non-zero
vectors in R® which are mapped into the zero vector of R’ by h.
21.G. Let f be a linear function on R’ to R? and let the matrix representation of
f be given by

a b
c od

Show that f(x) #0 when x40 if and only if A= ad — bc #0.


21.H. Let f be as in Exercise 21.G. Show that f maps R’ onto R? if and only if
A=ad—bc#0. Show that if A¥0, then the inverse function f~’ is linear and has
the matrix representation
[ dja id
—cfA ajA

21.1. Let g bea linear function from R’ to R*. Show that g is one-one and only
if g(x)=0 implies that x =0.
21.J. If h is a one-one linear function from R’ onto R’, show that the inverse
h” is a linear function from R’ onto R?’.
21.K. Show that the sum and the composition of two linear functions are linear
functions.
21.L. If f is a linear map on R’ to R‘, define

Iflles = sup {lf (Il: x € R°, [[xl] = 1}.


Show that the mapping f+ ||f||,, defines a norm on the vector space #(R’, R*) of
all linear functions on R’ to R*. Show that ||f(x)|| = [lfllp ||x|| for all x eR’.
21.M. If f is a linear map on R’ to R‘, define

M(f) = int {M > 0:||f(x)|]| = M ||x]|, x ¢ R*}.


Show that M(f) =| flleo-
21.N. If f and g are in £(R’, R°) show that fog is also in F(R’, R’) and
that ||f° ell, = lfll» Ilgll- Show that the inequality can be strict for certain f and g.
21.0. Give an example of a linear map f in £(R’, R*) with matrix representa-
tion [c,] where we have

Ila <{d Boe}.


12

21.P. If (21.4) gives the matrix for f, show that |c,|<|lfll,, for all i, j.

Section 22 Global Properties of Continuous


Functions

In Section 20 we considered ‘‘local’’ continuity; that is, we were con-


cerned with continuity at a point. In this section we shall be concerned
22. GLOBAL PROPERTIES OF CONTINUOUS FUNCTIONS 151

with establishing some deeper properties of continuous functions. Here


we shall be concerned with “‘global’’ continuity in the sense that we will
assume that the functions are continuous at every point of their domain.
Unless there is a special mention to the contrary, f will denote a function
with domain D(f) contained in R° and with range in R*. We recall that if
B isa subset of the range space R‘, the inverse image of B underf is the set
f-°(B) = {x e D(f) : f(x) € B}.
Observe that f~'(B) is automatically a subset of D(f) even though B is not
necessarily a subset of the range of f.
In topology courses, where one is more concerned with global than local con-
tinuity, the next result is often taken as the definition of (global) continuity. Its
importance will soon be evident.
22.1 GLoBAL ConTiNUITy THEOREM. The following statements are
equivalent:
(a) f is continuous on its domain D(f).
(b) If G is any open set in R‘, then there exists an open set G; in R?’ such
that Gi D(f) = f-(G).
(c) If His any closed set in R*, then there exists a closed set H, in R° such
that Hin D(f) =f "(H).
PROOF. First, we shall suppose that (a) holds and let G be an open
subset of R*. If a belongs to f-*(G), then since G is a neighborhood of
f(a), it follows from the continuity of f at a that there is an open set U(a)
such that if xe D(f)N U(a), then f(x)eG. Select U(a) for each a in
f-(G) and let G, be the union of the sets U(a). By Theorem 9.3(c), the
set G, is open and it is plain that Gi. D(f)=f '(G). Hence (a) implies
(b).
We shall now show that (b) implies (a). Ifa is an arbitrary point of D(f)
and G is an open neighborhood of f(a), then condition (b) implies that
there exists an open set G, in R? such that G.ND(f)=f-(G). Since
f(a)eG, it follows that a€ Gi, so Gi: is a neighborhood of a. If xe
Gin D(f), then f(x)¢€G whence f is continuous at a. This proves that
condition (b) implies (a).
We now prove the equivalence of conditions (b) and (c). First we
observe that if B is any subset of R* and if C=R*\B, then we have
f-\(B) Nf-(C) = and
(22.1) Dif)=f (BUF (CO).
If B, is a subset of R’ such that B,Q D(f)=f7'(B) and C:=R°\B,,
then C, Nf -'(B)=9 and
(22.2) D(f) =(BiN D&f)) U(CiN D(f)) = fF "(B) U(C1N D(f)).
152 CONTINUOUS FUNCTIONS

The formulas (22.1) and (22.2) are two representations of D(f) as the
union of f~'(B) with another set with which it has no common points.
Therefore, we have C,N D(f) =f (C).
Suppose that (b) holds and that H is closed in R*. Apply the argument
just completed in the case where B= R*\ H and C=H. Then B and B,
are open sets in R* and R’, respectively, so C, = R* \ B, is closed in R?.
This shows that (b) implies (c).
To see that (c) implies (b), use the above argument with B = R” \ G,
where G is an open set in R*. O.B.D.
In the case where D(f)=R’, the preceding result simplifies to some
extent.

22.2 Coroitary. Let f be defined on all of R?’ and with range in R*.
Then the following statements are equivalent:
(a) f is continuous on R°;
(b) if G is open in R4, then f~'(G) is open in R’;
(c) if His closed in R4, then f *(H) is closed in R’.
It should be emphasized that the Global Continuity Theorem 22.1 does
not say that if f is continuous and if G is an open set in R’, then the direct
image f(G)={f(x):x €G}is open in R*. In general, a continuous function
need not send open sets to open sets or closed sets to closed sets. For
example, the function f on R to R, defined by

fe)=Ths
is continuous on R. [In fact, it was seen in Examples 20.5(a) and (c) that the
functions fi(x)=1, and f.(x)=x’, for x¢R, are continuous at every
point. From Theorem 15.6, it follows that
fox) =14+ 27, xeER,

is continuous at every point and, since f, never vanishes, this same theorem
implies that the function f given above is continuous on R.] If G is the open
set G = (-1, 1), then f(G) = @, 1], which is not open in R. Similarly, if H is
the closed set H={x € R:x = 1}, then f(H) =(0, 3], which is not closed in
R. Similarly, the function f maps the set R, which is both open and closed in
R, into the set f(R) = (0, 1], which is neither open nor closed in R.

The moral of the preceding remarks is that the property of a set being open or
closed is not necessarily preserved under mapping by a continuous function.
However, there are important properties of a set which are preserved under
continuous mapping. For example, we shall now show that the properties of
connectedness and compactness of sets have this charzcter.
22. GLOBAL PROPERTIES OF CONTINUOUS FUNCTIONS 153

Preservation of Connectedness

We recall from Definition 12.1 that a set H in R? is disconnected if there


exist open sets A, B in R? such that ANH and BNH are disjoint
non-empty sets whose union is H. A set is connected if it is not discon-
nected.

22.3. PRESERVATION OF CONNECTEDNESS. If H&D(f) is connected


in R? and f is continuous on H, then f(H) is connected in R’.
PROOF. Let h be the restriction of f to the set H so that D(h) = H and
h(x)=f(x) for all x¢H. We note that f(H)=h(H) and that h is
continuous on H. .
If f(A) = h(H) is disconnected in R*, then there exist open sets A, B in R4
such that AM h(H) and BM h(H) are disjoint non-empty sets whose union
ish(H). By the Global Continuity Theorem 22.1, there exist open sets Ai,
B; in R? such that

AiNH=h (A), Bi:NH=h"(B).


These intersections are non-empty and their disjointness follows from the
disjointness of the sets AM h(H) and BO h(H). The assumption that the
union of A Mh(H) and BNh(H) is h(H) implies that the union of AiM H
and B,0 His H. Therefore, the disconnectedness of f(H) = h(H) implies
the disconnectedness of H. QED.

The very word ‘‘continuous”’ suggests that there are no sudden “‘breaks”
in the graph of the function; hence the next result is by no means
unexpected. However, the reader is invited to attempt to provide a
different proof of this theorem and he will come to appreciate its depth.
22.4 BOLZANO’s INTERMEDIATE VALUE THEOREM. Let H<D(f)
be a connected subset of R® and let f be continuous on H and with values in R.
If k is any real number satisfying
inf {f(x):x € H}<k <sup {f(x):x € H},
then there is at least one point of H where f takes the value k.
PROOF, If kéf(H), then the sets A={teR:t<k}, B={teR:t>k}
form a disconnection of f(H), contrary to the previous theorem. QED.

Preservation of Compactness
We now demonstrate that the important property of compactness is
preserved under continuous mapping. We recall that it is a consequence
of the important Heine-Borel Theorem 11.3 that asubset K of R? is compact
154 CONTINUOUS FUNCTIONS

if and only if it is both closed and bounded in R’. Thus the next result could
be rephrased by saying that if K is closed and bounded in R? and if f
is continuous on K and with range in R*, then f(K) is closed and bounded
in R4,
22.5 PRESERVATION OF COMPACTNESS. If K<&D(f) is compact
and f is continuous on K, then f(K) is compact.
FIRST PROOF. We assume that K is closed and bounded in R? and shall
show that f(K) is closed and bounded in R*. If f(K) is not bounded, for each
neé N there exists a point x, in K with |f(x,)||=n. Since K is bounded, the
sequence X=(x,) is bounded; hence it follows from the Bolzano-
Weierstrass Theorem 16.4 that there is a subsequence of X which converges
toanelement x. Since x, € K for n EN, the point x belongs to the closed set
K. Hence f is continuous at x, so f is bounded by |[f(x)||+ 1 on a neighbor-
hood of x. Since this contradicts the assumption that ||f(x,)|| =, the set
f(K) is bounded.
We shall prove that f(K) is closed by showing that any cluster point y of
f(K) must be contained in this set. In fact, ifn isa natural number, there isa
point z, in K such that ||f(z.)—yl|<1/n. By the Bolzano-Weierstrass
Theorem 16.4, the sequence Z =(z,) has a subsequence Z' = (Znq)) which
converges to anelement z. Since K is closed, then z € K and f is continuous
at z. Therefore,

f(z) = lim (f(2n0)) =,


which proves that y belongs to f(K). Hence f(K) is closed.
SECOND PROOF. By restricting f to K we may assume that D(f)=K.
Now assume that G@={G.} is a family of open sets in R* whose union
contains f(K). By the Global Continuity Theorem 22.1, for each set G, in G
there is an open subset C. of R® such that C. ND =f"(G.). The family
€ ={C.} consists of open subsets of R’ ; we claim that the union of these sets
contains K. For, if x € K, then f(x) is contained in f(K); hence f(x) belongs
to some set G, and by construction x belongs to the corresponding set C,.
Since K is compact, it is contained in the union of a finite number of sets in €
and its image f(K) is contained in the union of the corresponding finite
number of sets in Y. Since this holds for arbitrary family Y of open sets
covering f(K), the set f(K) is compact in R*. Q.E.D.
When the range of the function is R, the next theorem is sometimes
reformulated by saying that a continuous real-valued function on a compact
set attains its maximum and minimum values.
22.6 MAXIMUM AND MINIMUM VALUE THEOREM. Let K ¢ D(f) be
compact in R? and let f be a continuous real valued function. Then there are
22. GLOBAL PROPERTIES OF CONTINUOUS FUNCTIONS 155

points x* and xx in K such that


f(x*) = sup {f(x):x € K}, fxs) = int {f(x):% eK}.
FIRST PROOF. Since K is compact in R®, it follows from the preceding
theorem that f(K) is bounded in R. Let M=sup f(K) and let (x,) be a
sequence in K such that
f(%.) =M-—t/n, neN.
By the Bolzano-Weierstrass Theorem 16.4, some subsequence (Xn)
converges to a limit x*eK. Since f is continuous at x*, we must have
f(x*) =lim(f(%09)) = M. The proof of the existence of x« is quite similar.

SECOND PROOF. By restricting f to K, we may assume that D(f)= K.


We set M=supf(K). Then for each neN, let G,={ue R:u<M-—1/n}.
Since G, is open, it follows from the Global Continuity Theorem 22.1 that
there exists an open set C, in R? such that
G,NK ={xeK :f(x)<M-— 1/n}.

Now if the value M is not attained, then the union of the family € ={C,} of
open sets contains all of K. Since K is compact and the family {C, N K} is
increasing, there is an re N such that KCC, But then we have f(x)<
M-—1/r for all x € K, contrary to the fact that M =sup f(K). QED.

If f has range in R? with q > 1, the following corollary is sometimes useful.


22.7 CoROLLARY. Let f be a function on D(f)CR?’ to R* and let
Ke D(f) be compact. If f is continuous on K, then there are points x* and
x» in K such that

If) = sup {f(x eK}, ||f(«a)l] = int {Il f()||: x © K}.


It follows from Theorem 21.2 that if f: R’ — R‘ is linear, then there exists
a constant M >0 such that ||f(x)|| = M |\x|| for all x eR’. However it is not
always true that there exists a constant m >0 such that ||f(x)|| = m |[x|| for all
xé€R?’. Wenow show that this is the case if and only if f is an injective linear
function.
22.8 CoROLLARY. Let f:R’— R‘ be a linear function. Then f is
injective if and only if there exists m>0O such that |if(x)|| = m |x| for all
xeR?’.
PROOF. Suppose that f is injective, and let S = {x € R° :||x||= 1} be the
compact unit sphere in R’.
By Corollary 22.7 there exists xx¢S such that |[f(x.)|=m=
inf {||f(x)|:x eS}. Since f is injective, m =|f(x»)||>0. Hence |[f(x)|| = m >
Q for allxeS. Now, if ue R’, uz 0, then u/|lu|| belongs to S and by the
156 CONTINUOUS FUNCTIONS

linearity of f we have

pqlteoll=
1
Geile
u

whence it follows that ||f(u)|] = mllull for all ue R? (since the result is trivial
for u =0).
Conversely, suppose ||f(x)|| < m ||x|| for allx e R®. If f(x:) = f(x2), then we
have

O= [[f(1)— fx2)|| = lf. — x2)l] = m [ler — wall,


which implies that x: = x2. Therefore f is injective. Q.E.D.

One of the most striking consequences of Theorem 22.5 is that if f is


continuous and injective on a compact domain, then the inverse function f~*
is automatically continuous.
22.9 CONTINUITY OF THE INVERSE FUNCTION. Let K be a compact
subset of R? and let f be a continuous injective function with domain K and
range f{(K) in R*. Then the inverse function is continuous with domain f(K)
and range K,
PROOF. We observe that since K iscompact, then Theorem 22.5 implies
that f(K) is compact andhenceclosed. Since f isinjective by hypothesis, the
inverse function g=f~ is defined. Let H be any closed set in R? and
consider HK; since this set is bounded and closed (by Theorem 9.6(c)),
the Heine-Borel Theorem assures that H 1 K isacompact subset of R’. By
Theorem 22.5, we conclude that H, = f(H MN K) is compact and hence closed
in R*. Now if g=f7', then
H,=f(HNK)=¢ “(H).

Since Hj, is a subset of f(K) = D(g), we can write this last equation as

Hin D(g)=87"(H).
From the Global Continuity Theorem 22.1(c), we infer that g=f' is
continuous. QED.

We shall close this section with the introduction of some notation that will
be convenient.
22.10 Derimnirion. If DCR?, then the collection of all continuous
functions on D to R° is denoted by C,,(D). The collection of all bounded
continuous functions on D to R* is denoted by BC,,(D). When p and q are
understood, we will denote these collections merely by C(D) and BC(D).

The first part of the following result is a consequence of Theorem 20.6,


and the second part is proved in the same way that Lemma 17.8 was proved.
22. GLOBAL PROPERTIES OF CONTINUOUS FUNCTIONS 157

22.11 THEOREM. (a) The spaces C,,(D) and BC,,(D) are vector
spaces under the vector operations
G+ gx)=f(x)+e), (ef)()=ef(x) — forxe D.
(a) The space BC,,(D) is a normed space under the norm

IIfllo = sup {llf(x)l|:x € D}.


Of course, in the special case where D is a compact subset of R?’, then
C,(D) = BC,,(D).

Exercises

22.A. Interpret the Global Continuity Theorem 22.1 for the real-valued func-
tions f(x)=x* and g(x)=1/x, x#0. Take various open and closed sets and
determine their inverse images under f and g.
22.B. Let H:R — R be defined by

h(x) =1, O=<x<1,


=0, otherwise.

Exhibit an open set G such that h~’(G) is not open in R, and a closed set F such that
h“(F) is not closed in R.
22.C. If f is bounded and continuous on R’ to R and if f(x.) >0, show that f is
strictly positive on some neighborhood of x». Does the same conclusion hold if f is
merely continuous at Xo?
22.D. If p:R°—>R is a polynomial and ceR, show that the set
{(x, y): p(x, y)<c} is open in R’.
22.E. If f:R’—R is continuous on R°’ and a<f, show that the set
{xe R’:a=f(x)=8} is closed in R’.
22.F. A subset DC R° is disconnected if and only if there exists a continuous
function f:D — R such that f(D) ={0, 1}.
22.G. Let f becontinuouson R*to R*. Define the functions g,, gon R to R® by

gi(t)=f(,0), g(t) = f(O, t).


Show that g, and g, are continuous.
22.H. Let f, g,, g. be related by the formulas in the preceding exercise. Show that
from the continuity of g, and g, at t = 0 one cannot prove the continuity off at (0, 0).
22.1. Give an example of a function on F=[0, 1] to R which is not bounded.
22.3. Give an example of a bounded function f on F to R which does not take
on either of the numbers sup {f(x):x eI} or inf {f(x):x € I}.
22.K. Give an example of a bounded and continuous function g on R to R
which does not take on either of the numbers sup {g(x):x € R} or inf {g(x):x € R}.
22.L. Show that every polynomial of odd degree and real coefficients has a real
root. Show that the polynomial p(x) =x*+7x?—9 has at least two real roots.
22.M. Ifc>0 and n is a natural number, there exists a unique positive number b
such that b" =c.
158 CONTINUOUS FUNCTIONS

22.N. Let f be continuous on IF to R with f(0)<0O and f(1)>0. If N=


{x 1: f(x) <0} and if ¢ =sup N, show that f(c)=0.
22.0. Let f be a continuous function on R to R which is strictly increasing (in the
sense that if x'<x" then f(x’)<f(x”)). Prove that f is injective and that its inverse
function f~* is continuous and strictly increasing.
22.P. Let f be a continuous function on R to R which does not take on any of its
values twice. Isit true that f must either be strictly increasing or strictly decreasing?
22.Q. Let g be a function on Ito R. Prove that if g takes on each of its values
exactly twice, then g cannot be continuous at every point of F.
22.R. Let f be continuous on the interval [0, 277] to R and such that f(0) = f(277).
Prove that there exists a point c in this interval such that f(c)=f(c+ 7). (Hint:
consider g(x)=f(x)—f(x+).) Conclude that there are, at any time, antipodal
points on the equator of the earth which have the same temperature.
22.8. Let @ :[0, 27) > R’ be defined by ¢(t) = (cos t, sin t) fort ¢[0, 27). Then
e is an injective continuous map of [0,27) onto the unit circle S=
{(x, y)e R?:x’+y’=1}. Show that o7':S +[0, 27) cannot be continuous. (We
conclude that Theorem 22.9 may fail if the domain is not compact.)

Project
22.a. The purpose of this project is to show that many of the results of Section 22
hold for continuous functions whose domains and ranges are contained in metric
spaces. (In establishing these results we may either observe that earlier definitions
apply to metric spaces or can be reformulated to do so.)
(a) Show that Theorem 20.2 can be reformulated for a function from one metric
space to another one.
(b) Show that the Global Continuity Theorem 22.1 holds without change.
(c) Prove that the Preservation of Connectedness Theorem 22.3 holds.
(d) Prove that the Preservation of Compactness Theorem 22.5 holds.

Section 23 Uniform Continuity and Fixed


Points
Let f be defined on a subset D(f) of R’ to R*. Then it is readily seen that
the following statements are equivalent:
(i) f is continuous at every point in D(f).
(ii) Given « >0 and ue D(f), there is a 5(e, u) > 0 such that if x belongs
to D(f) and ||x — ull < 4, then |lf(x)—f(w)|| = e.
The thing that is to be noted here is that the 6 depends, in general, on both ¢
and u. That 6 depends on u is a reflection of the fact that the function f may
change its values rapidly near certain points and slowly near others.
Now it can happen that a function is such that the number 6 can be chosen
to be independent of the point u in D(f) and depending only on ¢. For
example, if f(x) = 2x, then

f(x)
- Flu)| = 2 |x — ul]
and so we can choose 6(e, u)=e/2 for all values of u.
23. UNIFORM CONTINUITY AND FIXED POINTS 159

On the other hand, if g(x) =1/x for x >0, then


u-Xx
g(x)—g(u)= ux *
If 0<8 <u and |x —u| = 4, then we leave it to the reader to show that

s@)-awl= 5
and that this inequality cannot be improved, since equality actually holds for
x=u-56. If we want to make |g(x)— g(u)|<e, then the largest value of &
we can select is 2
eu
ale, "Then"

Thus if u>0, then g is continuous at u because we can select 8(¢, u)=


eu’/(1+ eu), and this is the largest value we can choose. Since
2

inf {S4—-u>0}=0,

we cannot obtain a 8(¢, u) >0 which is independent of the choice of u


for all points u>0.
We shall now restrict g to a smaller domain. In fact, let a > 0 and define
h(x) = 1/x forx = a. Then the analysis just made shows that we can use the
same value of 8(, u). However, this time the domain is smaller and
. eu” ea”
inf {ai = a} = 40

Hence if we define 8(¢) = ea’/(1+ ea), then we can use this number for all
points u = a.
In order to help fix these ideas, the reader should look over Examples 20.5
and determine in which examples the 6 was chosen to depend on the point in
question and in which ones it was chosen independently of the point.
With these preliminaries we now introduce the formal definition.
23.1 DeFINITION. Let f have domain D(f)in R’ andrangein R*. We
say thatf is uniformly continuous on aset A ¢ D(f) if foreach e > Othere isa
8(e)>0 such that if x and u belong to A and |lx—ul|= 8(e), then
Ilf(x) — f(w)|| se.
It is clear that if f is uniformly continuous on A, then it is continuous at
every point of A. In general, however, the converse does not hold. It is
useful to have in mind what is meant by saying that a function is not
uniformly continuous, so we state such a criterion, leaving its proof to the
reader.
23.2 LEMMA. A necessary and sufficient condition that the function f is
not uniformly continuous on AC D(f) is that there exist eo.>0, and two
160 CONTINUOUS FUNCTIONS

sequences X = (xx), Y =(yn) in A such that ifn €N, then ||xn— yn|| = 1/n and
If) — Fn)I] > €0.
As an exercise the reader should apply this criterion to show that
g(x) =1/x is not uniformly continuous on D(g)={x:x >O}.
We now present a very useful result which assures that a continuous
function is automatically uniformly continuous on any compact set in its
domain.
23.3. UNIFORM CONTINUITY THEOREM. Let f be a continuous function
with domain D(f) in R? and range in R*. IfK < D(f) is compact, then f is
uniformly continuous on K.
FIRST PROOF. Suppose that f is not uniformly continuous on K. By
Lemma 23.2 there exists ¢) >0 and two sequences (x,) and (y,) in K such
that if née .N, then
(23.1) llxx— yal} 1/n, —— {If(%
— flyn)||>
n)£0.
Since K is compact in R’, the sequence X is bounded; by the Bolzano-
Weierstrass Theorem 16.4, there is a subsequence (Xn«)) of (x.) which
converges to an element z. Since K is closed, the limit z belongs to K and f
iscontinuous at z. Itis clear that the corresponding subsequence (yna)) of Y
also converges to z.
It follows from Theorem 20.2(c) that both sequences (f(X%.Go)) and (f(yng))
converge to f(z). Therefore, when k is sufficiently great, we have ||f(xna) —
f(yac)||< eo. But this contradicts the second relation in (23.1).
SECOND PROOF. (A shorter proof could be based on the Lebesgue
Covering Theorem 11.5, but we prefer to use the definition of
compactness.) Suppose that f is continuous at every point of the compact
set K. According to Theorem 20.2(b), given « >0 and u in K there is a
number 6(ze,u)>0 such that if x¢K and ||x—ull<SGe,u), then
f(x) -f(@)||<3e. For each u in K, form the open ball G(u)=
{x eR? :|x —ul|<48Ge, u)}; then the set K is certainly contained in the
union of the family @ = {G(u): u € K}, since to each point u in K there is an
open ball G(u) which contains it. Since K is compact, it is contained in the
union of a finite number of sets in the family G, say, G(u:),..., G(un). We
now define
8(e)
= inf {8Ge, u),..., Ge, und},
and we shall show that 6(<) has the desired property. For, suppose that x, u
belong to K and that ||x — ul] = 6(c). Then there exists a natural number k
with 1=<k =< N such that x belongs to the set G(u.); that is, ||x —ul|<
76(e, ux). Since &(e) <46(e, ux), it follows that
||u — a] = [lu — x]]+|]x — ux |< 8 Ge, ux).
23. UNIFORM CONTINUITY AND FIXED POINTS 161

Therefore, we have the relations

IfG)—fUadi<2e, — [If(u)— f(u)||<2e,


whence ||f(x) — f(u)||< e. We have shown that if x, u are any two points of K
for which ||x — ul| = 8(e), then |[f(x) —f(u)||<e. QED.
In later sections we shall make use of the idea of uniform continuity on
many occasions, so we shall not give any applications here. However, we
shall introduce here another property which is often available and is
sufficient to guarantee uniform continuity.
23.4 DEFINITION. If f has domain D(f) contained in R’ and range in
R‘, we say that f satisfies a Lipschitz} condition if there exists a constant
A > 0 such that

(23.2) If)
— fl] = A |x — ul
for all points x, uin D(f). Incase the inequality (23.2) holds with a constant
A <1, the function is called a contraction.
It is clear that if relation (23.2) holds, then on setting &(e) = e/A one can
establish the uniform continuity of f on D(f). Therefore, if f satisfies a
Lipschitz condition, then f is uniformly continuous. The converse, how-
ever, is not true, as may be seen by considering the function defined for
D(f) =I by f(x) =x. If (23.2) holds, then setting u=0 one must have
|f(x)| = A |x| for some constant A, but it is readily seen that the latter
inequality cannot hold.
By recalling Theorem 21.3, we see that a linear function with domain R?
and range in R® satisfies a Lipschitz condition. Moveover, it will be seen in
Section 27 that any real function with a bounded derivative also satisfies a
Lipschitz condition.

Fixed Point Theorems

Iff is a function with domain D(f) and range in the same space R®, then a
point u in D(f) is said to be a fixed point of f in case f(u) = u. A number of
important results can be proved on the basis of the existence of fixed points
of functions so it is of importance to have some affirmative criteria in this
direction. The first theorem we give is elementary in character, yet it is
often useful and has the important advantage that it provides a construction
of the fixed point. For simplicity, we shall first state the result when the
domain of the function is the entire space.

+ RUDOLPH LipscHiTz (1832-1903) was a professor at Bonn. He made contributions to


algebra, number theory, differential geometry, and analysis.
162 CONTINUOUS FUNCTIONS

23.5 FirxeEp POINT THEOREM FOR CONTRACTIONS. Let f be a


contraction with domain R°’ and range contained in R’. Then f has a
unique fixed point.
PROOF. We are supposing that there exists a constant C withO<C<1
such that ||f(x)
— f(y)]| = C [|x— y]] for all x, y in R?. Let x, be an arbitrary
point in R? and set x.= f(x,); inductively, set
(23.3) Xn+1 = f(Xn), neN.
We shall show that the sequence (x,) converges to a unique fixed point u of f
and estimate the rapidity of the convergence.
To do this, we observe that

lls — xl] = |]F(x2) — f(xs)|| < C ||x2— xl,


and, inductively, that
(23.4) |[xn+1— Xall = [lf%n) — fa) I] = C [xn — xn-il] = C7 |]x2— x1).
If m = n, then repeated use of (23.4) yields

|[m — Xn] = [[Xm — Xen —a]] + [[Xm—1 = Xena + > + + [fen


ea — Hell
<{C77 Crt + +07} x2— xl.
Hence it follows that, for m = n, then
co

(23.5) ||Xm — Xn] = Tae llxe— ml

Since 0< C <1, the sequence (C””’) converges to zero. Therefore, (x,) isa
Cauchy sequence. If u = lim (x,), then it is clear from (23.3) that u is a fixed
point of f. From (23.5) and Lemma 15.8, we obtain the estimate
Cc
(23.6) lu — xa]] <= 1-¢C \|x2— xall

for the rapidity of the convergence.


Finally, we show that there is only one fixed point for f. In fact, if u, v are
two distinct fixed points of f, then
lu — oll = If) -f@)I = C Ju vf.
Since u v, then |lu — v|| ¥ 0, so this relation implies that 1 < C, contrary to
the hypothesis that C < 1. O.E.D.
It will be observed that we have actually established the following result.

23.6 CoROLLARY. If f is a contraction with constant C <1, if x. is an


arbitrary point in R?, and if the sequence X =(x,) is defined by equation
(23.3), then X converges to the unique fixed point u of f with the rapidity
estimated by (23.6).
23. UNIFORM CONTINUITY AND FIXED POINTS 163

In case the function f is not defined on all of R’, then somewhat more care
needs to be exercised to assure that the iterative definition (23.3) of the
sequence can be carried out and that the points remain in the domain of f.
Although some other formulations are possible, we shall content ourselves
with the following one.

23.7 THEOREM. Suppose that fis a contraction with constant C which is


defined for D(f) ={x ¢ R® :||x|| = B} and that ||f(0)||= BQ—C). Then the
sequence
x1=0, x2= f(%1),.-., Xns1 =f (Mn), ---
converges to the unique fixed point off which lies in the set D(f).
proor. Indeed, if xe D=D(f), then |{f(x)—f(0)l| < C |x —Ol| < CB,
whence it follows that

If@)|| < fO)|+ CB = Q- CB +CB=B.


Therefore f(D) < D. Thus the sequence (x,) can be defined and remains in
D, so the previous proof applies. O.E.D.
The Contraction Theorem established above has certain advantages: it is
constructive, the error of approximation can be estimated, and it guarantees
a unique fixed point. However, it has the disadvantage that the requirement
that f be acontraction is a very severe restriction. Itisadeep and important
fact, first proved in 1910 by L. E. J. Brouwer,t that any continuous function
with domain D = {x € R® :||x|| = B} and range contained in D must have at
least one fixed point.
23.8 BROUWER Fixep Poinr THEOREM. Let B>O and let D=
{x € R?:||x||< B}. Then any continuous function with domain D and range
contained in D has at least one fixed point.
The proof of this result when p = 1 will be given as an exercise. For the
case p> 1, however, the proof would take us too far afield. For a proof
based on elementary notions only, see Dunford-Schwartz, pages 467-470.
For a more systematic account of fixed point and related theorems, consult
the book of Lefschetz.
Exercises

23.A. Examine each of the functions in Example 20.5 and either show that the
function is uniformly continuous or its domain or that it is not.
23.B. Give a proof of the Uniform Continuity Theorem 23.3 by using the
Lebesgue Covering Theorem 11.5.

+L. E. J. BROUWER (1881-1966) was professor at Amsterdam and dean of the Dutch school of
mathematics. In addition to his nearly contributions to topology, he is noted for his work on
the foundations of mathematics.
164 CONTINUOUS FUNCTIONS

23.C. Jf B is bounded in R?’ and f: B — R° is uniformly continuous, show that f is


bounded on B. Show that this conclusion fails if B is not bounded in R?’.
23.D. Show that the functions, defined for x ER by
1 .
f= g(x) =sin x,
are uniformly continuous on R.
23.E. Show that the functions, defined for D ={x € R: x = O}, by
h(x)=x, k(x)=e7,
are uniformly continuous on D.
23.F. Show that the following functions are not uniformly continuous on their
domains.
(a) f(x) =x’, >0},
D(f)={xeR:x
(b) g(x) =tan x, D(g)={xeR:0<=x<
7/2},
(c) h(x) =e’, D(W)=R.
(d) k(x) =sin (1/x), D(k)={x € R:x > 0}.
23.G. A function g:R — R’ is periodic if there exists a number p >0 such that
g(x+p)= g(x) for allx eR. Show that a continuous periodic function is bounded
and uniformly continuous on R.
23.H. Let f be defined on DR?’ to R‘, and suppose that f is uniformly
continuous on D. If (x,) is a Cauchy sequence in D, show that (f(x,)) is a Cauchy
sequence in R*
23.1. Suppose that f:(0,1)— R is uniformly continuous on (0,1). Show thatf
can be defined at x = 0 and x = 1 in such a way that it becomes continuous on [0, 1].
23.J. Let D={xeR°:||x||<1}. Show that f:D-—> R* can be extended to a
continuous function on D,={x € R?:||x||< 1} to R* if and only if f is uniformly
continuous on D.
23.K. If f and g are uniformly continuous on R to R, show that f+ g is uniformly
continuous on R, but that fg need not be uniformly continuous on R even when one
off and g are bounded.
23.L. If f: 1 — Tis continuous, show that f has a fixed point in F. (Hint: Consider
a(x) = f(x)—x.)
23.M. Give an example of a function f: R’ > R® such that ||f(x)
— f(u)|| = ||x — ull
for all x, ue R’ which does not have a fixed point. (Why does this not contradict
Theorem 23.5?)
23.N. Let f and g be continuous functions on [a, b] such that the range R(f)<
R(g)=[0, 1]. Prove that there exists a point c €[a, b] such that f(c) = g(c).

Project
23.a. This project introduces the notion of the “oscillation” of a function on a set
and ata point. Let I=[a, b]<R and let f:I > R be bounded. If A CI we define
the oscillation of f on A to be the number

O;(A) = sup {f()— f(y): ye A}.


(a) Show that 0= 0,(A)=<2 sup {If(x)|:xe A}. If ACBecI, then 0,(A)s
0, (B).
24, SEQUENCES OF CONTINUOUS FUNCTIONS 165

(b) If c €I we define the oscillation of f at c to be the number


(ec) = inf O,(Ns)

where N, ={x €I:|x —x.|< 6}. Show that


o;(c) = lim Q, (No).

Also, if w,(c) <a, then there exists 6 >0 such that O,(Ns) <a.
(c) Show that f is continuous at c € I if and only if w,(c) = 0.
(d) Ifa >0 and if w,(x)< a for all x € [, then there exists 6 > 0 such that ifA ¢ I is
such that its diameter d(A) = sup {|x — y|: x, y € A} is less than 8, then 0,(A)<a.
(e) Ifa >0, then the set D, = {x € I: w(x) = a} isaclosed set in R. Show that

D=UD.=U Din
a>d nen

is the set of points at which f is discontinuous. Hence the set of points of


discontinuity of a function is the union of a countable family of closed sets. (Such a
set is called an F,-set.)
(f) Extend these definitions and results to a function defined on a closed cell in
R’.

Section 24 Sequences of Continuous Functions

There are many times when one needs to consider a sequence of continu-
ous functions. In this section we shall present several interesting and
important theorems about such sequences. Theorem 24.1 will be used
very often in the following and is a key result. The remaining theorems
will not be used frequently, but the reader should be familiar with their
statements, at least.
In this section the importance of uniform convergence should become
clearer. We recall that a sequence (f,.) of functions on a subset D of R? to
R’‘ is said to converge uniformly on D to f if for every e > 0 there is an N(e)
such that if n= N(e) and x e€D, then |lf.(x)—f(x)||<e. We recall from
Theorem 17.9 that this is true if and only if lf. —fllb > 0, when (f,) is a
bounded sequence.

Interchange of Limit and Continuity


We observe that the limit of a sequence of continuous functions may not
be continuous. It is very easy to sce this; for n € N, and x €], let f,(x) =x".
We have seen, in Example 17.2(b), that the sequence (f,) converges on I to
the function f defined by
f(x)=0, O0<x<1,
=1, x=1.
166 CONTINUOUS FUNCTIONS

Thus, despite the simple character of the continuous functions f,,, the limit
function is not continuous at the point x = 1.
Although the extent of the discontinuity of the limit function in the
example just given is not very great, it is evident that more complicated
examples can be constructed which will produce more extensive dis-
continuity. It would be interesting to investigate exactly how discontinu-
ous the limit of a sequence of continuous functions can be, but this
investigation would take us too far afield. Furthermore, for most applica-
tions it is more important to find additional conditions which will guarantee
that the limit function is continuous.
We shall now establish the important fact that uniform convergence of a
sequence of continuous functions is sufficient to guarantee the continuity of
the limit function.
24.1 THsoreM. LetF = (f,) be a sequence of continuous functions with
domain D in R? and range in R° and let this sequence converge uniformly on
D to a function f. Then f is continuous on D.
PROOF. Since (f,) converges uniformly on D to f, given ¢ >0 there is a
natural number N = N(e/3) such that ||fv(x) — f(x)||< e/3 for all x in D. To
show that f is continuous at a point a in D, we note that

(24.1) f(x)
— fl@)Il < [LF Ox) — fin )II+ Ilfie)
— fix (@)I| + [lf (a) — f(a]
< ¢/3 +|lfw(x)— f(a)||+€/3.
Since fy is continuous, there exists a number 8= 8(e/3, a, fy) >0 such
that if |x —al|<6 and x €D, then ||fx(x)—fx(a)||< 2/3. (See Figure 24.1.)
Therefore, for such x we have |f(x)—f(a)||<e. This establishes the con-
tinuity of the limit function f at the arbitrary point ain D. O.E.D.
We remark that, although the uniform convergence of the sequence of
continuous functions is sufficient for the continuity of the limit function, it is
(% fy)

(a, f,,(2))

Figure 24.1
24, SEQUENCES OF CONTINUOUS FUNCTIONS 167

not necessary. Thus if (f,) is a sequence of continuous functions which


converges to a continuous function f, then it does not follow that the
convergence is uniform (see Exercise 24.A).
As we have seen in Theorem 17.9, uniform convergence on a set D of a
sequence of functions is implied by convergence in the uniform norm on
D. Hence Theorem 24.1 has the following formulation.
24.2 THEOREM. [If (f,) is a sequence of functions in BC,,(D) such that
If: — filo — 0, then fe BC,,(D).

Approximation Theorems
For many applications it is convenient to “approximate” continuous
functions by functions of an elementary nature. Although there are several
reasonable definitions that one can use to make the word “‘approximate”’
more precise, one of the most natural as well as one of the most important is
to require that at every point of the given domain the approximating
function shall not differ from the given function by more than the preas-
signed error. This sense is sometimes referred to as ‘‘uniform approxima-
tion” and it is intimately connected with uniform convergence. We suppose
that f is a given function with domain D = D(f) contained in R’ and range in
R’. We say that a function g approximates f uniformly on D to within < > 0,

it I(x) -f(x)||< « for allxeD;


or, what amounts to the same thing, if
lg —filo = sup {\|g(x)-—f(x)|:x¢ D} se.
Here we have used the norm which was introduced in equation (17.5). We
say that the function f can be uniformly approximated on D by functionsina
class @ if, for each number e >0 there is a function g. in § such that
ge —f|lb <e; or, equivalently, if there exists a sequence of functions in Y
which converges uniformly on D to f.
24.3 Derinirion. A function g with domain R? and range in R?* is
called a step function if it assumes only a finite number of distinct values in
R‘, each non-zero value being taken on an interval in R?.

For example, if p = q =1, then the function g defined explicitly by


g(x) =0, x=-2,
=1, 2<x <0,
=3, O<x<1,
=—5, 1sxs3,
=0, x >3.
is a step function.
168 CONTINUOUS FUNCTIONS

We now show that a continuous function whose domain is a compact cell


can be uniformly approximated by step functions.
24.4 THEOREM. Let f be a continuous function whose domain D is a
compact cell in R® and whose values belong to R*. Then f can be uniformly
approximated on D by step functions.
PROOF. Let «>0 be given; since f is uniformly continuous (Theorem
23.3), there is a number 6(¢)>0 such that if x, y belong to D and
\|x — y||< 8(e), then ||f(x) —f(y)||<e. Divide the domain D of f into disjoint
cells I,,..., I, such that if x, y belong to I, then ||x — y||<8(e). (How?)
Let x, be any point belonging to the cell i, k=1,...,n and define
g(x) =f(x.) for xE% and g.(x)=0 for x¢D. Then it is clear that
\|g.(x) —f(x)||<« for xeD so that g. approximates f uniformly on D to
within e. (See Figure 24.2.) QED.
It is natural to expect that a continuous function can be uniformly
approximated by simple functions which are also continuous (as the step
functions are not). For simplicity, we shall establish the next result only in
the case where p= q=1 although there evidently is a generalization for
higher dimensions.
We say that a function g defined on a compact cell J=[a, b] of R with
values in R is piecewise linear if there are a finite number of points c, with
Qa=Co<C1<¢C2<'+:<¢,=b and corresponding real numbers Ax, By, k =
0,1,..., ", such that when x satisfies the relation c,._-1< x < ¢, the function g
has the form
g(x)=AX+B, k=0,1,...,0.
If g is continuous on J, then the constants A,, B, must satisfy certain
relations, of course.

(x3, f(x3))

Figure 24.2, Approximation by a step function.


24. SEQUENCES OF CONTINOUS FUNCTIONS 169

24.5 THEOREM. Let f be a continuous function whose domain is a


compact cellJinR. Thenf can be uniformly approximated on J by continuous
piecewise linear functions.
PROOF. As before, f is uniformly continuous on the compact set J.
Therefore, given e>0, we divide J=[a, b] into cells by adding inter-
mediate points c., k =0,1,..., n, with a = co<¢i<02<++'<c, =b so that
Ce — Ce-1< 6(e). Connect the points (cx, f(cx)) by line segments, and define
the resulting continuous piecewise linear function g.. It is clear that g.
approximates f uniformly on J within e. QED.

Approximation by Polynomials
We shall now prove a deeper, more useful, and more interesting result
concerning the approximation by polynomials. First, we prove the Weier-
strass Approximation Theorem for p = q = 1, by using the polynomials of S.
Bernstein.
24.6 DEFINITION. Let f bea function with domain I =[0, 1] and range
in R. The nth Bernstein polynomial for f is defined to be

(24.2) B,(x) = Balas f=) (E)(f)ra-xy


These Bernstein polynomials are not as terrifying as they look at first glance. A
reader with some experience with probability should see the Binomial Distribution
lurking in the background. Even without such experience, the reader should note
that the value B, (x; f) of the polynomial at the point x is calculated from the values
f(O), f/m), f(2/n),..., fC), with certain non-negative weight factors (x)=

(7)a —x)"* which may be seen to be very small for those values of k for which
k/nisfarfrom x. In fact, the function ¢, is non-negative on I and takes its maximum
value at the point k/n. Moreover, as we shall see below, the sum of all the ¢,.(x),
k=0,1,...,n,is 1 foreachx inL

We recall that the Binomial Theorem asserts that

(24.3) (styt=> (pst,


n
where ( k ) denotes the binomial coefficient

(2) ~ ce ky!
+ SERGE N, BERNSTEIN (1880-1968) made profound contributions to analysis, approximation
theory, and probability. He was born in Odessa and was a professor in Leningrad and
Moscow.
170 CONTINUOUS FUNCTIONS

By direct inspection we observe that

(24.4) (CD
m—1\___(n—1)!_—_k(n
er essinices sreerit)
n-2\_— (n-2)! —_sik(k -1)/n
(24.5) (0-3) * Goel net le)
Now let s =x and t =1—x in (24.3), to obtain

(24.6) t=} (f)ta-2


Writing (24.6) with n replaced by n—1 and k by j, we have
nol _ : 2
1=¥ (" . "ea -xy4,
jao\ J
Multiply this last relation by x and apply the identity (24.4) to obtain
n-1.
= jt +( n ) jtteq vy yn-G+1)
x % a j+1 x (1-x) .

Now let k =j+1, whence

_*< k AY ey yack
=D E(h)x (1-x)"™.

We also note that the term corresponding to k =0 can be included, since it


vanishes. Hence we have

(24.7) => KPa) k=0

A similar calculation, based on (24.6) with n replaced by n — 2 and identity


(24.5), shows that

(n?-n)x?= x (k?- w(t )era —xy*.

Deere £O Qe
Therefore we conclude that

(24.8) (1 "x +e De na) \KI* (1+x)"™.

Multiplying (24.6) by x’, (24.7) by —2x, and adding them to (24.8), we obtain

(24.9) G/n)x—x) = ¥ &- kin)?(;))x*(1 =x)


k=0

which is an estimate that will be needed below.


Examining Definition 24.6, formula (24.6) says that the nth Bernstein
polynomial for the constant function fo(x) = 1 coincides with f,. Formula
24. SEQUENCES OF CONTINUOUS FUNCTIONS 171

(24.7) says the same thing for the function fi(x) =x. Formula (24.8) asserts
that the nth Bernstein polynomial for the function f.(x) = x? is
B(x; fa) = (1 — 1/n)x?+ (A/n)x,
which converges uniformly on I to f2. We shall now prove that if f is any
continuous function on I to R, then the sequence of Bernstein polynomials
has the property that it converges uniformly on I to f. This will give us a
constructive proof of the Weierstrass Approximation Theorem. In the
process of proving this theorem we shall need formula (24.9).
24.7 BERNSTEIN APPROXIMATION THEOREM. Let f be continuous on
I with values in R. Then the sequence of Bernstein polynomials for f,
defined in equation (24.2), converges uniformly on I to f.
PROOF. On multiplying formula (24.6) by f(x), we get

foo = % #eo(f)sta
xy
Therefore, we obtain the relation

Fle) Bae) = YF) — fein} (eh x)""*


from which it follows that

(24.10) (f)— BaG)] = Oe) —Fem)| (Px


Now f is bounded, say by M, and also uniformly continuous. Note that if k
is such that k/n is near x, then the corresponding term in the sum (24.10) is
small because of the continuity of f at x; on the other hand, if k/n is far
from x, the factor involving f can only be said to be less than 2M and any
smallness must arise from the other factors. We are led, therefore, to
break (24.10) into two parts: those values of k where x —k/n is small and
those for which x —k/n is large.
Let e >0 and let 5(e) be as in the definition of uniform continuity for f.
It turns out to be convenient to choose n so large that
(24.11) n = sup {(8(e))*, M’/e"},
and break (24.10) into two sums. The sum taken over those k for which
|x —k/n| <n < 8(e) yields the estimate

x (i )x*(-x)" < ey, (t)*a-9" =e.


k

The sum taken over those k for which |x —k/n| = n~™, that is, (x —k/m)’ >
n‘?, can be estimated by using formula (24.9). For this part of the sum in
172 CONTINUOUS FUNCTIONS

(24.10) we obtain the upper bound

=2MVn y (x- kin)?(7)x—x)""

1 M
=< 2MVa{— x 1-x ber,
n ( ) Vn

since x(1—x) <4 on the interval I. Recalling the determination (24.11)


for n, we conclude that each of these two parts of (24.10) is bounded above
by e. Hence, for n chosen in (24.11) we have

f(x) — Bu(x)|<2e,
independently of the value of x. This shows that the sequence (B,)
converges uniformly on I to f. OED.

As a direct corollary of the theorem of Bernstein, we have the following


important result.
24.8 WEIERSTRASS APPROXIMATION THEOREM. Let f be a continu-
ous function on a compact interval ofR and with values in R. Then f can be
uniformly approximated by polynomials.
PROOF. Iff is defined on [a, b], then the function g defined on F=[0, 1]
by
g(t)=f(b-a)tta), tel
is continuous. Hence g can be uniformly approximated by Bernstein
polynomials and a simple change of variable yields a polynomial approxi-
mation to f. QED.
We have chosen to go through the details of the Bernstein Theorem 24.7
because it gives a constructive method of finding a sequence of polynomials
which converges uniformly on I to the given continuous function. In
addition, the method of proof of Theorem 24.6 is characteristic of many
analytic arguments and it is important to develop an understanding of such
arguments. Finally, although we shall establish more general approxima-
tion results in Section 26, in order to do so we shall need to know that the
absolute value function can be uniformly approximated on a compact
interval by polynomials. Although it would be possible to show this
special case directly, the argument is not so simple. For a more complete
discussion of approximation the reader is referred to the book of E.
Cheney listed in the References.
24, SEQUENCES OF CONTINUOUS FUNCTIONS 173

Exercises

24.A. Give an example of a sequence of continuous functions which converges


to a continuous function but where the convergence is not uniform,
24.B. Give an example of a sequence of everywhere discontinuous functions
which converges uniformly to a continuous function.
24.C. Give an example of a sequence of continuous functions which converges
on a compact set to a function that has an infinite number of discontinuities.
24.D. Let (f,) be a sequence of continuous functions on D < R’ to R* such that
(f.) converges uniformly to f on D, and let (x,) be a sequence of elements in D
which converges to x€D. Does it follow that (f,(x,)) converges to f(x)?
24.E. Consider the sequences (f,) defined on D={x €R:x = 0} to R by the
following formulas:
x . .
(a) =, 0) © a
x" x" Xn
@ fo. © a, () Xe
Discuss the convergence and the uniform convergence of these sequences and the
continuity of the limit functions. In case of non-uniform convergence on D,
consider appropriate intervals in D.
24.F. Let (f,) be a sequence on DCR? to R* which converges on D to f.
Suppose that each f, is continuous at c and that the sequence converges uniformly
on some neighborhood of c. Prove that f is continuous at c.
24.G. Let (f,) be a sequence of continuous functions on D¢R?’ to R which is
monotone decreasing in the sense that if x e D, then

fi@)=flx)e---=f)eful)e---
If lim (f,(c))=0 for some ceD and « >@; show that there exists meN and a
neighborhood U of c such that if n= m and xe UND, then f,(x)<e.
24.H. Use the preceding exercise to prove the following result of U. Dini.+ If
(f.) is a monotone sequence of continuous functions which converges at each point
of a compact set K in R° to a function f which is continuous on K, then the
convergence is uniform on K.
24.1. Show, by examples, that Dini’s Theorem fails if we drop either of the
hypotheses that K is compact or that f is continuous.
24.J. Prove the following theorem of G. Pélya.¢ If for each n € N the function f,
on I to R is monotone increasing and if f(x) =lim (f,(x)) is continuous on FE, then
the convergence is uniform on I. (Observe that it is not assumed that f, is
continuous.)

¥ ULIisse Dini (1845-1918) studied and taught at Pisa. He worked on geometry and analysis,
particularly Fourier series.
+ GeorGE PoLya (1887- ) was born in Budapest and taught at Ziirich and Stanford. He
is widely known for his work in complex analysis, probability, number theory, and the theory
of inference.
174 CONTINUOUS FUNCTIONS

24.K. Let (f,) be a sequence of continuous functions on DCR? to R‘ and let


f(x) =lim (f,(x)) for xe D. Show that f is continuous at a point c € D if and only if
for each « > 0 there exists m € N and a neighborhood U of c such that ifx eD NU,
then |[f,. (x) — f(x)|[<«.
24.L. Suppose that f:R — R is uniformly continuous on R and for neEN, let
f.(x) = f(x +1/n) for xe R. Show that (f,) converges uniformly on R to f.
24.M. If f.(x)= x’ for xe], how large must n be so that the nth Bernstein
polynomial B,, for f, satisfies |f.(x) — B,(x)| <= 1/1000 for all x eI?
24.N. If fs(x) =x? for x ef, calculate the nth Bernstein polynomial for f;. Show
directly that this sequence of polynomials converges uniformly to f; on F.
24.0. Differentiate equation (24.3) once with respect to s and substitute s = x,
t= 1 -x to give another derivation of equation (24.7).
24.P. Differentiate equation (24.3) twice with respect to s to give another
derivation of equation (24.8).
24.Q. (a) Let J be a compact interval in R, andletaeéR,ceJ. Drawa graph of
the function g:J > R defined by p(x) =a tix —cl+x+c).
(b) Show that every continuous piecewise linear function can be written as the
sum of a finite number of functions ¢,,..., 9, having the form given in part (a).
(c) Assuming that, on any compact interval, the absolute value function A(x) =
|x| is the uniform limit of a sequence of polynomials in x, use the observation in part
(b) to give another proof of the Weierstrass Approximation Theorem. (This
method of proof is due to Lebesgue.)
24.R. Prove that the functions x > e* on R is not the uniform limit on R of a
sequence of polynomials. Hence the Weierstrass Approximation Theorem may
fail for infinite intervals.
24.S. Show that the Weierstrass Approximation Theorem fails for bounded
open intervals.

Section 25 Limits of Functions

Although it is not possible to give a precise definition, the field of


“mathematical analysis” is generally understood to be that body of
mathematics in which systematic use is made of various limiting concepts.
If this is a reasonably accurate statement, it may seem odd to the reader
that we have waited this long before inserting a section dealing with limits.
There are several reasons for this delay, the main one being that elemen-
tary analysis deals with several different types of limit operations. We
have already discussed the convergence of sequences and the limiting
implicit in the study of continuity. In the next chapters, we shall bring in
the limiting operations connected with the derivative and the integral.
Although all of these limit notions are special cases of a more general one,
the general notion is of a rather abstract character. For that reason, we
prefer to introduce and discuss the notions separately, rather than to
develop the general limiting idea first and then specialize. Once the
25. LIMITS OF FUNCTIONS 175

special cases are well understood it is not difficult to comprehend the


abstract notion. For an excellent exposition of this abstract limit, see
the expository article of E. J. McShane cited in the References.
In this section we shall be concerned with the limit of a function at a
point and some slight extensions of this idea. Often this idea is studied
before continuity; in fact, the very definition of a continuous function is
sometimes expressed in terms of this limit instead of using the definition we
have given in Section 20. One of the reasons why we have chosen to study
continuity separately from the limit is that we shall introduce two slightly
different definitions of the limit of a function at a point. Since both
definitions are widely used, we shall present them both and attempt to
relate them to each other.
Unless there is specific mention to the contrary, we shall let f be a
function with domain D contained in R? and values in R* and we shall
consider the limiting character of f at a cluster point c of D. Therefore,
every neighborhood of c contains infinitely many points of D.
25.1 DesFINITION. (i) An element b of R* is said to be the deleted
limit of f at c if for every neighborhood V of b there is a neighborhood U
of c such that if x belongs to UMD and x#c, then f(x) belongs to V. In
this case we write

(25.1) b=lim f or b= lim f(x).

(ii) An element b of R‘ is said to be the non-deleted limit of f at c if for


every neighborhood V of b there is a neighborhood U of ¢ such that if x
belongs to UND, then f(x) belongs to V. In this case we write

(25.2) b=Limf or b= Lim f(x).


It is important to observe that the difference between these two notions
centers on whether the value f(c), when it exists, is considered or not.
Note also the rather subtle notational distinction we have introduced in
equations (25.1) and (25.2). It should be realized that most authors
introduce only one of these notions, in which case they refer to it merely as
“the limit’? and generally employ the notation in (25.1). Since the deleted
limit is the more popular, we have chosen to preserve the conventional
symbolism in referring to it.
The uniqueness of either limit, when it exists, is readily established. We
content ourself with the following statement.
25.2 Lemma. (a) If either of the limits lim. f and Lim, f exists, then it is
uniquely determined.
(b) If the non-deleted limit exists, then the deleted limit exists and
lim f = Limf.
176 CONTINUOUS FUNCTIONS

(c) If c does not belong to the domain D of f, then the deleted limit exists if
and only if the non- deleted limit exists.
Part (b) of the lemma just stated shows that the notion of the non-
deleted limit is somewhat more restrictive than that of the deleted limit.
Part (c) shows that they can be different only in the case where c belongs to
D. To give an example where these notions differ, consider the function f
on R to R defined by
f(x)=0, x#0,
(25.3)
=1, x=0.

If c =0, then the deleted limit of f at c=0 exists and equals 0, while the
non-deleted limit does not exist.
We now state some necessary and sufficient conditions for the existence
of the limits, leaving their proof to the reader. It should be realized that in
part (c) of both results the limit refers to the limit of a sequence, which was
discussed in Section 14.

25.3. THEOREM. The following statements, pertaining to the deleted


limit, are equivalent.
(a) The deleted limit b = lim. f exists.
(b) If ¢ >0, there is a 6 >0 such that if x¢D and 0<||x—cl|<4, then
If) — bl|<e.
(c) If (x) is any sequence in D such that x.#c and c=lim(x,), then
b = lim (f(x,)).
25.4 TuHeorem. The following statements, pertaining to the non-
deleted limit, are equivalent.
(a) The non-deleted limit b = Lim. f exists.
(b) If « >0, there is a &>0 such that if xeD and |\x—cl|<8, then
IIf(x)
— bll<e.
(c) If (x) is any sequence in D such that c=lim (xn), then we have
b = lim (f(x).
The next result yields an instructive connection between these two limits
and continuity of f at c.
25.5 THEOREM. [If c is a cluster point belonging to the domain D of f,
then the following statements are equivalent.
(a) The function f is continuous at c.
(b) The deleted limit lim. f exists and equals f(c).
(c) The non-deleted limit Lim. f exists.
PROOF. If (a) holds and V is a neighborhood of f(c), then there exists a
neighborhood U of c such that if x belongs to UN D, then f(x) belongs to
V. Clearly, this implies that Lim f exists at c and equals f(c). Similarly,
25. LIMITS OF FUNCTIONS 177

f(x) belongs to V for all x#c for which xe UND, in which case limf
exists and equals f(c). Conversely, statements (b) and (c) are readily seen
to imply (a). O.E.D.
If f and g are two functions which have deleted (respectively, non-
deleted) limits at a cluster point c of D(f+g)=D(f)ND(g), then their
sum f +g has a deleted (respectively, non-deleted) limit at c and

lim (f+ g) =lim f+lim g,

(respectively, Lim (f+ g) = Lim f+Lim 8).

Similar results hold for other algebraic combinations of functions, as is


easily seen. The following result, concerning the composition of two
functions, is slightly deeper and is a place where the non-deleted limit is
simpler than the deleted limit.

25.6 THEOREM. Suppose that f has domain D(f) in R? and range in


R* and that g has domain D(g) in R‘* and range in R'. Let gof be the
composition ofg and f and let c be a cluster point of D(g of).
(a) If the deleted limits b = lim. f and a = lim, g both exist and if either g is
continuous at b or f(x) # b for x in a neighborhood of c, then the deleted limit
of gof exists atc and a = lim. gef.
(b) If the non-deleted limits b = Lim. f and a = Lim, g both exist, then the
non- deleted limit of gf exists at c and

a=Lim gof.

PROOF, (a) Let W be a neighborhood of a in R’; since a=lim g at b,


there is a neighborhood V of b such that if y belongs to VM D(g) and
y#b, then g(y)e W. Since b=limf at c, there is a neighborhood U of c
such that if x belongs to UM D(f) and x#c, then f(x)eV. Hence, if x
belongs to the possibly smaller set UN D(gef), and x#c, then f(x)e
VND(g). If f(x) #b on some neighborhood U; of c, it follows that for
x#cin(UinU)O D(gef), then (g°f)(x) € W, so that a is the deleted limit
of gef atc. If g is continuous at b, then (g°f)(x)€ W for x in UN D(gef)
and x#c.
To prove part (b), we note that the exceptions made in the proof of (a)
are no longer necessary. Hence if x belongs to UN D(gof), then f(x)e¢
V1 D(g) and, therefore, (g°f)(x) € W. QED.
The conclusion in part (a) of the preceding theorem may fail if we drop
the condition that g is continuous at b or that f(x) # b on a neighborhood
of c. To substantiate this remark, let f be the function on R to R defined
178 CONTINUOUS FUNCTIONS

in formula (25.3) and let g=f and c=0. Then gef is given by
(g°f\(x)=1, x0,
=0, x=0.

Furthermore, we have limxo f(x) =0, and lim,o g(y)=0, whereas it is


clear that limz—o(gef)(x)=1. (Note that the non-deleted limits do not
exist for these functions.)

Upper Limits at a Point


For the remainder of the present section, we shall consider the case
where q=1. Thus f is a function with domain D in R? and values in R
and the point c in R? is a cluster point of D. We shall define the limit
superior or the upper limit of f at c. Again there are two possibilities
depending on whether deleted or non-deleted neighborhoods are con-
sidered, and we shall discuss both possibilities. It is clear that we can define
the limit inferior in a similar fashion. One thing to be noted here is that,
although the existence of the limit in R (deleted or not) is a relatively
delicate matter, the limits superior to be defined have the virtue that if f is
bounded, then their existence is guaranteed.
The ideas in this part are parallel to the notion of the limit superior of a
sequence in R? which was introduced in Section 19. However, we shall
not assume familiarity with what was done there, except in some of the
exercises.
25.7 DEFINITION. Suppose that f is bounded on a neighborhood of
the point c. If r>0, define g(r) and P(r) by

(a) p(r)
= sup {f(x):0<||x
—cl| <r, x € D},
(b) (7) = sup {f(x) : |x — cl]<r, x € D}
and set

(c) lim sup f=inf{p(r):r> 0},

(d) Lim sup f = inf {®(r):r > 0}.

These quantities are called the deleted limit superior and the non-deleted
limit superior of f at c, respectively.
Since these quantities are defined as the infima of the image under f of
ever-decreasing neighborhoods of c, it is probably not clear that they
deserve the terms ‘‘limit superior.” The next lemma indicates a justifica-
tion for the terminology.
25, LIMITS OF FUNCTIONS 179

25.8 LEMMA. If o, ® are as defined as above, then


(a) lim sup f = lim ¢(r),
(b) Lim sup f= lim ®(r).

PROOF. We observe that if 0<r<s, then

lim sup f<o(r) = 4(s).

Furthermore, by 25.7(c), if « >0 there exists an r. >O such that

or) <lim sup fre.

Therefore, if r satisfies O0<r<r., we have |@(r)—lim sup... f|<e, which


proves (a). The proof of (b) is similar and will be omitted. QED.

25.9 Lemma. (a) If M>limsupx.. f, then there exists a neighborhood


U of ¢ such that
fx)<M for c#xeEDnNU.
(b) If M>Lim sup,... f, then there exists a neighborhood U ofc such that
f(xX)<M for xeEDNU.
PROOF. (a) By 25.7(c), we have inf{g(r):r>O}<M. Hence there
exists a real number 7,>0 such that g(r)<M and we can take U=
{xe R?:||x —cl|<r.}. The proof of (b) is similar. Q.E.D.
25.10 Lemma. Let f and g be bounded on a neighborhood of c and
suppose that c is a cluster point of D(f+g). Then

(a) lim sup (f+ g) = lim sup f +lim sup g

(b) Lim sup (f+ g) = Lim sup f + Lim sup g.

PROOF. In view of the relation

sup {f(x) + g(x):x EA} < sup {f(x):xe A}+sup {g(x):x € A},
it is clear that, using notation as in Definition 25.7, we have
Prrelt) = O(t) + Pe(r).
Now use Lemma 25.8 and let r —> 0 to obtain (a). Q.E.D.
Results concerning other algebraic combinations will be found in
Exercise 25.F.
180 CONTINUOUS FUNCTIONS

Although we shall have no occasion to pursue these matters, in some


areas of analysis it is useful to have the following generalization of the
notion of continuity.
25.11 DEFINITION. A function f on D to R is said to be upper
semi-continuous at a point c in D in case
(25.4) f(c) = Lim sup f.

It is said to be upper semi-continuous on D if it is upper semi-continuous at


every point of D.
Instead of defining upper semi-continuity by means of equation (25.4)
we could require the equivalent, but less elegant, condition
(25.5) f(c)= lim sup f.

One of the keys to the importance and the utility of upper semi-continuous
functions is suggested by the following lemma, which may be compared
with the Global Continuity Theorem 22.1.
25.12 Lemma. Let f be an upper semi-continuous function with
domain D in R? and let k be an arbitrary real number. Then there exists an
open set G and a closed set F such that
(25.6) GND={xeD:flx)<k}, FOD={xeD:f(x)=
k}.
PROOF. Suppose that c is a point in D such that f(c)<k. According to
Definition 25.11 and Lemma 25.9(b), there is a neighborhood U(c) of c
such that f(x)<k for allx in DM U(c). Without loss of generality we can
select U(c) to be an open neighborhood; setting
G = U{U(c):c € D},
we have an open set with the property stated in (25.6). If F is the
complement of G, then F is closed in R? and satisfies the stated condition.
OED.
It is possible to show, using the lemma just proved, (cf. Exercise 25.M)
that if K is a compact subset of R? and f is upper semi-continuous on K,
then f is bounded above on K and there exists a point in K where f attains its
supremum. Thus upper semi-continuous functions on compact sets pos-
sess some of the properties we have established for continuous functions,
even though an upper semi-continuous function can have many points of
discontinuity.
It will have occurred to the reader that it is possible to extend the notion
of limit superior at a point to the case where the function is not bounded by
using ideas along the lines given at the end of Section 18. Similarly one
25. LIMITS OF FUNCTIONS 181

can define the limit superior as x +. These ideas are useful, but we
will leave them as exercises.

Exercises

25.A. Discuss the existence of both the deleted and the non-deleted limits of the
following functions at the point x = 0.
(a) f(x) =I, (b) f(x)=1/x, = x #0,
(c) f(x) =x sin (1/x), x40, (d) f(x) =sin (1/x), x#0
x sin Oho. x#0, _ f0, x=0,
(©) f= { x=0, (p fod= {0 x >0.
25.B. Prove bemma 25.2.
25.C. If f denotes the function defined in equation (25.3), show that the deleted
limit at x =0 equals 0 and that the non-deleted limit at x =0 does not exist.
Discuss the existence of these two limits for the composition ff.
25.D. Prove Lemma 25.4.
25.E. Show that statements 25.5(b) and 25.5(c) imply statement 25.5(a).
25.F. Show that if f and g have deleted limits at a cluster point c of the set
D(f)N D(g), then the sum f+ g has a deleted limit at c and

lim (f+ g) = lim f + lim g.

Under the same hypotheses, the inner product f - g has a deleted limit at c and

lim (fg) = (tim f) (tim 8).


25.G. Let f be defined on a subset D(f) of R into R*. If c is a cluster point of
the set V={xeR:xeD(f),x>c}, and if f, is the restriction of f to V, then we
define the right-hand deleted limit of f at c to be lim. f,, whenever this limit
exists. Sometimes the limit is denoted by lim...f or by f(c+0). Formulate and
establish a result analogous to Theorem 25.3 for the right-hand deleted limit. (A
similar definition can be given for the right-hand non-deleted limit and both
left-hand limits at c.)
25.H. Let f be defined on D={xER:x =0}to R. We say that a number L is
the limit of f at + if for each e >0 there exists a real number m(e) such that if
x= m/(e), then |f(x)—L\|<e. In this case we write L =lim,....f. Formulate and
prove a result analogous to Theorem 25.3 for this limit.
25.1. If f is defined on a set D(f) in R to R and if c is a cluster point of D(f),
then we say that f(x) > +o as x > ¢, or that

lim
f = +00

in case for each positive number M there exists a neighborhood U of c such that if
xe€UND(f), x#c, then f(x)>M. Formulate and establish a result analogous to
Theorem 25.3 for this limit.
25.J. In view of Exercises 25.H and 25.I, give a definition of what is meant by
the expressions:
lim f= +, lim f =~
182 CONTINUOUS FUNCTIONS

25.K. Establish Lemma 25.8 for the non-deleted limit superior. Give the proof
of Lemma 25.9(b).
25.L. Define what is meant by lim sup,.4. f= L, and lim inf,_... f= —».
25.M. Show that if f is an upper semi-continuous function on a compact subset K
of R’ with values in R, then f is bounded above and attains its supremum on K.
25.N. Show that an upper semi-continuous function on a compact set may not be
bounded below and may not attain its infimum.
25.0. Show that if A is an open subset of R° and if f is defined on R’ to R by
f(x)=1 for xeA, and f(x)=0 for x¢A, then f is a lower semi-continuous
function. If A is a closed subset of R’, show that f is upper semi-continuous.
25.P. Give an example of an upper semi-continuous function which has an
infinite number of points of discontinuity.
25.Q. Is it true that a function on R’ to R is continuous at a point if and only if it
is both upper and lower semi-continuous at this point?
25.R. If (f,) is a bounded sequence of continuous functions on R’ to R and if f*
is defined on R” by f*(x)=sup {f,(x):n <N} for x eR’, then is it true that f* is
upper semi-continuous on R°?
25.S. If (f,) is a bounded sequence of continuous functions on R® to R and if f, is
defined on R?’ by f,(x) = inf {f.(x):n €N} for x € R’, then is it true that f, is upper
semi-continuous on R??
25.T. Let f be defined on a subset D of R’ x R* and with values in R’. Let
(a, b) be a cluster point of D. By analogy with Definition 19.4, define the double
and the two iterated limits of f at (a, b). Show that the existence of the double and
the iterated limits implies their equality. Show that the double limit can exist
without either iterated limit existing and that both iterated limits can exist and be
equal without the double limit existing.
25.U. Let f be as in the preceding exercise. By analogy with Definitions 17.4
and 19.8, define what it means to say that

g(y) =lim f(x, y)


uniformly for y in a set. Formulate and prove a result analogous to Theorem
19.10.
25.V. Let f be as in Definition 25.1 and suppose that the deleted limit at c exists
and that for some element A in R* and r >0 the inequality ||f(x)— Al|<r holds on
some neighborhood of c. Prove that |\lim, f— Al||=<r. Does the same conclusion
hold for the non-deleted limit?
25.W. Discuss the upper and lower semi-continuity of the functions in parts (g)
and (h) of Example 20.5.
25.X. If f:[0, +) — R is continuous on [0, +~) and lim,_.,.. f(x) = 0, show thatf
is uniformly continuous on [0, +).

Section 26 Some Further Results

We shall present some theorems in this section that will not be applied
later in this book, but which are often useful in topology and analysis. The
26. SOME FURTHER RESULTS 183

first results are far-reaching extensions of the Weierstrass Approximation


Theorem, next is a theorem giving conditions under which a continuous
function has a continuous extension, and the final result is analogous to the
Bolzano-Weierstrass in the space C,,(K) of continuous functions on a
compact set K.

The Stone-Weierstrass Theorem

To facilitate our discussion, we introduce the following terminology. If f


and g are functions with domain D in R? and with values in R, then the
functions h and k defined for x in D by

h(x) =sup {f(x), g(x)}, k(x) = ink {f(x), g@)},


are called the supremum and infimum, respectively, of the functions f and
g. If f and g are continuous on D, then both h and k are also continuous.
This follows from Theorem 20.7 and the observation that if a, b are real
numbers, then

sup {a, b} = i{at+b+|a—b}},


inf {a, b}=3{a+b—|a—
bl}.
We now state one form of Stone’st generalization of the Weierstrass
Approximation Theorem. Despite its recent discovery it has already
become “‘classical’? and should be a part of the background of every
student of mathematics. The reader should refer to the article by Stone
listed in the References for extensions, applications, and a much fuller
discussion than is presented here.

26.1 STONE APPROXIMATION THEOREM. Let K be a compact subset


of R? and let £ be a collection of continuous functions on K to R with the
properties:
(a) If f, g belong to &, then sup {f, g} and inf {f, g} belong to &.
(b) Ifa, be Rand x#yeK, then there exists a function f in £ such that
f(x) =a, f(y) =.
Then any continuous function on K to R can be uniformly approximated
on K by functions in £.
PROOF. Let F bea continuous function on K to R. If x, y belong to K,
let gy € F be such that g,,(x) = F(x) and g.,(y) = F(y). Since the functions
F, g,, are continuous and have the same value at y; given e > 0, there is an

Tt MARSHALL H. STONE (1903- ) studied at Harvard and has taught at Harvard and the
Universities of Chicago and Massachusetts. The son of a chief justice, he has made basic
contributions to modern analysis, especially to the theories of Hilbert space and Boolean
algebras.
184 CONTINUOUS FUNCTIONS

open neighborhood U(y) of y such that if z belongs to KN U(y), then


(26.1) Bxy(Z)
> F(z) —e.
Hold x fixed and for each y EK, select an open neighborhood U(y) with
this property. From the compactness of K, it follows that K is contained in
a finite number of such neighborhoods: U(y,),...,U(y,). If k=
Sup {8o.,---» Qxy,}, then it follows from relation (26.1) that
(26.2) h,(z)>F(z)-e forz eK.

Since g.,(x) = F(x), it is seen that h,(x) = F(x) and hence there is an open
neighborhood V(x) of x such that if z belongs to K M V(x), then
(26.3) h,(z)<F(z)+e.
Use the compactness of K once more to obtain a finite number of neigh-
borhoods V(x1),..., V(@%m) and set h = inf {h,,,..., h.,}. Then h belongs
to £& and it follows from (26.2) that

h(z)>F(z)-e« forzeK.

and from (26.3) that


h(z)<F(z)+e forzeK.

Combining these results, we have |h(z)— F(z)|<«, z €K, which yields the
desired approximation. O.E.D.

The reader will have observed that the preceding result made no use of
the Weierstrass Approximation Theorem. In the next result, we replace
condition (a) above by three algebraic conditions on the set of functions.
Here we make use of the classical Weierstrass Theorem 24.8 for the special
case of the absolute value function ¢ defined for t in R by ¢(t)=|t|, to
conclude that ¢ can be approximated by polynomials on every compact set
of real numbers.
26.2 STONE-WEIERSTRASS THEOREM. Let K be a compact subset of
R? and let & be a collection of continuous functions on K to R with the
properties:
(a) The constant function e(x)=1, x € K, belongs to &.
(b) If f, g belong to Hf, then af+ Bg belongs to & for all a, B in R.
(c) If f, g belong to #, then fg belong to A.
(d) If x# y are two points of K, there exists a function f in of such that
f(x) 4 fly).
Then any continuous function on K to R can be uniformly approximated
on K by functions in xf.
PROOF, Leta,beR andx# y belongto K. According to (d), there isa
26. SOME FURTHER RESULTS 185

function f in sé such that f(x) #f(y). Since e(x) = 1= e(y), it follows that
there are real numbers a, B such that
af(x)+ Be(x) =a, af(y)+ Be(y) = b.

Therefore, by (b) there exists a function g¢.o such that g(x)=a and
gly) =o.
Now let £ be the collection of all continuous functions on K which can
be uniformly approximated by functions in 4. Obviously @ c &, so £ has
property (b) of the Stone Approximation Theorem 26.1. We shall now
show that if he Y, then |h|e ¥. Since
sup {f, g}=2f+ g+l|f- gi),
inf {f, g}=20f+ @—If—e)),
this will imply that & has property 26,1(a) and hence that every continuous
function on K to R belongs to &.
Since h is continuous and K is compact, it follows that there exists an
M>0 such that |lhlk <M. Since he &, there is a sequence (h,) of func-
tions in «¢ which converge uniformly to h on K and we may suppose that
[halk M+1 for allneN. (Why?) If ¢>0 is given we now apply the
Weierstrass Approximation Theorem 24.8 to the absolute value function
on the interval [-(M +1), M +1] to get a polynomial p. such that

| |t|-p.(t)| <3 for |t] <= M+1.


It therefore follows that

| |ha(x)|— p.(ha(x))| <2e forxe K.


Now p. °f, belongs to because of our hypotheses (a), (b), and (c). Since

| | (x)| — [hn (x)| | = Ib hale


it follows that if n is sufficiently large, then we have
[|hG)|—p. ° h(x)|= & forx¢ K.
Since « >0 is arbitrary, we infer that |h}<¢¥ and the result now follows
from the preceding theorem. OED.
We now obtain, as a special case of the Stone-Weierstrass Theorem, a
stronger form of Theorem 24.8. This result strengthens the latter result in
two ways: (i) it permits the domain to be an arbitrary compact subset of R?
and not just a compact cell in R, and (ii) it permits the range to lie in any
space R‘, and not just R. To understand the statement, we recall that a
function f with domain D in R’ and range in R* can be regarded as q
functions on D to R by the coordinate representation:
(26.4) f(x) =(filx),-.., f(x) forxe D.
186 CONTINUOUS FUNCTIONS

If each coordinate function f,; is a polynomial in the p coordinates


(X1,..., Xp), then we say that f is a polynomial function.
26.3 POLYNOMIAL APPROXIMATION THEOREM. Let f be a continu-
ous function whose domain K is a compact subset of R’ and whose range
belongs to R‘ and let e >0. Then there exists a polynomial function p on R?
to R* such that ||f(x)— p(x)||<e forx eK.
PROOF. Represent f by its q coordinate functions, as in (26.4). Sincef
is continuous on K, each of the coordinate functions f, is continuous on K
to R. The polynomial functions defined on R’ to R evidently satisfy the
properties of the Stone- Weierstrass Theorem. Hence the coordinate func-
tion f, can be uniformly approximated on K within «//q by a polynomial
function p;. Letting p be defined by
p(x) = (pilx), ..., pa(x)),
we obtain a polynomial function from R? to R‘ which yields the desired
approximation on K to the given function f. Q.E.D.

Extension of Continuous Functions


Sometimes it is desirable to extend the domain of a continuous function
to a larger set without changing the values on the original domain. This
can always be done in a trivial way by defining the function to be 0 outside
the original domain, but in general this method of extension does not yield
acontinuous function. After some reflection, the reader should see that it
is not always possible to obtain a continuous extension. For example, if
D={xeR:x#0} and if f is defined for x ED to be f(x)= 1/x, then it is
not possible to extend f in such a way as to obtain a continuous function on
all of R. However, it is important to know that an extension is always
possible when the domain is a closed set. Furthermore, it is not necessary
to increase the bound of the function (if it is bounded).
Before we prove this extension theorem, we observe that if A and B are
two disjoint closed subsets of R’, then there exists a continuous function @
defined on R? with values in R such that
e(x)=0, xeEA; e(x)=1, xeEB; O<o(x)<1, xeER?’.

In fact, if d(x, A) =inf {\|x—yl]:ye¢A} and d(x, B) = inf {|x — yl]:y € B},
then we can define » for x € R” by the equation
___a(x, A)
0) =a A) +d, B):
26.4 TreTzet EXTENSION THEOREM. Let f be a bounded continuous
function defined on a closed subset D of R’ and with values in R. Then
} HEINRICH TIETZE (1880-1964) was professor at Munich and contributed to topology,
geometry, and algebra. This extension theorem goes back to 1914.
26. SOME FURTHER RESULTS 187

there exists a continuous function g on R” to R such that g(x) = f(x) for x in


D and such that sup {|lg(x)||: x € R’}= sup {|lf(x)|}: x € D}.
proor. Let M=sup({|f(x)|:xeD} and consider Ai={xeD:
f(x) = —M/3} and B,={xeD:f(x)= M/3}. From the continuity of
f and the fact that D is closed, it follows from Theorem 22.1(c) that Ai
and B, are closed subsets of R’. According to the observation preceding
the statement of the theorem, there is a continuous function g: on R’ toR
such that
gi(x)=-3M, xe€Ai; oi(x)=3M, xeB;
3M <9\(x)=<3M, xeR’.
We now set f.=f—g, and note that f, is continuous on D and that
sup {|f2(x)|:x€ D} =< 9M.
Proceeding, we define A.={x¢D:f.(x)<—33M} and B,={xeD:
fox) = 33M} and obtain a continuous function g: on R?’ to R= such
that

ga(x)=-33M, x€A.; x(x)=—-33M, x€Bo;


—33M <.(x)<433M, xeR’.
Having done this, we set fs = f2— 2 and note that fs = f — ¢1— 2 is continu-
ous on D and that sup {|fs(x)|:x € D} < G)?M.
By proceeding in this manner, we obtain a sequence (¢,) of functions
defined on R’ to R such that, for each n,

(26.5) f(x) —Ler(x)


+ g2(x) ++ +++ @n(x)]]| = OM,
for all x in D and such that
(26.6) |on(x)]| < G)G)"""M forxe R’.

Let g, be defined on R? to R by g, = ¢1+¢2++ +++ ,, whence it follows


that g, is continuous. From inequality (26.6) we infer that if m =n and
xéER’, then

|g: (x) Ga(x)| = |Pnsa(x) +++ + Gm (%)| = GIG)M[1 +3


+ G+: >>] = GYM,
which proves that the sequence (g,) converges uniformly on R’ to a
function we shall denote by g. Since each g, is continuous on R’, then
Theorem 24.1 implies that g is continuous at every point of R’. Also, it is
seen from the inequality (26.5) that |f(x)— g,.(x)| = (3)"M forxeD. We
conclude, therefore, that f(x) = g(x) for x in D. Finally, inequality (26.6)
implies that for any x in R’ we have
|gn(x)| <3M[1+3+---+@""]=M,
which establishes the final statement of the theorem. Q.E.D.
188 CONTINUOUS FUNCTIONS

26.5 COROLLARY. Let f be a bounded continuous function defined on


a closed subset D of R’ and with values in R*. Then there exists a
continuous function g on R® to R* with g(x) = f(x) for x in D and such that

sup {\g(x)\|:x ¢ R’} < Vq sup {\|f(x)l]: x € D}.


PROOF. This result has just been proved for q=1. In the general case,
we note that f defines q continuous real-valued coordinate functions on D,
say,
Fx) = ile), fh), - +5 fa).
Since each of the f;, 1 = j < q, has a continuous extension g, on R’ to R, we
define g on R” to R* by g(x) =(gi(x), g2{x),.-., g.(x)). The function g is
seen to have the required properties. QED.

Equicontinuity
We have made frequent use of the Bolzano-Weierstrass Theorem 10.6
for sets (which asserts that every bounded infinite subset of R’ has a cluster
point) and the corresponding Theorem 16.4 for sequences (which asserts
that every bounded sequence in R? has a convergent subsequence). We
now present a theorem which is entirely analogous to the Bolzano-
Weierstrass Theorem except that it pertains to sets of continuous functions
and not sets of points. For the sake of brevity and simplicity, we shall
present here only the sequential form of this theorem.
In what follows we let K be a fixed compact subset of R’, and we shall be
concerned with functions which are continuous on K and have their range
in R*. In view of Theorem 22.5, each such function is bounded, and hence
C,,(K) = BC,,(K). We say that a set ¥ in C,,(K) is bounded (or uniformly
bounded) on K if there exists a constant M such that |lfllx < M, for all f in
#, Jt is clear that any finite set # of such functions is bounded; for if
#F ={fi, fo,..-, fa}, then we can set

M= sup {llfalhs, Ilfalles tees Ilfnlhc}-

In general, an infinite set of continuous functions on K to R* will not be


bounded. However, a uniformly convergent sequence of continuous func-
tions is bounded. (Cf. Exercise 26.M).
Ji f is a continuous function on the compact set K of R’, then Theorem
23.3 implies that it is uniformly continuous. Hence, if e >0 there exists
S(e)>0, such that if x, y belong to K and |lx—y||<8(e), then
Ilf(x) —f(y)||< ©. Of course, the value of 6 may depend on the function f as
well as on « and so we often write 5(e, f). (When we are dealing with
more than one function it is well to indicate this dependence explicitly.)
26. SOME FURTHER RESULTS 189

We notice that if # = {fi,..., fa} is a finite set in C,,(K), then, by setting

5(e, ¥) = inf {8(e, fi), ..-, d(e, far},


we obtain a 8 which ‘““works”’ for all the functions in this finite set.
26.6 DEFINITION. A set ¥ of functions on K to R® is said to be
uniformly equicontinuous on K if, for each real number ¢ >0 there is
a number 8(e)>0 such that if x, y belong to K and |x — y||<6(e) and f isa
function in ¥, then |f(x)—f(y)||<e.
It has been seen that a finite set of continuous functions on K is
equicontinuous. It is also true that a sequence of continuous functions
which converges uniformly on K is also equicontinuous. (Cf. Exercise
26.N.)
It follows that, in order for a sequence in C,,(K) to be uniformly
convergent on K, it is necessary that the sequence be bounded and uni-
formly equicontinuous on K. We shall now show that these two proper-
ties are necessary and sufficient for a set ¥ in C,,(K) to have the property
that every sequence of functions from ¥ has a subsequence which con-
verges uniformly on K. This may be regarded as a generalization of the
Bolzano-Weierstrass Theorem to sets of continuous functions and plays an
important role in the theory of differential and integral equations.
26.7 ARZELA-ASCOLIt THEOREM. Let K be a compact subset of R®
and let ¥ be a collection of functions which are continuous on K and have
values in R‘*. The following properties are equivalent:
(a) The family ¥ is bounded and uniformly equicontinuous on K.
(b) Every sequence from ¥ has a subsequence which is uniformly con-
vergent on K.
PROOF. First we shall show that if condition (a) is false, then so is
condition (b). If ¥ is not bounded, then there exists a sequence (f,) in ¥
such that |f.|lk =n for neéN. But then no subsequence of (f,) can be
uniformly convergent. Also if the set & is not uniformly equicontinuous,
then for some ¢5>0 there exists (why?) a sequence (f,,) in ¥ and sequences
(xn) and (y,) in K with |[x.—yal|<i/n but such that |[f.(x.) —falya)I|> eo.
But then no subsequence of (f,) can be uniformly convergent on K.
We now show that, if the set # satisfies (a), then given any sequence (f,.)
in ¥ there is a subsequence which converges uniformly on K. To do this
we notice that it follows from Exercise 10.H that there exists a countable
t CESARE ARZELA (1847-1912) was a professor at Bologna. He gave necessary and sufficient
conditions for the limit of a sequence of continuous functions on a closed interval to be
continuous, and he studied related topics.
GIULIO ASCOLI (1843-1896), a professor at Milan, formulated the definition of equicontinuity
in a geometrical setting. He also made contributions to Fourier series.
190 CONTINUOUS FUNCTIONS

set C in K such that if y¢ K and e >0, then there exists an element x in C


such that ||x—yll<e. If C={xi,x.,...}, then the sequence (f,(x1)) is
bounded in R*. It follows from the Bolzano-Weierstrass Theorem 16.4
that there is a subsequence
(f(x), fo'(x1), tee fa'(x1), a)
of (f.(x1)) which is convergent. Next we note that the sequence
(fa' (x2): k € N) is bounded in R‘; hence it has a subsequence
(fi(x2), fo?(x2), sty fa (x2), of )

which is convergent. Again, the sequence (f,’(x3):n €N) is bounded in


R‘, so some subsequence

(f:°(xs), fa°(xa), ty fa (Xs), an -)

is convergent. We proceed in this way and then set g, =f," so that g, is the
nth function in the nth subsequence. It is clear from the construction that
the sequence (g,) converges at each point of C.
We shall now prove that the sequence (g,) converges at each point of K
and that the convergence is uniform. To do this, let e >0 and let 8(e) be
as in Definition 26.6. Let C:={y1,..-, yx} be a finite subset of C such that
every point in K is within 8(e) of some point in C;. Since the sequences
(gn (2), (gn(y2)), «- +» (Gn yie))
converge, there exists a natural number M such that if m,n = M, then

lan(y)—gn(yi]<e for i=1,2,...,k.


Given x é K, there exists a y,; € C; such that ||x — y,|< (ce). Hence, by the
uniform equicontinuity, we have ||g.(x)— g,(y))||<e« for all ne N; in par-
ticular, this inequality holds for n = M. Therefore, we have

I[@n (x) — Gm (X)l] < len) — ga (vi dIl + IlgnCyi) — Bm Cy)Il


+ |[@n (yi) - Sa(x)]<e te +e =3e,
provided m, n=>M. This shows that
len — Bmilc < 3e for m,n = M,
so the uniform convergence of the sequence (g,) on K follows from the
Cauchy Criterion for uniform convergence, given in 17.11. O.E.D.

In the proof of this result, we constructed a sequence of subsequences of


functions and then selected the ‘‘diagonal’’ sequence (g,), where g,=f,". Such a
construction is often called a “diagonal process’’ or ‘‘Cantor’s diagonal method”
and is frequently useful. The reader should recall that a similar type of argument
was used in Section 3 to prove that the real numbers do not form a countable set.
26. SOME FURTHER RESULTS 191

Exercises

26.A. Show that condition (a) of Theorem 26.1 is equivalent to the condition:
(a’') If f belongs to &, then |f| belongs to &.
26.B. Show that every continuous real-valued function on the interval [0, a] is
the uniform limit of a sequence of ‘“‘polynomials in cos x”’ (that is, of functions (P,),
where P,(x)=p,(cos x) for some polynomial p,).
26.C. Show that every continuous real-valued function on [0, 7] is the uniform
limit of a sequence of functions of the form
X +> dota, cosx+a;,cos2x++-++4,
cos nx.

26.D. Explain why the result in Exercise 26.B fails if cos kx is replaced by sin kx,
keEN.
26.E. Use Exercise 26.C to show that every continuous real-valued function f on
[0, a] with f(0) = f(a) is the uniform limit of a sequence of functions of the form

xr> bo+b, sinx+b, sin 2x+---+b, sin nx.

26.F. Use Exercises 26.C and 26.E to show that every continuous real-valued
function f on [—a, 7] with f(—a)= f(s) is the uniform limit of a sequence of
functions of the form

xX ata, cosxt+b,sinx+:--+a, cos nx +b, sin nx.

[Hint: split f into the sum f = f. +f, of an even function f,(x) =3(f(x) +f(—x)) and an
odd function f, (x) =3(f(x).]
26.G. Give a proof of the preceding exercise based upon Theorem 26.3 applied
to the unit circle T = {(x, y)e R*:x’+y?=1} and the observations that there is a
one-one correspondence between continuous functions on T to R and continuous
functions on [~—7, a] to R which satisfy f(—a) = f(a).
26.H. Let J&R be a compact interval and let # be a collection of continuous
functions on J > R which satisfy the properties of the Stone-Weierstrass Theorem
26.2. Show that any continuous function on JxJ (in R’) to R can be uniformly
approximated by functions of the form

filx)gs(y) + +t fala) en(y).


where f, g, belong to of.
26.1. Show that Tietze’s Theorem 26.4 may fail if the domain is not closed.
26.J. Use Tietze’s Theorem 26.4 to show that if D < R? is closed and if f is an
unbounded continuous function on D — R, then there exists a continuous exten-
sion of f to all of R’. [Hint: consider the composition @°f, where (x) = Arc tan x
or (x) =x/(1+x).]
26.K. Let ¥ be a collection of functions on DCR’ to R*. Consider the
property at the point ce D: if « >0 there is a 8(c, e)>0 such that if xeD and
|x — cl] < 8(c, e) then [If(x)—f(c)||<e for all fe# Show that # has this property at
céD if and only if for each sequence (x,) in D with c=lim(x,), then f(c)=
lim (f(x,)) uniformly for fe% (Sometimes we say that # is equicontinuous at
c €D when this property is satisfied).
192 CONTINUOUS FUNCTIONS

26.L. Let ¥ be as in Exercise 26.K. If D is compact and the property in


Exercise 26.K is satisfied for all ce D, show that # is uniformly equicontinuous in
the sense of Definition 26.6.
26.M. If K<R?® is compact and (f,) is a sequence of continuous functions on K
to R* which is uniformly convergent on K, show that the family {f,} is bounded on
K (in the sense that there exists M > 0 such that ||f,(x)|| = M for allx eK, ne N (or
Ifolle = M for nEN).
26.N. If K CR’ is compact and (f,) is a sequence of continuous functions on K
to R* which is uniformly convergent on K, show that the family {f,} is uniformly
equicontinuous on K in the sense of Definition 26.6.
26.0. Let # be a bounded and uniformly equicontinuous collection of functions
on DCR? to R and let f* be defined on D > R by

f*(x) = sup {f(x): f oF}.


Show that f* is continuous on D to R.
26.P. Show that the conclusion of the preceding exercise may fail if it is not
assumed that # is uniformly equicontinuous.
26.Q. Consider the following sequences of functions which show that the
Arzela-Ascoli Theorem 26.7 may fail if the various hypotheses are dropped.
(a) fa(x)=x+n for x €[0, 1];
(b) f(x) =x" for x €[0, 1];
1
(c) fa(x) “Ti Gon) for x e[0, +9).
26.R. Let (f,) be a sequence of continuous functions on R to R* which con-
verges at each point of the set Q of rationals. If the set {f,} is uniformly equicon-
tinuous on R, show that the sequence converges at every point of R and that the
convergence is uniform on every compact set of R, but not necessarily uniform
on R.
V
FUNCTIONS OF
ONE VARIABLE

We shall now commence the study of the differentiation and integration


of functions. In doing so it will be convenient to treat first the case of
functions of one variable; in Chapters VII and VHI we shall return to the
study of functions of several variables. It will be seen, in comparing these
chapters, that the case of functions of several variables is quite similar in
outline to what we shall do here, but that certain complications arise.
Furthermore, since the general theory makes use of results from the case
of one variable, it is convenient to have made a study of this case previ-
ously.
In Sections 27 and 28 we introduce the derivative of a function defined
on a real interval and establish the important Mean Value Theorem and
some of its corollaries. In Section 29 we shall introduce the definition of
the Riemann (and the Riemann-Stieltjes) integral of bounded functions on
an interva: [a, b]. The basic properties of the integral are established in
this section and in Sections 30 and 31. In the last two sections, we dis-
cuss “improper” and infinite integrals. Although the results of these sec-
tions are used very little in the following portions of this book, they are
important for many applications.

Section 27 The Mean Value Theorem

Since the reader is assumed to be already familiar with the connection


between the derivative of a function on R to R and the slope of its graph,
and with the notion of instantaneous rate of change, we shall focus our
attention entirely on the mathematical aspects of the derivative and not go
into its applications to physics, economics, et cetera. In this and the next
section we shall consider a function with domain D and range contained in

193
194 FUNCTIONS OF ONE VARIABLE

R. Although we are primarily interested in the derivative at an interior


point, we shall define the derivative somewhat more generally so that an
end point of an interval, for example, can be considered. However, we do
require that the point at which the derivative is being defined is a cluster
point of D and belongs to D.
27.1 DerINITION. If c is acluster point of D and belongs to D, we say
that a real number L is the derivative of f at c if for every number e >0
there is a number 5(¢)>0 such that if x belongs to D and if 0<|x—c|<
8(e), then

(27.1) f)-fO_
— ple &.

In this case we write f'(c) for L.


Alternatively, we could define f'(c) as the limit

jim f= FO) (xé€D,x#c).


xe x-c¢
It is to be noted that if c is an interior point of D, then in (27.1) we consider the
points x both to the left and the right of the point c. On the other hand, if D is an
interval and c is the left end point of D, then in relation (27.1) we can only take x to
the right of c.

Whenever the derivative of f at c exists, we denote its value by f’(c). In


this way we obtain a function f’ whose domain is a subset of the domain of
f. We now show that continuity of f at c is a necessary condition for the
existence of the derivative at c.
27.2 Lemma. Iff has a derivative atc, then f is continuous there.
PROOF. Let ¢ =1 and take 6 = 8(1) such that

f(x)-flo) _ fc) |<a,


x—-cC

for all xe D satisfying 0<|x—c|<6. From the Triangle Inequality, we


infer that for these values of x we have

f(x)
— f(c)| = |x| {f"()|+ I.
The left side of this expression can be made less than ¢ if we take x in D
with |x — c| <inf {8, e/(|f’(c)|+ 1}. OED.
It is easily seen that continuity at ¢ is not a sufficient condition for the derivative
to exist at c. For example, if D=R and f(x)=|x|, then f is continuous at every
point of R but has a derivative at a point c if and only ifc#0. By taking simple
algebraic combinations, it is easy to construct continuous functions which do not
have a derivative at a finite or even a countable number of points. In 1872,
27, THE MEAN VALUE THEOREM 195

Weierstrass shocked the mathematical world by giving an example of a function


which is continuous at every point but whose derivative does not exist anywhere. (In
fact, the function defined by the series

fQx)= y x cos (3"x),

can be proved to have this property. We shall not go through the details, but refer
the reader to the books of Titchmarsh and Boas for further details and references.)

27.3. Lemma. (a) If f has a derivative at c and f'(c)>0, there exists a


number & > 0 such that if x¢ D and c<x<c+6, then f(c)< f(x).
(b) If f'(c)<0, there exists a number 6 >0 such that ifxeD andc—-8<
x<c, then f(c)<f(x).
PROOF. (a) Let €o be such that 0< e9< f'(c) and let 6 = 8(e0) correspond
to eo as in Definition 27.1. If xe D andc<x<c+té6, then we have

-e fO=1O_pye),
Since x —c > 0, this relation implies that

O0<(f'(c) — £0)(x —c) < f(x) —f(c),


which proves the assertion in (a). The proof of (b) is similar. Q.E.D.
We recall that the function f is said to have a relative maximum at a
point c in D if there exists a 6 > 0 such that f(x) < f(c) when x € D satisfies
|x—c|<6. A similar definition applies to the term relative mimimum.
The next result provides the theoretical justification for the familiar
process of finding points at which f has relative maxima and minima by
examining the zeros of the derivative. It is to be noted that this procedure
applies only to interior points of the interval. In fact, if f(x)=x on
D =[0, 1], then the end point x = 0 yields the unique relative minimum and
the end point x = 1 yields the unique relative maximum of f, but neither is a
root of the derivative. For simplicity, we shall state this result only for
relative maxima, leaving the formulation of the corresponding result for
relative minima to the reader.
27.4 INTERIOR MAXIMUM THEOREM. Let c be an interior point of D at
which f has a relative maximum. If the derivative of f at c exists, then it
must be equal to zero.
proor. If f’(c)>0, then from Lemma 27.3(a) there is a 6 > 0 such that
ife<x<c+6andxeD, then f(c)<f(x). This contradicts the assumption
that f has a relative maximum at c. If f(c)<0, we use Lemma 27.3(b).
Q.E.D.
196 FUNCTIONS OF ONE VARIABLE

of—-~—-——
ZL \
Figure 27.1

27.5 ROLLE’s THEOREM.{ Suppose that f is continuous on a closed


interval J=[a, b], that the derivative f’ exists in the open interval (a, b),
and that f(a)=f(b)=0. Then there exists a point c in (a,b) such that
f'(c)=0.
proor. If f vanishes identically on J, we can take c=(a+b)/2. Hence
we suppose that f does not vanish identically; replacing f by —f, if neces-
sary, we may suppose that f assumes some positive values. By the Max-
imum Value Theorem 22.7, the function f attains the value sup {f(x) : x € J}
at some point c of J. Since f(a) = f(b) =0, the point c satisfies a<c<b.
By hypothesis f'(c) exists and, since f has a relative maximum point at c,
the Interior Maximum Theorem implies that f'(c) = 0. O.E.D.
As a consequence of Rolle’s Theorem, we obtain the very fundamental
Mean Value Theorem.
27.6 MEAN VALUE THEOREM. Suppose that f is continuous on a closed
interval J =[a, b] and has a derivative in the open interval (a,b). Then
there exists a point c in (a, b) such that

f(b) ~ fla) = f'()(b— a).


PROOF. Consider the function g defined on J by

o(x)= f(x) — f(a) LL (5),


[It is easily seen that ¢ is the difference off and the function whose graph
consist of the line segment passing through the points (a, f(a)) and (b, f(b);
see Figure 27.2.] It follows from the hypotheses that ¢ is continuous on
J=[a, b] and it is easily checked that @ has a derivative in (a, b).
Furthermore, we have g(a)=¢(b)=0. Applying Rolle’s Theorem, there

+ This theorem is generally attributed to MICHEL ROLLE (1652-1719), a member of the


French Academy, who made contributions to analytic geometry and the early work leading to
calculus.
27, THE MEAN VALUE THEOREM 197

1
!
on

i
% ¢
Figure 27.2. The mean value theorem.

exists a point c inside J such that

0=9'c)=f'"(c)— He) ~ fla)


fle)

from which the result follows. QED.

27.7 Coroiiary. Iff has a derivative on J =[a, b], then there exists a
point c in (a, b) such that

f(b) — fla) = f'(c)(b— a).


Sometimes it is convenient to have a more general version of the Mean
Value Theorem involving two functions.
27.8 Caucuy MEAN VALUE THEOREM. Let f, g be continuous on J =
[a, b] and have derivatives inside (a,b). Then there exists a point c in (a, b)
such that

f’(o)Lg(b)— g(a)] = gC)


f(b) — f(a)].
PROOF. When g(b)= g(a) the result is immediate if we take c so that
g’(c)=0. If g(b) # g(a), consider the function ¢ defined on J by

(3) =f) pla) LRP [etx)~ ga)


Applying Rolle’s Theorem to ¢, we obtain the desired result. QED.
Although the derivative of a function need not be continuous, there is an
elementary but striking theorem due to Darboux? asserting that the de-
rivative f’ attains every value between f'(a) and f'(b) on the interval [a, b].
(See Exercise 27.H.)
+ GASTON DARBOUX (1842-1917) was a student of Hermite and a professor at the Collége de
France. Although he is known primarily as a geometer, he made important contributions to
analysis as well.
198 FUNCTIONS OF ONE VARIABLE

It is easy to remember the statement of the Mean Value Theorem by


drawing appropriate diagrams. While this should not be discouraged, it
tends to suggest that its importance is geometrical in nature, which is quite
misleading. In fact the Mean Value Theorem is a wolf in sheep’s clothing
and is the Fundamental Theorem of the Differential Calculus. We close
this section with a few elementary consequences of this result. More will
be given in the next section, and still others will appear later.
27.9 THrorEM. Suppose that f is continuous on J =[a, b] and that its
derivative exists in (a, b).
(i) If f(x) =0 for a<x<b, then f is constant on J.
(ii) If f’(x) = g'(x) fora<x <b, thenf and g differ on J by a constant.
(iti) If f'(x) = 0 fora <x <b and if x; = x2 belong to J, then f(x1) = f(x2).
(iv) If f(x) >0 fora <x <b and if x1 < xz belong to J, then f(x:) < f (x2).
(v) If f(x) = 0 fora<x <at6, then ais a relative minimum point of f.
(vi) If f'(x) = 0 forb -8<x <b, then bis a relative maximum point of f.
(vii) If |f'(x)| <M for a<x <b, then f satisfies the Lipschitz condition:
[ f(x)
— f(x2)| = M |x1— x9 for x1, x2in J.
We leave the proof to the reader.

Exercises

27.A. Using the definition, calculate the derivative (when it exists) of the func-
tions given by the expressions:

(a) f(x) =x? for xeER,


(b) g(x) =x" for xeER,
(c) h(x) =vx for x=0,
(d) F(x)=1/x for x#0,
(e) G(x) =|x| for xeER,
(f) L(x)
= 1/x? for x#0.
27.B. If f and g are real-valued functions defined on an interval J, and if they are
differentiable at a point c, show that their product h, defined by h(x) = f(x)g(x), for
xéJ, is differentiable at c and

h'(c) =fi(c)g(c)
+ fic)g'(c).
27.C. Show that the function defined for x # 0 by

f(x) = sin (1/x)


is differentiable at each non-zero real number. Show that its derivative is not
bounded on a neighborhood of x = 0. (You may make use of trigonometric iden-
tities, the continuity of the sine and cosine functions, and the elementary limiting
relation (sin u)/u— 1 as u— 0.)
27. THE MEAN VALUE THEOREM 199

27.D. Show that the function defined by


g(x) =x’ sin (1/x), x#0,
=0, x=0,

is differentiable for all real numbers, but that g’ is not continuous at x = 0.


27.E. The function h:R— R defined by h(x)=x? for x¢Q and h(x)=0 for
x Q is continuous at exactly one point. Is it differentiable there?
27.F. Let céD bea cluster point of D and let f: D — R. Show that f'(c) exists if
and only if for every sequence (x,) in D with x,#c for n EN such that lim (x,)=c,
the limit of the sequence

(fs) =f0)
Xn —€

exists. In this case the limit of all such sequences are equal to f'(c).
27.G. If f:D — R is differentiable at c¢ D andifc+1/neD forall ne N, show
that

f'(c) =lim (nf{f(c + 1/n) — f(c)}).


However show that the existence of the limit of this sequence does not imply the
existence of the derivative.
27.H. (Darboux) If f is differentiable on [a, b], if f'(a) =A, f’(b) =B, and if C
lies between A and B, then there exists a point c in (a, b) for which f’(c) = C. (Hint:
consider the lower bound of the function g(x) = f(x) — C(x — a).)
27.1. If g(x) =0 for x <0 and g(x) = 1 for x = 1, prove that there does not exist a
function f:R — R such that f’(x) = g(x) for all x eR.
27.5. Give an example of a continuous function with a unique relative maximum
point but such that the derivative does not exist at this point.
27.K. Give an example of a uniformly continuous function which is differenti-
able on (0, 1) but is such that its derivative is not bounded on (0, 1).
27.L. Let f:[a,b]— R be differentiable at cé[a, b]. Show that if for every
é >0, there is a (ce) > 0 such that if0<|x —y|<8(e)anda <x=<c=y <b, then

[#22101 py |<
27.M. Let f:[a, b]> R be differentiable on [a, b]. Show that f’ is continuous
on (a, b]if and only if for every « > 0 there is a 8(e) > 0 such that if 0<|x — y|< 8(e),
x, y€[a, b], then

fG)-fy)
x-y -f@) |<
27.N. Let f:[a,b]— R be continuous on [a, b] and differentiable in (a, b). If
lim, f'(x) = A show that f’(a) exists and equals A.
27.0. If f: R— R and f'{a) exists, show that

Fa) =f tla)
200 FUNCTIONS OF ONE VARIABLE

However, give an example to show that the existence of this limit does not imply
the existence of the derivative.
27.P. A function f:R — R is said to be even if f(—x) = f(x) for all x ER, and to
be odd if f(—x) =—f(x) for allxe R. Iff is differentiable on R and even (respec-
tively, odd), show that f’ is odd (respectively, even).
27.Q. Let f:(a,b) > R and ce(a,b). We put f(ct+)=hm,.. f(x) (the right-
hand limit of f atc). If the right-hand limit

A, =lim f(x) —f(ct)


xe
x>e
x-C¢

exists in R, we say that f has a right-hand derivative at c and denote A, by f1(c).


Similarly for left-hand derivatives
Show that if f is continuous at c, then f’(c) exists if and only if f4(c) and f‘(c)
exist and are equal. Show that we can have g’(c)= g/(c) without g’(c) existing.
27.R. Let I and J be intervals in R and let f:I > R and g:J— R be such that g
is differentiable at a point b € J and f is differentiable at an interior point a = g(b) of
I. Show that the composition h = fog defined for {x €J: g(x) €J} is differentiable
at b and that h’(b) = f’(a)g’(b). [Hint: let H be defined on D(h) by

(x)=
alex) -f(glb))
g(x)—g(b) if g(x) ¥ gc),

= f’{a) if g(x) =g(c).


Show that lim, H(x)=f'(a). Then use the fact that (g(x)—g(b)H(x)=
f(gx))—f(g(b)) for all x in D(h).]
27.8. Let f :[0, +o) — R be differentiable on (0, +).
(a) If f'(x) — bE R as x — +, show that for any h >0 we have

fin EMM)
(b) If f(x)> aeR and f'(x)> bER as x > +™, then b=0.
(c) If f'(x) > bE R as x > +, then f(x)/x > b as x > +00,
27.T. Let f:[a,b]—>R be differentiable with 0<m = f'(x)=M for x e[a, b]
and let f(a)<0<f(b). Given x,€[a, b], define the sequence (x,) by

Xan =e 3 Flom, neN.

Prove that this sequence is well-defined and converges to the unique root x of the
equation f(x) =0 in [a, b] and that

y ny
\xXn-e1¥| = aml M

forneN. (Hint: let ~:[a, b] > R be defined by ¢ (x) =x —f(x)/M. Show that ¢
is increasing and a contraction (see 23.4) with constant 1—m/M.)
27.U. Let f: R — R have a continuous derivative and be such that f(a) = b an!
f'(a) #0. Let &>0 be such that if |x — a] = 5 then |f’(x)—f’(a)| =<3|f'(a)|, and let
28. FURTHER APPLICATIONS OF THE MEAN VALUE THEOREM 201

1 = 38 |f'(a)|. Prove that if | — b| = y, then the sequence (x,) defined by x, = a and

Xnat = Xp
f)-¥F neN

converges to the unique point x in [a—6, a+] such that f(x)=y9. (Hint: show
that the function defined by p(x) = x — (f(x)— 9)/f'(a) is a contraction with constant
300 the interval [a— 8, a+6].)

Section 28 Further Applications of the Mean


Value Theorem
It is hardly possible to overemphasize the importance of the Mean Value
Theorem, for it plays a crucial role in many theoretical considerations. At
the same time it is very useful in many practical matters. In 27.9 we
indicated some immediate consequences of the Mean Value Theorem
which are often useful. We shall now suggest some other areas in which it
can be applied; in doing so we shall draw more freely than before on the
past experience of the reader and his knowledge concerning the derivatives
of certain well-known functions.
28.1 AppLICATION. Rolle’s Theorem can be used for the location of
roots of a function. For, if a function g can be identified as the derivative
of a function f, then between any two roots of f there is at least one root of
g. For example, let g(x) =cos x; then g is known to be the derivative of
f(x) =sin x. Hence, between any two roots of sin x there is at least one root
of cos x. On the other hand, g’(x) = —sin x = —f(x), so another application
of Rolle’s Theorem tells us that between any two roots of cos x there is at
least one root of sinx. Therefore, we conclude that the roots of sin x and
cos x interlace each other. This conclusion is probably not news to the
reader; however, the same type of argument can be applied to Besselt
functions J, of order n=0, 1, 2,... by using the relations

[x"Jn (x) ] = x"Jn-a(x), ("Jn (x)! = —x Sn (x) for x>0.


The details of this argument should be supplied by the reader.
28.2 APPLICATION. We can apply the Mean Value Theorem for ap-
proximate calculations and to obtain error estimates. For example, sup-
pose it is desired to evaluate V105. We employ the Mean Value Theorem
with f(x) = vx, a= 100, b = 105 to obtain
5
¥105—V100=—~,
2c
} FRIEDRICH WILHELM BESSEL (1784-1846) was an astronomer and mathematician. A
close friend of Gauss, he is best known for the differential equation which bears his name.
202 FUNCTIONS OF ONE VARIABLE

for some number c with 100<c<105. Since 10<Vce<V105<V121=


11, we can assert that

5
ari
<< V105—-10<5-~
xa 2(10) ’

whence it follows that 10.22<V¥105<10.25. This estimate may not be as


sharp as desired. It is clear that the estimate Ve<V105<V121 was
wasteful and can be improved by making use of our conclusion that
V105<10.25. Thus, Vc< 10.25 and we easily determine that

023 <sae os
25) ~¥ 105—

Our improved estimate is 10.243<¥105 < 10.250 and more accurate esti-
mates can be obtained in this way.
28.3. AppLicaTion. The Mean Value Theorem and its corollaries can
be used to establish inequalities and to extend inequalities that are known
for integral or rational values to real values.
For example, we recall that Bernoulli’s Inequality 5.C asserts that if
1+x>0 and neEN, then (1+x)"=i+nx. We shall show that this in-
equality holds for any real exponent r= 1, Todo so, let f(x)=(1+ x)’, so
that f’(x)=r(it+x)". If -1<x<0, then f’(x)<r, while if x>0, then
f'(x)>r. If we apply the Mean Value Theorem to both of these cases, we
obtain the result
(1+x) =1+nrx,

when 1+x>0 andr=1. Moreover, if r>1, then the equality occurs if


and only if x =0.
As a similar result, let a be a real number satisfying O<a@<1 and let
g(x)=ax—x* for x=0. Then g'(x)=a(1—x°*"’), so that g’(x)<0O for
O0<x<1 and g’(x)>0forx>1. Consequently, if x = 0, then g(x) = ¢(1)
and g(x) = g(1) if and only ifx =1. Therefore, if x = 0 and0<a <1, then
we have
x*<ax+(1—a).

If a= 0 and b>0 and if we let x = a/b and multiply by b, we obtain the


inequality
a*b’* < aat+(1—a)b.
where equality holds if and only if a=b. This inequality is often the
starting point in establishing the important Hdlder Inequality (cf. Project
8.B).
28. FURTHER APPLICATIONS OF THE MEAN VALUE THEOREM 203

28.4 Application. The familiar rules of L’Hospitalt on the evalua-


tion of ‘‘indeterminant forms’’ can be established by means of the Cauchy
Mean Value Theorem. For example, suppose that f, g are continuous on
[a, b] and have derivatives in (a, b), that f(a) = g(a) =0, but that g, 9’ do
not vanish for x# a. Then there exists a point c with a<c<b such that

f(b) _ fle)
gb) gc)”
It follows that if lim... f’(x)/g’(x) exists, then

tim £@) — jim LOO


va g(x) #94 g(x)”
The case where the functions become infinite at x = a, or where the point
at which the limit is taken is infinite, or where we have an “‘indeterminant”’
of some other form, can often be treated by taking logarithms, exponen-
tials or some similar manipulation.
For example, if a = 0 and we wish to evaluate the limit of h(x) =x log x
as x —> 0, we cannot apply the above argument. We write h(x) in the form
f(x)/g(x) where f(x) =log x and g(x)=1/x, x >0. It is seen that
te

£'Q)
wax_ > 0, as x0.
g'(x)
aLL]

Let e¢ > 0 and choose a fixed number 0< x, <1 such that if0<x<x,, then
|f’(x)/2’(x)|<e. Applying the Cauchy Mean Value Theorem, we have

f(x)
— f(x.) | _ | f(x) <€ ’
g(x) — g(x) g'(x2)
with x2 satisfying O0<x<x.<x,. Since f(x) #0 and g(x) #0 forO0<x<x4,
we can write the quantity appearing on the left side in the more convenient
form
1 fx)

Fx) JU fx)
g(x) 1 — 80) ‘
g(x)
Holding x: fixed, we let x > 0. Since the quantity in braces converges to

+ GUILLAUME FRANCOIS L’HOSPITAL (1661-1704) was a student of Johann Bernoulli


(1667-1748). The Marquis de L’Hospital published his teacher’s lectures on differential
calculus in 1696, thereby presenting the first textbook on calculus to the world.
204 FUNCTIONS OF ONE VARIABLE

1, it exceeds 3 for x sufficiently small. We infer from the above that

|h(x)| = f(x) <2e,


g(x)
for x sufficiently near 0. Thus the limit at x = 0 of h is 0.

Interchange of Limit and Derivative


Let (f,) be a sequence of functions defined on an interval J of R and with
values in R. It is easy to give an example of a sequence of functions which
have derivatives at every point of J and which converges on J to a function
f which does not have a derivative at some points of J. (Do sol)
Moreover, the example of Weierstrass mentioned before can be used to
give an example of a sequence of functions possessing derivatives at every
point of R and converging uniformly on R to a continuous function which
has a derivative at no point. Thus it is not permissible, in general, to
differentiate the limit of a convergent sequence of functions possessing
derivatives even when the convergence is uniform.
We shall now show that if the sequence of derivatives is uniformly
convergent, then all is well. If one adds the hypothesis that the derivatives
are continuous, then it is possible to give a short proof based on the
Riemann integral. However, if the derivatives are not assumed to be
continuous, a somewhat more delicate argument is required.
28.5 THeorem. Let (f,) be a sequence of functions defined on a finite
interval J of R and with values on R. Suppose that there is a point xo in J at
which the sequence (f,(xo)) converges, that the derivatives ft, exist on J, and
that the sequence (ft) converges uniformly on J to a function g. Then the
sequence (f,) converges uniformly on J to a function f which has a derivative
at every point ofJ and f’= g.
PROOF. Suppose the end points of J are a<b and let x be any point of
J. Ifm, n are natural numbers, we apply the Mean Value Theorem to the
difference f,,—f, on the interval with end points xo, x to conclude that
there exists a point y (depending on m, n) such that
fr (X) ~ f(x) = fn (%0) — fu (0) + (% — Xo Finy) ~ fry}.
Hence we infer that

Ifo — falls = {fon (220) — fre(X0)| + (B — a) [lfm


~ fale
so the sequence (f,) converges uniformly on J to a function we shall denote
by f. Since the f, are continuous and the convergence of (f,) to f is
uniform, then f is continuous on J.
To establish the existence of the derivative of f at a point c in J, we apply
28. FURTHER APPLICATIONS OF THE MEAN VALUE THEOREM 205

the Mean Value Theorem to the difference f,, —f, on an interval with end
points c, x to infer that there exists a point z (depending on m, n) such that
{fm (x) — fn(x)}—
{fm (c) — fale)} = (x — cf f(z) — fa(z)}.
We infer that, when c# x, then

fn(x)—~fnlc) fale) = fal)


x—-c x—-c = lift
— fall.
In virtue of the uniform convergence of the sequence (f/), the right hand
side is dominated by « when m,n = M(e). Taking the limit with respect
to m, we infer from Lemma 15.8 that

FO)-FO) _ fale)
— filo)
|
x—-C x—-C€

when n= M(e). Since g(c) =lim (f4(c)), there exists an N(e) such that if
n= N(e), then [fi(c)—g(c)|<«. Now let K =sup{M(e), N(e)}. In view
of the existence of f&(c), if 0<|x —c|< 8x(e), then
Pe) fl) _ (0)
x-

Therefore, it follows that if 0<|x —c|<8x(e), then


| Bede gy <3e.

This shows that f'(c) exists and equals g(c). Q.E.D.

Taylor’s Theorem

If the derivative f'(x) of f exists at every point x of a set D, we can


consider the existence of the derivative of the function f’ at a point c € D.
In case f’ has a derivative of c, we refer to the resulting number as the
second derivative of f at c and shall ordinarily denote this number by f”(c),
or by f(c). In a similar fashion we define the third f"(c)=f(c),... ’
and the nth derivative f(c),..., whenever these derivatives exist.
We shall now obtain the celebrated theorem attributed to Brook
Taylor, which plays an important role in many investigations and can be
regarded as an extension of the Mean Value Theorem.
28.6 Taytor’s THEOREM. Suppose that n is a natural number, that f
and its derivatives f',f",..., f° > are defined and continuous on J =[a, b],

f BROOK TAYLOR (1685-1731) was an early English mathematician. In 1715 he gave the
infinite series expansion, but—true to the spirit of the time—did not discuss convergence.
The remainder was supplied by Lagrange.
206 FUNCTIONS OF ONE VARIABLE

and that f™ exists in (a, b). If a, B belong to J, then there exists a number y
between a and B such that

(8)= fla) +E (¢ —@) +O (pa)?


nL (n)
+: af ri pay +O) (B- a)".
PROOF. Let P be the real number defined by the relation

8.1) BAP
p— 4p) ~ {70 +49 @-a)
fe (a) ¢
Paap boa" |.
and consider the function ¢ defined on J by

002) = 108) ~ {70 +22 B+ +p s Fp—o}.


(n-1),

Clearly, @ is continuous on J and has a derivative on (a, b). It is evident


that »(8)=0 and it follows from the definition of P that p(a)=0. By
Rolle’s Theorem, there exists a point y between a and B such that
¢'(y)=0. On calculating the derivative ’ (using the usual formula for the
derivative of a sum and product of two functions), we obtain the telescop-
ing sum
(n—1)
e')=-{re)- 7)+ 2 B-a+- +n EO p-a
=O Gee]
£0) P n-1

PHF) pp yn
— (n- 1)! (B x) .

Since ¢'(y) = 0, then P = f(y), proving the assertion. QED.


REMARK. The remainder term
{n)
(28.2) r,=L Fo) ip- a)"
given above is often called the Lagrange form of the remainder. There are
many other expressions for the remainder, but for the present, we mention
only the Cauchy form which asserts that for some number @ withO<@<1,
then

(28.3) R.=(1-6)"7 8B)(


oe - Sha 6B)
28. FURTHER APPLICATIONS OF THE MEAN VALUE THEOREM 207

This form can be established as above, except that on the left side of
equation (28.1) we put (8 — a@)Q/(n— 1)! and we define ¢ as above except
its last term is (8 —x)OQ/(n—1)! We leave the details as an exercise. (In
Section 31 we shall obtain another form involving use of the integral to
evaluate the remainder term.)

Exercises

28.A. Using the formulas in 28.1, show that if 2 = 0, 1,2,..., then the roots of
the Bessel functions J, and J,,., on (0, +9) interlace each other.
28.B. Show that if x >0, then

1 423 - « = Vitx<<l +5.

28.C. Calculate V1.2 and V2. What is the best accuracy you can be sure of?
28.D. Get estimates similar to those in Exercise 28.B for (1+ x)” on the interval
[0, 7]. Use these to calculate 31.5 and 72.
28.E. Suppose that 0<r<1 and -—1<x. Show that we have (1+x) =1+rx
and that the equality holds if and only if x =0.
28.F. A root x» of a polynomial p is said to be simple (or have multiplicity one) if
p'(%o.) #0, and to have multiplicity n if p(x.) =p'(x.) =---=p (x. =0, but
P (Xo) #0.
If a<b are consecutive roots of a polynomial, then there are an odd number
(counting multiplicities) of roots of its derivative in (a, b).
28.G. Show that if the roots of the polynomial p are all real, then the roots of p’
are allreal. If, in addition, the roots of p are all simple, then the roots of p’ are all
simple.
28.H. If f(x) =(x?— 1)" and if p is the nth derivative of f, then p is a polynomial
of degree n whose roots are simple and lie in the open interval (—1, 1).
28.1. Establish the Cauchy form of the remainder term R, in Taylor’s Theorem
given in formula (28.3).
28.J. A proof of Taylor’s Theorem 28.6 using the Cauchy Mean Value Theorem
can be given by letting

RO)=f)-[F + EOF pong)


Show that R(a) = R'(a)=---=R° (a) =0 and R(x) =f™(x). Note that there
exists y, between a and B such that

R(B) _R(B)-R(a)_ Ry)


(B-a)" (B—a)"-0" n(yi-a)"™?
Continue this to find that R(8) = (B — a)"f(y,)/n! for some y, between a and p.
28.K. If f(x) = e*, show that the remainder term in Taylor’s Theorem converges
to zero as n —> © for each fixed a, B.
28.L. If f(x)=sin x, show that the remainder term in Taylor’s Theorem con-
verges to zero as n —> © for each fixed a, B.
208 FUNCTIONS OF ONE VARIABLE

28.M. If f(x)=(1+x)" where me Q, |x|<1, the usual differentiation formulas


from calculus and Taylor’s Theorem lead to the expression

(Bre as (nxt FR
(1+x)™ = 14(™)x+

where R, can be given in Lagrange’s form by R, = x"f(0,x)/n! where 0<6, <1.


Show that if 0=x<1, then lim(R,)=0. Show that if -1<x <0, then we cannot
use the same argument to show that lim (R,,) =0.
28.N. In the preceding exercise, use Cauchy’s form of the remainder to obtain
_m(m—1): ++ (m—n+1) (1—6,)""'x"
R T-2---(m-1) (+0xy""?
where 0<6,<1. When |x|<1 show that |(1—6,)/(1+6,x)|<1, and prove that
lim (R,,) = 90.
28.0. Iff: R—R, if f’(x) exist for x ER, and if f’(a) exists, show that

f"(a)=limune fla+h)—2f(a)+ flab)


Give an example where this limit exists, but the function does not have a second
derivative at a.
28.P. Let f,(x) =|x|"*" for x in[—1, 1]. Show that each f, is differentiable on
[-1, 1] and that (f,) converges uniformly on [—1, 1] to f(x) = |x|.

Projects
28.a. In this project we consider the exponential function from the point of view
of differential calculus.
(a) Suppose that a function E on J = (a, b) to R has a derivative at every point of
J and that E’(x) = E(x) for all x ¢J. Observe that E has derivatives of all orders on
J and they all equal E.
(b) If E(a) = 0 for some a € J, apply Taylor’s Theorem 28.6 and Exercise 14.L to
show that E(x)=0 for all xeJ.
(c) Show that there exists at most one function E on R to R which satisfies

E'(x) = E(x) forx eR, E(0)=1.

(d) Prove that if E satisfies the conditions in part (c), then it also satisfies the
functional equation
E(x+y)=E(x)E(y) forx, yeR.

(Aint: if f(x)=E(x+y/E(y), then f(x) =f(x) and f(0)=1.)


(e) Let (E,) be the sequence of functions defined on R by
E,(@x)=14+x, E,(x) = E,-s(x)+x"/nl.

Let A be any positive number; if |x|= A and if m>=n>2A, then

Hence the sequence (E,,) converges uniformly for |x| =< A.


28. FURTHER APPLICATIONS OF THE MEAN VALUE THEOREM 209

(f) If (E,) is the sequence of functions defined in part (e), then

E‘(x) = E,_.(x), forxeR.

Show that the sequence (E,) converges on R to a function E with the properties
displayed in part (c). Therefore, E is the unique function with these properties.
(g) Let E be the function with E’=E and E(0)=1. If we define e to be the
number
e-E(\),

then e lies between 23 and 23. (Hint: 1+1+}+$<e<1+1+5+i+%. More


precisely, we can show that 2.708 <2+i}<e <2+33< 2.723.)

28.f. In this project, you may use the results of the preceding one. Let E
denote the unique function on R such that

E'=E and E(Q)=1


and let e = E(1).
(a) Show that E is strictly increasing and has range P= {x € R:x > 0}.
(b) Let L be the inverse function of E, so that the domain of L is P and its range
is allof R. Prove that L is strictly increasing on P, that L(1) = 0, and that L(e) = 1.
(c) Show that L(xy)=L(x)+L(y) for all x, y in P.
(d) If O0<x<y, then

5-H
SLG) LQ) <2 (9%).
(Hint: apply the Mean Value Theorem to E.)
(e) The function L has a derivative for x >0O and L'(x) = 1/x.
(f) The number e satisfies

e=tim ((1+4)’), n

(Hint: evaluate L'(1) by using the sequence ((1+1/n)) and the continuity of E.)

28.y. In this project we shall introduce the sine and cosine.


(a) Let h be defined on an interval J =(a, b) to R and satisfy

h"(x)+h(x) =0

for all x in J, Show that h has derivatives of all orders and that if there is a point a
in J such that h(a) =0, h’(a) = 0, then h(x)=0 for all x EJ. (Hint: use Taylor’s
Theorem 28.6.)
(b) Show that there exists at most one function C on R satisfying the conditions
c’'+C=0, CO)=1, C(O) =0,
and at most one function S on R satisfying

S’+S$=0, S(0) =0, S'(0)


= 1.
(c) We define a sequence (C,) by
x?"

Cix)=1-%7/2, CO)
= Cae) +(-1)" (2n)!"
210 FUNCTIONS OF ONE VARIABLE

Let A be any positive number; if |x|=< A and if m=n>A, then

Ie) Gol = gaa [1+ (EB) 4 GE) ]


Ant? A 2 A 2m-2n

4 Ant

< (Jen +21"


Hence the sequence (C,) converges uniformly for |x}= A. Show also that
Cl=—C,-1, and C,(0)=1 and Ci(0)=0. Prove that the limit C of the sequence
(C,) is the unique function with the properties in part (b).
(d) Let (S,) be defined by
xen

Six)=x, Sa(x)= S.a(x)


+ (-1)""' =>
Gan 1!"

Show that (S,) converges uniformly for |x| < A to the unique function S with the
properties in part (b).
(e) Prove that S’=C and C’=~—S.
(f) Establish the Pythagorean Identity $+ C?=1. (Hint: calculate the deriva-
tive of S7+C”’,)

28.8. The project continues the discussion of the sine and cosine functions. Free
use may be made of the properties established in the preceding project.
(a) Suppose that h is a function on R which satisfies the equation
h"+h=0.

Show that there exist constants a, 8 such that h=aC+S. (Hint: a =h(0),
B =h'(0).)
(b) The function C is even and S is odd in the sense that

C(—x) = C(x) and S(—x) =—S(x) for allx in R.

(c) Show that the “‘addition formulas”

C(x ty) = C(x)C(y)— S(x)S(y),


S(x ty) = S(x)C(y)
+ C&)S(y),
hold for all x, y in R. (Hint: let y be fixed, define h(x) = C(x +y), and show that
h’+h=0.)
(d) Show that the ‘“‘duplication formulas’’

C(2x) = 2[C(x)P— 1 = 2[S(x)


FP +1,
S(2x) = 2S(x) C(x),

hold for all x in R.


(e) Prove that C satisfies the Inequality

C(x) = 1 = Cla) < 1 > +5


= C(x).
Therefore, the smallest positive root y of C lies between the positive root of
x’—2=0 and the smallest positive root of x*—12x?+24=0. Using this, prove
that V2<y<Vv3.
28. FURTHER APPLICATIONS OF THE MEAN VALUE THEOREM 211

(f) We define 7 to be the smallest positive root of S. Prove that m=2y and
hence that 2V2 <7 <2v3.
(g) Prove that both C and S are periodic functions with period 27 in the sense
that C(x +27) = C(x) and S(x +2) =S(x) for all x in R. Also show that

S(x)= c(Z-x) = -o(x +3),


cx) =8(2-x)= s(x +3),
for all x in R.

28.2. Following the model of the preceding two projects, introduce the hyper-
bolic cosine and sine as functions satisfying
c"=¢, c(0) = 1, c‘(0)=0,
' s"=s, s(0)=0, s‘(0)=1,
respectively. Establish the existence and the uniqueness of these functions and
show that c—s?=1.

Prove results similar to (a)-(d) of Project 28.6 and show that, if the exponential
function is denoted by E, then

c(x) =2(E(x)+ E(-x)), s(x) =3(E(x)— E(—x)).


28.¢. A function ¢ on an interval J of R to R is said to be (midpoint) convex
in case

(2) =2(e(x)+ ey)

for each x, y in J. (In geometrical terms: the midpoint of any chord of the curve
y = ¢(x), lies above or on the curve.) In this project we shall suppose that ¢ is a
continuous convex function.
(a) If n=2™ and if x,,...,x, belong to J, then

oP
Xb Xete
ES)
+X
Fol)
1
+ 96):
(b) Ifm<2” andifx,,...,x, belong to J, let x, forj =n+1,...,2™ be equal to

= (Stet ts)
x= {4 iS
n
Show that the same inequality holds as in part (a).
(c) Since @ is continuous, show that if x, y belong to J and teF, then

o((1—t)x + ty) = (1—t)e(x)


+ te(y).
(In geometrical terms: the entire chord lies above or on the curve.)
(d) Suppose that » has a second derivative on J. Then a necessary and sufficient
condition that » be convex on J is that »"(x)=0O for xeJ. (Hint: to prove the
necessity, use Exercise 28.0. To prove the sufficiency, use Taylor’s Theorem and
expand about x =(x + y)/2.)
212 FUNCTIONS OF ONE VARIABLE

(e) If @ is a continuous convex function on J and if x = y = z belong to J, show


that

oly)—o(x) _ ofz)—e(x)
y-x z-x

Therefore, if w=x = y=<z belong to J, then

e(x)— ew) _ e(z)-o(y)


x—w z-y

(f) Prove that a continuous convex function ¢ on J has a left-hand derivative and
a tight-hand derivative at every point. Furthermore, the subset where o’ does not
exist is countable.

Section 29 The Riemann-Stieltjes Integral

In this section we shall define the Riemann-Stieltjest integral of


bounded functions on a compact interval of R. Since we assume that the
reader is acquainted, at least informally, with the integral from a calculus
course we will not provide an extensive motivation for it.
The reader who continues his study of mathematical analysis will want to
become familiar with the more general Lebesgue integral at an early date.
However, since the Riemann and the Riemann-Stieltjes integrals are
adequate for many purposes and are more familiar to the reader, we prefer
to treat them here and leave the more advanced Lebesgue theory for a
later course.
We shall consider bounded real-valued functions on closed intervals of
the real number system, define the integral of one such function with
respect to another, and derive the main properties of this integral. The
type of integration considered here is somewhat more general than that
considered in earlier courses and the added generality makes it very useful
in certain applications, especially in statistics. At the same time, there is
little additional complication to the theoretical machinery that a rigorous
discussion of the ordinary Riemann integral requires. Therefore, it is
worthwhile to develop this type of integration theory as far as its most
frequent applications require.

+ (GEORGE FRIEDRICH) BERNHARD RIEMANN (1826-1866) was the son of a poor country
minister and was born near Hanover. He studied at Gottingen and Berlin and taught at
Gottingen. He was one of the founders of the theory of analytic functions, but also made
fundamental contributions to geometry, number theory, and mathematical physics.
THOMAS JOANNES STIELTJES (1856-1894) was a Dutch astronomer and mathematician. He
studied in Paris with Hermite and obtained a professorship at Toulouse. His most famous
work was a memoir on continued fractions, the moment problem, and the Stieltjes integral,
which was published in the last year of his short life.
29. THE RIEMANN-STIELTJES INTEGRAL 213

Let f and g denote real-valued functions defined on a closed interval


J =[a, b] of the real line. We shall suppose that both f and g are bounded
on J; this standing hypothesis will not be repeated. A partition of J is a
finite collection of non-overlapping intervals whose union is J. Usually,
we describe a partition P by specifying a finite set of real numbers
(Xo, X1,..-,%n) such that
A=XSxXS'''
Se = bd
and such that the subintervals occurring in the partition P are the intervals
[Xx-1, Xe], K=1,2,...,n. More properly, we refer to the end points x:,
k=0,1,...,n as the partition points corresponding to P. However, in
practice it is often convenient and can cause no confusion to use the word
“partition” to denote either the collection of subintervals or the collection
of end points of these subintervals. Hence we write P = (Xo, X1,..., Xn).
If P and Q are partitions of J, we say that Q is a refinement of P or that Q
is finer than P in case every subinterval in Q is contained in some subinterval
in P, This is equivalent to the requirement that every partition point in P is
also a partition point in Q. For this reason, we write P< Q when Q is a
refinement of P.
29.1 Derinivion. If P is a partition of J, then a Riemann-Stieltjes sum
of f with respect to g and corresponding to P = (xo, x1,..., Xn) is a real
number S(P; f, g) of the form

(29.1) SOP; f, 8) = 2 AE)


9 (a) — g()}.
Here we have selected numbers &, satisfying
Xe-1 SS & SX for k=1,2,...,”
Note that if the function g is given by g(x) =x, then the expression in equation
(29.1) reduces to

(29.2) y HE) Gn — Xe).


The sum (29.2) is usually called a Riemann sum of f corresponding to the partition P
and can be interpreted as the area of the union of rectangles with sides [x,_,, x, ] and
heights f(&). (See Figure 29.1.) Thus if the partition P is very fine, it is expected
that the Riemann sum (29.2) yields an approximation to the ‘‘area under the graph of
f.” Fora general function g, the reader should interpret the Riemann-Stieltjes sum
(29.1) as being similar to the Riemann sum (29.2)—except that, instead of consider-
ing the length x, —x, , of the subinterval [x, 1, x,], we are considering some other
measure of magnitude of this subinterval, namely the difference g(x,)— g(x,-,).
Thus if g(x) is the total “mass” or ‘‘charge’’ on the interval [a, x], then g(x,)—
g(x,1) denotes the ‘‘mass” or “‘charge”’ on the subinterval [x,-,, x,]. The idea is that
we want to be able to consider measures of magnitude of an interval other than
length, so we allow for the slightly more general sums (29.1).
214 FUNCTIONS OF ONE VARIABLE

)
|. —-__N
\

Lp
NN

—- —- —_

}— srr —
wD
Leir

a=Xxo *L x2 x3 Xk Xpel Xn =b
x
co
H
Figure 29.1. '
The Riemann sum as an area.

It will be noted that both of the sums (29.1) and (29.2) depend upon the choice of
the ‘intermediate points’’; that is, upon the numbers &, 1 = k =n. Thusit might be
thought advisable to introduce a notation displaying the choice of these numbers.
However, by introducing a finer partition, it can always be assumed that the
intermediate points & are partition points. In fact, if we introduce the partition
Q = (Xo, Er, X1, &2, + + + > Exp Xn) and the sum S(Q; f, g) where we take the intermediate
points to be alternately the right and the left end points of the subinterval,,then the
sum S(Q; f, g) yields the same value as the sum in (29.1). We could always assume
that the partition divides the interval into an even number of subintervals and the
intermediate points are alternately the right and left end points of these subintervals.
However, we shall not find it necessary to require this ‘‘standard’’ partitioning
process, nor shall we find it necessary to display these intermediate points.

29.2 DeFINITION. We say that f is integrable with respect to g on J if


there exists a real number I such that for every number e >0 there is a
partition P. of J such that if P is any refinement of P, and S(P; f, g) is any
Riemann-Stieltjes sum corresponding to P, then

(29.3) IS(P; f, g)-I<e.


In this case the number I is uniquely determined and is denoted by

I= [tag = [no dg(t);

it is called the Riemann-Stieltjes integral of f with respect to g over


J=[a,b]. We call the function f the integrand, and g the integrator.
Sometimes we say thatf is g-integrable if f is integrable with respect to g.
In the special case g(x) =x, if f is integrable with respect to g, we usually
say that f is Riemann integrable.
29. THE RIEMANN-STIELTJES INTEGRAL 215

Before we develop any of the properties of the Riemann-Stieltjes integral,


we shall consider some examples. In order to keep the calculations simple,
some of these examples are chosen to be extreme cases; more typical
examples are found by combining the ones given below.
29.3 Examples. (a) We have already noted that if g(x) =x, then the
integral reduces to the ordinary Riemann integral of elementary calculus.
(b) If g is constant on the interval [a, b]}, then any function f is integrable
with respect to g and the value of the integral is 0.
(c) Let g be defined on J =[a, b] by

g(x)=0, x=a,
=1, a<x<b.
We leave it as an exercise to show that a function f is integrable with respect
to g if and only if f is continuous at a and that in this case the value of the
integral is f(a).
(d) Let c be an interior point of the interval J =[a, b] and let g be defined
by
g(x) =0, asxx<c,
=1, c<x=<b.
It is an exercise to show that a function f is integrable with respect to g if and
only if it is continuous at c from the right (in the sense that for every e >0
there exists 6(¢)>O such that if c<x<c+68(e) and xeJ, then we have
[f(x)—f(c)|<e). If f satisfies this condition, then the value of the integral
is f(c). (Observe that the integrator function g is continuous at c from the
left.)
(e) Modifying the preceding example, let h be defined by
h(x) =0, asxx<c,
=1, csxab.

Then h is continuous at c from the right and a function f is integrable with


respect to h if and only if f is continuous at c from the left. In this case the
value of the integral is f(c).
(f) Let c:< 2 be interior points of J =[a, b] and let g be defined by
g(x) =a, asx<1,
= a2, <x S02,
= a3, a<x=b.
If f is continuous at the points ci, ¢2, then f isintegrable with respect to g and

{3 dg = (a2— a1) f(e1) + (as — a2) f(c2).


216 FUNCTIONS OF ONE VARIABLE

By taking more points we can obtain a sum involving the values of f at points
in J, weighted by the values of the jumps of g at these points.
(g) Let the function f be Dirichlet’s discontinuous function [cf. Example
20.5(g)] defined by
f(x) =1, if x is rational,
=0, if x is irrational,

and let g(x)=x. Consider these functions on I=[0, 1]. If a partition P


consists of n equal subintervals, then by selecting k of the intermediate
points in the sum S(P; f, g) to be rational and the remaining to be irrational,
S(P; f, g)=k/n. It follows that f is not Riemann integrable.
(h) Let f be the function defined on I by f(0) = 1, f(x) = 0 for x irrational,
and f(m/n)=1/n when m and n are natural numbers with no common
factors except1. It wasseen in Example 20.5(h) that f is continuous at every
irrational number and discontinuous at every rational number. If g(x) =x,
then it is an exercise to show that f is integrable with respect to g and that the
value of the integral is 0.
29.4. CAUCHY CRITERION FORINTEGRABILITY. The function f is integrable
with respect to g over J =[a, b] if and only if for each number c > 0 there is a
partition Q. of J such that if P and Q are refinements of Q, and if S(P; f, g) and
S(Q; f, g) are any corresponding Riemann- Stieltjes sums, then
(29.4) IS(P;f, g)- S(O; f, gi<e.
prooF. If f is integrable, there is a partition P, such that if P, Q are
refinements of P., then any corresponding Riemann-Stieltjes sums satisfy
|IS(P;f, g)-I|<e/2 and |S(Q;f, g)—I|<e/2. By using the Triangle In-
equality, we obtain (29.4).
Conversely, suppose the criterion is satisfied. To show that f is integrable
with respect to g, we need to produce the value of its integral and use
Definition 29.2. Let Q, be a partition of J such that if P and O are
refinements of Q:, then |S(P; f, g)— S(Q; f, g)|<1. Inductively, we choose
Q, to be a refinement of Q,-; such that if P and Q are refinements of Q,,
then

(29.5) IS(P; f, g)-S(Q; f, g)|<1/n.


Consider a sequence (S(Q,; f, g)) of real numbers obtained in this way.
Since Q, is a refinement of QO, when n= m, this sequence of sums is a
Cauchy sequence of real numbers, regardless of how the intermediate points
are chosen. By Theorem 16.10, the sequence converges to some real
number L. Hence, if ¢ >0, there is an integer N such that 2/N<e and

[S(Qusf, g)-L|<e/2.
29. THE RIEMANN-STIELTJES INTEGRAL 217

If P isa refinement of Quy, then it follows from the construction of Oy that

IS(P; f, g)-S(On; f. 2)/< U/N <e/2.


Hence, for any refinement P of Qn and any corresponding Riemann-
Stieltjes sum, we have
(29.6) |S(P;f, g)-L|<e,
This shows that f is integrable with respect to g over J and that the value of
this integral is L. O.E.D.

Some Properties of the Integral


The next property is sometimes referred to as the bilinearity of the
Riemann-Stieltjes integral.
29.5 THEOREM. (a) If f., f2 are integrable with respect to g onJ and a, B
are real numbers, then af,+ Bf is integrable with respect to g on J and

(29.7) [eh+Bf) dg=af fidg+ 6 fd,


(b) If fis integrable with respect to g: and g2 onJ and a, B are real numbers,
then f is integrable with respect to g = ag:+ Bg. on J and
b b 6

(29.8) I fdg =a fdgi+ Bl fdgo.


proor. (a) Let e>0O and let Pi=(x0,X1,...,%) and P.=
(yo, Y1, .--, Ym) be partitions of J =[a, b] such that if Q is a refinement of
both P, and P2, then for any corresponding Riemann-Stieltjes sums, we have

jL-S(O;3fi, gl<e, \I2- S(O; fr, gl<e.

Let P. be an partition of J which is a refinement of both P; and P, (for


example, all the partition points in P; and P, are combined to form P.). IfQ
is a partition of J such that P, < Q, then both of the relations above still hold.
When the same intermediate points are used, we evidently have
S(Q; of: + Bf, 8) = aS(Q; fr, 8)+ BS(Q; fe, 8).
It follows from this and the preceding inequalities that
lal + Bl,—S(Q; af:+ Bfo, g)| =lath — S(Q; fi, gt + Bil— S(O; fo, a}
= (le|+|B)e.
This proves that aI, + BI, is the integral of af: + Bf. with respect to g. This
establishes part (a); the proof of part (b) is similar and will be left to the
reader. QED.
218 FUNCTIONS OF ONE VARIABLE

There is another useful additivity property possessed by the Riemann-Stieltjes


integral; namely, with respect to the interval over which the integral is extended. It
is in order to obtain the next result that we employed the type of limiting introduced
in Definition 29.2. A more restrictive type of limiting would be to require inequality
(29.3) for any Riemann-Stieltjes sum corresponding to a partition P=
(Xo, X1,---5 X%_) Which is such that

Pll
= sup {x1—%0, X2— X1,-. + Xn Xn-sf < BCE).
This type of limiting is generally used in defining the Riemann integral and
sometimes used in defining the Riemann-Stieltjes integral. However, many authors
employ the definition we introduced, which is due to S. Pollard, for it enlarges slightly
the class of integrable functions. As a result of this enlargement, the next result is
valid without any additional restriction. See Exercises 29. P-R.
29.6 THEorEM. (a) Suppose that a <c < band that f is integrable with
respect to g over both of the subintervals [a, c] and [c, b]. Then f is integrable
with respect to g on the interval [a, b] and

(29.9) [i dg = [ fdg+ {i de.


(b) Let f be integrable with respect to g on the interval [a, b] and let c satisfy
a=c=b. Thenfisintegrable with respect to g on the subintervals [a, c] and
[c, b] and formula (29.9) holds.
proor. (a) If e >0, let P; be a partition of [a, c] such that if P’ is a
refinement of P:, then inequality (29.3) holds for any Riemann-Stieltjes
sum. Let P? be a corresponding partition of [c, b]. If P. is the partition of
[a, b] formed by using the partition points in both P! and P%, and if P isa
refinement of P., then

S(P;f, 2 =SP' f, g)+S(P's fg),


where P’, P” denote the partitions of [a, c], [c, b] induced by P and where the
corresponding intermediate points are used. Therefore, we have

[fas [ fag—ser; fg) |

= |[ tas-seP sf 8) + [fag-sers
fe) |<26.
It follows that f is integrable with respect to g over [a, b] and that the value
of its integral is

{ faa+[
Fas,
(b) We shall use the Cauchy Criterion 29.4 to prove that f is integrable
over [a, c]. Since f is integrable over [a, b], given e >0 there is a partition
29. THE RIEMANN-STIELTJES INTEGRAL 219

Q. of [a, b] such that if P, Q are refinements of Q.,, then relation (29.4) holds
for any corresponding Riemann-Stieltjes sums. It is clear that we may
suppose that the point c belongs to Q., and we let Q: be the partition of
[a, c] consisting of those points of Q, which belong to[a, c]. Suppose that
P’ and Q’ are partitions of [a, c] which are refinements of Q! and extend
them to partitions P and Q of [a, b] by using the points in Q, which belong
to [c, b]. Since P, Q are refinements of QO,, then relation (29.4) holds.
However, it is clear from the fact that P, Q are identical on [c, b] that, if
we use the same intermediate points, then

IS(P'; f, 2) - S(O f, QI=ISCP3 f a) -S(Q3 Ff, gi<e.


Therefore, the Cauchy Criterion establishes the integrability of f with
respect to g over the subinterval [a, c] and a similar argument also applies to
the interval [c,b]. Once this integrability is known, part (a) yields the
validity of formula (29.9), QED.

Thus far we have not interchanged the roles of the integrand f and the
integrator g, and it may not have occurred to the reader that it might be
possible to do so. Although the next result is not exactly the same as the
“integration by parts formula” of calculus, the relation is close and this result
is usually referred to by that name.
29.7 INTEGRATION By Parts. A function f is integrable with respect to g
over [a, b] if and only if g is integrable with respect to f over [a,b]. In this
case,

(29.10) [tae+[e df = f(b)g(b)


— f(a)g(a).
PROOF. We shall suppose that f isintegrable with respect to g. Lete >0
and let P. be a partition of [a, b] such that if Q is a refinement of P, and
S(Q; f, g) is any corresponding Riemann-Stieltjes sum, then

(29.11) )s(a:4. 8)- [Fag ]<e.


Now let P be a refinement of P, and consider a Riemann-Stieltjes sum
S(P; g, f) given by
S(Ps 8, f)= 2, glia) —fad}
where %-1< & <x. Let Q=(yo, yi,-..; yan) be the partition of [a, b]
obtained by using both the &, and x, as partition points; hence ya, = x, and
Yor-1 = &. Add and subtract the terms f(y2)g(y2.) for k=0, 1,..., n, to
S(P; g, f) and rearrange to obtain
S(P; 8, f) = f(b) ato) —fla)g(a)- ¥ Fem Here)
— BOD,
220 FUNCTIONS OF ONE VARIABLE

where the intermediate points 7, are selected to be the points x; Thus we


have

S(P; g, f) = f(b)g(b) — f(a)g(a)— S(Q; f, 8),


where the partition Q = (yo, yi,..., Yan) is a refinement of P,. In view of
formula (29.11),

| 5; 8 -{f0)30)—flargta)— [Fag}| <e


provided P is a refinement of P,. This proves that g is integrable with
respect to f over[a, b] and establishes formula (29.10). Q.E.D.

Modification of the Integral


When the integrator function g has a continuous derivative, it is possible
and often convenient to replace the Riemann-Stieltjes integral by a
Riemann integral. We now establish the validity of this reduction.

29.8 THEOREM. If the derivative g' exists and is continuous on J and if f


is integrable with respect to g, then the product fg’ is Riemann integrable and

(29.12) {i dg = [te
PROOF. The hypothesis implies that g’ is uniformly continuous on J. If
e >0, let P = (xo, X1,..., Xn) be a partition of J such that if & and ¢ belong
to [xx-1, x] then |g’(&)—g’(&)|<e. We consider the difference of the
Riemann-Stieltjes sum S(P; f, g) and the Riemann sum S(P; fg’), using the
same intermediate points &. In doing so we have a sum of terms of the form

FEM B(x) — ges} FE)


(Ee xe — Xe if

If we apply the Mean Value Theorem 27.6 to g, we can write this difference
in the form
FE Mg") — BCE Xe — Xe)
where & is some point in the interval [x.-1,x.]. Since this term is
dominated by e ||fl| (x. — xx-1), we conclude that
|S(P; f, g)— SCP; fel <e |Ifl (- a),
provided the partition P is sufficiently fine. Since the integral on the left side
of (29.12) exists and is the limit of the Riemann-Stieltjes sums S(P; f, g), we
infer that the integral on the right side of (29.12) also exists and that the
equality holds. Q.E.D.
For an extension of this result, see Theorem 30.13
29. LHI RIEMANN-STIBLTJES INTEGRAL 221

29.9 Exampces. (a) It follows from results to be proved in Section 30


that f(x) =x is integrable with respect to g(x) =x? on J=(0, 1]. Assuming
this, Theorem 29.8 shows that
1 1 1
2
[, <a0ey= [x 2xae =e =>.
0 0. 0 3

(Here we have made use of results from calculus that will be proved in
Section 30.)
(b) If we apply Theorem 29.7 on Integration by Parts to the functions in
(a), we get
i 1
‘2
[> d(x*)=x* -| x? dx =1-3x .
0 0 0 3

(c) It follows from results to be proved in Section 30 that f(x) =sin x is


integrable with respect to f on J=[0, 7/2]. Assuming this, we have
aid wi2 w/2

| sin x d(sin x) = | sin x cos x dx = x(sin x)” 5°


0 0 0

(d) If we apply Theorem 29.7 on Integration by Parts to part (c) we get


{2 af2 wf/2

| sin x d(sin x)
= (sin x)” -| sin x d(sin x),
0 0 0

whence it follows that


mf2 afd 1

| sin x d(sin x) = (sin x)” .


O oO 2

(e) We introduce the greatest integer function on R to R, denoted by the


special symbol [ - ] and defined by requiring that if x ¢ R, then [x] is the
largest integer less than or equal to x. Hence [aw]=3,[e]=2,[-2.5]=—3.
The reader should draw a graph of this function and note that it is
continuous from the right, with jumps equal to | at the integers. It follows
that if f is continuous on (0, 5], then f is integrable with respect to g(x) =[x],
x €[0, 5], and that

[ feo aep-¥& 10.


5 5

(f) It follows from results in Section 30 that f(x) =x" is integrable with
respect to both gi(x) = x and go(x) =[x]on[0, 5]. Therefore it is integrable
with respect to g(x) = x +[x] and we have

[x d(x+[x) = [x dx + [« d({x)
= 35° 4+1°4+274+374+4°4+ 5",
222 FUNCTIONS OF ONE VARIABLE

Exercises

29.A. Iff is constant on the interval [a, b], then it is integrable with respect to any
function g and

[Fas = flarte(b)- g(a.


29.B. Ifg is asin Example 29.3(c), show that f is integrable with respect to g if and
only if f is continuous at a.
29.C. Let g be defined on I=[0, 1] by g(x) =0 for 0 <x =3 and g(x) =1 for
3<x <1. Show that f is integrable with respect to g on I if and only if it is
continuous at 3 from the right. In this case the value of the integral is f(@).
29.D. Show that the function f, given in Example 29.3(h) is Riemann integrable
on I and that the value of its integral is 0.
29.E. If f is integrable on [a, b] with respect to f, then

[ far-xg@y'- Gay,
(a) Prove this by examining the two Riemann-Stieltjes sums for a partition
P=(X%o, X1,...,X,) obtained by taking & =x,_, and & =x.
(b) Prove this by using the Integration by Parts Theorem 29.7.
29.F. Show directly that if f is the greatest integer function f(x) =[x] defined in
Example 29.9(e), then f is not integrable with respect to f on the interval (0, 2].
29.G. If f is Riemann integrable on [0, 1], then

[r-nim (5&1)
29.H. Show that if g is not integrable on [0, 1], then the sequence of averages

(2 et)
1g /(k )

may or may not be convergent.


29.1. Show that the function h, defined on F by h(x) = x for x rational and h(x) =0
for x irrational, is not Riemann integrable on I.
29.J. Suppose thatf is Riemann integrable on[a, b]. Iff,isafunctionon[a, b]to
R such that f,(x) = f(x) except for a finite number of points in [a, b], show that f, is
Riemann integrable and that

frefs
b b

(Thus we can change the value of a Riemann integrable function—or leave it


undefined—at a finite number of points.)
20.K. Give an example to show that the conclusion of the preceding exercise may
fail if the number of exceptional points is infinite.
29.L. Let ce(a,b) and let k be defined on [a,b] by k(c)=1 and k(x)=0
for xé[a,b], x#c. If f:[a,b]—R is continuous at c show directly that f is
29. THE RIEMANN-STIELTJES INTEGRAL 223

k-integrable, that k is f-integrable, and that


b b

{ f dk -| k df =0.

29.M. Suppose that f is g-integrable on [a,b]. If g,:[a,b]— R is such that


g.(x) = g(x) except for a finite number of points in (a, b) at which f is continuous,
then f is g.-integrable and

[ Fes. [ fas,
29.N. Suppose that g is continuous on [a, b], that x+» g’(x) exists and is
continuous on [a, b}\{c}, and that the one-sided limits

g'(c—)=lim g(x), g'(c+) = tim g(x)

exist. If f is integrable with respect to g on [a, b], then fg’ can be defined at c to be
Riemann integrable on [a, b] and such that

[ fas= | te’
b b

(Hint: consider Exercise 27.N.)


29.O If f is Riemann integrable on [—5, 5], show that f is integrable with respect
to g(x) =|x| and

[ofde-[r-[r
29.P. If P=(Xo, X:,..., Xn) is a partition of J =[a, b], let ||P|| be defined to be

Pll =sup {x —%4-1:7=1,2,...,n}


we call ||P|| the norm of the partition P. Define f to be (*)-integrable with respect
to g on J in case there exists a number A with the property: if « >0 then there is a
8(e)>0 such that if |P||}<8(«) and if S(P;f, g) is any corresponding Riemann-
Stieltjes sum, then |S(P;f, g)-A|<e. If this is satisfied the number A is called
the (*)-integral of f with respect to g on J. Show that if f is (*)-integrable with
respect to g on J, then f is integrable with respect to g (in the sense of Definition
29.2) and that the values of these integrals are equal.
29.Q. Let g be defined on I as in Exercise 29.C. Show that a bounded function
f is (*)-integrable with respect to g in the sense of the preceding exercise if and
only if f is continuous at } when the value of the (*)-integral is f@). If h is defined
by
h(x) =0, O=xx<
nls

=1, s<x<l,

then h is (*)-integrable with respect to g on [0,3] and on [3,1] but it is not


(«)-integrable with respect to g on [0,1]. Hence Theorem 29.6(a) may fail for the
(*)-integral.
224 FUNCTIONS OF ONE VARIABLE

29.R. Let g(x)=x for xeJ. Show that for this integrator, a function f is
integrable in the sense of Definition 29.2 if and only if it is (*)-integrable in the
sense of Exercise 29.P.
29.8. Let f be Riemann integrable on J and let f(x)=0O for xeJ. If f is
continuous at a point ceJ and if f(c)>0, then

[ Foo.

29.T. Let f be Riemann integrable on J and let f(x) >0OforxeJ. Show that

[ foo.

(Hint: for each neéN, let H, be the closure of the set of points x €J such that
f(x)>1/a and apply Exercise 11.N.)

Projects

29.a. The following outline is sometimes used as an approach to the Riemann-


Stieltjes integral when the integrator function g is monotone increasing on the
interval J. [This development has the advantage that it permits the definition of
upper and lower integrals which always exists for a bounded function F. However, it
has the disadvantage that it puts an additional restriction on g and tends to blemish
somewhat the symmetry of the Riemann-Stieltjes integral given by the Integration
of Parts Theorem 29.7.] If P = (xo, x:,...,%,) is a partition of J =[a, b] and f isa
bounded function on J, let m,, M; be defined to be the infimum and the supremum
of {f(x):%_1< x <x}, respectively. Corresponding to the partition P, define the
lower and the upper sums of f with respect to g to be

LOPS F, 8) = mgs)
20%}
UP; f, 8)= Y Mile)
— 80-0):
(a) If S(P;f, g) is any Riemann-Stieltjes sum corresponding to P, then

L(P; f, g) = S(P; f, g) = U(P; fg).


(b) If ¢ >0O then there exists a Riemann-Stieltjes sum S,(P; f, g) corresponding
to P such that

Si(P; f, 3) = L(P; fg) +e,


and there exists a Riemann-Stieltjes sum S,(P; f, g) corresponding to P such that

U(P; fg) = S.(P; f, 8).


(c) If P and Q are partitions of J and if Q is a refinement of P (that is, P< Q),
then

L(P; f, g) = L(Q;f, g) = U(Q;f, g) = UCPf, g).


29. THE RIEMANN-STIELTJES INTEGRAL 225

(d) If P,; and P, are any partitions of J, then L(P,; f, g) < U(P.; f, g). [Hint: let
Q be a partition which is a refinement of both P, and P, and apply (c).]
(ec) Define the lower and the upper integral of f with respect to g to be,
respectively
LO, g)=sup {L(P; f, 8},
U(f, g) = ink {U(P; f, gh:
here the supremum and the infimum are taken over all partitions P of J. Show that
L(f, g) = UG, 8).
(f) Prove that f is integrable with respect to the increasing function g if and only
if the lower and upper integrals introduced in (e) are equal. In this case the
common value of these integrals equals

(1s
Show that f is integrable with respect to g if and only if the following Riemann
condition is satisfied: for every «>0 there exists a partition P such that
U(P;f, g)—L(P;f, g) <e.
(g) If f; and f, are bounded on J, then the lower and upper integrals of f,+f,.
satisfy
L(fit+f., g)= Lh, g)t+ Lh, g&)>

U(fit fi, g) = Uf, 8)+ Ufa 8).


Show that strict inequality can hold in these relations.

29.8. This (and the next two) projects introduce and study the important class of
functions which have “bounded variation” on a compact interval. Let f:[a,b]—>R
be given; if P=(a =x.<x,<:-+<x, =b) isa partition of [a, b], let v,(P) be defined
by .

»,(P) = 2 [fO%) - FDI.

If the set {v,(P):P a partition of [a, b]} is bounded, we say that f has bounded
variation on [a,b]. The collection of all functions having bounded variation on
[a, b] is denoted by BV({a, b]) or by BV[a, b]. If fe BV{a, b], then we define

VLa, b] = sup {u,(P):P a partition of [a, b]}.

We call the number V,[a, b] the total variation of f on [a, b]. Show that Vi[a, b]=0
if and only if f is a constant function on [a, b].
(a) If f:[a,b]> R, if P and Q are partitions of [a, b], and if P > QO, show that
uP) = v,(Q). Iff ¢BV[a, b], show that there exists a sequence (P,) of partitions of
(a, b] such that V[a, b] =lim (v,(P,)).
(b) Iff is monotone increasing on [a, b], show that f ¢ BV[a, b] and that V[a, bJ=
f(b)—f(a). What iff is monotone decreasing on [a, b]?
(c) If g:[a, b] > R satisfies the Lipschitz condition |g(x) — g(y)| = M |x — y| for all
x, y in[a, b], show that g € BV[a, b] and that V,[a, b] = M(b—a). If|h'(x)| = M for
allx €[a, b], then h ¢ BV[a, b] and V,[a, b] = M(b—a). However, consider k(x) =
vx on [0, 1].
226 FUNCTIONS OF ONE VARIABLE

(d) Let f:[0,1]— R be defined by f(x)=0 for x =0 and f(x)=sin (1/x) for
0<x<1. Show that f does not have bounded variation on[0, 1]. If g is defined by
g(x) =xf(x) for x €[0, 1], show that g is continuous but does not have bounded
variation on [0, 1}. However, if h is defined by h(x) = x’f(x) for x €[0, 1], show that
h does have bounded variation on [0, 1].
(e) If f ¢ BV[a, b], show that |f(x)| = |f(a@)|+ Vi[a, b] for all x €[a, b], so that f is
bounded on I =[a, b} and |If\l, = |f(a)|+ ViLa, 6}.
(@ Itf, g¢BV[a, b] anda € R, show that af and f+g belongto BV[a, b] and that
Vis (a, b] = lax| V,[a, b],

Vi+e[a, b)< V,[a, b]+ V,[a, b].


Hence BV[a, b] is a vector space of functions.
(g) If f, ge BV[a, b], show that the product fg belongs to BV[a, b] and that

Vida, b] < fll V,La, b]+(gll. Vila, b].

Show that the quotient of two functions in BV[a, b] may not belong to BV[a, b].
(h) Show that the mapping f+ Vj[a, b] is not a norm on the vector space
BV[a, b], but that the mapping

folfllev =|fla)|+ Vila, 6]


is a norm on this space.

29.y. We continue our study of functions of bounded variation on an interval


[a, b]GR.
(a) Iffe BV[a, b] andifc €(a, b), show that the restrictions of f to[a, c]and[c, b]
have bounded variation on these intervals and that

Vila, 6]= ViLa, c]+ Vile, b).


Conversely, if g :[a, b] > R is such that for some c € (a, b) we have g €¢ BV[a, c] and
g€BV{[c, b], then ge BV[a, b].
(b) If fe BV[a, b], then we define p,(x)= Va, x] for x €(a, b], and py(a) = 0.
Show that p; is an increasing function on [a, b].
(c) Note that ifa<x =< y <b, then

fy) -f@) = Vix y].


Show that if we define n,(x)=p;(x)—f(x) for x e[a, b], then n, is an increasing
function.
(d) Show that a function f :[a, b] > R belongs to BV[a, b] if and only if it is the
difference of two increasing functions.
(e) If f¢ BV[a, b] is continuous from the right at a point c €[a, b), and if e >0,
show that there exists 5 > 0 and a partition such that if Q =(c <x,<-+-+<x,=b)isa
sufficiently fine partition of [c, b] with x,—c <6, then

Via, b]—3e Stet > [fa —fCa2)] <te+ Vix. b],


k=2
whence it follows that
Vile, x1]= Vic, b]—- Vi[x., b]<e.
3U. EXISTENCE OF THE INTEGRAL 227

Show that f is continuous at c €[a, b] if and only if v, is continuous at c.


(f) Deduce that a continuous function f:[a, b]— R belongs to BV[a, b] if and
only if it is the difference of two increasing continuous functions.
29.6. It was proved by Lebesgue that a function with bounded variation has a
derivative at every point except possibly for a set of “‘measure zero.” The proof of
this result is rather difficult and will not be outlined here, but we shall obtain some
further properties of such functions.
(a) If f¢ BV[a, b] and if c € (a, b), then the left- and right-hand limits of f at c
exist. These limits are equal except possibly for a countable collection of points in
[a, b].
(b) If (f.) is a sequence of continuous functions in BV[a, b] which is uniformly
convergent on [a, b] to a function f, show that it does not follow that f belongs to
BV[a, b].
(c) Let (f,) be a sequence in BV[a, b] which converges at every point of La, b] toa
function f, and suppose that for some M >0 we have V,[a, b] = M for all neN.
Show that f belongs to BV[a, b] and that Va, b] <= M.
(d) Let (f,) be a sequence in BV[a, b] such that ||f,—f,|lbv ~ 0 as m, n>.
Show that there exists afunction f ¢ BV[a, b]suchthat||f, — flay > Oasn > ~,
(e) Let (f,) be a sequence of monotone increasing functions defined on I =[a, b]
such that ||f,|| = M for allneN. Use the diagonal process to obtain a subsequence
(g.) of (f.) which converges for each rational number r in [a,b]. Define g(r) =
lim (g.(r)) for re Q@ N[a, b]. Show that g is increasing on QM [a,b]. We define g
for x €[a, b) as the right-hand limit g(x) =lim,_,. g(r). Show that if c €[a, b) isa
point of continuity of g, then g(c)=lim, g.(c). Since g has at most a countable
collection of points of discontinuity, a further application of the diagonal process can
be applied to get a subsequence (h,,.) of (g.) which converges everywhere on [a, b].
(f) Making use of part (e), establish the following result, called the Helly Selection
Theorem: Let (f,) be a sequence of functions in BV[a, b] such that |f,||,v < M for all
neéN. Then there exists a subsequence of (f,) which converges at every point of
[a, b] to a function f € BV[a, b] for which ||flla, = M.

Section 30 Existence of the Integral!

In the preceding section we established some useful properties of the


Riemann-Stieltjes integral. However, we have not yet shown the existence
of the integral for very many functions.
In this section we shall focus our attention on monotone increasing
integrator functions, although much of what we do can be extended to
functions g which have bounded variation on an interval J =[a, b] in the
sense that there exists a constant M > 0 such that if P = (x, x1,..., %,) isany
partition of J, then

(30.1) x, |g(x;)— g(%)-1)| = M.


228 FUNCTIONS OF ONE VARIABLE

It is clear that, if g is monotone increasing, then the sum in (30.1) telescopes


and one can take M = g(b)— g(a). Hence a monotone increasing function
has bounded variation. Conversely, it can be shown that every function
with bounded variation is the difference of two increasing functions. (See
Project 29.+.)
We shall first establish a very powerful result.
30.1 RIEMANN CRITERION FOR INTEGRABILITY. Let J =[a, b] and let g be
monotone increasing on J. A function
f: J — R is integrable with respect to g
on J if and only if for every « >0 there exists a partition P. of J such that if
P=(Xo, X1,--.; Xn) is a refinement of P., then

(30.2) % (Mi~m)fe() 80%) <e,


where M, = sup {f(x):x €[x%-1, x;]} and m, = inf {f(x):x €[x-1, x} for j=
1,...,n.
proor. If f is integrable with respect to g and e >0 is given, let P. be a
partition of J such that if P = (xo, x1,..., Xn) is a refinement of P., then

| S(P; f, e)-[ fag <e

for any Riemann-Stieltjes sum corresponding to P. Now choose y, and z; in


[x;-1, x;] such that
M—e<f(y), fla)<m +e.
This implies that M, — m; <f(y,)—f(z)+2« and hence

¥ (M,— m){ex)~ soi = Y Fl {ela)~ gs}


~ YF Eels) — gl}
+ 26 {g(b) — g(a}.
Now the right side of this inequality contains two Riemann-Stieltjes sums
corresponding to P, which cannot differ by more than 2e. Hence the
condition (30.2) is satisfied with e replaced by 2e{1+(b)— g(a)}.
Conversely suppose ¢ >0 is given and P, is a partition such that (30.2)
holds for any partition P=(X0,%1,...,X,) refining P.. Let Q=
(yo, y1,;.--, Ym) be a refinement of P; we shall estimate the difference
S(P; f, g)-S(Q; f, g) of two corresponding sums. Since every point in P
belongs to Q, we can express both of these sums in the form

SPs fg)= ¥ fludeon)- 80-0},


S(O;f, =X k=1
flr)ig(ve)— glye-d}-
30. EXISTENCE OF THE INTEGRAL 229

However, to write S(P; f, g) in terms of the points in QO, we must permit


repetitions of the points u, and do not require u, to belong to [yx-1, ys].
However, both u, and u. do belong to some interval [x;-1, x;] and in this
case |f(ux)—f(v)| <= Mj-m. Multiplying by g(y.)— g(yx-1) = 0 and ad-
ding, we obtain
iS(P; f, 2) -S(O;f, gl s L (M—m){g(x))— g)-d} <e.
Finally, let P and P’ be arbitrary refinements of P. and let Q be acommon
refinement of both P and P’. Since the preceding argument applies to both
P and P’, we deduce that any sums S(P; f, g) and S(P’; f, g) could differ by at
most 2e. Hence the Cauchy Criterion 29.4 applies to yield the integrability
of f. QED.
30.2 INTEGRABILITY THEOREM. If f is continuous and g is monotone
increasing on J, then f is integrable with respect to g on J.
PROOF. Since f is uniformly continuous on J, given e >0, there exists a
8(e)>0 such that if x, yeJ and |x—y|<8(e), then |f(x)—f(y)|<e. Let
P. = (Zo, Z1,---, 2) be a partition such that sup {z. — Ze-1}<6(e). If P=
(Xo, X1,-.-,Xn) is a refinement of P., then also sup {x; —x;-1}< 6(e) and so
M, —m; < e, whence it follows that

) < e(@(b)~ g(a).


¥ (Mm Mas— gud}
Since e > 0 is arbitrary, the Riemann Criterion applies. QED.
30.3 CoroLiary. Iff is monotone and g is continuous on J, then f is
integrable with respect to g on J.
proor. Apply the preceding theorem and Theorem 29.7 to +f. QD.
The Riemann Criterion enables us to show that the absolute value and the
product of integrable functions are integrable.
30.4 THEOREM. Let g be monotone increasing on J =[a, b].
(a) If f: J — R is integrable, then |f| is integrable with respect to g on J.
(b) Iff: and f, are integrable, then the product ff. is integrable with respect
to g on J.
PROOF. Let M, and m; have the meaning given in the Riemann Criterion
and observe that

M,—m, =sup {f(x)— f(y): x, y €Lg-1, x].

To prove (a), note that ||f(x)|—|f(y)||= |f@)~—f(y)|, so the Riemann


Criterion implies that |f| is integrable when f is.
We also observe that if |f(x)|/=< K for xeJ, then |(f(x))°—(f(y))'|<
2K |f(x)—f(y)|, so the Riemann Criterion implies that f” is integrable when f
230 FUNCTIONS OF ONE VARIABLE

is. To prove that fif. is integrable when f, and f, are, note that

2hf= Chit hy -fr-f’. QED.


30.5 Lemma. Let g be monotone increasing on J =[a, b] and suppose
that f is integrable with respect to g on J. Then

(30.3) [ f4s| = [ \fiag <p (eo)


~ g(a).
ifm = f(x) =M for all x €J, then

(30.4) m(g(b)~g(a)) = | fag = M(g(b)— g(a).


PROOF. It follows from Theorem 30.4 that |f| isintegrable with respect to
g. If P=(xo, xX1,..., Xn) is a partition of J and (z,) is a set of intermediate
points, then for j=1,2,...,n, we have

~llflb = -lfla)l sf) = |f(@)| = Ifill.


Multiply by g(x;)— g(x%j-1) = 0 and sum to obtain the estimate
—Ilfils (g(6)— g(a)) = —S(P3 If], g) = SP; fg) = S(P3 |f|, 8)
< Ifill: (g(b) — g(a),
whence it follows that |S(P; f, g)| < S(P3|fl, 8) <Ilflly (g(b)—g(a)), which
implies the validity of (0.3). The proof of (30.4) is similar and will be
omitted. QED.

Evaluation of the Integral


The next two results are useful in their own right, but also lead to the
Fundamental Theorem which is the primary tool for evaluating Riemann
integrals.
30.6 First MEAN VALUE THEOREM. If g is increasing on J =[a, b] andf
is continuous on J to R, then there exists a number c in J such that

(30.5) [, fae =flo)| dg = Foe) — 2a}.


PROOF. Ifm =inf {f(x):x €J}and M = sup {f(x):x € J}, it was seen in the
preceding lemma that

m{g(b)—a(a)} = | fdg = Mig(b)— g(a}.


If g(b) = g(a), then the relation (30.5) is trivial; if g(b) > g(a), then it follows
from Bolzano’s Intermediate Value Theorem 22.4 that there exists a
30. EXISTENCE OF THE INTEGRAL 231

number c in J such that

or={f fag} /ta(b)- ead} on.


30.7 DIFFERENTIATION THEOREM. Suppose that f is continuous on J and
that g is increasing on J and has a derivative at a pointc in J. Then the
function F, defined for x in J by

FO) = ["Fag,
has a derivative at c and F'(c) = f(c)g’(c).
proor. If h>O is such that c+h belongs to J, then it follows from
Theorem 29.6 and the preceding result that

F(e+h)-F()=| fag-[fae=[° fag


= fledigle+h)-g(6)}
for some c, withe =< c,<ct+h. Asimilar relation holdsifh<0. Since f is
continuous and g has a derivative at c, then F'(c) exists and equals f(c)g’(c).
OED.

Specializing this theorem to the Riemann case, we obtain the result which
provides the basis for the familiar method of evaluating integrals in calculus.
30.8 FUNDAMENTAL THEOREM OF INTEGRAL CatcuLus. Let f be continu-
ous on J=[a, b]. A function F on J satisfies
x
(30.6) F(x)~ F(a)= { f for xeJ,

if and only if F’=f on J.


proor. If relation (30.6) holds and ceJ, then it is seen from the
preceding theorem that F’(c) = f(c).
Conversely, let F. be defined for x in J by
x
F(x) = { f.

The preceding theorem asserts that F.=fonJ. If F is such that F’ =f, then
it follows from Theorem 27.9(ii), that there exists a constant C such that
F(x) = F.(x)+C, forx €J. Since F,(a) = 0, then C = F(a) whence it follows
that if F'=f on J, then

F(x)—F(a)= [7 Q.E.D.
232 FUNCTIONS OF ONE VARIABLE

Note. If F is a function defined on J such that F’=f on J, then we


sometimes say that F is an indefinite integral, an anti-derivative, or a
primitive of f. In this terminology, the Differentiation Theorem 30.7 asserts
that every continuous function has a primitive. Sometimes the Fundamen-
tal Theorem of Integral Calculus is formulated in ways differing from that
given in 30.8, but it always includes the assertion that, under suitable
hypotheses, the Riemann integral of f can be calculated by evaluating any
primitive of f at the end points of the interval of integration. We have given
the above formulation, which yields a necessary and sufficient condition fora
function to be a primitive of a continuous function. A somewhat more
general result, not requiring the continuity of the integrand, will be found in
Exercise 30.J.
It should not be supposed that the Fundamental Theorem asserts that if
the derivative f of a function F exists at every point of J, then f is integrable
and (30.6) holds. In fact, it may happen that f is not Riemann integrable
(see Exercise 30.K). Similarly, a function f may be Riemann integrable but
not have a primitive (see Exercise 30.L).
As a consequence of the Fundamental Theorem and Theorem 29.8, we
obtain the following variant of the First Mean Value Theorem 30.6, here
stated for Riemann integrals.
30.9 First Mean Vatug THEOREM. If f and p are continuous on J =
[a, b] and p(x) = 0 for all x € J, then there exists a point c € J such that

(30.7) [fp ax= fo pla) ax.


PROOF. Let g:J— R be defined for xe J by

g(x)= [pt de
Since p(x)= 0, it is seen that g is increasing and it follows from the
Differentiation Theorem 30.7 that g’=p. By Theorem 29.8, we conclude
that
b b

i fdg - { fp,

and from the First Mean Value Theorem 30.6, we infer that for some c in J,
then

{3 dg = fro). QED.

As a second application of Theorem 29.8 we shall reformulate Theorem


29.7, which is concerned with integration by parts, in a more traditional
form. The proof will be left to the reader.
30. EXISTENCE OF THE INTEGRAL 233

30.10 INTEGRATION By Parts. If f and g have continuous derivatives on


{a, b], then

{ fe'=f0)a(b)-fladeta)—
[F's
The next result is often useful.
30.11 SECOND MEAN VALUE THEOREM. (a) [f f is increasing and g is
continuous on J =[a, b], then there exists a point c in J such that

30.8) [ f4e=say [de +f00){ag.


(b) If f is increasing and h is continuous on J, then there exists a point cin J
such that

(30.9) [im = f(a) + f(b) [ h.


(c) If p is non-negative and increasing and h is continuous on J, then there
exists a point c in J such that

[ch= er.
PROOF. The hypotheses, together with the Integrability Theorem 30.2
imply that g is integrable with respect to f on J. Furthermore, by the First
Mean Value Theorem 30.6,

{ eaf= e(ertf)-fa)h.
After using Theorem 29.7 concerning integration by parts, we conclude that
f is integrable with respect to g and

[1 dg = {f(b)g(b) — f(a)g(a)}— g(ch{f(b) — fla)}


= f(a{g(c)— g(a)}+ f(b){g(b) — g(c)}
=fla){ dg+f(b){ de,
¢ b

which establishes part (a). To prove (b) let g be defined on J by

a(x)= fh
so that g’=h. The conclusion then follows from part (a) by using Theorem
29.8. To prove (c) define F to be equal to » for x in (a, b] and define
F(a)=0. We now apply part (b) to F. QED.
234 FUNCTIONS OF ONE VARIABLE

Part (c) of the preceding theorem is frequently called the Bonnett form of
the second Mean Value Theorem. Itis evident that there is a corresponding
result for a decreasing function (cf. Exercise 30.N.)

Change of Variable
We shall now establish a theorem justifying the familiar formula relating
to the ‘‘change of variable” in a Riemann integral.
30.12 CHANGE OF VARIABLE THEOREM. Let @ be defined on an interval
[a, B] to R with a continuous derivative and suppose that a= (a) and
b= (8). Iff is continuous on the range of », then

(30.10) [#00 ax = [He @e’( at


b B

proor. Let I= ¢({a, B]) and let F be defined by

F(é)= [ foo dx for eI

and consider the function H defined by H(t)=F(¢(t)) for axis.


Observe that H(a) = F(a)=0. If we differentiate with respect to t and use
the fact that F’=f (why?), we obtain
H(t) = Fee") = fle Meo.
We now apply the Fundamental Theorem to infer that
b B

[ #00 ae=F)=H)= [ fee’ ae. OED.


Modification of the Integral
The next result is often useful in reducing a Riemann-Stieltjes integral to
a Riemann integral.
30.13 THEorEM. If g’ exists and f and g' are Riemann integrable on
[a, b], then f is Riemann Stieltjes integrable with respect to g and
b b

(30.11) | fdg=| fe’.


PROOF. Let M>0 be such that |f(x)| <M forx¢[a, b]andlete>0. It
follows from Theorem 30.4 that fg’ is Riemann integrable. Therefore
there exists a partition P, of [a, b] such that if P =(xo, x1,...,%,) is any

+ OSSIAN BONNET (1819-1892) is primarily known for his work in differential geometry.
30. EXISTENCE OF THE INTEGRAL 235

refinement of P. and if & €[x-1, x] for j=1,...,n, then

(30.12) E fene'@a—x-)-[ fe] <e.


n b

I= a

Since g’ is Riemann integrable we may also suppose (in virtue of the


Riemann Criterion 30.1) that P. has been chosen so that

(30.13) x (M, — m))04 - 4-1) <e,


j=l

where M, = sup {g'(x):x €[x-1, x;]} and m; = inf {g(x):x €[xj)-1, x ]}. If we
use the Mean Value Theorem 27.6, we obtain points ¢; € (x;-1, x;) such that

E FG Nle(s)— 90-0 | fe"


~ | » f(E)e'(GOs =x.-[ fel
“< | > f(EMe'(G)— 8) a)

* | > fE)s'(G)G — 4-1)- {, fa ‘


Now since |g'(&)— g'(&)| < Mj; — m, it follows from (30.12) and (30.13) that
the preceding expression is dominated by

M x (M, — m;)(x; — xj-1) +e = (M+ De.

Since « >0 and the choice of & €[x;-1, x;] are arbitrary, it follows that f is
integrable with respect to g and that (30.11) holds. OED,
REMARK. The proof can be modified to apply to the case where f is
bounded and g is continuous on [a, b], and where g has a derivative except
at a finite number of points at which g’ can be defined so that g’ and fg’ are
Riemann integrable on [a, b].

Exercises

30.A. Show that a bounded function which has at most a finite number of
discontinuities is Riemann integrable.
30.B. If f:[a, b] > R is discontinuous at some point of the interval, then there
exists a monotone increasing function g such that f is not g-integrable.
30.C. Show that the Integrability Theorem 30.2 holds when g is a function of
bounded variation on J.
30.D. Give an example of a function f which is not Riemann integrable over J but
such that |f| and f* are Riemann integrable over J.
236 FUNCTIONS OF ONE VARIABLE

30.E. Let f be positive and continuous on J =[a, b] and let M = sup {f(x):x € J}.
Show that

M=lim ({ ¢cor dx)”


30.F. Show that the First Mean Value Theorem 30.6 may fail if f is not
continuous.
30.G. Show that the Differentiation Theorem 30.7 holds if it is assumed that f is
integrable on J with respect to an increasing function g, that f is continuous at c, and
that g is differentiable at c.
30.H. Suppose that f is integrable with respect to an increasing function g on
J =[a, b] and let F be defined for x eJ by

F(x)= [i dg.
Prove that (a) if g is continuous at c, then F is continuous at c, and (b) if f is
positive, then F is increasing.
30.1. Give an example of a Riemann integrable function f on J such that the
function F, defined for x €J by

Fa)=["f
does not have a derivative at some points of J. Can you find an integrable function f
such that F is not continuous on J?
30.J. If f is Riemann integrable on J =[a, b] and if F’=f on J, then

F(b)—F(a)= { f.
Hint: if P = (xo, X:,.--, %) is a partition of J, write

F(b)— F(a) = 0 {F(,) -F(s,.)}


30.K. Let F be defined by
F(x) = x’ sin (1/x?), O<x<1,
=0, x=0.

Then F has a derivative at every point of I. However F’ is not integrable on F and so


F is not the integral of its derivative.
30.L. Let f be defined by f(x) =[x] for x €[0, 2]. Then f is Riemann integrable
on [0, 2], but it is not the derivative of any function.
30.M. In the First Mean Value Theorem 30.9, assume that p is Riemann
integrable (instead of continuous). Show that the conclusion still holds.
30.N. If ¢ is non-negative and decreasing and h is continuous on [a, b], then there
exists a point € <[a, b] such that

[on = eta). h.
30, EXISTENCE OF THE INTEGRAL 237

30.0. Let f be continuous on F=(0, 1], let fo=f, and let f,., be defined by

foals) = [ f.i)dt
0
forne NxeL
By induction, show that |f,(x)| = (M/n!)x" = M/n!, where M=sup {|f(x)|:x 6D.
It follows that the sequence (f,) converges uniformly on F to the zero function.
30.P. If fis integrable with respect to g on J =[a, b], if p is continuous and strictly
increasing on [c, d], and if p(c) = a, p(d) = b, then fe@ is integrable with respect to
gem and

[ $48 =| ee) a’gee).


30.Q. If f is continuous on [a, b] and if

[=o
for all continuous functions h, then f(x) = 0 for all x.
30.R. Iff is integrable on [a, b] and if

[im
for all continuous functions h, then f(x) = 0 for all points of continuity of f.
30.S. Let p be continuous and positive on [a, b] and let c>0. If

pix)< cf pt) dt
for all x e[a, b], show that p(x)=0 for all x.
30.T. Let f be continuous and such that f(x) = 0 for all x e[a, b]. If g is strictly
increasing on [a, b] show that

{ fag 0
b

if and only if f(x)=0 for all x e[a, b].


30.U. Show that if g is strictly increasing on [a, b], then in the First Mean Value
Theorem 30.6 one can take c € (a, b). Make a similar modification of the parts (a)
and (b) of the Second Mean Value Theorem 30.11.
30.V. Evaluate the following Riemann-Stieltjes integrals. (Here x +>[x] de-
notes the greatest integer function.)

(@) [rae yf) xadeb,


[x adxp, @ [ated
(e) [/c08 x d(sin x), (f) i cos
x d(|sin x]).
238 FUNCTIONS OF ONE VARIABLE

Projects

30.a. The purpose of this project is to develop the logarithm by using an integral
as its definition. Let P={xeR:x>O}.
(a) If x € P, define L(x) to be

L(x)= I ifdat.
Hence L(1)=0. Prove that L is differentiable and that L’(x) = 1/x.
(b) Show that L(x) <0 for O0<x<1 and L{x)>0 forx>1. In fact,

1-1fx<L()<x-1 forx >0.

(c) Prove that L(xy) = L({x)+L(y) for x, yin P. Hence L(1/x) =—L(x) for x in P.
(Hint: if y ¢P, let L, be defined on P by Li(x) = L(xy) and show that Li =L’.)
(d) Show that if n € N, then

1,1 1 1 1
ata HE <L(nj<ltste +o.

(e) Prove that L is a one-one function mapping P onto allof R. Letting e denote
the unique number such that L(e)=1, and using the fact that L’(1) = 1, show that
e=lim((14+1/n)*).
(f} Let r be any positive rational number, then lim,_,.. L(x)/x" =0.
(g) Observe that
™* dt [4
LU+
G4+xy=} =| fo]
P= | 4.
iat
Write (1+1)” as a finite geometric series to obtain

Lats)= 5 recyP= x84 R(x).


n— k-1

k=

Show that |R,(x)| < 1/(n+1) for0=<x <1 and


|x|" +L _

ROIS Gs aaaxy for -1<x<0.

30.8. This project develops the trigonometric functions starting with an integral.
(a) Let A be defined for x in R by

A(x) = { dt
14+¢°

Then A is an odd function (that is, A(—x) = —A(x)), it is strictly increasing, and it is
bounded by 2. Define a by a/2= sup {A(x):x © R}.
(b) Let T be the inverse of A, so that T isa strictly increasing function with domain
(—a/2, 7/2). Show that T has a derivative and that T’=1+T’.
(c) Define C and S on (—7/2, a/2) by the formulas

1 oT
Caer SS aeTy
30. EXISTENCE OF THE INTEGRAL 239

Hence C is even and S is odd on (—77/2, 7/2). Show that C(0) =1 and $(0)=0 and
C(x)— 0 and S(x) > 1 asx > 7/2.
(d) Prove that C’(x) =—S(x) and S’(x) = C(x) for x in (7/2, 7/2). Therefore,
both C and S satisfy the differential equation

h’+h=0
on the interval (—7/2, 7/2).
(e) Define C(a/2)=0 and S(a/2)=0 and define C, S, T outside the interval
(-/2, 7/2) by the equations

C(x +a) =—-C(x), S(x + 7) =—S(x),

T(x + 1) = T(x).

If this is done successively, then C and S are defined for all R and have period 27.
Similarly, T is defined except at odd multiples of 7/2 and has period a.
(f} Show that the functions C and S, as defined on R in the preceding part, are
differentiable at every point of R and that they continue to satisfy the relations
C’=-S, S'=C
everywhere on R.

30.y. This project develops the well-known Wallist product formula. Through-
out it we shall let
afd

S, = { (sin x)" dx.


0

(a) Ifn >2, then S,=[(n—1)/n]S,_.. (Hint: integrate by parts.)


(b) Establish the formulas

1-3-5--:Qn—-Da 2° 4-++(2n)
Sm= FG.
6 (On) 2? Sons =
1:3-5-+-(2n+1)°

(c) Show that the sequence (S,,) is monotone decreasing. (Hint: 0 <sinx =< 1.)
(d) Let W, be defined by

°-6°6°++(2n)(2n)
7---(2n—1)(Qn+1)"
Prove that lim (W,) = 7/2. (This is Wallis’s product.)
(e) Prove that lim ((n!)°27"/(2n)! Vn)= Va.
30.8. This project develops the important Stirling} formula, which estimates the
magnitude of n!.

+ JOHN WALLIs (1616-1703), the Savilian professor of geometry at Oxford for sixty years, was a
precurser of Newton. He helped to lay the groundwork for the development of calculus.
+ JAMES STIRLING (1692-1770) was an English mathematician of the Newtonian school. The
formula attributed to Stirling was actually established earlier by ABRAHAM DE Moivre
(1667-1754), a French Huguenot who settled in London and was a friend of Newton.
240 FUNCTIONS OF ONE VARIABLE

(a) By comparing the area under the hyperbola y = 1/x and the area of a trapezoid
inscribed in it, show that
2 1
Int1~ 8 (1 +)
From this, show that e<(14+1/n)"*"”.
(b) Show that

| log x dx =n logn—n+1=log (n/e)" +1.


1

Consider the figure F made up of rectangles with bases [1,3], [n —3, n] and heights 2,
log n, respectively, and with trapezoids with bases [k —3, k +3], k =2,3,...,n—-1,
and with slant heights passing through the points (k, log k). Show that the area of F
is

1+log 2+---+log (n—1)+} log n=1+log (n!)—log Vn.


(c) Comparing the two areas in part (b), show that

u, =(wie) 1, nen

(d) Show that the sequence (u,) is monotone increasing. (Hint: consider u,41/t,.)
(e) By considering u,’/u,,, and making use of the result of part (e) of the preceding
project, show that lim (u,) =(27)"*”.
(f) Obtain Stirling’s formula

lim ((wlet'vomn) =1.

Section 31 Further Properties of the Integral

In this section we shall present some further properties of the Riemann-


Stieltjes (and the Riemann) integral that are often useful.
We first consider the possibility of “taking the limit under the integral
sign”; that is, the integrability of the limit of a sequence of integrable
functions.
Suppose that g is monotone increasing on an interval J =[a, b] and that
(f.) is a sequence of functions which are integrable with respect to g and
which converges at every point of J to a function f. It is quite natural to
expect that the limit function f is integrable and that

(31.1) {i dg = tim fs dg.

However, this need not be the case even for very nice functions.
31. FURTHER PROPERTIES OF THE INTEGRAL 241

(1/n, n)

{ j
0 lfn 2jn 1

Figure 31.1. Graph of f,.

Exampce. Let J=(0, 1], let g(x) =x, and let f, be defined for n = 2 by

fa(x) = 7x, O<x<1/n,


=—n’(x~-2/n), ljn =x <2/n,
=0, 2in=<x<l.

It is clear that for each n the function f, is continuous on J, and hence it is


integrable with respect to g. (See Figure 31.1.) Either by means of a direct
calculation or referring to the significance of the integral as an area, we
obtain

{feo dx =1, n= 2.

In addition, the sequence (f.,) converges at every point of J to 0; hence the


limit function f vanishes identically, is integrable, and

[f00 dx =0.

Therefore, equation (31.1) does not hold in this case even though both sides
have a meaning.

Since equation (31.1) is very convenient, we inquire if there are any simple
additional conditions that will imply it. We now show that, if the con-
vergence is uniform, then this relation holds.

31.2 THEOREM. Let g be a monotone increasing function on J and let (f..)


be a sequence of functions which are integrable with respect to g over J.
Suppose that the sequence (f,) converges uniformly on J to a limit function f.
242 FUNCTIONS OF ONE VARIABLE

Then f is integrable with respect to g and

(31.1) [is dg = tim] dg.


PROOF. Let ¢ >0 and let N be such that [fy —f|lh<e. Now let Px bea
partition of J such that if P, Q are refinements of Py, then |S(P; fx, g)—
S(Q; fx, g)}<e, for any choice of the intermediate points. Lf we use the
same intermediate points for f and fy, then

ISPs fas 8) S(PSf. 8)1 = ie—Flb feCau)~ ge}


= lif fll {g(b)— g(a)}<efg(b)—
g(a).
Since a similar estimate holds for the partition Q, then for refinements P, Q
of Py and corresponding Riemann-Stieltjes sums, we have

IS(P; f, g)-S(Q; f, g)| = |S(P3 f, 2) - S(P3 fix, 8)


+|S(P; fr, 8) S(O; fix, 8)
+|S(Q; fx, g)- S(Q; f, 8)
ss e(1+2{g(b)— g(a)p.
According to the Cauchy Criterion 29.4, the limit function f is integrable
with respect to g.
To establish (31.1), we employ Lemma 30.5:

[tas—[tas|=| [0 -f) ag | <li flr t8(6)-g¢a}


Since lim ||f — f, ||, =0, the desired conclusion follows. OED.
The hypothesis made in Theorem 31.2 that the convergence of the
sequence (f,) is uniform is rather severe and restricts the utility of this result.
We shall now state a result which does not restrict the convergence so
heavily, but requires the integrability of the limit function. We shall not
prove this result here, since the most natural proof would require an
excursion into ‘‘measure theory.” (However, the reader may consult the
article of Luxemberg listed in the References.)
31.3 BoUNDED CONVERGENCE THEOREM. Let (f.) be a sequence of
functions which are integrable with respect to a monotone increasing function g
on J =[a, b]to R. Suppose that there exists B >0 such that ||f.(x)]| < B for all
neN,xeJ. If the function f(x) =lim (f.(x)), x €J, exists and is integrable
with respect to g on J, then

(31.1) [i dg = tim de.


31. FURTHER PROPERTIES OF THE INTEGRAL. 243

The following consequence of the Bounded Convergence Theorem is


frequently useful, so we shall state it formally.
31.4 MoNOTONE CONVERGENCE THEOREM. Let (f.) be a monotone se-
quence of functions which are integrable with respect to a monotone increasing
function g on J=[a,b] to R. If the function f(x) =lim(f,(x)), xeJ, is
integrable with respect to g on J, then

(31.1) [i dg =lim i dg.


PROOF. Suppose that fi(x) <fo(x)=<---=f(x) for all xeJ. Then
\f.(x)|| << B, where B =||f,}]; +||fl, so we can apply 31.3. OED.
The main source of power of the Lebesgue (and Lebesgue-Stieltjes)
theory of integration is that it enlarges the class of integrable functions so
that equation (31.1) holds under weaker assumptions than given in the
preceding theorems. See the author’s Elements of Integration, listed in
the References.

Integral Form for the Remainder


The reader will recall Taylor’s Theorem 28.6, which enables one to
calculate the value f(b) in terms of the values f(a), f’(a),..., f°" (a) anda
remainder term which involves f™ evaluated at a point between a and b.
For many applications it is more convenient to be able to express the
remainder term as an integral involving f™.
31.5 TAyLor’s THEOREM. Suppose that f and its derivatives f',
f",...,f® are continuous on [a,b] to R. Then

Fa) dip — a)
f(b)= fla) +p t+ LO
Ea ®@ py(bay yet + Re
where the remainder is given by

(31.2) R=G=pi [ (b— 1)" F%(t) dt.


_ 1 , _— fyh-l em)

PROOF. mee R,, by parts to obtain

ce vf (b—t)"-7f°" P(t)at}

at
(n-1)

~
a (b-ayr'+ a |, O97
Continuing to integrate by parts in this way, we obtain the stated formula.
OED.
244 FUNCTIONS OF ONE VARIABLE

Instead of the formula (31.2), it is often convenient to make the change of


variable t=(1—s)a+sb, for s in [0, 1], and to obtain the formula
_ (b _ a)"
(31.3) Ri="G— pi { ‘a —s)'f™[a+ (b—-a)s] ds.

This form of the remainder can be extended to the case where f has domain
in R? and range in R*.

Integrals Depending on a Parameter


It is often important to consider integrals in which the integrands depend
on a parameter. In such cases one desires to have conditions assuring the
continuity, the differentiability, and the integrability of the resulting func-
tion. The next few results are useful in this connection.
Let D be the rectangle in Rx R given by
D={(%, j):asx<b,c<t<d},
and suppose that f is continuous on D to R. Then it is easily seen (cf.
Exercise 22.G) that, for each fixed t in[c, d], the function which sends x into
f(x, ) is continuous on [a, b] and, therefore, Riemann integrable. We
define F for ¢ in[c, d] by the formula

(31.4) F(t)= | “h(x, t) dx.


It will first be proved that F is continuous.

31.6 THEOREM. [If fis continuous on D to R and if Fis defined by (31.4),


then F is continuous on [c, d] to R.
PROOF. The Uniform Continuity Theorem 23.3 implies that if ¢ >0, then
there exists a 8(e) > 0 such that if t and to belong to [c, d] and |t — to|< 8(e),

then Lf (x, t)— fe, to)| <e,


for all x in[a, b]. It follows from Lemma 30.5 that

IF) Feed] =| [ (80, 0-F0x toh dx


= { |f(x, t) f(x, to)| dx =< e(b—a),

which establishes the continuity of F. Q.E.D.


In the next two results, we shall make use of the notion of the partial
derivative of a function of two real variables. This concept, familiar to the
reader from calculus, will be discussed further in Chapter VII.
31. FURTHER PROPERTIES OF THE INTEGRAL 245

31.7 THEoReM. If f and its partial derivative f, are continuous on D to R,


then the function F defined by (31.4) has a derivative on [c, d] and

(31.5) F'(t)= [ies t) dx.


PROOF. From the uniform continuity of f, on D we infer that if e >0, then
there is a 8(e) >0 such that if |t— t.|<8(e), then
f(x, t) — filx, to)| <e

for all x in[a, b]. Let t, to satisfy this condition and apply the Mean Value
Theorem 27.6 to obtain a ti (which may depend on x and lies between t and
to) such that

F(x, t) — FX, to) = (t~ to) f(x, th).


Combining these two relations, we infer that if 0<|t—to|< 8(e), then
FO, )— FO 60) _ bey 1) <eé,
t— to

for all x in[a, b]. By applying Lemma 30.5, we obtain the estimate
— b b _
F(t) rte) | fix, to) dx\< { fx, t) f(x, to) £ (x, to) dx
t— to la a t— to

<e(b—a),
which establishes equation (31.5). Q.E.D.
Sometimes the parameter ¢ enters in the limits of integration as well as in
the integrand. The next result considers this possibility. In its proof we
shall make use of a very special case of the Chain Rule (to be discussed in
Chapter VID) which will be familiar to the reader.
31.8 Leisniz’s ForMuLA. Suppose that f and f, are continuous on D toR
and that « and B are functions which are differentiable on the interval [c, d]
and have values in [a, b]. If @ is defined on [c, d] by
Bt)
(31.6) o(t)= [.. f(x, t) dx,

then ¢ has a derivative for each t in [c, d] which is given by


aw
(31.7) e'(t) = f(B(), OB) —f(att), pare [ (t ; fil, t) dx.

proor. Let H be defined for (u, v, t) by

Hw, 0, = [FG 0 dx,


246 FUNCTIONS OF ONE VARIABLE

when u, v belong to[a, b] and t belongsto[c, d]. The function » defined in


(31.6) is the composition given by »(t)= H(B(t), a(t), t). Applying the
Chain Rule, we have

¢'(t) = H(B), a(t), 16)


+ H(B (0), a(t), te'(t)
+ (BC), a(®), t).
According to the Differentiation Theorem 30.7,

H.(u, v0, =flut, AoC, v, t)=—fl, ),


and from the preceding theorem, we have

H,(u, v, t) = { ficx, t) dx.

If we substitute u = B(t) and v = a(t), then we obtain the formula (31.7).


QED.
Iff is continuous on D to R and if F is defined by formula (31.4), then it
was proved in Theorem 31.6 that F is continuous and hence Riemann
integrable on the interval [c,d]. We now show that this hypothesis of
continuity is sufficient to insure that we may interchange the order of
integration. In formulas, this may be expressed as

(31.8) I {Fe t) ax| dt= [ iic t) at} dx.


31.9 INTERCHANGE THEOREM. [If f is continuous on D with values in R,
then formula (31.8) is valid.
PROOF. Theorem 31.6 and the Integrability Theorem 30.2 imply that
both of the iterated integrals appearing in (31.8) exist; it remains only to
establish their equality. Since f is uniformly continuous on D, if e > 0 there
exists a 65(e)>0 such that if |x—x]<8(e) and |t—t'|<8(e), then
If(x, t)—f(x'’, t)|<e. Let n be chosen so large that (b—a)/n<8(e) and
(d—c)/n < 6(e) and divide D into n? equal rectangles by dividing [a, b] and
[c, d] each into n equal parts. For j =0,1,...,, we let
x =at(b—a)j/n, t=c+(d—c)j/n.

We can write the integral on the left of (31.8) in the form of the sum

> | Jh-1. {{ yaa f(x, 2) ax} dt.


K=1 j=1

Applying the First Mean Value Theorem 30.6 twice, we infer that there
exists a number x} in [x;-:, x] and a number fj, in [t.-1, & ] such that
ty

| kt (h i) ax} dt = f(x}, te)(%j — X)-1) (te — te1).


31. FURTHER PROPERTIES OF THE INTEGRAL. 247

Hence we have

[/{ [te 9 ax}ae = 3 peo, —-dle —a0.


The same line of reasoning, applied to the integral on the right of (31.8),
yields the existence of numbers xj in [x;-1, x;] and ti in [t—-1, 4] such that

44-1) (te > tha).


[{f1¢- ) at} dx = x > F(x, 1D0q —
Since both x; and x* belong to [x)-1, x;] and ti, ti belong to [f.-1, &], we
conclude from the uniform continuity of f that the two double sums, and
therefore the two iterated integrals, differ by at most e(b—a)(d—c). Since
e > Ois arbitrary, the equality of these integrals is confirmed. O.E.D.

The Riesz Representation Theoremt


We shall conclude this section with a deep theorem which, although it will
not be applied below, plays a very important role in functional analysis.
First it will be convenient to collect some results which have already been
established or are direct consequences of what we had done.
Let J =[a, b] be a closed cell in R, let C(J) denote the vector space of all
continuous functions on J to R, and let ||f||; be the norm on C(J) defined by
Ifll = sup {|f)|: x € J}.
A linear functional on C(J) is a linear function G: C(J) > R defined on the
vector space C(J); hence
G(af,+ Bf) = eG (fi) + BG(f2)
for all a, 8B in R and fi, f2in CJ). A linear functional G on C(J) is said to be
positive if for each f ¢ C(J) with f(x) = 0, x EJ, we have

G(f)=0.
A linear functional G on C(J) is said to be bounded if there exists M = 0
such that
IG(f)| = M llflls
for all fe CJ).
31.10 Lemma. If g is a monotone increasing function on J and if G is
defined for f in C(J) by
G1.9) aih= [ fas,
then G is a bounded positive linear functional on C(J).
t The rest of this section may be omitted on a first reading.
248 FUNCTIONS OF ONE VARIABLE

prooF. It follows from Theorem 29.5(a) and Theorem 30.2 that G is a


linear function on C(J) and from Lemma 30.5 that G is bounded with
M= e(b)- g(a). If f belongs to C(J) and f(x) = 0 for x €J, then taking
m = Oin formula (30.4) we conclude that G(f) = 0. Q.E.D.
We shall now show that, conversely, every bounded positive linear
functional on C(J) is generated by the Riemann-Stieltjes integral with
respect to some monotone increasing function g. This is a form of the
celebrated “Riesz Representation Theorem,” which is one of the keystones
for the subject of “functional analysis” and has many far-reaching generali-
zations and applications. The theorem was proved by the great Hungarian
mathematician Frederic Riesz.t
31.11 Rresz REPRESENTATION THEOREM. If G is a bounded positive
linear functional on C(J), then there exists a monotone increasing function g
on J such that

G19) GW=| fas


for every f in C(J).
PROOF. We shall first define a monotone increasing function g and then
show that (31.9) holds.
There exists a constant M such that if 0 < f,(x) = f2(x) for all x in J, then
0< Gif) < Gif.) < M |lfill. If tis any real number such that a<t<b, and
if n is a sufficiently large natural number, we let ¢,, be the function (see
Figure 31.2) in C(J) defined by

Gin(x)= 1, a=x<t,
(31.10) =1-n(x-1), t<x<t+1/n,

=0, tt+l1lin<x<b.

It is readily seen that if n < m, then for each t with a<t<b,

OS gim(X) = Gn(x) = 1,

so that the sequence (G(¢,.):n1€N) is a bounded decreasing sequence of


real numbers which converges to areal number. We define g(t) to be equal
to this limit. If a<t<s<b andneN, then

0 = Gin(x) S Gan(x) < 1,

+ FREDERIC Riesz (1880-1955), a brilliant Hungarian mathematician, was one of the founders
of topology and functional analysis. He also made beautiful contributions to potential,
ergodic, and integration theory.
31. FURTHER PROPERTIES OF THE INTEGRAL 249
1

|
|
|
|
|
|
|
L @ |
t t+1/n 6

Figure 31.2. Graph of o,,.

whence it follows that g(t) < g(s). We define g(a) =0 and if ¢,,, denotes
the function g,.(x) = 1, x € J, then we set g(b) = G(q,.). _Ifa<t<bandn
is sufficiently large, then for all x in J we have

0 < Gin(X) = Pon(x) = 1,


so that g(a) =0 <= G(¢.n) < G(@,,) = g(b). This shows that g(a) = g(t) <
g(b) and completes the construction of the monotone increasing function g.
If f is continuous on J and e > 0, there is a 8(e¢) >0 such that if |x —y|<
&(e) and x, y € J, then |f(x)—f(y)|<e. Since f is integrable with respect to g,
there exists a partition P. of J such that if Q is a refinement of P., then for any
Riemann-Stieltjes sum, we have

[ tag-sasi, 8)|<e.
Now let P = (to, fi, ..., tm) be a partition of J into distinct points which is a
refinement of P, such that sup {t, —t-1}<3 8&(e) and let n be a natural
number so large that
2in< inf {ty _ tes}.

Then only consecutive intervals


(31.11) [to, i+ 1I/n], ..- 5 [teas te FU], ..., [on-1, tm]
have any pointsin common. (See Figure 31.3.) Foreachk=1,..., m, the
decreasing sequence (G(¢;,,n)) converges to g(t.) and hence we may suppose
that n is so large that

(31.12) g(t.) < Gein) So (te) +(e/m [lf llr).


We now consider the function f* defined on J by

(31.13) FRG)=fldmualX) +E Fe Monn)


~ ox aC}
250 FUNCTIONS OF ONE VARIABLE

|
|
|
|
|
|
|
|
|
a aaa th tht lfn b
tpi, tifn
Figure 31.3. Graph of @y.n — Putin

An element x in J either belongs to one or two intervals in (31.11). If it


belongs to one interval, then we must have to = x <t, and f*(x) = f(t:) or we
have %-1+(1/n)<x <4 for some k=1,2,...,m in which case f*(x)=
f(t). (See Figure 31.4.) Hence

{f(x)-f*(x)[<e.
If the x belongs to two intervals in (31.11), then & = x =< & +1/n for some

Figure 31.4. Graphs of f and f*.


31, FURTHER PROPERTIES OF THE INTEGRAL 251

k=1,...,m-—1 and we infer that

FRC) = f(t) Pun (X) + fer{T — Pan (X)}.


If we refer to the definition of the @’s in (31.10), we have

f*(x) = fi) — n(x — te) + ftv) n(x ~ t.).


Since |x — t|<8(e) and |x — t4:1]<6(e), we conclude that
f(x)
— FD] = [FG)
— FUe)| CL — Ge = th )) FFG)
— Fen] 10 — te)
<e{1—-n(x-4)+n(x—-t)}=e.

Consequently, we have the estimate

ff" = sup (If) ~f*@)|:xeT}<e.


Since G is a bounded linear functional on C(J), it follows that
(31.14) |IG(f)- GG*)| s Me.
In view of relation (31.12) we see that

HG (ign) — Gu-1n)}
£8 (te) — 8 (te—1)}| <€/2m [fs
for k =2,3,...,m. Applying G to the function f* defined by equation
(31.13) and recalling that g(t.) =0, we obtain

GU-E Herlelu)— gD} | <e.


But the second term on the left side is a Riemann-Stieltjes sum S(P; f, g) for
f with respect to g corresponding to the partition P which is a refinement of
P.. Hence we have

[tac—or)| =|= [ rde—s@s @)| +1805 «GF <26.


Finally, using relation (31.14), we find that
b

(31.15) | | tas Gi] <a +2).


Since e > 0 is arbitrary and the left side of (31.15) does not depend on it, we
conclude that

Gin= [fae orp.


For some purposes it is important to know that there is a one-one
correspondence between bounded positive linear functionals on C(J) and
certain normalized monotone increasing functions. Our construction can
be checked to show that it yields an increasing function g such that g(a) =0
252 FUNCTIONS OF ONE VARIABLE

and g is continuous from the right at every interior point of J. With these
additional conditions, there is a one-one correspondence between positive
functionals and increasing functions.

Exercises
31.A. If a>0, show directly that

tim | e“ dx =0.
n oO

Which of the results of this section apply?


31.B. If 0<a<2, show that
2

lim { e™ dx =0.
What happens if a = 0?

31.C. Discuss lim Jj nx(1 —x)° dx.

31.D. If a>0, show that

limn [Ja sinNX PX Gy =O,


What happens if a =0?
31.E. Let f,(x) =nx(1+nx)’ for x €[0, 1], and let f(x) = 0 for x = 0 and f(x) = 1
for x €(0, 1]. Show that f,(x)> f(x) for all x e[0, 1] and that

I, f(x) dx > { f(x) dx.

31.F. Let h,(x)=nxe™ for x €[0, 1] and let h(x)=0. Show that

0=[ h(x) dx lim | ha(x) de =5.


1 1

o 0

31.G. Let (g,) be a sequence of increasing functions on [a, b] which converges


uniformly to a function g on [a,b]. If an increasing function f is integrable with
respect to g, for all neN, show that f is integrable with respect to g and
b b

| fidg =tim | f dg,.

31.H. Give example to show that the conclusion in the preceding exercise may
fail if the convergence is not uniform.
31.1. If a >0, show that fo “(log t)’ dt=2/(a +1).
31.J. Suppose that f and its partial derivative f, are continuous for (x, ft) in
[a, b]x[c, d]. Apply the Interchange Theorem 31.9 to

t) arx| di, cstz<d,


[ {I se
and differentiate to obtain another proof of Theorem 31.7.
31. FURTHER PROPERTIES OF THE INTEGRAL 253

31.K. Use the Fundamental Theorem 30.8 to show that if a sequence (f,) of
functions converges on J to a function f and if the derivatives (f/) are continuous
and converge uniformly on J to a function g, then f’ exists and equals g. (This
result is less general than Theorem 28.5, but it is easier to establish.)
31.L. Let {r,, 1,...5 tm .-.} be an enumeration of the rational numbers in I. Let
fa, be defined to be 1 if x €{r,,..., 7} and to be 0 otherwise. Then f, is Riemann
integrable on I and the scquence (f,,) converges monotonely to the Dirichlet discon-
tinuous function f (which equals 1 on EN Q and equals 0 on I\Q. Hence the
monotone limit of a sequence of Riemann integrable functions does not need to be
Riemann integrable.
31.M. Let g be a fixed monotone increasing function on J=[a, b]. If f is any
function which is integrable with respect to g on J, then we define ||f||, by

Inh= [inlas,
Show that the following ‘“‘norm properties” are satisfied:
(a) [lfll= 0;
(b) If f(x) =0 for all x € J, then ||f]], = 0;
(c) IfceR, then |lef|l; =|e| [lfll:;
(@) | fl Walls |= fe Alh = IF + lh
However, it is possible to have ||f||, = 0 without having f(x) =O for all x €J. (Can this
occur when g(x) =x?)
31.N. If g is monotone increasing on J, and iff and f,, n €.N, are functions which
are integrable with respect to g, then we say that the sequence (f,) comverges in
mean (with respect to g) in case

lf. fli0.
(The notation here is the same as in the preceding exercise.) Show that if (f.)
converges in mean to f, then

[' dg> iG dg.

Prove that if a sequence (f,) of integrable functions converges uniformly on J to f,


then it also converges in mean to f. In fact,

If fil = {g(b) — g(@)} Ilf, — fb.


However, if f, denotes the function in Example 31.1, and if g,=(1/n)f,, then the
sequence (g,) converges in mean [with respect to g(x) = x] to the zero function, but
the convergence is not uniform on E.
31.0. Let g(x) =x on J =[0, 2] and let (I,) be a sequence of closed intervals in J
such that (i) the length of [, is 1/n, (ii) 1. O1,..=9@, and (iii) every point x in J
belongs to infinitely many of the I,. Let f, be defined by

fy=1, xe,
=0, xéI..

Prove that the sequence (f,) converges in mean [with respect to g(x) =x] to the
254 FUNCTIONS OF ONE VARIABLE

zero function on J, but that the sequence (f,) does not converge uniformly. Indeed,
the sequence (f,) does not converge at any point!
31.P. Let g be monotone increasing on J =[a, b]. Iff and h are integrable with
respect to g on J to R, we define the inner product (f, h) off and h by the formula

(f,m)= | FCOmG) dain.


Verify that all of the properties of Definition 8.3 are satisfied except (ii). If f=h is
the zero function on J, then (f, f)=0; however, it may happen that (f, f)=0 for a
function f which does not vanish everywhere on J.
31.Q. Define |[fl|, to be

Ihk={{f ireor deta}


4/2.
,

so that |[fll, =(f, f)’”. Establish the Schwarz Inequality


(fh) = Iflk lle
(Theorem 8.7 and 8.8). Show that the Norm Properties 8.5 hold, except that
lfle=0 does not imply that f(x)=0 for all x in J. Show that |/fl, <
{g(b)—g(a)}" |flL-
31.R. Let f and f,, ne N, be integrable on J with respect to an increasing
function g. We say that the sequence (f,) converges in mean square (with respect
to g on J) tof if ||f. —flL— 0.
(a) Show that if the sequence is uniformly convergent on J, then it also converges
in mean square to the same function.
(b) Show that if the sequence converges in mean square, then it converges in
mean to the same function.
(c) Show that Exercise 31.0 proves that convergence in mean square does not
imply convergence at any point of J.
(d) If, in Exercise 31.0, we take I, to have length 1/n’ and if we set h, = nf, then
the sequence (h,) converges in mean, but does not converge in mean square, to the
zero function.
31.S. Show that, if the nth derivative f” is continuous on [a, b], then the Integral
Form of Taylor’s Theorem 31.5 and the First Mean Value Theorem 30.9 can be
used to obtain the Lagrange form of the remainder given in 28.6.
31.T. If J,=[a, b], J=[c, d], and if f is continuous on J,xJ, to R and g is
Riemann integrable on J,, then the function F, defined on J, by

F()= [6s 0800) de,


is continuous on J.
31.U. Let g be an increasing function on J, =[a, b] to R and for each fixed t in
J,=[e, d], suppose that the integral

F()= | fle, dato)


exists. If the partial derivative f, is continuous on J, x J, then the derivative F’
31. FURTHER PROPERTIES OF THE INTEGRAL 255

exists on J, and is given by

FW = [ fs 9 dgto.
31.V. Let J,=[a, b] and J,=[c, d]. Assume that the real valued function g is
monotone on J,, that h is monotone on J, and that f is continuous on J,x J».
Define G on J, and H on J, by

GO=[ fede), —-HO)= [4,0 anio.


Show that G is integrable with respect to h on J,, that H is integrable with respect
to g on J, and that

[co dh(t)= [He dg (x).

We can write this last equation in the form

{ {fr (0) dg «| dh (1) = [ {16 t) ance} dg(x).


31.W. Let f, Ji, and J, be as in Exercise 31.V. If @ is in C(J,) (that is, p is a
continuous function on J, to R), let T(@) be the function defined on J, by the
formula

T(eX)= [ fl, Dela) de.


Show that T is a linear transformation of C(J,) into C(J.) in the sense that if ©, &
belong to C(J,), then
(a) T(¢) belongs to C(J2),
(b) Tie +H) = Tle) + TCH),
(c) T(cp)=cT(@) force R.
If M =sup {[f(x, )|: (x, t) J, J,}, then T is bounded in the sense that
@) |IT@)=M lel, for p eC).
31.X. Continuing the notation of the preceding exercise, show that if r >0, then
T sends the collection

B,={@ €CU)): lel, = 7}


into a uniformly equicontinuous set of functions in C(J,) (see Definition 28.6).
Therefore, if (¢,) is any sequence of functions in B,, there is a subsequence (¢,,)
such that the sequence (T(¢,,)) converges uniformly on Jy.
31.¥. Let J, and J, be as before and let f be continuous on R x J, into R. If ¢ is
in C(J,), let S(p) be the function defined on J, by the formula

steno = [ fle(), 1) dx
Show that S(@) belongs to C(J,), but that, in general, S is not a linear transforma-
tion in the sense of Exercise 31.W. However, show that S sends the collection B,
256 FUNCTIONS OF ONE VARIABLE

of Exercise 31.X into a uniformly equicontinuous set of functions in C(J2). Also, if


(@.) is any sequence in B,, there is a subsequence such that (S(q,,)) converges
uniformly on J... (This result is important in the theory of non-linear integral
equations.)
31.Z. Show that if we define Go, G,, G» for f in CUD by

GAf=F0), Gif) =2) "Feo de,


Galf) = Hl) + f()};
then Go, Gi, and G, are bounded positive linear functionals on C(D. Give
monotone increasing functions go, g1, g2 which represent these linear functionals as
Riemann-Stieltjes integrals. Show that the choice of these g, is not uniquely
determined unless one requires that g,(0)=0 and that g, is continuous from the
right at each interior point of E.

Project
31.a. This project establishes the existence of a unique solution of a first order
differential equation under the presence of a Lipschitz condition. Let Q¢ R’ be
open and let f:Q— R be continuous and satisfy the Lipschitz condition: |f(x, y)—
f(x y= K |y —y'| for all points (x, y), (x, y’) in Q. Let I be a closed cell
T={(x, y):|x-a]| <a, ly—b| < B}
contained in 0 and suppose that Ma = B, where |f(x, y)|-< M for (x, y)eL
(a) If J=[a—a, a+a], define ¢.(x) = b for x € J and, if nEN, define

onl) = b+ [FC oalO) at


forxe¢J. Prove by induction that the sequence (¢,,) is well-defined on J and that
@) le)—bl=B,
Gi) |.) 10)] = ME xa rol
>

for all x EJ.


(b) Show that each of the functions ¢, is continuous on J and that the sequence
(¢.) converges uniformly on J to a function @.
(c) Conclude that the function ¢ is continuous on J, satisfies p(a) = b, and

o(x)= b+ [F( @0) at


for allxeJ. Deduce that @ is differentiable on J and satisfies

e'(x)=f(x,e@)) — forxeJ.
(d) If % is continuous on J and satisfies

blay=b, W(t) = F(x, WOx))


32. IMPROPER AND INFINITE INTEGRALS 257

for all x € J, show that

b(x) = b+{ f(t, w(t) dt for xeJ.

(e) If @ is as in (c) and & is as in (d), show by induction that

le@)-¥@=K| [leo-vola
K" n
= Fy
lle — lh bx—al’.
Hence |lp — ol, = |e — bl, K"a"/n!, whence it follows that p(x) = W(x) for all x e J.

Section 32. Improper and Infinite


Integrals

In the preceding three sections we have had two standing assumptions:


we required the functions to be bounded and we required the domain of
integration to be compact. If either of these hypotheses is dropped, the
foregoing integration theory does not apply without some change. Since
there are a number of important applications where it is desirable to permit
one or both of these new phenomena, we shall indicate here the changes
that are to be made.

Unbounded Functions

Let J=[a, b] be an interval in R and let f be a real-valued function


which is defined at least for x satisfying a<x=b. If f is Riemann
integrable on the interval [c, b] for each c satisfying a<c = b, let

(32.1) I= {it
We shall define the improper integral of f over J =[a, b] to be the limit of I,
asc—a.
32.1 DeFIniITION. Suppose that the Riemann integral in (32.1) exists
for each c in (a, b]. Suppose that there exists a real number J such that for
every « >0 there is a 8(¢) >0 such that if a<c<a+8&(e) then |I.—I|<e.
In this case we say that I is the improper integral of f over J =[a, b] and
we sometimes denote the value I of this improper integral by

(32.2) [. f orby { f(x) dx,


although it is more usual not to write the plus signs in the lower limit.
258 FUNCTIONS OF ONE VARIABLE

32.2 Examples. (a) Suppose the function f is defined on (a, b] and is


bounded on this interval. If f is Riemann integrable on every interval
[c, b] with a<c <b, then it is easily seen (Exercise 32.A) that the im-
proper integral (32.2) exists. Thus the function f(x) =sin (1/x) has an
improper integral on the interval [0, 1].
(b) If f(x) =1/x for x on (0, 1] and if c is in (0, 1] then it follows from the
Fundamental Theorem 30.8 and the fact that f is the derivative of the
logarithm that
1

1={ f =log 1—log c = —log c.

Since log c becomes unbounded as c — 0, the improper integral of f on


[0, 1] does not exist.
(c) Let f(x) =x* for x in (0, 1]. If a <0, the function is continuous but
not bounded on (0, 1]. If a4 ~—1, then f is the derivative of

asa"
ati

It follows from the Fundamental Theorem 30.8 that

[ od = 1 (1-¢**?)
c vex atl .

If @ satisfies -1<a<0, then c**?—>0 as c—>0, and f has an improper


integral. On the other hand, if a <—1, then c*** does not have a (finite)
limit as c —- 0, and hence f does not have an improper integral.
The preceding discussion pertained to a function which is not defined or
not bounded at the left end point of the interval. It is obvious how to treat
analogous behavior at the right end point. Somewhat more interesting is
the case where the function is not defined or not bounded at an interior
point of the interval. Suppose that p is an interior point of [a, b] and that f
is defined at every point of [a, b] except perhaps p. If both of the improper
integrals

{is [9
p- b

a p+

exist, then we define the improper integral of f over [a, b] to be their sum.
In the limit notation, we define the improper integral of f over [a, b] to be
p-e b
(32.3) lim | f(x) dx + 80+
230+ Jy
lim | Jois f(x) dx.
It is clear that if those two limits exist, then the single limit

(32.4) lim {foo dx + {.. f(x) ax}


32. IMPROPER AND INFINITE INTEGRALS 259

also exists and has the same value. However, the existence of the limit
(32.4) does not imply the existence of (32.3). For example, if f is defined
for x €[~1, 1], x#0, by f(x) = 1/x’, then it is easily seen that

Ce (a E)E-3-
[, (5) ar+[ (3) ax =( 2 Netto)
for all ¢ satisfying O0<e<1. However, we have seen in Example 32.2(c)
9

that if a = —3, then the improper integrals


o- 1
1 1
= dx, [ —= dx
[, x? lo+ x?
do not exist.
The preceding comments show that the limit in (32.4) may exist without
the limit in (32.3) existing. We defined the improper integral (which is
sometimes called the Cauchy integral) of f to be given by (32.3). The limit
in (32.4) is also of interest and is called the Cauchy principal value of the
integral and denoted by

(CPV) [feo dx.


It is clear that a function which has a finite number of points where it is not
defined or bounded can be treated by breaking the interval into subintervals
with these points as end points.

Infinite Integrals
It is important to extend the integral to certain functions which are
defined on unbounded sets. For example, iff is defined on {x Ee R:x = a}
to R and is Riemann integrable over [a, c] for every c >a, we let I, be the
partial integral given by

(32.5) L= [it
We shall now define the “infinite integral” of f for x = a to be the limit of
J, as c increases.
32.3. DeFINITION. If f is Riemann integrable over [a, c] for each c >a,
let I. be the partial integral given by (32.5). A real number I is said to be
the infinite integral of f over {x :x = a} if for every « >0, there exists a real
number M(e) such that if c > M(e) then |I—I.|<e. In this case we denote
Iby

(32.6) [or or [te dx,


260 FUNCTIONS OF ONE VARIABLE

It should be remarked that infinite integrals are sometimes called ‘‘im-


proper integrals of the first kind.” We prefer the present terminology,
which is due to Hardy, for it is both simpler and parallel to the terminol-
ogy used in connection with infinite series.
32.4 Exampies. (a) If f(x)=1/x for x >a>0, then the partial inte-
grals are

=| £ dx =log clog a.

Since log c becomes unbounded as c > +, the infinite integral of f does


not exist.
(b) Let f(x)=x* forx = a>0anda#-1. Then
1
I= { x* dx = (c%*t~ at"),
atl
If a >—1, then a+1>0 and the infinite integral does not exist. However,
if a<—1, then
+0 atl
a
x* dx = — .
I at+l1

(c) Let f(x)=e™ forx =O. Then

| e * dx =—(e“—1);
Qo

hence the infinite integral of f over {x:x = 0} exists and equals 1.


It is also possible to consider the integral of a function defined on all of
R. In this case we require that f be Riemann integrable over every
finite interval in R and consider the limits

(32.7a) [- f(x) dx = Jim ic dx,


(32.7b) [te dx = jim [1 dx.
It is easily seen that if both of these limits exist for one value of a, then they
both exist for all values of a. In this case we define the infinite integral of f
over R to be the sum of these two infinite integrals:

(32.8) [F009 dx = lim [100 dx +lim [1 dx,

+ GEOFFREY H. HARDY (1877-1947) was professor at Cambridge and long-time dean of


British mathematics. He made many and deep contributions to mathematical analysis.
32. IMPROPER AND INFINITE INTEGRALS 261

As in the case of the improper integral, the existence of both of the limits in
(32.8) implies the existence of the limit

(32.9) tim|{ [" f(x) dx+ [#00 ax} = tim [- f(x) dx,
cote

and the equality of (32.8) and (32.9). The limit in (32.9), when it exists, is
often called the Cauchy principal value of the infinite integral over R and is
denoted by

(32.10) (CPV) [160 dx,


However, the existence of the Cauchy principal value does not imply the
existence of the infinite integral (32.8). This is seen by considering f(x) =
x, whence

| x dx =(c*?—c”?)=0

for all c. Thus the Cauchy principal value of the infinite integral for
f(x) =x exists and equals 0, but the infinite integral of this function does
not exist, since neither of the infinite integrals in (32.7) exists.

Existence of the Infinite Integral

We now obtain a few conditions for the existence of the infinite integral
over the set {x:x = a}. These results can also be applied to give condi-
tions for the infinite integral over R, since the latter involves consideration
of infinite integrals over the sets {x:x < a} and {x:x => a}. First we state
the Cauchy Criterion.
32.5 CAUCHY CRITERION. Suppose that f is integrable over [a, c] for all
c2za. Then the infinite integral

[s
exists if and only if for every « >0 there exists a K(e) such that ifb=>c>=
K(e), then

(32.11) {| <e,

PROOF. The necessity of the condition is established in the usual man-


ner. Suppose that the condition is satisfied and let I, be the partial integral
defined for ne N by
n=[7
262 FUNCTIONS OF ONE VARIABLE

It is seen that (J,) is a Cauchy sequence of real numbers. If J = lim (J,) and
e >0, then there exists N(e) such that if n = N(e), then |I-I.|<e. Let
M(e)=sup {K(e),a+N(e)}+1 and let c>M(e). Then there exists a
natural number n = N(e) such that K(e)<a+n<c. Therefore the par-
tial integral I. is given by

w= [r=[ore[
whence it follows that [I — I,|<2e. OED.
In the important case where f(x)=0O for all x =a, the next result
provides a useful test.
32.6 THEOREM. Suppose that f(x) = 0 for all x = a and that f is inte-
grable over [a,c] for allc =a. Then the infinite integral of f exists if and
only if the set {I,:¢ =< a} is bounded. In this case

[t sup {{F: c= a}.

proor. The hypothesis that f(x)=0 implies that I, is a monotone


increasing function of c. Therefore, the existence of lim I, is equivalent to
the boundedness of {I. :c = a}. QED.

32.7 COMPARISON Test. Suppose that f and g are integrable over [a, c]
for allc = a and that |f(x)| = g(x) for allx =a. If the infinite integral of g
exists, then the infinite integral of f exists and

i= [os
proor. If a<c<b, then it follows from Lemma 30.5 that |f| is inte-
grable on [c, b] and that

|Lels[irl=[e
b b b

It follows from the Cauchy Criterion 32.5 that the infinite integrals of f and
|f| exist. Moreover, we have

J =| ifl= | g. QED.

32.8 Limir ComPARISON Test. Suppose that f and g are non-negative


and integrable over [a, c] for all c = a and that

(32.12) a Oe)
dim oq) * °°

Then both or neither of the infinite integrals Jz* f, [2° g exist.


32. IMPROPER AND INFINITE INTEGRALS 263

proor. In view of the relation (32.12) we infer that there exist positive
numbers A <B and K = a such that
Ag(x) = f(x) = Bg(x) for x => K.

The Comparison Test 32.7 and this relation show that both or neither of
the infinite integrals fx” f, {x g exist. Since both f and g are integrable on
[a, K], the statement follows. OED.
32.9 DrricHLet’s Test. Suppose that f is continuous for x = a, that the
partial integrals

I, = { f, c2a,

are bounded, and that ¢ is monotone decreasing to zero asx —> +. Then
the infinite integral Jz” fo exists.
PROOF. Let A be a bound for the set {|I.|:c = a}. Ife >0, let K(e) be
such that ifx = K(e),then0 < g(x) <e/2A. Ifb=c = K(e), thenit follows
from Exercise 30.N that there exists a number é in [c, b] such that

[te = o(c)|
In view of the estimate

it follows that
| =|I,-L|<2A,

[te J<e

when b= c both exceed K(e). We can then apply the Cauchy Criterion
32.5. QED.
32.10 Exampces. (a) If f(x)=1/(1+x7) and g(x)=1/x? for x =a>0,
then 0 < f(x) = g(x). Since we have already seen in Example 32.4(b) that
the infinite integral J1* (1/x’) dx exists, it follows from the Comparison
Test 32.7 that the infinite integral [}* (1/(1+ x*)) dx also exists. (This
could be shown directly by noting that

j sap de = Are tan c= Are tan 1


1

and that Arc tan c > 7/2 as c > +.)


(b) If h(x) =e” and g(x) =e™ then 0 = h(x) < g(x) forx=1. It was
seen in Example 32.4(c) that the infinite integral Jo” e * dx exists, whence it
follows from the Comparison Test 32.7 that the infinite integral [3” e*™’ dx
264 FUNCTIONS OF ONE VARIABLE

also exists. This time, a direct evaluation of the partial integrals is not
possible, using elementary functions. However, it will be seen later that
this infinite integral equals 4V7.
(c) Let p>0 and consider the existence of the infinite integral

If p>, then the integrand is dominated by 1/x’, which was seen in


Example 32.4(b) to be convergent. In this case the Comparison Test
implies that the infinite integral converges. If 0<p <1, this argument
fails; however, if we set f(x) =sin x and ¢(x) = 1/x’, then Dirichlet’s Test
32.9 shows that the infinite integral exists.
(d) Let f(x) =sin x’ for x = 0 and consider the Fresnel} Integral
foo
| sin x7 dx.
oO

It is clear that the integral over [0, 1] exists, so we shall examine only the
integral over {x:x = 1}. If we make the substitution t= x” and apply the
Change of Variable Theorem 30.12, we obtain
. 1(°sint
sin x” dx=>/ ——= dt.
J 2s vt
The preceding example shows that the integral on the right converges when
c > +; hence it follows that fi” sin x” dx exists. (It should be observed
that the integrand does not converge to 0 as x — +0.)
(e) Suppose that a = 1 and let I'(a) be defined by the integral
+o0
(32.13) (a) = { e*x"* dx.

In order to see that this infinite integral exists, consider the function
g(x) =1/x? forx =1. Since

it follows that if ¢ >0 then there exists K(e) such that

O<e*x* '< ex? forx < K(e).


Since the infinite integral [x” x? dx exists, we infer that the integral (32.13)
also converges. The important function defined for a = 1 by formula
(32.13) is called the Gamma function. It will be quickly seen that if a <1,

+ AUGUSTIN FRESNEL (1788-1827), a French mathematical physicist, helped to reestablish the


undulatory theory of light which was introduced earlier by Huygens.
32. IMPROPER AND INFINITE INTEGRALS 265

then the integrand e~*x*"' becomes unbounded near x =0. However, if a


satisfies 0< a <1, then we have seen in Example 32.2(c) that the function
x°*”' has an improper integral over the interval [0, 1]. Since 0<e™ = 1 for
all x = 0, it is readily established that the improper integral
1

{ e*x""* dx
O+

exists when 0<a<1. Hence we can extend the definition of the Gamma
function to be given for all a >O by an integral of the form of (32.13)
provided it is interpreted as a sum
a +00
| etx! dx+ { ex" ' dx
0. + a

of an improper integral and an infinite integral.

Absolute Convergence
If f is Riemann integrable on [a, c] for every c = a, then it follows from
Theorem 30.4(a) that |f|, the absolute value of f, is also Riemann inte-
grable on [a,c] for c = a. It follows from the Comparison Test 32.7 that
if the infinite integral

(32.14) | [f(x)} dx
exists, then the infinite integral

(32.15) | f(x) dx
also exists and is bounded in absolute value by (32.14).
32.11 DeFinirion. If the infinite integral (32.14) exists, then we say
that f is absolutely integrable over {x :x = a}, or that the infinite integral
(32.15) is absolutely convergent.
We have remarked that if f is absolutely integrable over {x :x > a}, then
the infinite integral (32.15) exists. The converse is not true, however, as
may be seen by considering the integral

[0% as.
+00 Ot
sin X
at x

The convergence of this integral was established in Example 32.10(c).


However, it is easily seen that in each interval [ka, (k + 1)a], k EN, there
is a subinterval of length b >0 on which
Isin x] > 3.
266 FUNCTIONS OF ONE VARIABLE

(in fact, we can take b=277/3.) Therefore, we have

ke sin x _{ en boi 1 1
I ax | + +{o =isetact +e},
whence it follows (see 16.11(c)) that the function f(x) =sin x/x is not abso-
lutely integrable over {x:x = 7}.
We observe that the Comparison Test 32.7 in fact establishes the abso-
lute convergence of the infinite integral of f over the interval [a, +).

Exercises

32.A. Suppose that f is a bounded real-valued function on J =[a, b] and that f is


integrable over [c, b] for all c>a. Prove that the improper integral of f over J
exists.
32.B. Suppose that f is integrable over [c, b] for all c >a and that the improper
integral f*,|f| exists. Show that the improper integral f?,f exists, but that the
converse may not be true.
32.C. Suppose that f and g are integrable on {c, b] for all ce(a,b). If |f(x)|=
g(x) for x ¢J =[a, b] and if g has an improper integral on J, then so does f.
32.D. Discuss the convergence or the divergence of the following improper
integrals:

© | ‘dx
aitan. ) | ‘dx
gyn
' xdx ‘logx
(c) i Gx)’ (d) i Te dx,

© [2Pee © [ae
x dx

32.E. Determine the values of p and q for which the following integrals con-
verge:

(a) [ea-a dx, (b) [xin x)" dx,


(c) [dog x)? dx, (d) [108 x)? dx.

32.F. Discuss the convergence or the divergence of the following integrals.


Which are absolutely convergent?

(a) lao (b) [s “as,

() [eC as, @ {~css dx,


“x sin x * sin x sin 2x
(e) [ T+x? dx, (f) { xy dx.
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 267

32.G. For what values of p and q are the following integrals convergent? For
what values are they absolutely convergent?
“= xP ™ sinx
(a) { Tan & (b) { xi dx,

* sin x? * 1—cosx
(c) << dx, (d) i x dx.

32.H. If f is integrable on any interval [0,c] for c>0, show that the infinite
integral {>* f exists if and only if the infinite integral [5* f exists.
32.1. Give an example where the infinite integral {>° f exists but where f is not
bounded on the set {x :x = O}.
32.J. If f is monotone and the infinite integral [> f exists, then xf(x)—>0 as
xX > +0,

Section 33. Uniform Convergence and Infinite


Integrals

In many applications it is important to consider infinite integrals in which


the integrand depends on a parameter. In order to handle this situation
easily, the notion of uniform convergence of the integral relative to the
parameter is of prime importance. We shall first treat the case that the
parameter belongs to an interval J =[a, B].
33.1 Derinirion. Let f be a real-valued function, defined for (x, t)
satisfying x =a anda =<t=<f. Suppose that for each t in J=[a, B] the
infinite integral

(33.1) F(t)= [fe t) dx


exists. We say that this convergence is uniform on J if for every « >0
there exists a number N(e) such that if c > N(e) and te J, then

F()— [ f(x, t) dx <€.

The distinction between ordinary convergence of the infinite integrals


given in (33.1) and uniform convergence is that M(e) can be chosen to be
independent of the value of t in J. We leave it to the reader to write out
the definition of uniform convergence of the infinite integrals when the
parameter t belongs to the set {t:t = a} or to the set N.
It is useful to have some tests for uniform convergence of the infinite
integral.
33.2 CaucHy CRITERION. Suppose that for each te J, the infinite in-
tegral (33.1) exists. Then the convergence is uniform on J if and only if for
268 FUNCTIONS OF ONE VARIABLE

each « >0 there is a number K(e) such that if b> c = K(e) andteJ, then

(33.2) | [ie t) dx <e.

We leave the proof as an exercise.


33.3 WelersTRAss M-Test. Suppose that f is Riemann integrable over
[a,c] for alle = aandallieJ. Suppose that there exists a positive function
M defined for x = a such that

iff |= M(x) forx=a,teJ,


and such that the infinite integral [7° M(x) dx exists. Then, for each te J,
the integral in (33.1) is (absolutely ) convergent and the convergence is uniform
on J.
PROOF. The convergence of

{ F(x, #)| dx forte J,

is an immediate consequence of the Comparison Test and the hypotheses.


Therefore, the integral yielding F(t) is absolutely convergent for te J. If
we use the Cauchy Criterion together with the estimate

| [ fe: Djdxi=< [ites t)| dx < [Me dx,

we can readily establish the uniform convergence on J. QED.


The Weierstrass M-test is useful when the convergence is absolute as
well as uniform, but it is not delicate enough to handle the case of non-
absolute uniform convergence. For this, we turn to an analogue of Dirich-
let’s Test 32.9.
33.4 Diricuiet’s Test. Let f be continuous in (x, t) forx = a andtinJ
and suppose that there exists a constant A such that

| [ f(x, t) dx <A forc=a, teJ.

Suppose that for each te J, the function (x, t) is monotone decreasing for
X = a, and converges to 0 as x > +~ uniformly forte J. Then the integral

PWD= [fox Dela) dx


converges uniformly on J.
PROOF. Let ¢ >0 and choose K(e) such that if x = K(e) and teJ, then
e(x,t)<e/2A. If b=c= K(e), then it follows from Exercise 30.N that
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 269

for each t € J, there exists a number &(t) in [c, b] such that


b é)

{ f(x, tho (x, t) dx = @(c, o| f(x, t) dx.

Therefore, if b => c = K(e) and te J, we have

| [70 096, dr] < (692A <e,


so the uniformity of the convergence follows from the Cauchy Criterion
33.2. Q.E.D.
33.5 Examples. (a) If f is given by
cos {x
f(x, t) aoe
i¢x2?)| X29 ER
«TER,

and if we define M(x) =(1+x7)"', then |f(x, t)|< M(x). Since the infinite
integral of M on [0, +) exists, it follows from the Weierstrass M-test that
the infinite integral
+0
cos 1x
5 dx
I, 1+x?

converges uniformiy for te R.


(b) Let f(x, 1)=ex' forx=0,t=0. Itis seen that the integral
+o
| e *x' dx
0

converges uniformly for t in an interval [0, 8B] for any B>0. However, it
does not converge uniformly on {t¢ R:t= 0}. (See Exercise 33.A.)
(c) If f(x, )=e™ sin x for x =O andt= y>0, then
If(x, \lse™se™
If we set M(x) =e ™, then the Weierstrass M-test implies that the integral
40
| e-™ sin x dx
0

converges uniformly for t => y > 0 and an elementary calculation shows that
it converges to (1+ 17)". (Note that if t=0, then the integral no longer
converges.)
(d) Consider the infinite integral
“=. sin x
| e *—— dx fort = 0,
0 x
where we interpret the integrand to be 1 for x =0. Since the integrand is
270 FUNCTIONS OF ONE VARIABLE

dominated by 1, it suffices to show that the integral over « < x converges


uniformly for t= 0. The Weierstrass M-test does not apply to this inte-
grand. However, if we take f(x, t)=sinx and ¢(x,t)=e "/x, then the
hypotheses of Dirichlet’s Test are satisfied.

Infinite Integrals Depending on a Parameter


Suppose that f is a continuous function of (x, t) defined for x = a and for
tin J=[a, 8]. Furthermore, suppose that the infinite integral

(33.1) F(t)= [te 1) dx


exists for each t¢ J. We shall now show that if this convergence is uniform,
then F is continuous on J and its integral can be calculated by interchang-
ing the order of integration. A similar result will be established for the
derivative.

33.6 THEOREM. Suppose that f is continuous in (x, t) forx = aand tin


J=[a, B] and that the convergence in (33.1) is uniform on J. Then F is
continuous on J.
proor. If neéN, let F, be defined on J by

F(t) = | f(x, t) dx.

It follows from Theorem 31.6 that F, is continuous on J. Since the


sequence (F,,) converges to F uniformly on J, it follows from Theorem 24.1
that F is continuous on J. QED.

33.7 THEOREM. Under the hypotheses of the preceding theorem, then

[Fo dt= min f(x, t) at} dx,

which can be written in the form

(33.3) {fr t) ax} dt = AL f(x, 1) ar} dx.


PROOF. If F, is defined as in the preceding proof, then it follows from
Theorem 31.9 that

[F.c dt = "(fre i) at} dx.


33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 271

Since (F,) converges to F uniformly on J, then Theorem 31.2 implies that


8 B

{ F(t) dt= tim | F,(t) dt.

Combining the last two relations, we obtain (33.3). QED.


33.8 THEOREM. Suppose that f and its partial derivative f, are continu-
ous in (x, t) forx = a andtinJ=[a, B]. Suppose that (33.1) exists for all
teJ and that

Gw=[ Kd ax
is uniformly convergent on J. Then F is differentiable on J and F'=G. In
symbols:

{, fe. t) dx = i af (x, t) dx.


prRooF. If F, is defined for té€ J to be

F.(t)= ac t) dx,
then it follows from Theorem 31.7 that F, is differentiable and that
+n
Fi(t) -| g(x, t) dx.
By hypothesis, the sequence (F,) converges on J to F and the sequence
(Fi) converges uniformly on J to G. It follows from Theorem 28.5 that F
is differentiable on J and that F’= G. OED.

33.9 ExAmpces. (a) We observe that if t>0, then

i={ e* dx
t lo

and that the convergence is uniform for t= t.>0. If we integrate both


sides of this relation with respect to t over an interval [a, 8] where O<a<
8, and use Theorem 33.7, we obtain the formula

log (Bla) = [ * at= [ {fem at} dx


~[7 eae
‘Oo
+0
é€
ax
—e

x
—px

(Observe that the last integrand can be defined to be continuous at x = 0.)


272 FUNCTIONS OF ONE VARIABLE

(b) Instead of integrating with respect to t, we differentiate and formally


obtain

i
+00

= { xe”™ dx.
t lo

Since this latter integral converges uniformly with respect to t, provided


t= t.>0, the formula holds fort >0. By induction we obtain
nf|
f= x"e * dx fort >0.
0

Referring to the definition of the Gamma function, given in Example


32.10(e), we see that (n+ 1)=n!.
(c) If a>1 is a real number and x>0, then x*7?=e°~"*, Hence
f(a) =x" is a continuous function of (a,x). Moreover, it is seen that
there exists a neighborhood of a on which the integral

T(a) = { x* te dx
0

is uniformly convergent. It follows from Theorem 33.6 that the Gamma


function is continuous at least fora >1. (If0<a < 1, the same conclusion
can be drawn, but the fact that the integral is improper at x =0 must be
considered.)
(d) Let t= 0 and u = 0 and let F be defined by

F(u) -| et RUE ay.


0 x

If ¢>0, then this integral is uniformly convergent for u = 0 and so is the


integral
ee
F'(u) = { e”™ cos ux dx.
‘0

Moreover, integration by parts shows that


A —™x . x=A
- é€ Uu SIN UX
— t COS UX
| e-* cos wx dy = |& “Le sin us F008 :
0 P+u? x=O0

If we let A — +, we obtain the formula

(w= -|| ao
+0
Fu e-™ cos ux dx = —3 uz 0.

Therefore, there exists a constant C such that


F(u)
= Arc tan (u/t)+C for u=0.
In order to evaluate the constant C, we use the fact that F(0)=0 and
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 273

Arc tan (0) =0 and infer that C=0. Hence, if t>0 and u = 0, then
40 .
_i, SIN UX
Arc tan (u/t) = { e — dx.
0

(e) Now hold u >0 fixed in the last formula and observe, as in Example
33.5(d) that the integral converges uniformly for t = 0 so that the limit is
continuous fort = 0. If we let t > 0+, we obtain the important formula

(33.4) 3 = { a dx, u>0d.


0.

Infinite Integrals of Sequences


Let (f,) be a sequence of real-valued functions which are defined for
x 2a. We shall suppose that the infinite integrals fz” f, all exist and that
the limit f(x) = lim (f,(x)) exists for each x = a. We would like to be able
to conclude that the infinite integral of f exists and that

(33.5) [or = tim| fi


In Theorem 31.2 it was proved that if a sequence (f,) of Riemann inte-
grable functions converges uniformly on an interval [a, c] to a function f,
then f is Riemann integrable and the integral of f is the limit of the integrals
of the f,. The corresponding result is not necessarily true for infinite inte-
grals; it will be seen in Exercise 33.J that the limit function need not possess
an infinite integral. Moreover, even if the infinite integral does exist and
both sides of (33.5) have a meaning, the equality may fail (cf. Exercise
33.K). Similarly, the obvious extension of the Bounded Convergence
Theorem 31.3 may fail for infinite integrals. However, there are two
important and useful results which give conditions under which equation
(33.5) holds. In proving them we shall make use of the Bounded Con-
vergence Theorem 31.3. The first result is a special case of a celebrated
theorem due to Lebesgue. (Since we are dealing with infinite Riemann
integrals, we need to add the hypothesis that the limit function is inte-
grable. In the more general Lebesgue theory of integration, this
additional hypothesis is not required.)

33.10 DomINATED CONVERGENCE THEOREM. Suppose that (f,) is a


bounded sequence of real-valued functions, that f(x) =lim (fa(x)) for all
x = a, and that f and f,, ne N, are Riemann integrable over [a,c] for all
c>a. Suppose that there exists a function M which has an integral over
x = a and that

\fa(x)] = M(x) forx=a, neN.


274 FUNCTIONS OF ONE VARIABLE

Then f has an integral over x = a and

(33.5) ( = tim fe
PROOF. It follows from the Comparison Test 32.7 that the infinite
integrals

{ f, ( fy neéeN,
exist. If ¢ >0, let K be chosen such that fx° M <e, from which it follows
that

Jet <e and


J. <s, neN.

Since f(x) = lim (f.(x)) for all x €[a, K] it follows from the Bounded Con-
vergence Theorem 31.3 that Jo f= lim, Jz f.. Therefore, we have

[or [el=| =
[rf +2e,

which is less than 3¢ for sufficiently large n. QED.


33.11 MONOTONE CONVERGENCE THEOREM. Suppose that (fa) is a
bounded sequence of positive functions on {x:a =a} which is monotone
increasing in the sense that f.(x) < fasi(x) forme Nand x = a, and such that
f and each f,, has an integral over [a, c] for alle >a. Then the limit function
f has an integral over {x:x = a} if and only if the set {[c° f.:neN} is
bounded. In this case

[fose{] jam
PROOF. Since the sequence (f,) is monotone increasing, we infer that
the sequence (Jz° f,:néN) is also monotone increasing. If f has an
integral over {x:x = a}, then the Dominated Convergence Theorem (with
M = f) shows that

[ f= tim | fa

Conversely, suppose that the set of infinite integrals is bounded and let $
be the supremum of this set. If c >a, then the Monotone Convergence
Theorem 31.4 implies that

[Foti ['t=sup {[H4


Since f, = 0, it follows that {i f, = fo" f. = S, and hence that (3 f<S. By
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 275

Theorem 32.6 the infinite integral of f exists and

[t=syp['f=sep
(see[ 1}
= sup {sup| fs} = sup| fas QED.

Iterated Infinite Integrals


In Theorem 33.7 we obtained a result which justifies the interchange of
the order of integration over the region {(x, t):asx,a=<t=<}. Itisalso
desirable to be able to interchange the order of integration of an iterated
infinite integral. That is, we wish to establish the equality

(33.6) {U1 t) ax| dt = { {te t)at} dx,


under suitable hypotheses. It turns out that a simple condition can be given
which will also imply absolute convergence of the integrals. However, in
order to treat iterated infinite integrals which are not necessarily absolutely
convergent, a more complicated set of conditions is required.
33.12 THEOREM. Suppose that f is a positive function defined for (x, t)
satisfying x = a,t=a. Suppose that

(33.7) [{ [fe t) at} dx = min ft) ax} dt


for each b = a and that

(33.79) {{ [, te i) ax| dt = LAL f(x, 1)ar} dx


foreach B = a. Then, if one of the iterated integrals in equation (33.6) exists,
the other also exists and they are equal.
PROOF. Suppose that the integral on the left side of (33.6) exists. Since f
is positive,

ec 1) dx = [te t) dx
for each b> aandt=a. Therefore, it follows from the Comparison Test
32.7, that

its (x) ax} dts Co (x, t) ax} dt.


276 FUNCTIONS OF ONE VARIABLE

Employing relation (33.7), we conclude that

{.{[ 1 t) ar} dx < [{ [te t) ax| dt


foreach b= a. An application of Theorem 32.6 shows that we can take the
limit as b — +, so the other iterated integral exists and

{fr (x, t) at} dx< [{ [16 t) ax} dt.

If we repeat this argument and apply equation (33.7’), we obtain the reverse
inequality. Therefore, the equality must hold. Q.E.D.
33.13 THEOREM. Suppose fis continuous forx = a, t = a, and that there
exist positive functions M and N such that the infinite integrals \3° M and J.° N
exist. If the inequality
(33.8) [f(x, O| = M(x)N(), Xx2a, t2a,
holds, then the iterated integrals in (33.6) both exist and are equal.
proor. Let g be defined for x = a, t= a by g(x, t)=f(x, th + M(x)N(D)
so that
0 = g(x, t)= 2M(x)N(t).

Since N is bounded on each interval [a, 8], it follows from the inequality
(33.8) and the Weierstrass M-test 33.3 that the integral

{a t) dx
exists uniformly for t¢[a, B]. By applying Theorem 33.7, we observe that
equation (33.7’) holds (with f replaced by g) for each B >a. Similarly,
(33.7) holds (with f replaced by g) for each b= a. Also the Comparison
Test 32.7 implies that the iterated integrals in (33.6) exist (with f replaced by
g). We deduce from Theorem 33.12 that these iterated integrals of g are
equal. But this implies that the iterated integrals of f exist and are equal.
QED.
The preceding results deal with the case that the iterated integrals are
absolutely convergent. We now present a result which treats the case of
non-absolute convergence.
33.14 THEOREM. Suppose that the real-valued function f is continuous
in (x, t) forx = a andi >= a and that the infinite integrals

(33.9) [te t) dx, {fos t) dt


33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 277

converge uniformly for t= a and x = a, respectively. In addition, let F be


defined for x = a, B = a, by

F(x, B)= | fx, #) at


B

and suppose that the infinite integral

(33.10) { F(x, B) dx

converges uniformly for B = a. Then both iterated infinite integrals exist and
are equal.
prRoor. Since the infinite integral (33.10) is uniformly convergent for
8 = a, if « >0 there exists a number A, = a such that if A = A., then

(33.11) [Fe B) dx — [Fe B) dx <e


for all B =a. Also we observe that

[Fo B) dx = [{[ve t) at] dx

= [ {fre t) ax} dt.


From Theorem 33.7 and the uniform convergence of the second integral in
(33.9), we infer that
A A +00

jim, | F(x, 8) dx = [ {| f(x, t)ar} dx.

Hence there exists a number B = a such that if B2 = Bi = B, then

(33.12) [Fc B2) dx — [rc Bs) dx <e.

By combining (33.11) and (33.12), it is seen that if B2=> Bi = B, then

[-F (x, B2) dx — [F (x, Bi) dx <3,

whence it follows that the limit of Jz” F(x, B) dx exists as B > +. After
applying Theorem 33.7 to the uniform convergence of the first integral in
(33.9), we have

Jim [ FG, 8) dx = Jim [{["Fe, 0 ar} ae


= Bt
lim { [te t) ax dt
= [ TE tee t) ax} dt.
278 FUNCTIONS OF ONE VARIABLE

Since both terms on the left side of (33.11) have limits as 8 ~+* we
conclude, on passing to the limit, that

({[ te. t) ar} dx - [ {fe t) ax} dt | <e.


If we let A > +, we obtain the equality of the iterated improper integrals.
QED.
The theorems given above justifying the interchange of the order of
integration are often useful, but they still leave ample foom for ingenuity.
Frequently they are used in conjunction with the Dominated or Monotone
Convergence Theorems 33.10 and 33.11.
33.15 Exampres. (a) If f(x, t)=e~°*” sin xt, then we can take M(x) =
e* and N(t)=e‘ and apply Theorem 33.13 to infer that

[ {{ e*" sin xt ax| dt = [ {{ e ©"? sin xt at} dx.


0. 0 oO 0.

(b) If g(x, t)=e™, for x = 0 and t = 0, then we are in trouble on the lines
x=0Oandt=0. However, if a>0,a>0, and x >a and t>a, then we
observe that
e xt e THE g—xtl2 < ere a?

If we set M(x) =e"*” and N(t)=e"”, then Theorem 33.13 implies that

{ {{ e~ dx} do= | {| eat} dx

(c) Consider the function f(x, y)=xe"°*”” forx = a>Oandy=0. If


we put M(x)=xe™ and N(y)=e~*’, then we can invert the order of
integration overa =x andOQ<y. Since we have
boo men x00 -a*(1+y?2)

xe —(1+y?)x?
dx = -~f
i ~ 21+y’) |x za 2(1+y*)’
it follows that
1a?

2€ 6
+e gary?

tty? y= [Pen{[
y=
+00


ya

o
+00

xe
ae

ay} de
y xX.

If we introduce the change of variable t = xy, we find that

| xe” dy = | ee’ dt=I.


0. 10.

It follows that
te grey 2 2
lty 7 dy =2e* i e” dx.
0
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 279

If we let a — 0, the expression on the right side convergesto2I’. On the left


hand side, we observe that the integrand is dominated by the integrable
function (1+ y’)*. Applying the Dominated Convergence Theorem, we
have
1 _f* dy _,. ieee 2
i= | ity im ° Try = ar.
Therefore I’ = 1/4, which yields a derivation of the formula
eo
| e dx =Wa.
0

(d) If we integrate by parts twice, we obtain the formula


Tay eo”
(33.13) { e-” sinx dx ays
=<" etd 5 sin a.
1+

If x = a>0 and y = a >0, we can argue as in Example (b) to show that

{ yoCOs a y+[~ ye™


ye sina 4
x 1+y?

-[° ‘{{ e * sin x dy} dx = | on de.

We want to take the limit as a > 0. In the last integral this can evidently
be done, and we obtain Jo” (e~™ sin x/x) dx. In view of the fact that
e~” cos a is dominated by 1 for y = 0, and the integral fi°(1/(1+ y’)) dy
exists, we can use the Dominated Convergence Theorem 33.10 to conclude
that
[ase
cosa |
lim
a0 Jy ity’ ree

The second integral is a bit more troublesome as the same type of estimate
shows that
ye™ sina
1+y’ ~14+y"’
“77

and the dominant function is not integrable; hence we must do better. Since
u <e” and |sin u| < u for u = 0, we infer that |e~” sin al =< 1/y, whence we
obtain the sharper estimate

ye” sina 1
< 3.
1+y’ 1+y

Wecan now employ the Dominated Convergence Theorem to take the limit
280 FUNCTIONS OF ONE VARIABLE

under the integral sign, to obtain


[- ye sina
lim ity? dy =0.
a0

We have arrived at the formula

*° dy -[-S sinx
am — Are tan a = dx.
lo 1+y? lo x

We now want to take the limit as a0. This time we cannot use the
Dominated Convergence Theorem, since J3” x~* sin x dx is not absolutely
convergent. Although the convergence of e “ to 1 as a > 0 is monotone,
the fact that sin x takes both signs implies that the convergence of the entire
integrand is not monotone. Fortunately, we have already seen in Example
33.5(d) that the convergence of the integral is uniform for oa = 0.
According to Theorem 33.6, the integral is continuous for a = 0 and hence
we once more obtain the formula

(33.14) { SID X dx = 4.
0 x

Exercises

33.A. Show that the integral {5° x‘e™* dx converges uniformly for ¢ in an interval
[0, B] but that it does not converge uniformly for t = 0.
33.B. Show that the integral

[ sin (tx) ax
0 x

is uniformly convergent for t = 1, but that it is not absolutely convergent for any of
these values of t.
33.C. For what values of t do the following infinite integrals converge uniformly?

@) "dx
[oes w [,
*° dx

(c) { e-* cos tx dx, (d) | x"e cos tx dx,


0 0

() { er dy,
0
(f) { Fee dy,
o (OX

33.D. Use formula (33.14) to show that TQ) = Vz.


33.E. Use formula (33.14) to show that {3° e"™’ dx =3vn/t for t>0. Justify the
differentiation and show that
+
{
0
x™e™ dx =
1-3---(2n—-1)
a
33.F. Establish the existence of the integral {s°(1—e “)x-?dx. (Note that the
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 281

integrand can be defined to be continuous at x =0.) Evaluate this integral by


(a) replacing e~*’ by e~*’ and differentiating with respect to t;
(b) integrating {}* e-*’ dx with respect to t. Justify all of the steps.
33.G. Let F be given for te R by
+00
F(t) = { e-’ cos tx dx.
oO

Differentiate with respect to t and integrate by parts to prove that F’(t) = (—1/2)iF(t).
Then find F(t) and, after a change of variable, establish the formula
+
| e cos tx dx =$V a/c e7""*, c>0.
12,

33.H. Let G be defined for t >0 by


to
G(t= [ ene dy,
oO

Differentiate and change variables to show that G'(t) = —2G(t). Then find G(t) and
establish the formula te
{ ert? dy lag el,
oO

33.1. Use formula (33.4), elementary trigonometric formulas, and manipulations


to show that

(a) zi on OF ax =1, a>o,


w do x
=0, a=0,
=~, a<Q,

(b) 2/ sin¥ COS


AX gy _ 1 lal <1,
tw do x
=} ja|=1,
=0, {a|>1.
2 (**sin
x sin ax 1 ati
(c) Zi — {ax = 7 log: ja|<1,

1 atl
=Tloss> ja|>4,

co 2 [em Pacn
33.J. For neN let f, be defined by
fa(x) = 1/x, lax<n,
=0, x>n.

Each f, has an integral for x = 1 and the sequence (f,) is bounded, monotone
increasing, and converges uniformly to a continuous function which is not integrable
over {x ER: x = 1}.
282 FUNCTIONS OF ONE VARIABLE

33.K. Let g, be defined by

Bn(x) = 1/n, Osxen’,


=0, x>n’,

Each g, has an integral over x = 0 and the sequence (g,) is bounded and converges to
a function g which has an integral over x = 0, but it is not true that

iim 2. = { g.
oO o

Is the convergence monotone?


33.L. If f(x, )=(x —2)/(x +1)’, show that

| {{ f(x, t)ar| dit>0O foreach


A = 1;
1 1

B too
| {{ f(x, t)as} dx <0 for each B = 1.
1 1

Hence, show that

I { 1. t) ax} ate ({ f “Ys t) at| dx.


33.M. Using an argument similar to that in Example 33.15(c) and formulas from
Exercises 33.G and 33.H, show that

[= kl
dy =Fe
1+y?
33.N. By considering the iterated integrals of e“°” sin y over the quadrant
x = 0, y = 0, establish the formula
“em. (*siny
{ Taw ® [ aty a>o.

Projects

33.a. This project treats the Gamma function, which was introduced in Example
32.10(e). Recall that T' is defined for x in P ={x € R : x >0} by the integral
+e0
T@)= | O+ er dt.
We have already seen that this integral converges for x € P and that '(@) =z.
(a) Show that T is continuous on P.
(b) Prove that P(x +1)=xI'(x) forx e€ P. (Hint: integrate by parts on the interval
[e, c].)
(c) Show that [(n+ 1) =n! forneN.
(d) Show that lim,..., x['(x)=1. Hence it follows that T is not bounded to the
right of x =0.
(e) Show that T is differentiable on P and that the second derivative is always
positive. (Hence I is a convex function on P.}
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 283

(f) By changing the variable t, show that


0 +”
T(x)= 2f es" ds = u| es ds.
O+ oF

33.8. We introduce the Beta function of Euler. Let B(x, y) be defined for x, y in
P={xeR:x
>0} by
L
B(x, y)= { rd —ty dt.

Ifx = 1 and y = 1, this integral is proper, but if0<x <1 or0<y <1, the integral is
improper.
(a) Establish the convergence of the integral for x, y in P.
(b) Prove that B(x, y) = B(y, x).
(c) Show that if x, y belong to P, then
(aP2)—
B(x, y)= 2[ O+ (sin "(cos t)*"' dt
and
+20 x1

Bow | aa
(d) By integrating the positive function
ft, u) = ete? Pend yd

over {(t, u):t?+u’=R?’, t= 0, u = 0} and comparing this integral with the integral
over inscribed and circumscribed squares, derive the important formula

B(x, y) _T@Py)
Ta ty)”

(e) Establish the integration formulas

eng. Mal (nt) 1:3°5+++(Qn-Ia


{ (sin x)" dx =SrGhd) 2-4-6): (2n) 2?
wid
val(n+1)_ 2-4-6 ++-(2n)
{ (sin: x)""*"
2n+1
dx == amt) = 1 +3-5- 7---(n+1)"

33.y. This and the next project present a few of the properties of the Laplacet
transform, which is important both for theoretical and applied mathematics. To
simplify the discussion, we shall restrict our attention to continuous functions f
defined on {t€R:t=0}toR. The Laplace transform of f is the function f defined at
the real number s by the formula

fisy= [eft at,


whenever this integral converges. Sometimes we denote f by £(f).
+ PIERRE-Si40N LAPLACE (1749-1827), the son of a Norman farmer, became professor at the
Military School in Paris and was elected to the Academy of Sciences. He is famous for his work
on celestial mechanics and probability.
284 FUNCTIONS OF ONE VARIABLE

(a) Suppose there exists a real number c such that |f(t)| = e“ for sufficiently large t.
Then the integral defining the Laplace transform f converges for s>c. Moreover,
it converges uniformly for s =c+6 if6>0.
(b) If f satisfies the boundedness condition in part (a), then f is continuous and has
a derivative for s >c given by the formula

Ms)= [oe (-1)f(t) dt.


[Thus the derivative of the Laplace transform of f is the Laplace transform of the
function g(t) =—¢f(t).]
(c) By induction, show that under the boundedness condition in (a), then f has
derivatives of all orders for s >c and that

For(s)= [Pe —ayre(o at


(d) Suppose f and g are continuous functions whose Laplace transforms f and g
converge for s > So, and if a and b are real numbers then the function af + bg has a
Laplace transform converging for s >s) and which equals af + bé.
(e) Ifa>0 and g(t) =f(at), then & converges for s > as, and

a(s)
-
== 13flsia).
Similarly, if h(t) =(1/a)f(t/a), then h converges for s > sp/a and
h(s) = flas).
(f) Suppose that the Laplace transform f of f exists for s > s, and let f be defined
fort <Otobeequalto0. Ifb >0 andif g(t) = f(t — b), then g converges for s > s, and

&(s) =e f(s).
Similarly, if h(t) = e"f(t) for any real b, then h converges for 5 >s)+b and
A(s) = f(s—b).
33.6. This project continues the preceding one and makes use of its results.
(a) Establish the following short table of Laplace transforms.

f@) f(s) Interval of Convergence

1 V/s s>0,
t" nt/s"** s>0,
e* (s—a)y? s>a,
te" nif(s-—ay""" s>a,
: a
sin at vt+ae all s,

s
cos at sta all s,

sinh at 33 s>a,

cosh at z s z s>a,
; sa
sint
1 Arc tan (1/s) s>0.
33. UNIFORM CONVERGENCE AND INFINITE INTEGRALS 285

(b) Suppose that f and f’ are continuous for t = 0, that f converges for s > sy and
that e“f(t) > 0 as t-> + for alls >s.. Then the Laplace transform of f' exists for
5 > S$, and

Fils) = sf(s)— FCO).


(Hint: integrate by parts.)
(c) Suppose that f, f’ and f” are continuous for t = 0 and that f converges for s > So.
In addition, suppose that e~“f(t) and e~“'f'(t) approach 0 as t—> +0 for all s > so.
Then the Laplace transform of f” exists for s>s, and

fP(s) = s7f(s)—sf(0)—f'().
(d) When all or part of an integrand is seen to be a Laplace transform, the integral
can sometimes be evaluated by changing the order of integration. Use this method
to evaluate the integral

[oasis
0
sinS
s

(e) It is desired to solve the differential equation

y(t) +2y(t) =3 sin t, y(0)=1.

Assume that this equation has a solution y such that the Laplace transforms of y and
y’ exists for sufficiently large s. In this case the transform of y must satisfy the
equation

s¥(s)— y(0)+29(s)=4/(s—1), s >1,


from which it follows that

ya 5t3
WS) = ETDs 1)
Use partial fractions and the table in (a) to obtain y(t)=Je'—3e “, which can be
directly verified to be a solution.
(f) Find the solution of the equation

y'+y’=0, y(Q)=a, — y'(0)


= 5,
by using the Laplace transform.
(g) Show that a linear homogeneous differential equation with constant coeffi-
cients can be solved by using the Laplace transform and the technique of decompos-
ing a rational function into partial fractions.
Vi
INFINITE SERIES

This chapter is concerned with establishing the most important theorems


in the theory of infinite series. Although a few peripheral results are
included here, our attention is directed to the basic propositions. The
reader is referred to more extensive treatises for advanced results and
applications.
In the first section we shall present the main theorems concerning the
convergence of infinite series in R’. We shall obtain some results of a
general nature which serve to establish the convergence of series and
justify certain manipulations with series.
In Section 35 we shall give some familiar “tests” for absolute con-
vergence of series. In addition to guaranteeing the convergence of the
series to which the tests are applicable, each of these tests yields a
quantitative estimate concerning the rapidity of the convergence.
The next section provides some useful tests for conditional convergence,
and gives a brief discussion of double series and the multiplication of series.
In Section 37 we introduce the study of series of functions and establish
the basic properties of power series. In the final section of this chapter we
shall establish some of the main results from the theory of Fourier series.

Section 34 Convergence of Infinite Series

In elementary texts, an infinite series is sometimes ‘“‘defined”’ to be ‘“‘an


expression of the form

(34.1) Xb Xote tate”


This ‘‘definition” lacks clarity, however, since there is no particular value
that we can attach a priori to this array of symbols which calls for an infinite

286
34 CONVERGENCE OF INFINITE SERIES 287

number of additions to be performed. Although there are other defini-


tions that are suitable, we shall take an infinite series to be the same as the
sequence of partial sums.
34.1 Derinirion. If X=(x,) is a sequence in R’, then the infinite
series (or simply the series) generated by X is the sequence S$ = (s,) defined
by
Si=X1,

S2=Sitx2. (=x1+Xz2),

Sk = Se-1 + Xk (=x, 4+x2+-+++x,),

If S converges, we refer to lim S as the sum of the infinite series. The


elements x, are called the terms and the elements s, are called the partial
sums of this infinite series.
It is conventional to use the expression (34.1) or one of the symbols

% (,)s y (Xn)s > Xn

both to denote the infinite series generated by the sequence X = (x,) and also to
denote lim S in the case that this infinite series is convergent. In actual practice,
the double use of these notations does not lead to confusion, provided it is
understood that the convergence of the series must be established.
The reader should guard against confusing the words ‘“‘sequence’’ and ‘‘series.”
In non-mathematical language, these words are interchangeable; in mathematics,
however, they are not synonyms. According to our definition, an infinite series is a
sequence S obtained from a given sequence X according to a special procedure that
was stated above. There are many other ways of generating new sequences and
attaching ‘‘sums”’ to the given sequence X. The reader should consult books on
divergent series, asymptotic series, and the summability of series for examples of
such theories.
A final word on notational matters. Although we generally index the elements
of the series by natural numbers, it is sometimes more convenient to start with
n=0, with n=5, or with n=k. When such is the case, we shall denote the
resulting series or their sums by notations such as

In Definition 14.2, we defined the sum and difference of two sequences


X, Yin R’. Similarly, if c is a real number and if w is an element in R’, we
defined the sequences cX =(cx,) and (w-x,) in R?’ and R, respectively.
We now examine the series generated by these sequences.
288 INFINITE SERIES

34.2 THEOREM. (a) If the series ¥ (x.) and ¥ (ya) converge, then the
series ¥ (xX. + yn) converges and the sums are related by the formula

D a+ yo) =D (nm) + D (Yn)

A similar result holds for the series generated by X— Y.


(b) If the series ¥ (x,) is convergent, c is a real number, and w is a fixed
element of R’, then the series ¥ (cxn) and ¥ (w+ xn) converge and

D (cx)=Cd (im), (we x= we D On).


PROOF. This result follows directly from Theorem 15.6 and Definition
34.1. QED.

It might be expected that if the sequences X =(x,) and Y = (y,) generate


convergent series, then the sequence X- Y=(x,- y,) also generates a
convergent series. That this is not always true may be seen by taking
X=Y=(CD'/Vn) in R.
We now present a very simple necessary condition for convergence of
a series. It is far from sufficient, however.
34.3. Lemma. If © (x) converges in R’, then lim (x,) =0.

PROOF. By definition, the convergence of ¥ (x.) means that lim (s,)


exists. But, since x, = —S:-1, then lim (x,) = lim (s,)—lim (s,-1) = 0.
QED.
The next result, although limited in scope, is of great importance.
34.4 THEOREM. Let (xn) be a sequence of positive real numbers. Then
» (x) converges if and only if the sequence S=(s,) of partial sums is
bounded. In this case,

DY x, = lim (s,) = sup {s,}.

pRoor. Since x,2=0, the sequence of partial sums is monotone


increasing:
SiS8SoS-s&SS

According to the Monotone Convergence Theorem 16.1, the sequence S


converges if and only if it is bounded. Q.E.D.
Since the following Cauchy Criterion is precisely a reformulation of
Theorem 16.10, we shall omit its proof.
34 CONVERGENCE OF INFINITE SERIES 289

34.5 CAUCHY CRITERION FOR SERIES. The series ¥ (xn) in R? con-


verges if and only if for each number ¢ >0 there is a natural number
M(e) such that if m =n> M(cs), then

{Sm — Sa|| = |[xn+1


+ Xn+2t+ ° *+Xml|<e.
The notion of absolute convergence is often of great importance in
treating series, as we shall show later.
34.6 DEFINITION. Let x =(x,) be a sequence in R°. We say that the
series ); (x,) is absolutely convergent if the series > (||x,||) is convergent in
R. Aseries is said to be conditionally convergent if it is convergent but not
absolutely convergent.
It is stressed that for series whose elements are positive real numbers,
there is no distinction between ordinary convergence and absolute
convergence. However, for other series there may be a difference.
34.7 THEOREM. [If a series in R® is absolutely convergent, then it is
convergent.
PROOF. By hypothesis, the series > (|x|!) converges. Therefore, it
follows from the necessity of the Cauchy Criterion 34.5 that given « >0
there is a natural number M(e) such that if m =n = M(ce), then

[[xno1l]
+ [Xns2l| ++ + ++ [xml] <e-
According to the Triangle Inequality, the left-hand side of this relation
dominates

[|Xn+1
t+ Xns2t+ + Xml.
We apply the sufficiency of the Cauchy Criterion to conclude that the }' (x,)
must converge. Q.E.D.

34.8 EXAMPLES. (a) We consider the real sequence X =(a"), which


generates the geometric series
(34.2) ata?t::+ta"+--:
A necessary condition for convergence is that lim (a")=0, which requires
that Ja|<1. If m =n, then
nti m+)

(34.3) an n+l + ant


nt2a oo,
+a ma 7a
loa?

as can be verified by multiplying both sides by 1—a and noticing the


telescoping on the left side. Hence the partial sums satisfy
n+l m+1

Ise —s,|=|a™ +.» -¢an[ ce tle") m>n.


Jl-al?
290 INIFINITE SERIES

If ja|<1, then |a"*’|>0 so the Cauchy Criterion implies that the


geometric series (34.2) converges if and only if |a|<1. Letting n=0 in
(34.3) and passing to the limit with respect to m we find that (34.2)
converges to the limit a/(1-a) when |a|<1.
(b) Consider the harmonic series }'(1/n), which is well-known to
diverge. Since lim (1/n)=0, we cannot use Lemma 34.3 to establish this
divergence, but must carry out a more delicate argument, which we shall
base on Theorem 34.4. We shall show that a subsequence of the partial
sums is not bounded. In fact, if k,=2, then

iL 1
Sky = 1 ta

and if k.= 2’, then

Saewtatylytig
=Tt5tZt Gas st ghyl
3+ G> 5, 42(9)1)_= 145.4 2

By mathematical induction, we establish that if k, = 2’, then

Si, > Se, +2" (F)=s., 452 1+5-

Therefore, the subsequence (s,,) is not bounded and the harmonic series
does not converge.
(c) We now treat the p-series ))(1/n’) where O<p=1 and use the
elementary inequality n’ <n, for ne N. From this it follows that, when
O0<p <1, then

A 7p
Slr

= ; neN.,
n

Since the partial sums of the harmonic series are not bounded, this
inequality shows that the partial sums of ¥ (1/n") are not bounded for
O0<p<=1. Hence the series diverges for these values of p.
(d) Consider the p-series for p>1. Since the partial sums are
monotone, it is sufficient to show that some subsequence remains bounded
in order to establish the convergence of the series. If ki = 2'—1=1, then
S,=1. If k2=2’-1=3, we have

1./1,1 2 1
sy qt (get ge) <1 tg Lt ge,
and if k;=2°—1, we have
1,1,1,1 4 1 1
Ska = Sk, + (StS Sts) <atH<lt gat ge.
34 CONVERGENCE OF INFINITE SERIES 291

Let a=1/2""'; since p>1, it is seen that 0<a<1. By mathematical


induction, we find that if k, =2'—1, then
O<s,<l+ata’t---+a™.
Hence the number 1/(1~ a) is an upper bound for the partial sums of the
p-series when 1<p. From Theorem 34.4 it follows that for such values of
p, the p-series converges.
(e) Consider the series ¥ (1/(n?+n)). By using partial fractions, we can
write

11 _ 21 _1 0 1
k?+k k(k+1) k eT
This expression shows that the partial sums are telescoping and hence

Sn
4104 L411
~7-2°2-3¢ Tn@tD1 ntl:
It follows that the sequence (s,) is convergent to 1.

Rearrangements of Series
Loosely speaking, a rearrangement of a series is another series which is
obtained from the given one by using all of the terms exactly once, but
scrambling the order in which the terms are taken. For example, the
harmonic series
11,1 1
pratgt ty
has rearrangements

1,1,1,1 1 1
2tttat3t
tant init
1,1,1,1,1,1

The first rearrangement is obtained by interchanging the first and second


terms, the third and fourth terms, and so forth. The second rearrangement
is obtained from the harmonic series by taking one ‘‘odd term,” two “even
terms,” three ‘‘odd terms,” and so on. It is evident that there are infinitely
many other possible rearrangements of the harmonic series.
34.9 DeFINiTION. A series } (yn) in R? is a rearrangement of a series
> (x,) if there exists a bijection f of N onto N such that ym =xXpin for all
meN.
292 INFINITE SERIES

There is a remarkable observation due to Riemann, that if ¥ (x,) is a series in R


which is conditionally convergent (that is, it is convergent but not absolutely
convergent) and if c is an arbitrary real number, then there exists a rearrangement
of ¥ (x,) which converges to c. The idea of the proof of this assertion is very
elementary: we take positive terms until we obtain a partial sum exceeding c, then
we take negative terms from the given series until we obtain a partial sum of terms
less than c, etc. Since lim (x,)=0, it is not difficult to see that a rearrangement
which converges to c can be constructed.

In our manipulations with series, we generally find it convenient to be


sure that rearrangements will not affect the convergence or the value of
the limit.
34.10 REARRANGEMENT THEOREM Let ¥ (xn) be an absolutely con-
vergent series in R’. Then any rearrangement of >. (x.) converges absolutely
to the same value.
PROOF, Let x= (xn), let ¥ (ym) be a rearrangement of ¥ (x,), and let
K be an upper bound for the partial sums of ¥ ((|x,||). Clearly, if
t=yit--+-+y, is a partial sum of ¥ (ym), then

llyallt = + yell = K,
whence it follows that ¥ (yn) is absolutely convergent to some element y of
R?’. We wish to show that x=y. If e>0, let N(e) be such that if
m>nz N(e), and 5 =x:+++++x,, then |x—s,||<e and
m

Dd. |pall<e.
k=n+1

Choose a partial sum ¢, of ¥ (ym) such that |ly —14||<« and such that each
X1,X2,...,;%n Occurs in & Having done this, choose m> ni so large that
every yx appearing in ¢, also appears in s,. Therefore

lx— yl] = [fx sul] + [fs — tl] +] — yl <e + 2 [pl +e <3e.


Since ¢ > 0 is arbitrary, we infer that x = y. Q.B.D.

Exercises
34.A. Let ¥ (a,) be a given series and let ¥ (b,) be one in which the terms are the
same as those in ¥ (a,), except those for which a, = 0 have been omitted. Show
that } (a,) converges to a number A if and only if ¥ (b,) converges to A.
34.B. Show that the convergence of a series is not affected by changing a finite
number of its terms. (Of course, the sum may well be changed.)
34.C. Show that grouping the terms of a convergent series by introducing
parentheses containing a finite number of terms does not destroy the convergence or
34 CONVERGENCE OF INFINITE SERIES 293

the value of the limit. However, grouping terms in a divergent series can produce
convergence.
34.D. Show that if a convergent series of real numbers contains only a finite
number of negative terms, then it is absolutely convergent.
34.E. Show that if a series of real numbers is conditionally convergent, then the
series of positive terms is divergent and the series of negative terms is divergent.
34.F. By using partial fractions, show that

(a) Liarnterery™ ifa>0,

1
(b) zy alwe iw e3) “4°

34.G. If }(a,) is a convergent series of real numbers, then is ¥ (a,’) always


convergent? If a, = 0, then is it true that (Va,) is always convergent?
34.H. If ¥ (a,) is convergent and a, = 0, then is ¥ (Va,a,+,) convergent?
34.1. Let ¥ (a,) be a series of strictly positive numbers and let b,, néN, be
defined to be b, =(a,+a,+:+--+a,)/n. Show that } (b,) always diverges.
34.J. Let ¥ (a,) be convergent and let c,, néN, be defined to be the weighted
means
¢ _ at2a,+--+-+na,
" n(n+ 1)

Then ¥ (c,) converges and equals ¥ (a,).


34.K. Let ¥ (a,) be a series of monotone decreasing positive numbers. Prove
that }°-, (a,) converges if and only if the series
Mes

2" a2"
n 1

converges. This result is often called the Cauchy Condensation Test. (Hint: group
the terms into blocks as in Examples 34.8(b, d).)
34.L. Use the Cauchy Condensation Test to discuss the convergence of the
p-series ) (1/n?).
34.M. Use the Cauchy Condensation Test to show that the series

1 1
» nlogn’ » n(log n)(log log n)’

y—_—_,
1
n(log n)(log log n)(log log log n)

are divergent.
34.N. Show that if c >1, the series

yy,
n(log n) YF n(log n)(log log n)°

are convergent.
294 INFINITE SERIES

34.0. Suppose that (a,) is a monotone decreasing sequence of positive


numbers. Show that if the series )(a,) converges, then lim(na,)=0. Is the
converse true?
34.P. If lim (a,) =0, then ¥ (a,) and ¥ (a, +2a,,:) are both convergent or both
divergent.

Section 35 Tests for Absolute Convergence

In the preceding section we obtained some results concerning the


manipulation of infinite series, especially in the important case where the
series are absolutely convergent. However, except for the Cauchy Criter-
ion and the fact that the terms of a convergent series converge to zero, we
did not establish any necessary or sufficient conditions for convergence of
infinite series.
We shall now give some results which can be used to establish the
convergence or divergence of infinite series. In view of its importance, we
shall pay special attention to absolute convergence. Since the absolute
convergence of the series }; (x,) in R° is equivalent with the convergence of
the series } (|x|) of positive elements of R, it is clear that results
establishing the convergence of positive real series have particular interest.
Our first test shows that if the terms of a positive real series are
dominated by the corresponding terms of a convergent series, then the first
series is convergent. It yields a test for absolute convergence that the
reader should formulate.

35.1 COMPARISON Test. Let X=(x,) and Y=(y.) be positive real


sequences and suppose that for some natural number K,

(35.1) Xn Yn forn = K,

Then the convergence of ¥ (ya) implies the convergence of >. (xn).


PROOF. If m=nz2sup{K, M(c)}, then

Anti tt t+ Xm SS Ynsi tt Ym SE,

from which the assertion is evident. QED.

35.2 Limit CoMpARISON TEsT. Suppose that X =(x,) and Y =(ya)


are positive real sequences.
(a) If the relation

(35.2) lim (Xn/ yn) #0

holds, then >} (x,) is convergent if and only if ¥ (ya) is convergent.


35 TESTS FOR ABSOLUTE CONVERGENCE 295

(b) If the limit in (35.2) is zero and > (y,) is convergent, then ¥. (xa) is
convergent.
PROOF. It follows from (35.2) that for some real number c >1 and
some natural number K, then

(1/e)}¥a = Xn = CYn for n= K.

If we apply the Comparison Test 35.1 twice, we obtain the assertion in part
(a). The proof of (b) is similar and will be omitted. OED.

The Root and Ratio Tests

We now give an important test due to Cauchy.

35.3. Roor Test. (a) IfX =(x,) is a sequence in R? and there exists a
positive number r<1 and a natural number K such that
(35.3) [x.""<r forn=K,
then the series ¥ (xn) is absolutely convergent.
(b) If there exists a number r>1 and a natural number K such that
(35.4) I|xn[]"" =r forn> K,
then the series ¥ (xn) is divergent.
PROOF. (a) If (35.3) holds, then we have |{x,|] <r". Now for 0 <r <1,
the series ) (r") is convergent, as was seen in Example 34.8(a). Hence it
follows from the Comparison Test that ¥ (x,) is absolutely convergent.
(b) If (35.4) holds, then ||x,|| =r". However, since r > 1, it is false that
lim (|lxx{l) = 0. QED.
In addition to establishing the convergence of >) (x,), the root test can be
used to obtain an estimate of the rapidity of convergence. This estimate is
useful in numerical computations and in some theoretical estimates as well.

35.4 COROLLARY. If r satisfies O0<r<1 and if the sequence X = (x)


satisfies (35.3), then the partial sums s,, n= K, approximate the sum
S=)) (xn) according to the estimate
ntl
r
(35.5) \|s — sal] = forn>K.
1-r

PROOF. If m=nz= K, we have

r ntl
{|in — Sn || = [foenert+ + Xml < [loenealf-+ > + [fxn] sr? +--+ tr <
l-r
Now take the limit with respect to m to obtain (35.5). QE.D.
296 INFINITE SERIES

It is often convenient to make use of the following variant of the Root


Test.
35.5 COROLLARY. Let X=(x,.) be a sequence in R? and set

(35.6) r= lim (([xal|”),


whenever this limit exists. Then ¥ (xn) is absolutely convergent when r<1
and is divergent when r>1.

PROOF. It follows that if the limit in (35.6) exists and is less than 1, then
there is a real number r; with r<r,< 1 anda natural number K such that

I|xn[["" = 11 for n= K.

In this case the series is absolutely convergent. If this limit exceeds 1, then
there is a real number r.>1 and a natural number K such that

x.'"=r forn=K,
in which case series is divergent. QED.

This corollary can be generalized by using the limit superior instead of


the limit. We leave the details as an exercise. The next test is due to
D’Alembert.t

35.6 Ratio Test. (a) If X =(xn) is a sequence of non-zero elements


of R? and there is a positive number r <1 and a natural number K such that

(35.7) [n+ =r forn = K,


I[xelh
then the series 5; (xn) is absolutely convergent.
(b) If there exists a number r= 1 and a natural number K such that

(35.8) [xn r for n= K,


[[xxlh
V

then the series ¥ (xn) is divergent.


PROOF. (a) If (35.7) holds, then an elementary induction argument
shows that ||xx+n||< rr |lx«|| for m = 1. It follows that for n => K the terms
of ¥ (x,) are dominated by a fixed multiple of the terms of the geometric
series ¥ (r") withO =r<1. From the Comparison Test 35.1, we infer that
¥ (xn) is absolutely convergent.
{ JEAN Le ROND D’ALEMBERT (1717-1783) was a son of the Chevalier Destouches. He
became the secretary of the French Academy and the leading mathematician of the
Encyclopedists. He contributed to dynamics and differential equations.
35 TESTS FOR ABSOLUTE CONVERGENCE 297

(b) If (35.8) holds, then an elementary induction argument shows that


||xx+m|]=r™ |[xx|| for m=1. Since r=1, it is impossible to have
lim ((|xx||) = 0, so the series cannot converge. OED.
35.7 COROLLARY. [If r satisfies 0 r<1 and if the sequence X = (xn)
satisfies (35.7) for n=K, then the partial sums approximate the sum
s=) (xn) according to the estimate

(35.9) \|s ~ sal|< Toy forn=K.

PROOF. The relation (35.7) implies that ||x.+«|| << r* ||xn|| when n = K.
Therefore, if m = n= K, we have

WlSiu ~ Sul]= []xnea ++ + + Xm | = [oneal] Fe + ++ [Xml


<(rtirete sete") [xn <7—7 [Pll

Again we take the limit with respect to m to obtain (35.9). Q.E.D.

35.8 COROLLARY. Let X= (x) be a sequence in R? and set

r= lim (lias),
I[xelh
whenever the limit exists. Then the series ¥ (xn) is absolutely convergent
when r<1 and divergent when r> 1.
PROOF. Suppose that the limit exists andr<1. If 1, satisfies r<r.<1,
then there is a natural number K such that

[|xn-+1l]
<n forn>K.
Ilan
In this case Theorem 35.6 establishes the absolute convergence of the
series. If r>1, andif rz satisfies 1<r.<,r, then there is a natural number K
such that

Liesl to for n = K,

and in this case there is divergence. QED.

Raabe’s Test
If r= 1, both the Ratio and the Root Tests fail and either convergence or
divergence may take place. (See Example 35.13(d)). For some purposes
298 INFINITE SERIES

it is useful to have a more delicate form of the Ratio Test for the case when
r=1. The next result, which is attributed to Raabe?, is usually adequate.
35.9 Raase’s Test. (a) If X=(xn) is a sequence of non-zero
elements of R? and there is a real number a > 1 and a natural number K such
that

(35.10) a le 1-4 forn=K,


then the series ¥ (xn) is absolutely convergent.
(b) If there is a real number a <1 and a natural number K such that

(35.11) bel. 1-2 forn=K,

then the series ¥ (x,) is not absolutely convergent.


PROOF. (a) Assuming that relation (35.10) holds, we have
k ||x +1] <= (k — 1) |lxxI]-—
(a — 1) ||| fork = K.
It follows that

(35.12) (k—1) |x ||—K |lxc+1l] = (a - 1) |[xx||>0 fork = K,

from which it follows that the sequence (k ||x:.+[|) is decreasing for k = K.


On adding the relation (35.12) for k = K,...,n and noting that the left
side telescopes, we find that

(K ~ 1) [[x«l|— 1 []xnssl] = (a ~ 1)(llxxl| ++ - - + |[onl))-


This shows that the partial sums of ¥ ((|x.||) are bounded and establishes the
absolute convergence of ¥ (xn).
(b) If the relation (35.11) holds for n = K then, since a <1,

1 |[xXn+1|] = (1 — a) |[xnl] = (2 — 1) [lan


Therefore, the sequence (n ||x,.1||) is increasing for n = K, and there exists a
number c >0 such that

[xu+al|>c/n, n= K.
Since the harmonic series ¥ (1/n) diverges, then ’ (x,) cannot be absolutely
convergent. QED.

We can also use Raabe’s Test to obtain information on the rapidity of


the convergence.

t JospPH L. RAABE (1801-1859) was born in the Ukraine and taught at Ziirich. He worked in
both geometry and analysis.
35 TESTS FOR ABSOLUTE CONVERGENCE 299

35.10 CoROLLARY. If a>1 and if the sequence X =(xn) satisfies


(35.10), then the partial sums approximate the sum s of ¥ (xx) according to
the estimate

(35.13) Ils — || = | [xn+il| forn=K.


a-1
PROOF. Let m>nz K and add the inequalities obtained from (35.12)
for k=n+1,...,m to obtain

7 ||xXn+:l|~ 10 |lXm-+al] = (4— 1)([xnsall++ + ++ [lanl


Hence we have
[sm — Sl] = [enol] +- [nll = Msnll
taking the limit with respect to m, we obtain (35.13). O.E.D.

In the application of Raabe’s Test, it may be convenient to use the


following less sharp limiting form.
35.11 CoROLLARyY. Let X =(xn) be a sequence of non-zero elements
of R? and set

35.14 a= lim(n( 1
G14) Pl )),
whenever this limit exists. Then ©, (xn) is absolutely convergent when a> 1
and is not absolutely convergent when a <1.
PROOF. Suppose the limit (35.14) exists and satisfies a >1. If ai is any
number with a>a,> 1, then there exists a natural number K such that

ay <n(1—Heeil) forn > K.

Therefore, it follows that

ee 1- forn=K
n

and Theorem 35.9 assures the absolute convergence of the series. The
case where a <1 is handled similarly and will be omitted. Q.B.D.

The Integral Test


We now present a powerful test, due to Maclaurint, for a series of
positive numbers.
T COLIN MACLAURIN (1698-1746) was a student of Newton’s and professor at Edinburgh.
He was the leading British mathematician of his time and contributed both to geometry and
mathematical physics.
300 INFINITE SERIES

35.12 InreGRAL Test. Let f be a positive, decreasing, continuous


function on {t:t = 1}. Then the series ¥ (f(n)) converges if and only if the
infinite integral

{- f(t) dt = tim(" fir) at)


exists. In the case of convergence, the partial sum Ss. = Ye-1 (f(k)) and the
sum $ = x=. (f(k)) satisfy the estimate

(35.15) {- f(t) dt<s—s, < [- f(t) dt.


PROOF. Since f is positive, continuous, and decreasing on the interval
{k—1, k], it follows that

(35.16) f(k) < L., f(t) dt < f(k—1).


By summing this inequality for k =2,3,...,n, we obtain the relation

so ~f(1) < I f(t) dt <su-1,


which shows that both or neither of the limits

lim (s,), tim(" f(t) at)


exist. If they exist, we obtain on summing relation (35.16) for k =
n+1,...,m, that

Sm — Sa | f(t) dt < Sn-1— Sn-1,

whence it follows that


m+1 m

| f(t) dt = 8m — 8 = | f(t) dt.


n+t n

If we take the limit with respect to m in this last inequality, we obtain


(35.15). Q.E.D.

We shall show how the results in Theorems 35.1 to 35.12 can be applied
to the p-series, which were introduced in Example 34.8(c).
35.13. Examptes. (a) First we shall apply the Comparison Test.
Knowing that the harmonic series } (1/n) diverges, it is seen that if p = 1,
35 TESTS FOR ABSOLUTE CONVERGENCE 301

then n’ <n and hence

a(R
=—_ P
.

3
After using the Comparison Test 35.1, we conclude that the p-series
¥ (/n’) diverges for p <1.
(b) Now consider the case p=2; that is, the series )(1/n?). We
compare the series with the convergent series }' [1/n(n+ 1)] of Example
34.8(e). Since the relation

i et
n(n+1) nn?

holds and the terms on the left form a convergent series, we cannot apply
the Comparison Theorem directly. However, we could apply this theorem
if we compared the nth term of }[1/n(n+1)] with the (n+ 1)st term of
> (1/n’). Instead, we choose to apply the Limit Comparison Test 35.2 and
note that

it nn
n(n+1) n? n(wt+1) nt+1°
Since the limit of this quotient is 1 and } [1/n(n+ 1)] converges, then so
does the series ¥ (1/n’).
(c) Now consider the case p = 2. If we note that n’ = n’ for p = 2, then

1
p72
i
neon

a direct application of the Comparison Test assures that } (1/n’) converges


for p= 2. Alternatively, we could apply the Limit Comparison Test and
note that

Ait iW
nP on? nn? nP

If p >2, this expression converges to 0, whence it follows from Corollary


35.2(b) that the series } (1/n’) converges for p = 2.
By using the Comparison Test, we cannot gain any information concern-
ing the p-series for 1< p <2 unless we can find a series whose convergence
character is known and which can be compared to the series in this range.
(d) We demonstrate the Root and the Ratio Tests as applied to the
p-series. Note that

2)" <ornm= 0
n?
302 INFINITE SERIES

Now it is known (see Example 14.8(e)) that the sequence (n”") converges
to 1. Hence we have

tim((s) =
lfm

so that the Root Test (in the form of Corollary 35.5) does not apply.
In the same way, since
1 o.1 on 1
(n+l? n? (th? (1+1/n)P’
and since the sequence ((1+1/n)’) converges to 1, the Ratio Test (in the
form of Corollary 35.8) does not apply.
(e) In desperation, we apply Raabe’s Test to the p-series for integral
values of p. First, we attempt to use Corollary 35.11. Observe that

2-H)
mf HEEB -a(1(1-gh))
If p is an integer, then we can use the Binomial Theorem to obtain an
estimate for the last term. In fact,

1 \'\_ (p-1)
n(1-(1- 44) )- n(1-1+ Bodie ye )
If we take the limit with respect to n, we obtain p. Hence this corollary to
Raabe’s Test shows that the series converges for integral values of p = 2
(and, if the Binomial Theorem is known for non-integral values of p, this
could be improved). :
(f) Finally, we apply the Integral Test to the p-series. Let f(t)=1t7° and
recall that

[4 de=1og
(n) —tog (1),
[
a

dt=>— 1 pin =D for p#1.


Bl

From these relations we see that the p-series converges if p>1 and
diverges if p= 1.

Exercises

35.A. Establish the convergence or the divergence of the series whose nth term
is given by
1 n
) GeD@+2’ ©) GeDm@+d)’
35 TESTS FOR ABSOLUTE CONVERGENCE 303

(c) 2°", (d) n/2",


(e) [n@a+ DT (f) [W@+py,
(g) nt/n", (h) (—D*a/(nt+
1).
35.B. For each of the series in Exercise 35.A which converge, estimate the
remainder if only four terms are taken. If we wish to determine the sum within
1/1000, how many terms should we take?
35.C. Discuss the convergence or the divergence of the series with nth term (for
sufficiently large n) given by
(a) Log nT”, (b) [log n]™,
(c) [log ny", (d) [log nJ"*"°"",
(e) [# log nJ”," (f) [ndog n)(log log n)"J".
35.D. Discuss the convergence or the divergence of the series with nth term
(a) 2"e", (b) n"e™,
() ev", (d) (log n) e,
(e) nite, (f) nle™.
35.E. Show that the series

is convergent, but that both the Ratio and the Root Tests fail to apply.
35.F. If a and b are positive numbers, then

a
(an + by’

converges if p>1 and diverges if p = 1.


35.G. Discuss the series whose nth term is

n! (nt
@ 55-7 @ntty’ ©) Qnyl
2+4+++(2n) 2-4-+-(2n)
© 35 ant ) San +3)
35.H. The series given by

6) Gy GS
2 2-4 2:4-6

converges for p >2 and diverges for p = 2.


35.1. Let X =(x,) be a sequence in R? and let r be given by

r= lim sup (lx, ||").

Then } (x,) is absolutely convergent if r<1 and divergent if r>1. [The limit
superior u =lim sup (b,) of a bounded sequence of real numbers was defined in
Section 18. It is the unique number u with the properties that (i) if u<v then
b, = v for all sufficiently large n € N, and (ii) if w<u, then w = b, for infinitely
many neN.]
304 INFINITE SERIES

35.J. Let X =(x,) be a sequence of non-zero elements of R’ and let r be given


by r=lim sup ([|xu+:ll/l|xal))-
(a) Show that if r<1, then the series ¥ (x,) is absolutely convergent.
(b) Give an example of an absolutely convergent series with r> 1.
(c) If lim inf (||x.+11|/||x.|[)> 1, show that the series ¥ (x,) is not absolutely con-
vergent.
35.K. Let X =(x,) be a sequence of non-zero elements of R° and let a be given
by a =lim sup (n(1 — [fxn -ill/llx,[))-
(a) If a<1, show that the series ¥ (x,) is not absolutely convergent.
(b) Give an example of a divergent series with a > 1.
(c) If lim inf (n(1—||x..1\/lx.{)) > 1 show that the series )' (x,) is absolutely
convergent.
35.L. Let X =(x,) be such that x,>0 for néN. Show that the series } (x,) is
divergent if

limsup( (log n)[n(1 —*st) — 1])< 1.

35.M. Let x,>0 for neN and suppose that n(1—x,,./x.)=a+k,/n’, where
p >0 and (k,) is bounded. Then the series } (x,) converges if a > 1 and diverges if
asl.
35.N. If p>0, q>0, then the series

y et Det?) - (rn)
(q+1)(q+2)---(q+n)
converges for q>p+1 and diverges for q<p+1.
35.0. Show that the series } (2"n!)’/(2n + 1)! is divergent.
35.P. Let x, >0 and let r = lim inf (—log x,/log n). Show that ¥ (x,) converges if
r>1 and diverges if r<1.
25.Q. Suppose that none of the numbers a, b, c is a negative integer or zero.
Prove that the hypergeometric series

ab alat1)b(b+ 1), alat 1)(a + 2)b(b+ 1b+2) .


Ife 2!c(c +1) 3!e(e + 1)(c+2)

is absolutely convergent for c>a+b and divergent for c=atb.


35.R. Let a,>0 and suppose that ¥ (a,) converges. Construct a convergent
series ¥ (b,) with b, > 0 such that lim (a,/b,) = 0; hence > (b,) converges less rapidly
than ¥' (a,). (Hint: let (A,) be the partial sums of ¥ (a,) and A its limit. Define
rn=A,mn=A—A, and b,=Vr,,—Vr,.)
35.S. Let a, >0 and suppose that ¥ (a,) diverges. Construct a divergent series
¥ (b,) with b, >0 such that lim (b,/a,) =0; hence ¥ (b,) diverges less rapidly than
¥(a,). (Hint: Let b,=Va, and b, = Va,_;-Va,, n> 1.)
35.T. Let {n,, no,...} denote the collection of natural numbers that do not use
the digit 6 in their decimal expansion. Show that the series } (1/n,) converges to a
number less than 90. If {m,., m2, ...} is the collection that ends in 6, then ¥ (1/m,)
diverges.
35 TESTS FOR ABSOLUTE CONVERGENCE 305

Project
35.a. Although infinite products do not occur as frequently as infinite series,
they are of importance in many investigations and applications. For simplicity,
we shall restrict attention here to infinite products with terms a, >0. If A =(a,) is
a sequence of strictly positive real numbers, then the infinite product, or the
sequence of partial products, generated by A is the sequence P = (p,) defined by

Pi = &, P2= pial = Q;@2),...,

Pn = Pn-1@n(
= G12 ** + Onn), 00+.
If the sequence P is convergent to a non-zero number, then we call lim P the
product of the infinite product generated by A. In this case we say that the infinite
product is convergent and write either

II Ons II (a), or Q,Q203;°"°* Att?


n=1

to denote both P and lim P.


(Note: the requirement that lim P¥ 0 is not essential but is conventional, since it
insures that certain properties of finite products carry over to infinite products.)
(a) Show that a necessary condition for the convergence of the infinite product is
that lim (a,) = 1.
(b) Prove that a necessary and sufficient condition for the convergence of

I Ons a, >0, is the convergence of > log a,.


nea at
(c) Infinite products often have terms of the form a,=1+4u,. In keeping with
our standing restriction, we suppose u, >—1 for allne N. If u, = 0, show that a
necessary and sufficient condition for the convergence of the infinite product is the
convergence of the infinite series ¥ (u,). (Hint: use the Limit Comparison Test
35.2.)
(d) Let u, >—-1. Show that if the infinite series ¥ (u,) is absolutely convergent,
then the infinite product [](1+u,) is convergent.
(e) Suppose that u,>—1 and that the series } (u,) is convergent. Then a
necessary and sufficient condition for the convergence of the infinite product
J[(1+4u,) is the convergence of the infinite series ) (u,’). (Hint: use Taylor’s
Theorem and show that there exist positive constants A and B such that if |u]<3,
then Au? < u—log (1 +u) < Bu’.)

Section 36 Further Results for Series

The tests given in Section 35 all have the character that they guarantee
that, if certain hypotheses are fulfilled, then the series ¥ (x,) is absolutely
convergent. Now it is known that absolute convergence implies ordinary
convergence, but it is readily seen from an examination of special series,
306 INFINITE SERIES

such as

yey yey
n n
that convergence may take place even though absolute convergence fails.
It is desired, therefore, to have a test which yields information about
ordinary convergence. There are many such tests which apply to special
types of series. Perhaps the ones with most general applicability are
those due to Abelt and Dirichlet.
To establish these tests, we need a lemma which is sometimes called the
partial summation formula, since it corresponds to the familiar integration
by parts formula. In most applications, the sequences X and Y are both
sequences in R, but the results hold when X and Y are sequences in R?
and the inner product is used or when one of X and Y is a real sequence
and the other is in R’.
36.1 AsBevs Lemma. Let X=(x,) in R and Y=(y,) in R? be
sequences and let the partial sums of ¥ (yn) be denoted by (s.). If m=n,
then
m

(36.1) » Xj = (Xmn-+18m ~ XnSn—1) + a (Xj — +1)5).


jon jen

PROOF. A proof of this result may be given by noting that y; = 5; — s;-1


and by matching the terms on each side of the equality. We shall leave the
details to the reader. Q.E.D.
We apply Abel’s Lemma to conclude that the series ¥ (xny.) is con-
vergent in a case where both of the series ) (x,) and ¥(y.) may be
divergent.
36.2 DIRICHLET’s TEST. Suppose the partial sums of ¥ (yn) are
bounded. (a) If the sequence X =(x.) converges to zero, and if

(36.2) D [xa
— Xn]

is convergent, then the series ¥. (xnyn) is convergent.


(b) In particular, if X =(x,) is a decreasing sequence of positive real
numbers which converges to zero, then the series ¥ (xnyn) is convergent.

+ NiELS HENRIK ABEL (1802-1829) was the son of a poor Norwegian minister. When only
twenty-two he proved the impossibility of solving the general quintic equation by radicals.
This self-taught genius also did outstanding work on series and elliptic functions before his
early death from tuberculosis.
36 FURTHER RESULTS FOR SERIES 307

PROOF. (a) Suppose that ||s,||<B for all j. Using (36.1), we have the
estimate

(36.3) I> XY; < {[xmoa} + [2m] + D0 [ay — xjsal}B.


Th Fh
If lim (x,,) =0, the first two terms on the right side can be made arbitrarily
small by taking m and n sufficiently large. Also if the series (36.2)
converges, then the Cauchy Criterion assures that the final term on this
side can be made less than « by taking m = n= M(e). Hence the Cauchy
Criterion implies that the series Y (xn) is convergent.
(b) If x1=x2=..., then the series in (36.2) is telescoping and con-
vergent. OED.
36.3 COROLLARY. In part (b), we have the error estimate
2 ft

» Xi¥i— » XiYj s 2Xn+1B,


j=1 j=l

where B is an upper bound for the partial sums > (yj).


PROOF. This is readily obtained from relation (36.3). Q.E.D.

The next test strengthens the hypothesis on the series ¥ (y,), but relaxes
the hypothesis on ¥ (x,).
36.4 ABEL’s Test. Suppose that the series ¥ (y.) converges in R?.
(a) If the sequence X =(x,) in R is such that

(36.2) D |x. — xmas]


is convergent, then the series }. (xnyn) is convergent.
(b) In particular, if the sequence X = (x,) is monotone and convergent to x
in R, then the series ¥, (xnyu) is convergent.
PROOF. (a) By hypothesis, the partial sums s, of ) (y,) converge to
some element s in R’. Hence there is a bound B for {{|s,||:k @ N} and,
given « >0 there is Ni(e) such that if n = Ni(e), then ||s,—s||<e.
Now the hypothesis that (36.2) is convergent implies that if n € N, then
[xn] =< [x1 + (X21)H+ + ++ (Xn — Xn-2)|
n-1

= |xi{+ x |e — Xe+s|
k=1

so that |x.{<A for some A >0. Moreover, there exists N2{e) such that if
m>n => N2(e), then

(36.4) [Xm+1— Xn] = Dy [xj xj] <e.


jan
308 INFINITE SERIES

Now let N3(e) =sup {Ni(e), No(e)} so that if m > n > N;(e), then we have

||Xm+18m — XnSn-1|
|[Xnta8im — Xm +18] + []Xm+18 — Xns|| + [focus — XnSn-l|
S [Xm-e1] [|S — Sl] + [Xn+1 — Xn [|S] + [Xn] []s — Seal
= Ae+eB+Ae=(QA+B)e.

Therefore, by Abel’s Lemma 36.1, if m>n>N;(e), then we have


m
du Xi |= 2A+B)e+ » (x4 — Xj+1) 5)
i=

= (2A + Bye + (> |x; = x01) B


j=

<2A+B)e,
where we have used (36.4) in the last step. Since e >0 is arbitrary, the
convergence of } (x;y) is established.
(b) If the sequence (x,) is monotone and converges to x, then the series
(36.2) is telescoping and converges either to x — x; or to x1—x. QED.

If we use the same type of argument we can establish the following error
estimate.
36.5 COROLLARY. In part (b), we have the error estimate
cS n

d XiYi — a Xi
j=l i=l
E Ixnea| | —sal|-+2B |x — x04).
Alternating Series
There is a particularly important class of conditionally convergent real
series, namely those whose terms are alternately positive and negative.
36.6 DEFINITION. A sequence X =(x,) of non-zero real numbers is
alternating if the terms (—1)"x,, n=1,2,..., are all positive (or all
negative) real numbers. If a sequence X =(x,) is alternating, we say that
the series }' (x,) it generates is an alternating series.
It is useful to set x, =(—1)"z, and require that z, >0 (or z, <0) for all
n=1,2,.... The convergence of alternating series is easily treated when
the next result, proved by Leibniz, can be applied.
36.7 ALTERNATING SERIES TEsT. Let Z=(z,) be a decreasing se-
quence of strictly positive numbers with lim (z,.)=0. Then the alternating
series ¥ ((—1)"zn) is convergent. Moreover, if s is the sum of this series and s,
36 FURTHER RESULTS FOR SERIES 309

is the nth partial sum, then we have the estimate

(36.5) |s—s, |< Zns1

for the rapidity of convergence.


PROOF. This follows immediately from Dirichlet’s Test 36.2(b) if we
take y, =(—1)", but the error estimate given in Corollary 36.3 is not as
sharp as (36.5). We can also proceed directly and show by mathematical
induction that if m =n, then

[Sm — Sn| = |Znea— Zne2to + (-1)"™ "Z| S [Zn


This yields both the convergence and the estimate (36.5). QED.
36.8 Exampnrs. (a) The series ¥((-1)"/n), which is sometimes
called the alternating harmonic series, is not absolutely convergent. How-
ever, it follows from the Alternating Series Test that it is convergent.
(b) Similarly, the series ¥ ((-1)7/Vn) is convergent, but not absolutely
convergent.
(c) Let xER and let ke Z. Then, since

2 cos kx sin 3x = sin (k +3)x — sin (k —4)x,

it follows that

2 sin 3x[cos x +- --+cos nx] = sin (n +3)x — sin 3x.

Hence, if x is not an integer multiple of 27, then


sin (n +3)x — sin 3x
(36.6) cos x+-:--++cos nx = 7 sin ix

Therefore, if x€{2ka:k € Z}, then

lcosx+---+cosnx|<——7.
|sin 3x|

We can then apply Dirichlet’s Test 36.2(b) to conclude that the series
¥ (1/n) cos nx converges for all x¢{2ka:k € Z}. We note that this series
diverges when x =2km for some ke Z.
(d) Let xe R and let ke Z. Then since

2 sin kx sin 3x = cos (k —4)x — cos (k +3)x,


it follows that

2 sin 3x[sin x + - - -+sin nx] = cos 3x — cos (n +3)x.


310 INFINITE SERIES

Hence, if x is not an integer multiple of 27, then


cos 3x — cos (n +3)x
sinx +-+-++sin nx = —F
2 sin 3x

Therefore, if x¢{2ka:k € Z}, then


1
|sinxi +--+ ae -+sinj nx} <7.
< |sin 3x|

As before, Dirichlet’s Test implies the convergence of the series


¥ (1m) sin nx for all x¢{2ka:keZ}. We note that this series also
converges when x =2k7 forke Z.
(e) Let Y=(yn) be the sequence in R* whose elements are

yi =(1, 0), y2 = (0, 1), ys=(-1, 0),

ys= (0, -1), se y Yuta


= Yay - eee

It is readily seen that the series ) (y,) does not converge, but its partial
sums s, ate bounded; in fact, we have ||s,||</2. Hence Dirichlet’s Test
shows that the series ¥ (1/n)y, is convergent in R’.

Double Series

Sometimes it is necessary to consider infinite sums depending on two


integral indices. The theory of such double series is developed by reducing
them to double sequences; thus all of the results in Section 19 dealing with
double sequences can be interpreted for double series. However, we shall
not draw from the results of Section 19; instead, we shall restrict our
attention to absolutely convergent double series, since those are the type of
double series that arise most often,
Suppose that to every pair (i, j) in NN one has an element x; in R?.
One defines the (m, n)th partial sum s,.. to be

Smn = Xi.
Ms

M3

j Li 1

By analogy with Definition 34.1, we shall say that the double series > (x;)
converges to an element x in R° if for every « >0 there exists a natural
number M(e) such that if m = M(e) and n= M(e) then
IIx — Sin <e.
By analogy with Definition 34.6, we shall say that the double series © (x,) is
absolutely convergent if the double series > (|lx;||) in R is convergent.
It is an exercise to show that if a double series is absolutely convergent,
then it is convergent. Moreover, a double series is absolutely convergent if
36 FURTHER RESULTS FOR SERIES 311

and only if the set

(36.7) {> x [la]: m, n €n|

is a bounded set of real numbers.


We wish to relate double series with iterated series, but we shall discuss
only absolutely convergent series. The next result is very elementary, but
it gives a useful criterion for the absolute convergence of the double series.
36.9 Lemma. Suppose that the iterated series )3—, Yi-, ||x;|] converges.
Then the double series ¥ (x,) is absolutely convergent.
PROOF. By hypothesis each series )7-. ||x|]| converges to a positive
number a;,j¢N. Moreover, the series ¥ (a;) converges toa number A. It
is clear that A is an upper bound for the set (36.7). O.E.D.

36.10 THEOREM. Suppose that the double series ¥ (xi) converges


absolutely to x in R’. Then both of the iterated series

(36.8) Xij
its
iMs

Le Xij, y
im]j 1

also converge to x.
PROOF. By hypothesis there exists a positive real number A which is an
upper bound for the set in (36.7). If n is fixed, we observe that

¥ ball = & alls,


for each m in N. It thus follows that, for each n€N, the single series
Yi (Mn) is absolutely convergent to an element y, in R’.
If « >0, let M(e) be such that if m, n= M(e), then

(36.9) [Sinn
— xl|<e,
In view of the relation

Son = Soxat dS Xi2t- 43) xin


i isk i=l

we infer that

lim (Sn) = Y Xat Y Xateret Y Xin


m i=] i=1 t=1

=yityotr+t
yn
312 INFINITE SERIES

If we pass to the limit in (36.9) with respect to m, we obtain the relation

u yi7* <s, n= M(e).


j=l

This proves that the first iterated sum in (36.8) exists and equals x. An
analogous proof applies to the second iterated sum. QED.
There is one additional method of summing double series that we shall
consider, namely along the diagonals i+j=n.
36.11 THEOREM. Suppose that the double series > (xj) converges
absolutely tox in R°. If we define

k= Y Xj = X1k-1 + Xa,n-2 b+ + XK-1,1,


itj=k

then the series ¥\ (t.) converges absolutely to x.


PROOF. Let A be the supremum of the set in (36.7). We observe that

> lites j=lYDi=l lial = A.


k=2

Hence the series ¥ (&) is absolutely convergent; it remains to show that it


converges to x. Let ¢>0 and let M be such that
MM

A-eé <2) x \[x.,|| <A.

If m, n= M, then it follows that ||snn.—Smu|| is no greater than the sum


¥ (\\xal]) extended over all pairs (i, j) satisfying either M<i<m orM<j<
n. Hence ||snn—Sum||<e, when m, n=M. It follows from this that
|x —swml|< e. A similar argument shows that if n = 2M, then
n
y fe — sn <e,
K=2

whence it follows that x =¥ t. QED.

Cauchy Multiplication
In the process of multiplying two power series and collecting the terms
according to the powers, there arises very naturally a new method of
generating a series from two given ones. In this connection it is notation-
ally useful to have the terms of the series indexed by 0,1, 2,....
36.12 DEFINITION. If Yo (y,) and Yj-o (z;) are infinite series in R’,
36 FURTHER RESULTS FOR SERIES 313

their Cauchy product is the series )x-o (x.), where


Xe = Yor Ze tyr? Ze-rts + Ye + Zo.

Here the dot denotes the inner product in R’. In like manner we can
define the Cauchy product of a series in R and a series in R?.
It is perhaps a bit surprising that the Cauchy product of two convergent series
may fail to converge. However, it is seen that the series
= (-1)"

naovnt 1

is convergent, but the nth term of the Cauchy product of this series with itself is
1 1 1
CO-1 |=
ivr tt
avn : +k],
Vast
Since there are n+1 terms in the bracket and each term exceeds 1/(n+2), the
terms in the Cauchy product do not converge to zero. Hence this Cauchy product
cannot converge.

36.13 THeorem. If the series Yi-o yi and Y7- z, converge absolutely to


y, z in R®, then their Cauchy product converges absolutely to y - z.
PROOF. If i,j=0,1,2,..., let xj;=yi-z; The hypotheses imply that
the iterated series }j~o Vio ||x,|| converges. By Lemma 36.9, the double
series }) (xj) is absolutely convergent to a real number x. By applying
Theorems 36.10 and 36.11, we infer that both of the series

> > Xij, Y Y Xi


k=0 itj=k
converge to x. It is readily checked that the iterated series converges to
y - z and that the diagonal series is the Cauchy product of ¥ (y;) and ¥ (z,).
Q.E.D.
In the case p=1, it was proved by Mertensf that the absolute con-
vergence of one of the series is sufficient to imply the convergence of the
Cauchy product. In addition, Cesaro showed that the arithmetic means of
the partial sums of the Carchy product converge to yz. (See Exercises
37.0, P).

Exercises
36.A. Consider the series

11
—s-5+
1 2 3

where the signs come in pairs. Does it converge?


+ Franz (C. J) MERTENS (1840-1927) studied at Berlin and taught at Cracow and Vienna.
He contributed primarily to geometry, number theory, and algebra.
314 INFINITE SERIES

36.B. Let a4,éR for néEN and let p<q. If the series ¥ (a,/n’) is convergent,
then the series )' (a,/n*) is also convergent.
36.C. If p and q are strictly positive numbers, then

_ayyy log8 ny?


is a convergent series.
36.D. Discuss the series whose nth term is

n n”
(a) (-1) ape (b) @ep?

() yr Se @ oe
36.E. Suppose that } (a,) is a convergent series of real numbers. Either prove
that ¥ (b,) converges or give a counter-example, when we define b, by

(a) a,/n, (b) Va,/n (a, = 0),


(c) a, sinn, (d) Va,/n (a, = 0),
(ce) n'a, (f) a/(1+|a,l).
36.F. Show that the series

111,11
1+5-3+ gts et +e

is divergent.
36.G. If the hypothesis that (z,) is decreasing is dropped, show that the
Alternating Series Test 36.7 may fail.
36.H. For néN, let c, be defined by

11 1
Gi =pt5t : “+7 log ne

Show that (c,) is a decreasing sequence of positive numbers. The limit C of this
sequence is called Euler’s Constant and is approximately equal to 0.577. Show
that if we put

then the sequence (b,) converges to log 2. (Hint: b, =c., —¢, + log 2.)
36.1. Let ¥ (a,.) be the double series given by

Onn = +1, ifm-—n=1,

=-1, ifm—n=~-1,
= 0, otherwise.

Show that both iterated sums exist, but are unequal, and the double sum does not
exist. However, if (s,.,) denote the partial sums, then lim (s,,) exists.
37 SERIES OF FUNCTIONS 315

36.J. Show that if the double and the iterated series of ¥ (a...) exist, then they
are al] equal. Show that the existence of the double series does not imply the
existence of the iterated series; in fact the existence of the double series does not
even imply that lim, (a...) =0 for each m.
36.K. Show that if p>1 and q>1, then the double series

1 1
» (ca) and (qr 5)

are convergent.
36.L. By separating } (1/n’) into odd and even parts, show that
— 1 — ‘ 1
Lan Lan 3Gnai"
n=l

36.M. If Ja|<1 and |b|<1, prove that the series at+bt+a’+b’+a>+b?+


converges. What is the limit?
36.N. If} (a,’) and ¥ (b,”) are convergent, then } (a,b,) is absolutely convergent
and

y a,b, <{> a7}? { 2}.

In addition, }) (a, + b,)’ converges and

{> (a, +b.) }? =f{¥, a, + { Bett,

36.0. Prove Mertens’ Theorem: If ¥ (a,) converges absolutely to A and }' (b,)
converges to B, then their Cauchy product converges to AB. (Hint: Let the partial
sums be denoted by A,, B,, C,, respectively. Show that lim (C,,—A,B,) =0 and
lim (Ci.41— A.B.) = 0.)
36.P. Prove Ceséro’s Theorem: Let ¥ (a,) converge to A and } (b,) converge to
B, and let ¥ (c,) to their Cauchy product. If (C,) is the sequence of partial sums of
¥ (c,), then

2(G+G,+: +++C,)—> AB.

(Hint: write C,+---+C,=A,B,+---+A,B,; break this sum into three parts; and
use the fact that A, > A and B, > B.)

Section 37. Series of Functions

Because of their frequent appearance and importance, we now present a


discussion of infinite series of functions. Since the convergence of an
infinite series is handled by examining the sequence of partial sums,
questions concerning series of functions are answered by examining
corresponding questions for sequences of functions. For this reason, a
316 INFINIYE SERIES

portion of the present section is merely a translation of facts already


established for sequences of functions into series terminology. This is the
case, for example, for the portion of the section dealing with series of
general functions. However, in the second part of the section, where we
discuss power series, some new features arise merely because of the special
character of the functions involved.
37.1 Derinirion. If (f,) is a sequence of functions defined on a
subset D of R? with values in R‘4, the sequence of partial sums (s,,) of the
infinite series ¥ (f,) is defined for x in D by

si(x) = fi(x),
82(x) = si(x)+fo(x) [=filx) +fa(x)],

Tn case the sequence (s,) converges on D to a function f, we say that the


infinite series of functions ¥ (f.) comverges to f on D. We shall often
write

Ld), Lik) or Eh
to denote either the series or the limit function, when it exists.
If the series > ((|f.(x)||) converges for each x in D, then we say that } (f.)
is absolutely convergent on D. If the sequence (s,) is uniformly con-
vergent on D to f, then we say that ¥ (f,) is uniformly convergent on D, or
that it converges to f uniformly on D.

One of the main reasons for the interest in uniformly convergent series
of functions is the validity of the following results which give conditions
justifying the change of order of the summation and other limiting
operations.

37.2. THEOREM. [If f, is continuous on D Cc R® to R‘ for each ne Nand


if ¥ (fx) converges to f uniformly on D, then f is continuous on D,
This is a direct translation of Theorem 24.1 for series. The next result is
a translation of Theorem 31.2.
37.3. THEOREM. Suppose that the real-valued functions f,, n€N, are
Riemann-Stieltjes integrable with respect to a monotone function g on the
interval J=[a, b]. If the series ¥ (f.) converges to f uniformly on J, then f is
Riemann-Stieltjes integrable with respect to g and
cE
(37.1) {. fag=3. [ frdg.
n=]
37 SERIES OF FUNCTIONS 317

We now recast the Monotone Convergence Theorem 31.4 into series


form.

37.4 THEOREM. If the f, are positive Riemann integrable functions on


J=[a, b] and if their sum f => (f.) is Riemann integrable, then

(37.2) [s = n=1
y "fu
Ja

Next we turn to the corresponding theorem pertaining to


differentiation. Here we assume the uniform convergence of the series
obtained after term-by-term differentiation of the given series. This result
is an immediate consequence of Theorem 28.5.
37.5 THEOREM. For each néN, let f. be a real-valued function on
J=[a, b] which has a derivative f, on J. Suppose that the infinite series
¥ (fn) converges for at least one point of J and that the series of derivatives
¥ (f1) converges uniformly on J. Then there exists a real-valued functionf
on J such that > (f.) converges uniformly on J to f. In addition, f has a
derivative on J and

(37.3) f=d fi.

Tests for Uniform Convergence


Since we have stated some consequences of uniform convergence of
series, we shall now present a few tests which can be used to establish
uniform convergence.
37.6 CAUCHY CRITERION. Let (f,) be a sequence of functions on
DCR’ to R‘. The infinite series > (f,) is uniformly convergent on D if and
only if for every « >0 there exists an M(e) such that ifm =n = M(e), then

(37.4) IIfn + fava to s+ fall <e.


The proof of this result is immediate from 17.11, which is the corre-
sponding Cauchy Criterion for the uniform convergence of sequences.
37.7 WEIERSTRASS M-Test. Let (M,) be a sequence of non-negative
real numbers such that |lf.ll>b= M, for each neN. If the infinite series
¥ (Mn) is convergent, then ¥ (f,) is uniformly convergent on D.
PROOF. If m>n, we have the relation

[fat fallo = fale + illo <= Mak My


318 INFINITE. SERIES

The assertion follows from the Cauchy Criteria 34.5 and 37.6 and the
convergence of ¥ (M,.). OED.
The next two results are very useful in establishing uniform convergence
when the convergence is not absolute. Their proofs are obtained by
modifying the proofs of 36.2 and 36.4 and will be left as exercises.
37.8 DiIRICHLET’s Test. Let (f,) be a sequence of functions on D < R?
to R* such that the partial sums

fi, neN,

Me:
Sn =
j=l

are all bounded in D-norm. Let (@,) be a decreasing sequence of functions


on D to R which converges uniformly on D to zero. Then the series >, (Qnf.)
converges uniformly on D.
37.9 Apev’s Test. Let ¥ (f.) be a series of functions on D < R’ to R‘4
which is uniformly convergent on D. Let (gn) be a monotone sequence of
real-valued functions on D which is bounded in D-norm. Then the series
¥ (@afa) converges uniformly on D.
37.10 ExAmpLes. (a) Consider the series Yt-1(x"/n’). If |x| <1,
then |x"/n’| < 1/n’. Since the series ¥ (1/n”) is convergent, it follows from
the Weierstrass M-test that the given series is uniformly convergent on the
interval [—1, 1}.
(b) The series obtained after term-by-term differentiation of the series
in (a) is ¥3-1(x"""/n). The Weierstrass M-test does not apply on the
interval [—1, 1] so we cannot apply Theorem 37.5. In fact, it is clear that
this series of derivatives is not convergent for x = 1. However, if0<r<1,
then the geometric series ¥ (r""’) converges. Since

for |x|<r, it follows from the M-test that the differentiated series is
uniformly convergent on the interval [—r, r].
(c) A direct application of the M-test (with M,=1/n’) shows that
Y2=1 (1/n’) sin nx is uniformly convergent for all x in R.
(d) Since the harmonic series ¥ (1/n) diverges, we cannot apply the
M-test to

(37.5) > (1/n) sin nx.


n=1

However, it follows from the discussion in Example 36.8(d) that if the


interval J =[a, b] is contained in the open interval (0, 27), then the partial
sums s,(x) = r=: sin kx are uniformly bounded on J. Since the sequence
37 SERIES OF FUNCTIONS 319

(1/n) decreases to zero, Dirichlet’s Test 37.8 implies that the series (37.5)
is uniformly convergent on J.
(e) Consider Y5-1 ((-1)"/nje™ on the interval ¥=[0, 1]. Since the norm
of the nth term on I is 1/n, we cannot apply the Weierstrass Test.
Dirichlet’s Test can be applied if we can show that the partial sums of
X((-1)"e™) are bounded. Alternatively, Abel’s Test applies since
* ((-1)"/n)) is convergent and the bounded sequence (e~™) is monotone
decreasing on I (but not uniformly convergent to zero).

Power Series

We shall now turn to a discussion of power series. This is an important


class of series of functions and enjoys properties that are not valid for
general series of functions.
37.11 DEFINITION. A series of real functions ¥ (f,) is said to be a
power series around x =c if the function f, has the form

fr(x) = an(x — 0)",


where a, and c belong to R and where n=0,1,2,....
For the sake of simplicity of our notation, we shall treat only the case
where c=0. This is no loss of generality, however, since the translation
x'=x-—c reduces a power series around c to a power series around 0.
Thus whenever we refer to a power series, we shall mean a series of the
form

(37.6) er
tax
Y aax" = ot aixt tere,
n=0

Even though the functions appearing in (37.6) are defined over all of R,
it is not to be expected that the series (37.6) will converge for all x in R.
For example, by using the Ratio Test 35.8, we can show that the series
6
Y nix",
Mes

x”, YX x"/n!,
n=0 n=0
=

converge for x in the sets


{0}, {xeR:fx|<1}, RB,
respectively. Thus the set on which a power series converges may be
small, medium, or large. However, an arbitrary subset of R cannot be the
precise set on which a power series converges, as we shall show.
If (b,) is a bounded sequence of non-negative real numbers, then we
define the limit superior of (b,) to be the infimum of those numbers v such
that b, =< v for all sufficiently large ne N. This infimum is uniquely
320 INFINITE SERIES

determined and is denoted by

lim sup (b,).

Some other characterizations and properties of the limit superior of a


sequence were given in Section 18, but the only thing we need to know is (i)
that if v >lim sup (b.), then b, < v for all sufficiently large n € N, and (ii)
that if w<lim sup (b,), then w = b, for infinitely many neN.
37.12 DEFINITION. Let } (anx") be a power series. If the sequence
(la,|*”) is bounded, we set p=limsup (ja,|*"); if this sequence is not
bounded we set p=+2. We define the radius of convergence of )) (a,x”)
to be given by
R=0, if p=+,

= 1/p, if O<p<+oa,
= +00, if p=0.

The interval of convergence is the open interval (—R, R).


We shall now justify the term “radius of convergence.”
37.13 CAUCHY-HADAMARD{ THEOREM. [If R is the radius of con-
vergence of the power series ¥ (anx"), then the series is absolutely convergent
if |x|<R and divergent if |x|>R.
PROOF. We shall treat only the case where 0 << R<+~, leaving the cases
R=0, R=+, as exercises. If 0<|x|<R, then there exists a positive
number c <1 such that |x|<cR. Therefore p <c/|x| and so it follows that
if n is sufficiently large, then |a,|'" = c/|x|. This is equivalent to the
statement that

(37.7) |anx"| < c"

for all sufficiently large n. Since c<1, the absolute convergence of


¥ (a.x") follows from the Comparison Test 35.1.
If |x|>R = 1/p, then there are infinitely many néN for which we have
lan |" >1/|x|. Therefore, |a,x"|>1 for infinitely many n, so that the
sequence (a,x") does not converge to zero. Q.E.D.

T JACQUES HADAMARD (1865-1963), long-time dean of French mathematicians, was admitted


to the Ecole Polytechnique with the highest score attained during its first century. He was
Henri Poincaré’s successor in the Academy of Sciences and proved the Prime Number
Theorem in 1896, although this theorem had been conjectured by Gauss many years before.
Hadamard made other contributions to number theory, complex analysis, partial differential
equations, and even psychology.
37 SERIES OF FUNCTIONS 321

It will be noted that the Cauchy-Hadamard Theorem makes no statement as to


whether the power series converges when |x|=R. Indeed, anything can happen, as
the examples

(37.8) De, Lox,


n Dae
n

show. Since lim(n'")=1 (cf. 14.8(e)), each of these power series has radius of
convergence equal to 1. The first power series converges at neither of the points
x =—1 and x =+1; the second series converges at x =—1 but diverges at x = +1;
and the third power series converges at both x =—1 andx=+1. (Find a power
series with R=1 which converges at x =+1 but diverges at x =~—1.)
It is an exercise to show that the radius of convergence of ¥ (a,x") is also
given by

flan )
(37.9) tim
provided this limit exists. Frequently, it is more convenient to use (37.9)
than Definition 37.12.
The argument used in the proof of the Cauchy-Hadamard Theorem
yields the uniform convergence of the power series on any fixed compact
subset in the interval of convergence (—R, R).

37.14 THEOREM. Let R be the radius of convergence of ¥ (anx") and


let K be a compact subset of the interval of convergence (-R, R). Then the
power series converges uniformly on K.
PROOF. The compactness of K <(—R, R) implies that there exists a
positive constant c<1 such that |x|<cR for all xe K. (Why?) By the
argument in 37.13, we infer that for sufficiently large n, the estimate (37.7)
holds for allx €K. Since c <1, the uniform convergence of ¥ (a,.x") on K
is a direct consequence of the Weierstrass M-test with M, = c”. O.E.D.

37.15 THEOREM. The limit of a power series is continuous on the


interval of convergence. A power series can be integrated term-by-term over
any compact interval contained in the interval of convergence.
PROOF. If |xo|<R, then the preceding result asserts that ¥ (a,.x")
converges uniformly on any compact neighborhood of xo contained in
(-R, R). The continuity at xo then follows from Theorem 37.2, and the
term-by-term integration is justified by Theorem 37.3. O.E.D.

We now show that a power series can be differentiated term-by-term.


Unlike the situation for general series, we do not need to assume that the
differentiated series is uniformly convergent. Hence this result is stronger
than the corresponding result for the differentiation of infinite series.
322 INFINITE SERIES

37.16 DIFFERENTIATION THEOREM. A power series can be differen-


tiated term-by-term within the interval of convergence. In fact, if

foy= 5 (ax"), then f= YS (nar),


Both series have the same radius of convergence.
PROOF. Since lim(n’")=1, the sequence ({na,|"") is bounded if and
only if the sequence (|a,|""") is bounded. Moreover, it is easily seen that
lim sup (|na.|'") = lim sup (|a,|"").
Therefore, the radius of convergence of the two series is the same, so the
formally differentiated series is uniformly convergent on each compact
subset of the interval of convergence. We can then apply Theorem 37.5 to
conclude that the formally differentiated series converges to the derivative
of the given series. O.E.D.
It is to be observed that the theorem makes no assertion about the end points of
the interval of convergence. If a series is convergent at an end point, then the
differentiated series may or may not be convergent at this point. For example, the
series )'*_. (x"/n’) converges at both end points x = —1 andx=+1. However, the
differentiated series

converges at x =—1 but diverges at x =+1.

By repeated application of the preceding result, we conclude that if k is


any natural number, then the power series )x=o (anx") can be differentiated
term-by-term k times to obtain

(37.10) x. ei
n!}
oe n-k

Moreover, this series converges absolutely to f for |x|< R and uniformly


over any compact subset of the interval of convergence.
If we substitute x =0 in (37.10), we obtain the important formula

(37.11) f°) = ka.

37.17 UNIQUENESS THEOREM. If ¥ (anx") and > (b.x") converge on


some interval (—r, r), r>0, to the same function f, then

An = Dn for allneN.
PROOF. Our preceding remarks show that n!a,=f(0)=n!b, for
neN. QED.
37 SERIES OF FUNCTIONS 323

Some Additional Results*


There are a number of results concerning various algebraic combinations
of power series, but those involving substitution and inversion are more
naturally proved using arguments from complex analysis. For this reason
we shall not go into these questions but content ourselves with one result in
this direction. Fortunately, it is one of the most useful.
37.18 MULTIPLICATION THEOREM. If f and g are given on the interval
(-r,r) by the power series

f(x)= x nx", g(x) = > bax",

then their product is given on this interval by the series ¥ (cax”), where the
coefficients (c.) are
n
Cn = > ADa-k forn=0,1,2,....
k=0

PROOF. We have seen in 37.13 that if |x|<r, then the series giving f(x)
and g(x) are absolutely converent. If we apply Theorem 36.13, we obtain
the desired conclusion. O.E.D.
The Multiplication Theorem asserts that the radius of convergence of the
product is at least r. It can be larger, however, as is easily shown.
We have seen that, in order for a function f to be given by a power series
on an interval (—r,r), r>0, it is necessary that all of the derivatives of f
exist on this interval. It might be suspected that this condition is also
sufficient; however, things are not quite so simple. For example, the
function f, given by
f)=e""", = x 40,
37.12
( ) =0, x=0,
can be shown (see Exercise 37.N) to possess derivatives of all orders and
f(0)=0 for n=0,1,2,.... Iff can be given on an interval (—1, r) by a
power series around x = 0, then it follows from the Uniqueness Theorem
37.17 that the series must vanish identically, contrary to the fact that
f(x) 40 for x40.
Nevertheless, there are some useful sufficient conditions that can be
given in order to guarantee that f can be given by a power series. As an
example, we observe that it follows from Taylor’s Theorem 28.6 that if
there exists a constant B>0 such that
(37.13) If) <B
t The rest of this section can be omitted on the first reading.
324 INFINITE, SERIES

for all |x| <r andn=0, 1,2,..., then Yx-o f(0)x"/n! converges to f(x) for
|x|<r. Similar (but less stringent) conditions on the magnitude of the
derivatives can be given which yield the same conclusion.
As an example, we present an elegant and useful result due to Serge
Bernstein concerning the one-sided expansion of a function in a power
series.
37.19 BERNSTEIN’s THEOREM. Let f be defined and possess derivatives
of all orders on an interval [0, r] and suppose that f and all of its derivatives
are positive on the interval [0,r]. If 0<x<r, then f(x) is given by the
expansion

f(x )=_ peo


PO)

PROOF. We shall make use of the integral form for the remainder in
Taylor’s Theorem given by the relation (31.3). If 0=x <r, then

(37.14) fo = y FOOy ‘LR,


k=0

where we have the formula

R.= (1—s)"*f (sx) ds.


ah,
(n—
Since all the terms in the sum in (37.14) are positive,
rut

(37.15) f= Hopi { (1—s)"'f™(sr) ds.

Since f*” is positive, f is increasing on [0,r]}; therefore, if x is in this


interval, then
n-1

(37.16) 0=R,= Gopi


=p { (1—s)"?f (sr) ds.

By combining (37.15) and (37.16), we have 0 < R, < (x/r)""'f(r). Hence,


ifO0 =x <r, then lim (R,) =0. Q.E.D.
We have seen in Theorem 37.14 that a power series converges uniformly
on every compact subset of its interval of convergence. However, there is
no a priori reason to believe that this result can be extended to the end
points of the interval of convergence. However, there is a theorem of
Abel that, if convergence does take place at one of the end points, then the
series converges uniformly out to this end point.
In order to simplify our notation, we shall suppose that the radius of
convergence of the series is equal to 1. This is no loss of generality and can
always be attained by letting x’ = x/R, which is merely a change of scale.
37 SERIES OF FUNCTIONS 325

37.20 ABEL’s THEOREM. Suppose that the power series Vir-o(dnx")


converges to f(x) for |x|<1 and that Yi-0 (a) converges to A. Then the
power series converges uniformly in I=[0, 1] and

(37.17) Jim f(x)=A.

PROOF. Abel’s Test 37.9, with f,(x) =a, and o.(x) =x", applies to give
the uniform convergence of ¥ (a.x") on IE Hence the limit is continuous
on I; since it agrees with f(x) for 0=x<1, the limit relation (37.17)
follows. QED.
One of the most interesting things about this result is that it suggests a
method of attaching a limit to series which may not be convergent. Thus,
if );-1 (b,) is an infinite series, we can form the corresponding power series
> (b.x"). If the b, do not increase too rapidly, this power series converges
to a function B(x) for |x|<1. If B(x)—> B as x > 1-, we say that the
series ) (b,) is Abel summable to 8. This type of summation is similar to
(but more powerful than) the Cesaro method of arithmetic means men-
tioned in Section 19 and has deep and interesting consequences. The
content of Abel’s Theorem 37.20 is similar to Theorem 19.3; it asserts that
if a series is already convergent, then it is Abel summable to the same
limit. The converse is not true, however, for the series )7~o(—1)" is not
convergent but since

1 . nn
x7 2D ¥,

it follows that ¥ (—1)" is Abel summable to 3.


It sometimes happens that if a series is known to be Abel summable, and
if certain other conditions are satisfied, then it can be proved that the series
is actually convergent. Theorems of this nature are called Tauberian
theorems and are often very deep and difficult to prove. They are also
useful because they enable one to go from a weaker type of convergence to
a stronger type, provided certain additional hypotheses are satisfied.
Our final theorem is the first result of this type and was proved by A.
Tauber} in 1897. It provides a partial converse to Abel’s Theorem.
37.21 TaAuBER‘S THEOREM. Suppose that the power series ¥ (a.x")
converges to f(x) for \x|<1 and that lim(na,)=0. If limf(x)=A as
x —>1-, then the series ¥ (a.) converges to A.

+ ALFRED TAUBER (1866-circa 1947) was a professor at Vienna. He contributed primarily to


analysis.
326 INFINITE SERIES

PROOF. It is desired to estimate differences such as ©‘ (a,)— A. Todo


this, we write

(37.18) S a—A= {> a — Fe} +{f()— A}


n=0 n=0

= 2 an(1—-x")— x a,x" + {f(x)— A}.


Since O=x<1, we have 1—x"=(1—x)(1+x4+---+x"")<n(1—x), so
we can dominate the first term on the right side by the expression
(=x) eo nan.
By hypothesis lim (na,) =0; hence Theorem 19.3 implies that
. 1 < _
tim( et dye) =O.
In addition, we have the relation A =lim f(x).
Now let ¢ >0 be given and choose a fixed natural number N which is so
large that

a) > na,| <(N+1)e;

(ii) lanl <4 > N;


for alln
1
sae
(iti) If(x0) - Al<e for Xo = l-T7-

We shall assess the magnitude of (37.18) for this value of N and Xo.
From (i), (ii), (iii) and the fact that (1—x0)(N+1)=1, we derive the
estimate
N

Xan al =(1- xo)(N + Ie +55 fe

Since this can be done for each ¢ >0, the convergence of ¥ (a,) to A is
established. O.E.D.

Exercises

37.A. Discuss the convergence and the uniform convergence of the series ¥ (f,),
where f,(x) is given by
(a) (x?+n7)", (b) (nx)?, x#0,
(c) sin (x/n’), (d) (x"+1)',x =0,
(e) x"(x"+1)",x = 0, @® (-D*nt+x)y',x=0.
37.B. If ¥ (a,) is an absolutely convergent series, then the series ¥ (a, sin nx) is
absolutely and uniformly convergent.
37 SERIES OF FUNCTIONS 327

37.C. Let (c,) be a decreasing sequence of positive numbers. If ¥ (c, sin nx) is
uniformly convergent, then lim (nc,) = 0.
37.D. Give the details of the proof of Dirichlet’s Test 37.8.
37.E. Give the details of the proof of Abel’s Test 37.9.
37.F. Discuss the cases R=0, R=+ in the Cauchy-Hadamard Theorem
37.13.
37.G. Show that the radius of convergence R of the power series ¥ (a,x") is
given by lim (|a,|/|a,.,|) whenever this limit exists. Give an example of a power
series where this limit does not exist.
37.H. Determine the radius of convergence of the series )) (a,x"), where a, is
given by
(a) I/n’, (b) n*/n},
(c) nat, (d) (log n)", n= 2.
(e) (a!) /(2n)!, nv
37.1. If a, =1 when n is the square of a natural number and a, = 0 otherwise,
find the radius of convergence of ¥ (a,x"). If b,=1 when n=m! for meéN and
b, = 0 otherwise, find the radius of convergence of ¥ (b,.x").
37.J. Prove in detail that lim sup (|na,|'") = lim sup (|a,|"").
37.K. 1f0<p <|a,| <q for all n EN, find the radius of convergence of © (a,x").
37.L. Let f(x) =¥ (a,x") for |x|<R. If f(x)=f(—x) for all |x}<R, show that
a, = 0 for all odd n.
37.M. Prove that if f is defined for |x|<r and if there exists a constant B such
that |f(x)| = B for all |x}<r and ne N, then the Taylor series expansion

> f°).
M

ao on! *

converges to f(x) for |x|<r.


37.N. Prove by induction that the function given in formula (37.12) has
derivatives of all orders at every point and that all of these derivatives vanish at
x=0. Hence this function is not given by its Taylor expansion about x =0.
37.0. Give an example of a function which is equal to its Taylor series
expansion about x = 0 for x = 0, but which is not equal to this expansion for x <0.
37.P. The argument outlined in Exercise 28.M shows that the Lagrange form of
the remainder can be used to justify the general Binomial Expansion

(14+x)"= > (™)x*


n=o\h

when x is in the interval O=<x<1. Similarly Exercise 28.N validates this


expansion for —1<x <0, but the argument is based on the Cauchy form of the
remainder and is somewhat more involved. To obtain an alternative proof of this
second case, apply Bernstein’s Theorem to g(x)=(1— x)" for 0=x<1.
37.Q. Consider the Binomial Expansion at the end points x =+1. Show that if
x =—1, then the series converges absolutely for m = 0 and diverges for m<0O. At
x=+1, the series converges absolutely for m = 0, converges conditionally for
—1<m<0O, and diverges for m <—1.
328 INFINITE SERIES

37.R. Let f(x) =tan x for |x}</2. Use the fact that f is odd and Bernstein’s
Theorem to show that f is given on this interval by its Taylor series expansion about
x=0.
37.8. Use Abel’s Theorem to prove that if f(x)=¥ (a.x") for jx|< R, then

i fo) de =F BR,

provided that the series on the right side is convergent even though the original
series may not converge at x=R. Hence it follows that
1 net .
log 2= roe® , a7 Ler
37.T. By using Abel’s Theorem, prove that if the series }(a,) and ¥ (b,)
converge and if their Cauchy product } (c,) converges, then we have }) (c,)=
¥ (a,) +E (b,)-
37.U. Suppose that a, = 0 and that f(x) = (a,x") has radius of convergence 1.
If ¥ (a,) diverges, prove that f(x)» + asx -—>1—. Use this result to prove the
elementary Tauberian theorem: If a, = 0 and if

A = lm y a,x",

then } (a,) converges to A.


37.V. Let ¥%-5(p,) be a divergent series of positive numbers such that the radius
of convergence of ¥ (p,x") is 1. Prove Appell’st Theorem: If s = lim (a,/p,), then
the radius of convergence of ¥ (a,x") is also 1 and

D a,x"
lim =s.

D pax"
xl—

(Hint: it is sufficient to treat the case s=0. Also use the fact that
lim, [Z (p.x") 7 =
37.W. Apply Appell’s Theorem with p(x) = X7_ (x") to obtain Abel’s Theorem.
37.X. If (a,) is a sequence of real numbers and a, =0, let s,=a,+---+a, and
let o, =(s:+-+++5,)/n. Prove Frobenius’t Theorem: If s = lim (o,) then

= Jim 2 ax".
REMARK. In the terminology of summability theory, this result says that if a
sequence (a,) is Cesaro summable to s, then it is also Abel summable to s. (Hint:
apply Appell’s Theorem to p(x)=(1-x)?= ¥"%_.(nx"") and note that
X(n- ox")
= p(x) Y (a,x").)

+ PAUL APPELL (1855-1930) was a student of Hermite at the Sorbonne. He did research in
complex analysis.
= GEORG FROBENIUS (1849-1917) was professor at Berlin. He is known for his work both i.
algebra and analysis.
37 SERIES OF FUNCTIONS 329

Projects

37.a. The theory of power series presented in the text entends to complex power
series.
(a) In view of the observations in Section 13, all of the definitions and theorems
that are meaningful and valid for series in R’ are also valid for series with elements
in C. In particular the results pertaining to absolute convergence extend readily.
(b) Examine the results pertaining to rearrangements and the Cauchy product to
see if they extend to C.
(c) Show that the Comparison, Root, and Ratio Tests extended to C.
(d) Let R be the radius of convergence of a complex power series

x az".
n=0

Prove that the series converges absolutely for |z]}<R and uniformly on any
compact subset of {z €C:|z|<R}.
(ec) Let f and g be functions defined for D ={z €C:|z|<r} with values in C
which are the limits on D of two power series. Show that if f and g agree on
DAR, then they agree on all of D.
(f Show that two power series in C can be multiplied together within their
common circle of convergence.
37.8. In this project we define the exponential function in terms of power
series. In doing so, we shall define it for complex numbers as well as real.
(a) Let E be defined for z € C by the series

E(z) =y3 .

Show that the series is absolutely convergent for all z € C and that it is uniformly
convergent on any bounded subset of C.
(b) Prove that E is a continuous function on C to C, that E(0)=1, and that
E(z+w)=E(z)E(w)

for z, win C. (Hint: the Binomial Theorem for (z+ w)" holds when z, w eC and
néeN,)
(c) If x and y are real numbers, define E, and E, by E,(x) = E(x), E.{y) = E(iy);
hence E(x +iy)=E,(x)E.(y). Show that E, takes on only real values but that E,
has some non-real values. Define C and S on R to R by

C(y)=Re E,{y), — S(y)= 1m E,(y)


for y &R, and show that

C(yi t+ yo) = C(yC(y2)


— S(y)S(y2),
S(yit ya)= SQ¥ICCy2)+ Cly)S2).
(d) Prove that C and S, as defined in (c), have the series expansions
_< (-1)" an _ (-1)" ant.

CH= Lor? SO)=2 Gatpr:


330 INFINITE SERIES

(e) Show that C’=—S and S'=C. Hence (C’?+S’)'=2CC'+2SS'=0 which


implies that C?+ S* is identically equal to 1. In particular, this implies that both C
and S are bounded in absolute value by 1.
(f) Infer that the function E, on R to C satisfies E,(0)=1, E.(y.+ y2)=
E.(y,)E.(y2). Hence E,(—y) = 1/E,(y) and |E.(y)|=1 for all y in R.

Section 38 Fourier Series

We shall now give the definition of the Fouriert series of a piecewise


continuous function with period 27. Although our discussion will be brief,
we shall present the main convergence theorems relating to Fourier
series. These theorems are of considerable importance in analysis, and its
applications to physics.
In the following we shall suppose that f:R— R has period 277; that is,
that f(x +27)=f(x) for all xe R. We also suppose that f is piecewise
continuous; that is, f is continuous except possibly for a finite number of
points xi,..., x, in any interval of length 27, at which f has left and right
hand limits:

fy —)=lim fh), f(a) +) = him f(x) +h).


h>o h>o

The set of all functions f:R —~ R which have period 27 and are piecewise
continuous will be denoted by PC(277). It is readily seen that this set is a
vector space under the operations:

(f+ g@=f(x)+g), (cix)=cf(x), xeER.


Because of the periodicity of f ¢ PC(277) it is only necessary to investigate f
on an interval of length 27; for example, we have

[ f(x) dx = [ Cx) dx
for any ce R.
On the space PC(27) we shall be interested in the two norms

iflesup(feol:xet-m a], ih=([" Gey? ax)”,


+(.-B.) Joseph Fourier (1768-1830) was the son of a French tailor. Educated in a
monastery, he left to engage in mathematical and revolutionary activities. He accompanied
Napoleon to Egypt in 1798 and was later appointed prefect of the Department of Isére in
southern France. During this time he worked on his most famous accomplishment: the
mathematical theory of heat. His work was a landmark in mathematical physics and has had
a towering influence on both subjects to the present day.
38 FOURIER SERIES 331

which are well defined because a function in PC(27) is bounded and


Riemann integrable. It is an elementary exercise to show that if fe
PC(27), then

(38.1) Ill, =< V2 |lfll-


It follows from this inequality that convergence in the norm ||-||. (that is,
uniform convergence) implies convergence in the norm |-||, (that is, mean
square convergence). However, the converse is not true. (See Exercises
31.H and 38.L.)

38.1 DeFINITION. If f ¢ PC(27), then the Fourier coefficients of f are


the numbers do, di, d2,..., bi, bz,... defined by

(38.2) am -+{ f(t) cosntdt, by -+{ f(t) sin ne dt.


By the Fourier series of f we mean the series

(38.3) Sao+ >, (a, cos nx +b, sin nx).


n=1

To indicate the association of the Fourier series (38.3) to the function f,


we often write
f(x)~4a0+ ¥ (an cos nx +b, sin nx).
n=1

However it is to be emphasized that this notation is not intended to suggest


that the Fourier series converges to f(x) at any particular point x. Indeed,
there exist continuous functions with period 27 whose Fourier series are
divergent at infinitely many points. (See Burkhill and Burkhill, page 317,
and Hewitt and Ross, page 300.)
38.2 ExaAmpLes. (a) Let fie PC(27) be defined on (—a, 7] by
fi(x) =—1 for -7 <x <0 and fi(x) =+1 for0O=x= +7, It is an exercise
to show that the Fourier series for f, is given by
Alene sin 3x sin Sx, : |
7 1 3 5
It will be proved below that this Fourier series does indeed converge to f,
for 0<|x|< a, but it does not converge to f; at x =0, +7 (why?). Note
that f; is piecewise continuous, but is not continuous at the points in
{na ine Z}.
(b) Let f,€ PC(27) be defined on (—7, 7] by f.(x) =|x|. It is an exercise
to show that the Fourier series for f, is given by
am 4fcosx cos3x cos5x
ae |
332 INFINITE SERIES

It is clear that this series converges uniformly on R and it will be proved


below that it converges to fo.
(c) Let fe PC(27) be even; that is, f(-x) = f(x) for allxeR. For sucha
function the Fourier coefficients b, =0 for n=1,2,..., while

a= | f(t) cos nt dt, n=0,1,2,....


WW Jo

(Note that the function in (b) is even.)


(d) Let ge PC(27) be odd; that is, g(—x)=—g(x) for allxeR. For
such a function the Fourier coefficients a, =0 for n=0,1,2,..., while

b=2 | g(t)sinmtdt,
TT Jo
n=1,2,....
(Note that the function in (a) is odd.)
(ce) Let f be continuous on R with period 27 and let its derivative f’ be
piecewise continuous on R (and with period 27). We shall relate the
Fourier coefficients an, b, of f with the Fourier coefficients ax, b, of f’ for
n=1,2,.... In fact, integrating by parts, we have

pi 1 " ,
an= = [ f'(t) cos nt dt

1 7 ™ .
=— f(t)}eosnt| — f(t})(—n) sin nt dt}.

If we use the fact that t+ f(t) cos nt has period 27 the first term is seen to
vanish and so a,=nb, for n=1,2,.... Similarly it is shown that
bi=—na, forn=1,2,..., (We note that if f:, f. are the functions in (a)
and (b), then f:(x) = f2(x) for x¢é{na:ne Z}, and that the Fourier coeffi-
cients for f; and f, for n=1,2,... satisfy the above relationships.)
In the next lemma, we shall calculate the square of the distance relative
to the norm ||-||, from f in PC(27) to an arbitrary function T, of the form

(38.4) T(x) = 300+ >, (cx cos kx + By sin kx);


k=1

such a function is sometimes called a trigonometric polynomial of degree


n. In making this calculation it is useful to have the relations
7
[’ (cos kx)? dx = | (sin kx)* dx = 7, keEN,
a
| sin ke sin nx dx = | cos kx cos nx dx = 0, k,neN,k#n,
7

{ sin kx cos mx dx =0, k,m=0,1,2,....


38 FOURIER SERIES 333

38.3 Lemma. If fe PC(27) and T, is a trigonometric polynomial of


degree n (that is, T, has the form (38.4)), then

(38.5) If Tab? =f faa! + ¥, (a.?+b.)}


+ a{Sao- ao)+ k=12 [ou — ax)?+ (Bs ~ bh.
where ax, b, denote the Fourier coefficients of f.
PROOF, We have

W- Td? = [" o- TOP at


-[- [f()P dt—2 [ f(@T.(t) a+ |" (TP dt.

Now it is easily seen that

[" f()TA(t) dt=too |” f fe) d+ a (” f(t) cos kt dt


Se f f(t) sin ket dt
= af dado + x (axa, + pba).

Moreover, using the relations cited above it is seen that

[; [T,.(t)]} dt = mfdac? + > (a? + p)}.

If we insert these two relations into the first formula and add and subtract
atsao
+ Yi-1 (a?+ b.2)}, we obtain formula (38.5). QED.
Lemma 38.3 has the following important ‘‘geometrical”’ interpretation:
among all trigonometric polynomials T, of degree n, the one which
minimizes the expression |{f — T, ||.” is uniquely determined and is obtained
by choosing the coefficients a, B, to be the Fourier coefficients ax, b, of f,
k=0,1,...,n. If we denote this (unique) minimizing trigonometrical
polynomial by S,(f), then

(38.6) Su(f)(x) =4a0+ y (ax cos kx +b, sin kx)


is the nth partial sum of the Fourier series for f and formula (38.5) implies

(38.7) If Sal? = Ib? m{as?+ (ae +b2)}.


334 INFINITE SERIES

If we make use of Exercise 26.F we can show that


(38.8) lim ||f— Sn(P\o= 0
for each continuous function with period 27. However, since that exercise
is the result of considerable analysis, we prefer to derive this result more
directly. In doing so we shall need the following two results.
38.4 BEsseL’s INEQUALITY. If f¢PC(27), then

(38.9) hag?t > (a? +b’) = 4 iflla”.


PROOF. If néN is arbitrary, then it follows from (38.7) that

has?+ ¥ (ae? +b2) == If


Hence the partial sums of the series on the left side of (38.9) are bounded
above. Since the terms are all positive, this series is convergent and (38.9)
holds. QED.

The next result is a special case of what is usually called the Riemann-
Lebesgue Lemma.
38.5 RIEMANN-LEBESGUE LEMMa. If g¢€PC(27), then

tim | g(t) sin (n+3)t dt =0.

PROOF. Since sin (n+3)t =sin nt cos 3t+cos nt sin 3t, we have

| g(t) sin (n+3)t dt -+ [g(t cos 3t] sin nt dt

+2 [" [matt sin] eos nt ae


Since ge PC(27), it follows that the functions defined for t¢(—7a, a], by
gi(t) = ae@(t) cos 3t, g(t) = eg(t) sin 4t,
have extensions to R which belong to PC(27). Therefore the integrals in
the right side of the above formula give Fourier coefficients for g: and g2;
hence, by Bessel’s Inequality, these integrals converge to 0 asn—> &.,
Q.E.D.
38.6 Lemma. If fe PC(27), then the partial sum S,,(f) of its Fourier
series is given by

(38.10) S(f(x) == [ f(x+t)D,(t) dt


38 FOURIER SERIES 335

where D, is the nth Dirichlet kernel, defined by


sin (n+5)t
; 0<|t| <7,
D,(t)=3+¥ cos kt= 2 sin 2t
= nt+4, t=0.

PROOF. It follows from the formulas (38.2) and (38.6) that

Sa(f)(x) = [ f(t) de +4 {" x f(){cos kx cos kt +sin kx sin kt} dt


a1 [" folie > cos k(x — a} dt.
7

If we let t=x+s and use the fact that the cosine is an even function and
that the integrand has period 27, we have

Saf) (x) =-+ [ f(x+ i+ x cos ks} ds


al foets)[i+ > cos ks} ds.
T ln k=]

We now apply formula (36.6) to obtain (38.10). QED.


Before we proceed, we recall (see Exercise 27.Q) that, by the right-hand
derivative of a function f:R > R at a point ce R where f has a right-hand
limit f(c+), we mean the limit

file) =tim MeO fle)


‘to
whenever this limit exists. Similarly, the left-hand derivative of f at c is
the limit

fi(c) = lim +9
fle)
10
38.7 PoOINTWISE CONVERGENCE THEOREM. Suppose that f € PC(27)
and that f has right- and left-hand derivatives atc. Then, the Fourier series
for f converges to x{f(c—)+f(c+)} at the point c. In symbols,

(38.11) tf(c —)+f(c +)}=4a0+ ¥ (a, cos nc + b, sin nc).

PROOF. It follows from (36.6) that if sin 3t#0, then


sin (n+4)t
z+ ), coskt = :
1

? 2, 2 sin it
336 INFINITE SERIES

Multiply by (1/7) f(c +) and integrate with respect to t over [0,7]. Since
So cos kt dt =0 for k EN, we obtain

fle =2[" per) SO


sin (n +a)t 2)t
2 sin 3t at.

Similarly, if we multiply the above expression by (1/7)f(c —) and integrate


with respect to t over [—7, 0], we obtain

yle+y-2]
1 1/°
fle-) sin (n+3)t
-
2 sin 3t
dt.

If we subtract these expressions from formula (38.10), we get

() SUALe-HAle=)+fle+y=A[ Mee 2 Pefe—)


sin 3t
sin (n+35)t dt

a +{ f(e+t)-
sin ttfle +) 1 dt.
sin (n +3)t

Now since we have

mifeth= fet) 4 mf Her naHe). t }


lim
1>0
2 sin 3t 10
t>0
t 2 sin 3t

=fi(c) + 1= fic),
it follows that the function

f(et+-f(ct) for te(0, 7],


FAQ) = 2 sin 3t
= fi(c) for t=0,
=0 for te (—7, 0),
is piecewise continuous on (—a, 7]. Hence the second integral in (*)
converges to 0 as n>,
Similarly, the first integral in («) converges to 0 asn >, Therefore the
stated conclusion follows. QED.
38.8 Examptes. (a) The function f; in Example 38.2(a) is in
PC(27), with f(c —)=f(c)=f(c+) for ce[-a, 7], cx —7, 0, +7, where
we have f(-7w—)=+1, f(-7+)=-1, f(O-)=—-1, f(0+)=1, f(w—-)=1,
f(m+)=~—1. Since one-sided derivatives exist everywhere (and equal 0),
it follows from the Pointwise Convergence Theorem 38.7 that the Fourier
series for f; converges to f:(c) provided c €[—7, aw], c+ —a, 0, a, and that
at these three points the Fourier series for f, converges to 0.
(b) The function f, in Example 38.2(b) is continuous, has period 27 and
has one-sided derivatives everywhere. Therefore the Fourier series for f,
converges at every point to f, and, as we have seen, the convergence is
38 FOURIER SERIES 337

uniform. We note that the (two-sided) derivative of f, exists in [—a, 7]


except at the points 0, +a, and that f} agrees with the piecewise continuous
function f, for x¢{nw:ne Z}.

We remark that it follows from the Mean Value Theorem (see Exercise
27.N) that if f’€ PC(277), then the left- and right-hand derivatives of f exist
at the points of discontinuity of f’. We now show that for a function f with
period 27 and such that f’¢ PC(27), the Fourier series for f is uniformly
convergent to f.
38.9 UNIFORM CONVERGENCE THEOREM. Let f be continuous, have
period 27, and suppose that f'e PC(27). Then the Fourier series for f
converges uniformly to f on R.
PROOF. Since f is continuous and the one-sided derivatives of f exists
at every point, it follows from the Pointwise Convergence Theorem 38.7
that the Fourier series for f converges to f at every point. It remains to
show that the convergence is uniform. In view of the inequality

» (a, cos kx + by sin kx)} < 2 (lax |+|bx|),


= =1

it is enough to establish the convergence of the latter series. In fact, if we


apply Bessel’s Inequality to f’, we know that the series ¥; (\ai|?+|bi]’) is
convergent. But, as we have seen in Example 38.2(e), a. =—bi/k and
b. = ai/k. If we apply the Schwarz Inequality we have

Ylal= > elo=(% gs) (Doe)


m m 1 m 1 1/2 m 2 1/2

k=1 dik keaik k=1

Since a similar inequality holds for }|b,|, the desired assertion follows.
QED.
We now show that the partial sums of the Fourier series for any function
f in PC(2z) converges to f in the norm ||-|,._ While this does not guarantee
that we can recover the value of f at any particular preassigned point, it can
be interpreted as giving f in a certain ‘statistical’ sense. For some
applications this type of convergence is as useful as pointwise convergence,
and there is the advantage that we do not have to impose differentiability
restrictions.
38.10 Norm CONvERGENCE THEOREM. If fe PC(27) and if (S,(f))
is the sequence of partial sums of the Fourier series for f, then

tim,» ||f—S. (fll, = 0.


PROOF. Let fe PC(27) and let e >0 be given. Itis an exercise to show
that there exists a continuous function f, with period 2m such that
338 INFINITE, SERIES

f—file<e/7. By Theorem 24.5 there is a continuous, piecewise linear


function f, which can be chosen to have period 27, such that ||f;— full.<
e/7. It follows from the Uniform Convergence Theorem 38.9 that if n is
sufficiently large, then ||f2—S,(f2)|l<«/7. From formula (38.1) we have
\Iglle < V27 gil. < 3 ||gl|. for any g ¢ PC(27); hence we deduce that

IIf — Su(fa)lle <Ilf = falle+ [lf — fall


[lf — Su (fade
& 3e 3
Satz 7 Te

Now S,(f2) is a trigonometric polynomial of degree n approximating f


within ¢ (with respect to ||-||2). Since it was established in Lemma 38.3 that
the partial sum S,(f) is the trigonometric polynomial of degree n that gives
the best such approximation, we infer that |f—S.(Plb<«e. Since ¢>0 is
arbitrary, we conclude that lim ||f —S. (f)|l2= 0. QED.
As a corollary of this result and Lemma 38.3 we obtain the following
strengthening of Bessel’s Inequality for f € PC(27).

38.11 PARSEVAL’s Equa.iry. If fe PC(27), then

(38.12) lf? = 4a0? + & (ai?+ bv),


k=1

where the ax, db, are the Fourier coefficients of f.


We shall end this section with a proof of the theorem of Fejér} on the
Cesaro summability of the Fourier series of a continuous function. If
S.(f), n=0,1,2,..., denote the partial sums of the Fourier series
corresponding to f, let T.(f) denote the Cesaro means:

Pa) = AES) +S) +++ SiH)


Now let D,, 1 =0,1,2,..., be as in Lemma 38.6. If we make use of the
elementary formula
2 sin (k —3)t sin 3t = cos (k —1)t—cos kt, k =0,1,2,...,
we can show that

1 + (= ty 0<jtl|< 7,
(38.13) Hp wsDi9+---+D..40)-[2e sin3t /’ =e
an, t=0,
f LEOPOLD FEJER (1880-1959) studied and taught in Budapest. He made many interesting
contributions to various areas of real and complex analysis.
38 FOURIER SERIES 339

and we let K, be this function which is called the nth Fejér kernel. Clearly
K,(t) = 0 and since

+ {7 Da ae=1
for k=0,1,2,..., it follows that

(38.14) tf" K,(t) dt =1.


Also, if 0<6 <q, it follows from the fact that sin 6 = 20/7 for 0 < 0 = 7/2
that we have

(38.15) 0=K()<5-(3) for8 <|t|< 7.


2

Finally, we note that it follows from Lemma 38.6 that we can express the
Cesaro means by the formula

(38.16) Pe) =+ [" fx +1) Ka(t) dt.


We are now prepared to prove Fejér’s Theorem.

38.12 FEséR’s THEOREM. [If f is continuous and has period 27, then
the Cesdro means of the Fourier series for f converge uniformly tof on R.
PROOF. It follows from (38.14) that

f(x) = + [" f(x)Kn(2) dt.


Subtracting this from (38.16) we obtain

PD@)-F0) == [7 {FG +)-FCOKA( at


Since K,(t) = 0 for all t, we have
17
rN) -feols + |” e+ )-Fe1 Ku at
Let e >0 be given; since f is uniformly continuous on R, there exists a
number § with 0<8<7q such that if |t| <6, then
lf(x+t)—f(x)|<e for all x e[—7, z].
Hence we have

tf oar -folKar=£f Kars &


[" K=* &

7 7
340 INFINITE SERIES

On the other hand, in view of (38.15) we have

11"
a ds Yer) 1001 K.@)
" r= 7=*
~ or aiply(2)
M\8n 8) < n\ 48 (=e)PP
which can be made less than « by taking n sufficiently large. Since a
similar estimate holds for the integral over [—7z, —6], it follows that

IPs fle<(2+2)e1
for n sufficiently large. O.E.D.
Since the function’, (f) is readily seen to be a trigonometric polynomial (of
degree n—1), we have another proof of the following theorem of Weier-
strass.
38.13 WEIERSTRASS APPROXIMATION THEOREM. If f is continuous
and has period 27, then it can be uniformly approximated by trigonometric
polynomials.

Exercises

38.A. Let g be a real-valued function defined on a cell J in R with end points


a<b. Wesay that g is piecewise continuous on J if (i) g has a right-hand limit at a,
(ti) g has a left-hand limit at b, and (iii) g is continuous at all interior points of J
except, possibly, for a finite number of points at which g has right- and left-hand
limits.
(a) Show that if g is piecewise continuous on (—7, 7], then there exists a unique
function G in PC(27) such that G(x) = g(x) for all x e(—7, 7].
(b) The function g has a left-hand (respectively, right-hand, two-sided) deriva-
tive at c€(—7, 7) if and only if G does.
(c) The function g has a right-hand derivative at —a (respectively, a left-hand
derivative at 7) if and only if G does.
(d) The one-sided derivatives gi(—a), g/(ar) exist and are equal if and only if G
has a derivative at +:7.
38.B. (a) If fe PC(27) and the derivative f(x) exists for all x eR, then f’ has
period 27.
(b) If fe PC(27) and ce R, define F:R—>R by F(x) =§t f(t) dt, so that F is
continuous. Show that F has period 27 if and only if the mean of f is zero;
that is,
1 if"
z= 57 " =
f(t) dt=0.

38.C. (a) Let fe PC(27) be odd. Then f(+7)=0. If f is continuous at 0, then


f(0) =0.
(b) Let ge PC(27) be even, then g(0+)=g(0—). If the derivative g’(x) exists
for all x € R, then (see Exercise 27.P) g’ is odd, has period 27, and g’(0) = g'(+7) =
0.
38 FOURIER SERIES 341

38.D. Let F and f belong to PC(27r) and have Fourier coefficients A,, B, and a,,
b,, respectively. If a, B € R andifh =aF+ Bf, show that h belongs to PC(27) and
has Fourier coefficients aA, + Ba,, aB, + Bb,. (Hence the Fourier coefficients of a
function depend linearily on the function.)
38.E. (a) Let f, be the function in Example 38.2(a). Calculate the Fourier
series for f,, and show that this Fourier series does not converge uniformly on
(-7, aw].
(b) Let f. be the function in Example 38.2(b). Calculate the Fourier series for
f., and show that the term-by-term derivative of the Fourier series for f, coincides
with the Fourier series for f,.
(c) Using the fact that the Fourier series for f, converges to f,, deduce that

moti ity,
g tg

(d) Let f,(x)=3a—f,(x) so that f,(x)=30—|x| for xe(-—a, 7]. Use Exercise
38.D ta show that the Fourier series for f, is given by

4[cosx cos 3x cos 5x


fo ~4] P + 32 +2852...)

38.F. (a) Let g,€ PC(27) be such that g,(x)=x for x €(—a, w] and g,(a) =0.
Show that g, is an odd function and that its Fourier series is given by

[= x_sin2x sin3x_ |
1 2 3

Note that this Fourier series converges to 0 at x=+m. Use the Pointwise
Convergence Theorem 38.7 to show that this Fourier series converges to g,(x) for
every point x €[—7, zr).
(b) Let g,€ PC(27) be such that g.(x) =x? for x €(—a, 7]. Show that g, is an
even function and that its Fourier series is given by

7 [ss X_ cos2x , cos 3x_ |


3 °° UP ? 3?
Show that this Fourier series converges uniformly to g, on [—7, 7], and that its
term-by-term derivative is twice the Fourier series for g.
(c) Show that

wi Tart
11ga
72°4172

(d) Let h(x)=37’- g(x) so that h(x)=}0?—x? for xe(—7, 7]. Then the
Fourier series for h is given by

he x _cos2x cos 3x_ |


v 2? 3?

38.G. (a) Let k(x) =x? for allx eR. Show that k is continuous and odd on R.
342 INFINITE SERIES

However, the function k, in PC(2m) which coincides with k on (—a, 7] is not


continuous.
(b) Let h(x)=x°—7’x so that h is continuous and odd on R. Let h, be the
function in PC(27) which coincides with h on (—7, 7]. Show that h, is continuous
on R and that hi(x)=3x?- 7’ for x e(—7a, 7].
(c) Use Exercise 27.P, Example 38.2(e), and Exercise 38.C(d) to show that the
Fourier series for h, is given by

~12[ 58 sin2x sin3x ‘|.


Vv 2? 3?

38.H. Let f:[0, 7]— R be piecewise continuous and let f, € PC(27) be defined
by

f(x) = f(x) for x €[0, +],


=f(-x) for x «[—q, 0).

(a) Show that f, is an even function; it is called the even extension of f with
period 27.
(b) The Fourier series of f, is called the (Fourier) cosine series of f. Show that it
is given by

fay+ >, a, COS nx.


n=1

where

a == | f() cos nt dt, n=0,1,2,...,


TW Jo

(c) Show that if c € (0, ar) and f has left- and right-hand derivatives at c, then the
cosine series for f converges to Af(c—)+f(c+)]. Also if f has a right-hand
derivative at 0, then the cosine series for f converges to f(0+). If f has a left-hand
derivative at a, then the cosine series for f converges to f(m—).
38.1. For each of the following functions defined on [0, 7], calculate the cosine
series and determine the limit of this series at each point.
(a) f(@x)=x; (b) f(x)=sinx;
(c) f(x)=1 for O=x <}n, (d) f(x)=3a0-x for 0<x <)n,
=0 for ja <x<7. =0 for ja <x <7.
(e) f(x)=x(a—x).
38.J. Let f:[0, 7]— R be piecewise continuous and let f,¢ PC(27) be defined
by
f =f x) for x < (0, 7],
=0 for x =0,
=—f(-x) for
x € (~7, 0).
38 FOURIER SERIES 343

(a) Show that f, is an odd function; it is called the odd extension of f with period
27.
(b) The Fourier series of f, is called the (Fourier) sine series of f. Show that it is
given by

y b, sin nx,
n=1

where

b= 2 [" p00 sin neat, n=1,2,....


TW Jo

(c) Show that if c €(0, a) and if f has left- and right-hand derivatives at c, then
the sine series forf converges to 3[f(c—)+f(c+)]. In any case, the sine series forf
converges to 0 at x =0, a.
38.K. For each of the following functions defined on [0, a], calculate the sine
series and determine the limit of this series at each point.
(a) f(x) =1; (b) f(x) =cos x;
(c) f(x)=1 for0 <x <7, (d) f(x)=a7-x;
=0 forja<x<7;
(e) f(x) =x(ar—x).
38.L. Let f, € PC(27) be the function such that f,(x)=n'" for O< x = 1/n and
=0 for other x¢(—a, 7]. Show that |[f|,=1/n’” so that the sequence (f,)
converges to the zero function in the norm ||-||, but, since it is unbounded, the
convergence is not uniform.
38.M. If fe PC(27) and if « >0, show that there exists a continuous function f,
with period 27 such that ||f—filb<e.
38.N. Use Parseval’s Equality 38.11 to establish the following formulas.
2 2 ca
roy l TL 1
(a) ecw (b) 8 =) Gao

(©) 007 moet


Ua (d) ae
9457 2,Sdae
38.0. Iff and F belong to PC(27) and have Fourier coefficients a,, b, and A,,
B,, respectively, show that

LY” (OF dt hace Y (aA, + bB,).


(Hint: apply Parseval’s Equality to f+F.)
38.P. Use Dirichlet’s Test 36.2 and Example 36.8 to show that the trigonometri-
cal series

> sin nx
1/2
nai ft

converges for all x. Show, however, that this series cannot be the Fourier series of
any function in PC(27).
344 INFINITE. SERIES

38.0. Let L>0 and let PC(2L) be the vector space of all functions f:R > R
which have period 2L and are piecewise continuous.
(a) If we define f-g=J', f(t)g(t) dt for f, ge PC(2L), show that the map
(fg) f-g is an inner product (in the sense of Definition 8.3) on PC(2L).
Moreover the norm induced by this inner product (see 8.7) is

Wk=[[
oor ar] ”
(b) We let Cy, C,, S,, ne N, be the functions in PC(2L) given by

1 NTX

Oar
C(x)==, C(x = Foo
Vi Vi
Soe) =~ sin SF

Show that this set of functions is orthonormal in the sense that

CG, > Sn = 0, Ca Ca = Bry S.° Sn = Sam

where 6,,, = 1 if n=m and 6, =Oifn#m. (Hint: if L =, these are the relations
given before 38.3.)
(c) If f € PC(2L), we define the Fourier series of f on [-L, L] to be the series

— . Narx
ao+ 5 (a, cos "7L ~4-b, sin "*),
we

where we have

a= 1/*
[. fds a= 1°(. f(t) cos "2 t at
1/* _ nt
b=7 f(t) sin; dt

forn=1,2,....
(d) Reformulate the Convergence Theorems 38.7, 38.9, and 38.10 for Fourier
series of functions in PC(2L). (Hint: make a change of variable.)
(e) If fe PC(2L), then Parseval’s Equality becomes

i
rll =5 a+ > (a+b),
ol 2 + 2 2

where the norm of f is as in part (a) and the Fourier coefficients are as in part (c).
38.R. For each of the following functions on the specified interval, calculate the
Fourier series on this interval and determine the limit of this series at each point.
(@) f(x)=x on (2,2);
(b) f(x) =0 for —-4<x<0,
=x for O=x=4;
(c) f(x) =0 for —-3<x<0,
=1 for O<x<1l,
=0 for <x <3.
38.8. Let f be continuous and have period 27. Show that if the Fourier series
for f converges at ce[—7z, 7] to some number, then it converges to f(c).
38 FOURIER SERIES 345

38.T. Let f belong to PC(27) and suppose that c €{—7, 7]. If T',(f) denotes the
nth Fejér mean, defined in (38.16), show that

lim Fe(N(e)=S1f(-)
+ le +).
38.U. Suppose that f and f’ are continuous with period 2a and that f’e
PC(27). (a) Show that the Fourier coefficients a,, b, of f are such that the series

Y n*(a +14)
is convergent. Hence, there exists a constant M>0 such that |a,| =< M/n’ and
b,| = M/n? for all ne N.
(b) Show that the Fourier series for f’ is the term-by-term derivative of the
Fourier series for f.
38.V. (a) If ke PC(2m) and if x,, x e[—a, 7], use the Schwarz Inequality to
show that

J k(t)ay < |lk lL: [x — xol"? < [kl V2-7.

(b) Use part (a) and the Norm Convergence Theorem 38.10 to show that if
f¢PC(27) and x,€[—m, w], then the Fourier series for f can be integrated term-
by-term:

| f(t) dt= ; a(x —X0) + > | (a, cos nt+b, sin nt) dt,
x0 n=1 dxg

and the resulting series is uniformly convergent for x €[—7, a].


38.W. (a) Suppose that a >0 is not an integer. Show that
2a sinan[ 1 cosx _cos2x - cos 3x vee
COS ax = _
T 2a? a?-17 a?—-2? a?-3?

for all x €[—7, wr].


(b) Use part (a) to show that if x¢Z, then

1 2x 1
cot 1x = — +—_ z 23
TX Wax’ -n
* (-4)"
ese mx = 424 C y
TX Wysx'—n

(c) Differentiate the first series in (b) term-by-term (justify this) to show that if
x¢Z, then

Ti, ¥ 1.
2

lim
(sin ax)? noi (x —n)?*
(d) Integrate the first series in (b) term-by-term (justify this) to show that if x¢ Z,
then

sin 7x _ yy [(1-5)(1-%)- . (1-3)|


Wx mn v 2? m7) }
VII
DIFFERENTIATION
IN RF’

In this chapter we shall present the theory of differentiable functions in


R’ where p>1. Although the theory is parallel to that presented in
Sections 27 and 28, there are several complications and new features that
arise. Some of these complications are due purely to the inevitable
notational complexity, but most arise because it is possible to approach a
point c € R? from ‘“‘many directions,” so some new phenomena can occur.
In Section 27 we defined the derivative of a function f: R > R at a point
céR in the traditional manner; namely, as the number Le R such that

L=limf@=fO) |
xe x—-C

when this limit exists. Equivalently, we could have defined this derivative
to be the number L such that
lim f(x)-f()-L@—c) =0.
xc
|x —¢|
This limiting relation can be regarded as making precise the sense in which
we approximate the values f(x), for x sufficiently near c, by the values of
the affine? map

xr f(c)+L(x—c),
whose graph yields the line tangent to the graph of f at the point (c, f(c)).
It is this approach to the derivative that we shall use for functions on R?
to R*, Thus the derivative of a function f defined on a neighborhood of a

t In elementary courses, such a map is called “linear.” However, to be consistent with the
more restricted use of the term “linear” introduced in Section 21, we shall use the term
“affine” to refer to a function obtained by adding a constant to a linear function.

346
39 THE DERIVATIVE IN R? 347

point c € R? with values in R* will be a linear map L: R’ > R? such that


lim If@)-fl)-L&—o)ll_ 9
xe
Ix — ell
Hence we are approximating f(x), for x sufficiently near c, by the affine
map
x f(c)+L(x-c)
of R° into R*. [The reader should note that if p=1, then the notation
L(x —c) denotes the product of the real numbers L and x —c; however, if
p> 1, then L(x —c) denotes the value of the linear map L at the vector
x—c.]
Section 39 presents the definition and relates the derivative with the
various ‘‘partial’’ derivatives. In Section 40 we obtain the Chain Rule and
the Mean Value Theorem which are of central importance. Section 41
gives a penetrating analysis of the mapping properties of differentiable
functions, leading to the important Inversion and Implicit Function
Theorems, and culminating with the Parametrization and Rank
Theorems. The final section deals with extremum properties of real
valued functions on R?.

Section 39 The Derivative in R?’

Section 27 considered the derivative of a function with domain and


range in R. In this section we shall consider a function defined on a subset
of R? and with values in R* from a similar point of view.
If the reader will review Definition 27.1, he will note that it applies
equally well to a function defined on an interval J in R and with values in
the Cartesian space R*. Of course, in this case L is a vector in R*. The
only change required for this extension is to replace the absolute value in
equation (27.1) by the norm in the space R*. Except for this, Definition
27.1 applies verbatim to this more general situation. That this situation is
worthy of study should be clear when it is realized that a function f on J to
R‘ can be regarded as being a curve in the space R® and that the derivative
(when it exists) of this function at the point x =c yields a tangent vector to
the curve at the point f(c). Alternatively, if we think of x as denoting
time, then the function f is the trajectory of a point in R‘* and the
derivative f'(c) denotes the velocity vector of the point at time x =c.
A fuller investigation of these lines of thought would take us farther into
differential geometry and dynamics than is desirable at present. Our aims
are more modest: we wish to organize the analytical machinery that would
make a satisfactory investigation possible and to remove the restriction
348 DIFFERENTIATION IN R?

that the domain is in a one-dimensional space and allow the domain to


belong to the Cartesian space R’. We shall now proceed to do this.
An analysis of Definition 27.1 shows that the only place where it is
necessary for the domain to consist of a subset of R is in equation (27.1),
where a quotient appears. Since we have no meaning for the quotient of a
vector in R* by a vector in R’, we cannot interpret equation (27.1) as it
stands. We are led, therefore, to find reformulations of this equation.
One possibility which is of considerable interest is to take one-dimensional
“slices”’ passing through the point c inthe domain. For simplicity it will be
supposed that c is an interior point of the domain D of the function; then
for any u in R’, the point c+tu belongs to D for sufficiently small real
numbers t.
39.1 DeFInrTION. Let f be defined on a subset A of R’ and have
values in R*, let c be an interior point of A, and let u be any point in R?.
A vector L, € R‘ is said to be the partial derivative of f at c with respect to
u if for each number ¢>0 there is a &(¢)>0 such that for all teR
satisfying 0<|t|<8(e), we have

(39.1) iF {f(c + tu) — flO} Li] <e.


It is readily seen that the partial derivative L, defined in (39.1) is uniquely
determined when it exists. Alternatively, we can define L, as the limit

ling + {f(c + tu) —f(C},


or as the derivative at t=0 of the function F defined by F(t)=f(c+tu) for ||
sufficiently small, and having values in R*.
We shall write D.f(c) or f.(c) for the partial derivative L. of f at c with
respect tou. The first notation is greatly to be preferred when, as is often
the case, the symbol denoting the function has a subscript. We denote the
function c +> D.f(c)=fulc) by D.f or fu; it is defined for those interior
points c of A for which the required limit exists, and has values in R*.
It is clear that if f is real valued (so that q=1) and if u is the vector
e,=(1,0,..., 0) in R’, then the partial derivative of f with respect to e:
coincides with what is usually called the partial derivative of f with respect
to its first variable, which is often denoted by

Diff fas OF of
axe

In the same way, taking e. =(0,1,...,0),...,e, =(0,0,..., 1), we obtain


the partial derivatives of f with respect to the other variables:

Daf=
-, x2
—4f
axp 7 Def
=, f,
=f
OX,”
39 THE DERIVATIVE IN R?° 349

In case the symbol denoting the function has a subscript, we shall sometimes
insert a comma to indicate a partial derivative; thus, D,f2= f.,.
It should be observed that the partial derivative of a function at a point
with respect to one vector may exist, yet the partial derivative with respect
to another vector need not exist (see Exercise 39.A). It is also plain that,
under appropriate hypotheses, there are algebraic relations between
partial derivatives of sums and products of functions, etc. We shall not
bother to obtain these relations since they are either special cases of what
we shall do below, or can be proved similarly.
A word about terminology is in order. If u is a unit vector in R’, then
the partial derivative D.f(c) = f.(c) is often called the directional derivative
of f at c in the direction of u.

The Derivative

The main drawback of the partial derivative of a function f at a point c


with respect to a vector u is that it only gives a picture of the behavior of f
near c on the one-dimensional set {c +tu:t¢R}. In order to obtain more
complete information about f in a neighborhood of ce R’, we shall
introduce the notion of the derivative of f at c, which is a linear map from
R’ to R*.
39.2 DEFINITION. Let f have domain A in R? and range in R’4, and
let c be an interior point of A. We say that f is differentiable atc if there
exists a linear function L: R’ — R* such that for every « >0 there exists
8(e)>0 such that if xe R? is any vector satisfying |x —c||< 6(e), then
xeA and

(39.2) f(x) — f(c)-L@&


—c)| = €|x — ef.
Alternatively, (39.2) can be rephrased by requiring that for any ¢ > 0 there exists
5(e)>0 such that if ue R? and |lul|=< 6(e), then

(39.3) I[f(c + u)—f(c)—-L(u)l]= ¢ lull,


which, in turn, can be expressed more compactly by writing

(39.4) lim Hern) HOE = 9,


Ina} 0 u
We will see below that such a linear function L is uniquely determined
when it exists. It is called? the derivative of f at c and we shall often
denote it by Df(c) instead of L. We shall often write Df(c)(u) for L(u),
and Df(c)(x—c) for L(x —c).

} The reader is warned that L is sometimes called the Fréchet derivative, or the differential, of
f atc, and is sometimes denoted by df(c) or f'(c), etc.
350 DIFFERENTIATION IN R?

From an analytic point of view, the existence of the derivative of f at c


reflects the possibility of approximating the mapping x +> f(x) by the
mapping x+> f(c)+L(x—c). Inequality (39.2) gives a measure of the
closeness of this approximation when x is near to c. Because of the
linearity of L, we have

f(c)+L(x—c) = (f(c)—L(c))
+ L@).
Hence we are approximating x +> f(x) by a function of the form x +>
yot L(x), where yo is fixed. Such functions are called affine mappings of
R? into R’; they are merely translations of linear mappings and so have a
very simple character.
From a geometric point of view, the existence of the derivative of f at c
reflects the existence of a tangent plane to the surface {(x, f(x)):x¢ A} in
R’ x R‘ at the point (c, f(c)); namely, the plane given by the graph

(39.5) {(x, f(c) + L(x —c)):


xe R®}.

We shall now establish the uniqueness of the derivative.


39.3. Lemma. The function f has at most one derivative at a point.
PROOF. Suppose Li, L, are linear functions from R? to R‘4 and satisfy
(39.3) for ||ul|< 8(e). Then we have

0<|Li(u)—- L.(u)]|

<|[f(c +u)—f(c)—Li(u)||+ lfc + u) — f(c) — La(u)||


<2e lull.
Therefore we have 0 <||L.(u)—L,(u)}| = 2¢ |lu|| for all ue R? with |ul|<
S(e). If L,AL2, there exists ze R" with Li(z)4¢L.(z), whence z#0.
Now let zo=(8(e)/|[z|))z so that we have |\zo/=8(e) and hence 0<=
[|L1(Zo)
— Lo(zo)||<2¢ ||zo]. Hence ||[Li(z)— L2(z)||<2e ||z|| for all e >0, so
Li(z)=L.(z), a contradiction. Therefore Li = L:. OED.
39.4 ExaAmpLes. (a) Let ACR?, let yo¢ R’%, and let fo:A > R* be
the “constant function” defined by fo(x) = yo for xe A. If c is an interior
point of A and x eA, then fo(x)—fo(c)=0. It follows that fo is differenti-
able at c and that the derivative Dfo(c) = 0, the ‘“‘zero linear function” that
maps every element of R’ into the zero element of R*. Hence the
derivative at any point of a constant function is the zero linear function.
(b) Let A= RP and let f,: A > R* be a linear function. If ce A and
xeA, then fi(x)—fi(c)—fi(x—c)=0. It follows from this that f; is
differentiable at c and that Dfi(c)=f.. Hence the derivative at any point of
a linear function is the linear function itself.
39 THE DERIVATIVE IN R? 351

39.5 Lemma. If f:A — R’* is differentiable at ce A, then there exist


strictly positive numbers 6, K such that if ||x—c|| <5, then

(39.6) If(x) -f(O)|l = K |x ell.


In particular it follows that f is continuous at x =c.
PROOF. By Definition 39.2 it follows that there exists 6 >0 such that if
0<||x—cl{< 8, then (39.2) holds with e=1. If we use the Triangle
Inequality, we have

IIf x) — f(c)|| = IIL — cl] + ||x — el


for 0<||x-—cl|=6. By Theorem 21.3 there exists B>0O such that
[L(x —c)|| < B lx —cll for allx eR’. Therefore, if 0<||x — cl] < 8 we obtain

— fC) = (B +1) x — cll,


Ix)
and this inequality remains true also for x = c. OED.

We now show that the existence of the derivative at a point implies the
existence of all of the partial derivatives at that point.
39.6 THEOREM. If ACRY, if f: A— R* is differentiable at a point
céA, and if uis any element of R’, then the partial derivative D.f(c) off atc
with respect to u exists. Moreover,

(39.7) D.f(c) = Df(c)(u).


PROOF. Since f is differentiable at c, given « >0 there exists 5(e) >0
such that

IIf(c + tu) — f(c)— Df(c)(tu)]] = e ||tul]


provided ||tu|| < 5(e). If u =0, then the partial derivative with respect to 0
is readily seen to be 0=Df(c)(0); hence we suppose u#0. Thus if
0<|t| < 6(e)/lul], we have

les m= fle) Df(c)(u) <e |lul).

This shows that Df(c)(u) is the partial derivative of f at c with respect to u,


as claimed. Q.E.D.

39.7 CoroLtrary. Let ACR’, let f: A —R and let c be an interior


point of A. If the derivative Df(c) exists, then each of the partial derivatives
D,f(c),..., Dpf(c) exist in R and if u=(w,..., up) R?’, then

(39.8) Df(c)(u)
= ui Dif(c)+---+u,D,f(c).
352 DIFFERENTIATION IN R?

PROOF. The theorem implies that for each of the vectors e;,..., & the
partial derivatives Dif(c),...,D,f(c) exist and equal Df(c)(ei),...,
Df(c)(e,). However, since Df(c) is linear and u=u,e,+---+ ue, we
deduce that

Df(c\(u) = ¥ uPfloy(e) = & uDiflc).


e

j=l
orp
REMARKS. (a) The converse of Corollary 39.7 is not always true, for the partial
derivatives of f may exist without the derivative existing. For example, let
f:R’—R be defined by

FQ, y)=0 for (x, y) = (0, 0),


= ty
xy for (x, y) # (0, 0).
It is an exercise to show that the partial derivative of f with respect to the vector
(a, b) at (0, 0) is given by

ab?
(39.9) Da f(0, 0) = ath?’ (a, b) # (0, 0).

In particular, D,f(0,0)=0 and D.f(0,0)=0. If the derivative Df exists at (0, 0),


Corollary 39.7 would imply that

Daf (0, 0) = Df(0, 0)(a, b) =a -0+b-0=0,

contrary to (39.9).
(b) We shall see below that if A < R” and if the partial derivatives of f: A > R*
are continuous at c, then Df(c) exists.

39.8 Examp_Les. (a) Let ACR and let f:A—-R. Then f is


differentiable at an interior point c of A in the sense of Definition 39.2 if
and only if the ordinary derivative

lim e910.p¢e
a)
tA0

exists. In this case the derivative Df(c) is the linear function of R into R
defined by
ur fi(c)u.

Thus Df(c) maps ueéR into the product of f’(c) and u. (In matrix
terminology, the derivative Df(c) is the linear mapping represented by the
1X1 matrix whose only element is f’(c).)
Traditionally, instead of writing u for the real number on which the linear
function of Df(c) acts, one writes the somewhat peculiar symbol dx (here the “‘d”’
plays the role of a prefix and has no other significance). When this is done and the
39 THE DERIVATIVE IN R? 353

Leibniz? notation for the derivative is used, the formula Df(c)(u) = f’(c)u becomes

Df(c)(dx) -2 (c) dx.

(b) Let ACR and let f: A—» R* (q>1). Hence f can be represented
by the “coordinate functions”

fix)=(ix),--., fale), x EA.


As an exercise the reader should prove that f is differentiable at an interior
point c of A if and only if each of the real-valued functions f;,..., f, has a
derivative atc. In this case, the derivative Df(c) is the linear function of R
into R* given by

urvu(fi(c),..., falc), ueR.


Hence Df(c) maps a real number u into the product of u and a fixed vector
fi(c)=(ilc),..., f4(c)). When f is thought of as being a “curve,” this
vector is called the “tangent vector” to f at the point f(c).
(c) Let ACR? (p>1) and let f: A— R. Then it follows from Corol-
lary 39.7 that if the derivative Df(c) exists at a point c interior to A, then
each of the partial derivatives Dif(c), ..., D,f(c) must exist and that Df(c)
is the linear mapping of u=(u1,...,u,)€R° into R given by

Df(c)(u) = ui Dif(c)++ +++ upD,pf(c).


Although the mere existence of these partial derivatives does not imply the
existence of the derivative, we shall see below that their continuity at c
does guarantee its existence.
Sometimes, instead of u =(u,,..., u,) we write dx = (dx,,..., dx,) for the point
in R° on which the derivative is to act. When this and Leibniz’ notation is
employed for the partial derivatives, the above formula becomes

Df(c) (dx) -2 (c) dx, +-- +z (c) dx,.

(d) Now let ACR? and f:A — R‘ where both p>1, q>1. In this
case we can represent y = f(x) by a system

+GOTTFRIED WILHELM LEIBNIZ (1646-1716) is, with ISAAC NEWTON (1642-1727), one of the
coinventors of calculus. Leibniz spent most of his life serving the dukes of Hanover and was a
universal genius. He contributed greatly to mathematics, law, philosophy, theology, linguis-
tics, and history.
354 DIFFERENTIATION IN R?

of q functions of p arguments. If f is differentiable at a point c=


(c1,..-,Cp) in A, then it is an exercise to show that each of the partial
derivatives D,fi(c)(= fij(c)) must exist at c. (Again this latter condition is
not sufficient, in general, for the differentiability of f at c.) When Df(c)
exists, it is the linear function which maps the point u=(m,..., up) of R?’
into the point w=(wi,...,w,) of R*? given by
wi= Difi(c)urt Dofi(c)uet::- + Dof (cup;
(910) eee
Wa = Dif,(c)ust+ Dofa(c)uat- +- Dofa(c)
Up.
The derivative Df(c) is the linear mapping of R?’ into R* determined by
the q Xp matrix whose elements are

Difi(c) Def,(c) D,filc)

Garr | PO DAE) DRO


Difs(c) Dofy(c) -+- Dpfatc)
firle) fi2(e) fie(c)
_| fale) fraole) ++ frele)

fate) face) -++ fao(e)


We have already remarked (see Theorem 21.2) that such an array of real
numbers determines a linear function on R’ to R*. The matrix (39.11) is
called the Jacobian matrix of the system (39.9) at point c. When p =q, the
determinant of the matrix (39.11) is called the Jacobian determinant, or
simply the Jacobian of the system (39.10) at the point c. Frequently, this
Jacobiant determinant is denoted by

Of, fas ++» fo) or J;(c).


A(X, X22 05 Xp) x=e

Existence of the Derivative

It was proved in Theorem 39.6 that the existence of the derivative at a


point implies the existence of all the partial derivatives at that point. It
was seen in the remark after Corollary 39.7 that the mere existence of the
partial derivatives does not imply the existence of the derivative even when
p=2, q=1. We shall now show that the continuity of the partial
derivatives at c is sufficient for the existence of the derivative at c.

t CARL ({G. J.) JAcosI (1804-1851) was professor at K6nigsberg and Berlin. His main work
was concerned with elliptic functions, but he is also known for his work in determinants and
dynamics.
39 THE DERIVATIVE IN R° 355

39.9 THEOREM. Let ACR?, let f:A — R‘, and let c be an interior
point of A. If the partial derivatives Df ((=1,...,q,,=1,..., p) exist
in a neighborhood of c and are continuous at c, then f is differentiable’
atc. Moreover Df(c) is represented by the q Xp matrix (39.11).
PROOF. We shall treat the case q = 1 in detail. If ¢ >0 let (ce) >0 be
such that if ||y—cl| =< 6(e) and j=1,2,..., p, then
(39.12) ID,f
— Dif(c)|<e.
(y)
Tf x = (x1, X2,..., Xp) and c = (ci, €2,..., Cp), let z:, Z2,..., Zp-1 denote the
points
Z1 = (C1, X2,.. +, Xp), Zo
= (C1, €2, X3,.-- Xp),
wey Zp = (c1, Cay. . 64 Cpa, Xp)

and let zo=x and z,=c. If ||x—cl|<=6(e), then it is easily seen that
||z; -¢l| = 8(e) for 7] =0,1,...,p. We write the difference f(x)—f(c) asa
telescoping sum:

fle) fle) = ¥ fe) fled)


If we apply the Mean Value Theorem 27.6 to the jth term of this sum, we
obtain a point Z,, lying on the line segment joining z;-, and z;, such that
(2-1) — fla) = (4 — ¢) Dif (Z).
Therefore, we obtain

f(x)~ fle) ~ ¥ (Dil) = & ie MDF) - flO


In view of the inequality (39.12), each quantity appearing in braces in the
last formula is dominated by «. Applying the Schwarz Inequality to this
last sum, we obtain the estimate

fx)—f(€)— 2 (i ~ 4) DFO) = (eV) x —el


whenever ||x —c|| = 8(e).
We have proved that f is differentiable at c and that its derivative Df(c)
is the linear function from R’ to R given by

U= (utr, -5 ty) > Dflou) = ¥ wD flO).


In the case where f takes values in R* with q>1, we apply the same
argument to each of the real-valued functions f,, i=1,2,...,q, which
occur in the coordinate representation of the mapping f. We shall leave
the details of this argument as an exercise. OED.
356 DIFFERENTIATION IN R?

Exercises
39.4. Let f:R’—R be defined by

fl y=) for y~0,


=0 for
y =0.

Show that the partial derivatives D,f(0, 0), D.f(0, 0) exist and equal 0. However,
the derivative of f at (0,0) with respect to a vector u=(a,b) does not exist if
ab#0. Show that f is not continuous at (0, 0); indeed, f is not even bounded on a
neighborhood of (0, 0).
39.B. Let g:R*— R be defined by

g(x, y)=0 for xy =0,


=1 for xy #0.

Show that the partial derivatives D,g(0, 0), D.g(0, 0) exist and equal 0. However,
the derivative of g at (0,0) with respect to a vector u =(a, b) does not exist if
ab#0. Show that g is not continuous at (0,0); however, g is bounded on a
neighborhood of (0, 0).
39.C. Let h: RR’—> R be defined by

h(x, y)=0 for (x, y)


= (0. 0),
xy for (x, y) # (0, 0).
x*+y?

Show that the partial derivatives D,h(0, 0), D2h(0, 0) exist and equal 0. However,
the derivative of h at (0,0) with respect to a vector u=(a,b) does not exist if
ab#0. Show that h is not continuous at (0, 0).
39.D. Let k:R’— R be defined by

k(x, y)=0 for (x,


y) = (0, 0),
— xy"
ety' for (x, y) # (0, 0).

Show that the partial derivative of k at (0, 0) with respect to any vector u = (a, b)
exists and that

D,k(0, =e ifax0.
Show that k is not continuous and hence not differentiable at (0, 0).
39.E. Let f:R’— R be defined by

f(x, yy=0 for (x, y) = (0, 0),


x 2
__*y
“Say for (x,
y) 4 (0, 0).
39 THE PERIVATIVE IN R? 357

Show that the partial derivative of f at (0,0) with respect to any vector u = (a, b)
exists and that

ab?
D.,f(O, 0) = woab? if (a, b) # (0, 0).

Show that f is continuous but not differentiable at (0, 0).


39.F. Let F:R’— R be defined by

F(x, y)=x?+y? if both x, y are rational,


=0 otherwise.

Show that F is continuous only at the point (0, 0) and that it is differentiable there.
39.G. Let G:R?’— R be defined by

G(x, y)=(x?+ y’) sin 1/(x’+ y’) for (x, y) 4 (0, 0),
=0 for (x, y)
= (0, 0).

Show that G is differentiable at every point of R* but the partial derivatives D,G,
D,G are not bounded (and hence not continuous) on a neighborhood of (0, 0).
39.H. Let H: R?— R® be defined by

H(, y) = (x?+27 sin, y) for x40,

=(0, y) forx =0.

Show that D,H exists at every point and that D,H exists and is continuous on a
neighborhood of (0,0). Show that H is differentiable at (0, 0).
39.1. Let ACR’, let f: A > R‘ be differentiable at a point c interior to A, and
jet ve R*. If we define g:A > R by g(x) =f(x)- v for all xe A, show that g is
differentiable at c and that

(u)
= (Df
Dg(c) (c)(u)) + v forueR’.
39.J. Let c be an interior point of ACR? and let f:A—>R.
(a) If f is differentiable at c, show that there exists a unique vector v, € R° such
that

c)=v. + u
= Df(c)(u)
D,f( for allue R’.

The vector v, is called the gradient of f at c and is denoted by V-f, or by grad f(c).
Show that

V.f= (Dif(c), ---, Def(c)).


(b) Use the Schwarz Inequality to show that if ue R?” and |lul|=1, then the
function u+> D,f(c) has a maximum value when u is a positive multiple of V-f.
Hence the direction in which the directional derivative of f at c is maximum is that
of the gradient of f at c.
39.K. Let c be an interior point of A ¢ R’, let f, g: A > R be differentiable at c,
358 DIFFERENTIATION IN R?

and let ae R. Show that

V(af)=aV.f, V.(f+g)=V.f+Veg,
V. (fg) = f(c) V.g + g(c) Vef.
39.L. Find the gradients of the following functions at an arbitrary point in R’.
(a) filx, y, Zz) =x? ty? +275
(b) fax, y, z) =x? yz +275
(c) falx, y, 2) = xyz.
39.M. Find the directional derivatives of each of the functions in 39.L at the
point (0, 1, 2) in the direction toward the point (0, 2, 3).
39.N. Let ACR? and let a function f:A—R represent a surface S,; in R®
explicitly as its graph:

S, ={x, ys f(y): y)€ A}.


If f is differentiable at an interior point (xo, yo) of A, then the tangent plane to S, ai
the point (Xo, yo, f(X0, Yo)) is given by the graph of the affine map A,,,,.:R?—>R
defined by

A coxol% ¥) = f(Xo, Yo) + Df (Xo, Yo(% — Xa, ¥— Yo)-

Show that the tangent plane to S, at this point is

{x y, z) eR? : 2 = f(a, Yo) + Dif(Xos Vox — X0) + Daf (Xo, yoy — Yo).
39.0. Find the tangent planes to the surfaces in R® represented as graphs of the
following functions of the points specified. Draw a sketch.
(a) f(x, y)=x?+y? at (0,0) and at (1, 2).
(b) f.(x, y)=xy at (0, 0) and at (1, 2).
(c) falx, y) =(4—(x?+ y?))'” at (0, 0) and at (1, 1).
39.P. Let J&R be an interval and let g:J— R® represent a curve C, in R®
parametrically:

C, ={(a.(t), g(t), g(t): t © J}.


If g is differentiable at an interior point t, of J, then the tangent space to C, at the
point g(t.)=(gilto), g2(to), gs(to))ER° is given parametrically by the affine map
A,:R— R? defined by

Ag(t) = g (ta) + Dg (ta)(t— th).


Show that the tangent space to C, at this point is

{(x, y, Z)E Reix= &i(to) + gilto)(t —to)s

Y = Balto) + grltol(t—to), —-Z = Balto)+ gs(to)(t — to).


If gi(ts), 2(to), ga(t.) are not all zero, then this tangent space is a line in R° and is
called the tangent line.
39.Q, Find parametric equations for the tangent lines to the following curves in
39 THE DERIVATIVE IN R?° 359

R’ at the specified points:


(a) git yy, z2=@Ge,t°)
at the points corresponding to t=0 and t=1.
(b) g:tr (x, y, z)=(t-1, #’, 2)
at the points corresponding to t=0 and t=1.
(c) git (x, y, z)=(2 cost, 2 sin t, t)
at the points corresponding to t= 7/2 and t=.
39.R. Let A ¢R’ and let h: A > R’ represent a surface S, in R’ parametrically:

S, = {(hils, t), hols, t), ha(s, t)):(s, t) € A}.


If h is differentiable at an interior point (so, 1.) of A, then the tangent space to S, at
the point h(so, to) = (hilSo, 0), He(So, to), ha(So, to))€ R’ is given parametrically by the
affine map A Qo,.):.R — R? defined by

A Go. 8, t) = W(S0, to) + Dh (So, to)(S — So, t— to).

Show that the tangent space to S, at this point is

{(x, y, Z)E R?:x = hy(S0, to} + Diha(So, to)(S — So) + DahalSo, to)(t — to),
yr ha(So, to) + D,hi(so, to)(s _ Si) + D,hi(so, to)(t _ to),

Z= hA(So, ty) + D,h,(so, to)(s _ So) + Dyha(so, to)(t _ to)}.

If the vectors (Djhi(So, to), Diho(o, toe), Dihs(So, to)) and (Dh, (So, to), Drha(So, to),
D,h,(So, to)) in R* are not multiples of each other, then this tangent space is a plane
in R° and is called the tangent plane.
39.8. Find parametric equations for the tangent planes to the following surfaces
in R° at the specified points.
(a) h:(s, t)> (x, y, z)=(s, tf, s7+127) at the points corresponding to (s, t) = (0, 0)
and (1, 1).
(b) h:(s, t)> (x, y, z)=(s+4, s—t, s?—t”) at the points corresponding to (s, i) =
(0, 0) and (1, 2).
(c) h:(s, thw (x, y, z)=(s cost, s sin t, t) at the points corresponding to (s, t)=
(1, 0) and (2, 7/2).
(d) h:(s, t) > (x, y, z) = (cos s sin t, sins sin t, cos t) at the points corresponding
to (s, t)=(0, 0), (0, 2/2) and (7/4, 27/4).
39.T. If ACR?’ and f: A > R is such that the partial derivatives D,f,...,D,f
exist and are bounded on some neighborhood of c € A, then f is continuous at c.
(Hint: argue as in the proof of Theorem 39.9.)
39.U. Let f be defined on a neighborhood of a point c ¢ R’ with values in R.
Suppose that D.f exists and is continuous on a neighborhood of c and that D.f
exists at c. Show that f is differentiable at c.
39.V. Let ACR? and let f: A > R‘ and g:A —R’ be given. If F:A > R'X
R’ = R*™ is defined by F(x) = (f(x), g(x)) for x € A, show that F is differentiable at
an interior point c € A if and only iff and g are differentiable at c. In this case we
have

DF(c)(u)
= (Df (c)(u), Dee (u)) for ue R’.
360 DIFFERENTIATION IN R?

39.W. Let ACR’ and BCR‘ and let G:AXB—R' be differentiable at a


point (a,b) in AX B. We define g,:A — R' and g,:B —R’ to be the “partial
maps” at (a, b) given by

gi(x)= G(x, b), — galy)= Gla, y)


for all xe A, yeB. Show that g, and g, are differentiable at a and },
respectively, and that

Dg.(a)(u) = DG(a, bu, 0), Dg.(b)(v) = DG(a, b)(0, v),


for all ue R’, ve R*. Moreover, we have

DG(a, b)(u, v) = Dg:(a)(u) + Dg.(b)(v).


[Sometimes Dg,(a)e £(R’, R') and Dg.(b)<¢ £(R°,R') are called the “block
partial derivatives” of G at (a, b) and are denoted by D,,G(a, b) and D,.,G(a, b).]

Section 40 The Chain Rule and Mean Value


Theorems

We shall first establish the basic algebraic relations concerning the


derivative. These properties, which are the same as those for real-valued
functions of one variable, will be used frequently in the following.
40.1 THEOREM. Let ACR? and let c be an interior point of A.
(a) Iff and g are defined on A to R‘ and are differentiable at c, and if a,
BER, then the function h = af+ Bg is differentiable at c and
Dh(c) =aDf(c)+BDg(c).
(b) If p:A > R and f:A > R‘ are differentiable at c, then the product
function k =of:A — R° is differentiable at c and
Dk(c)(u) ={Del(c )(u)}f(c) + ele Df(c)(u)} for ue R?.
PROOF. (a) If « >0, then there exist 5:(e)>0 and &2(¢) >0 such that if
x — cl = inf {6,(e), 52(e)}, then

If(x) — fle) — Df(c)(x


—c)]] < « |x -ell,
llg(x) - g(c) — Dg(e)(x
—c)|| = € |x — el].
Thus, if |x —cl| < inf {8:(e), 52(e)}, then
[|h(x) —h(c) —{aDf(c)(x —c)+ BDg(c)(x —¢}}]| = (la]+|B)e |x — el].
Since aDf(c) + BDg(c) is a linear function of R° into R‘, it follows that h is
differentiable at c and that Dh(c)=aDf(c)+ BDg(c).
40) THR CHAIN RULE AND MEAN VALUE THEOREMS 361

(b) A simple calculation shows that

k(x) k(c)—{De(c)(x — c)f(c) + o(c)Df(c)(x — c)}


= {e(x)— e(c)— De(c)(x — c)}f(x)
+ De(c)(x— cH f(x) — fle)} + ele Hf(x) — fle)— Df(c)(x — c)}.
Since Df(c) exists, we infer from Lemma 39.5 that f is continuous at c;
hence there exists a constant M such that ||f(x)||< M for |x — ci] < 6 From
this it can be seen that all the terms on the right side of the last equation
can be made arbitrarily small by choosing ||x—c|] small enough. This
establishes (b). QED.
The next result, which is very important, asserts that the derivative of the
composition of two differentiable functions is the composition of their
derivatives.
40.2 Cuan Rute. Let f have domain A CR? and range in R‘, and
let g have domain B < R‘ and range in R'. Suppose that fis differentiable at
c and that g is differentiable at b=f(c). Then the composition h = gef is
differentiable at c and

(40.1) Dh(c) = Dg(b)°Df(c).

Alternatively, we write

(40.2) D(gof)(c) = Dg (F(c))° Df(c).


PROOF. The hypothesis implies that c is an interior point of the domain
of h=gef. (Why?) Let c >Oand let 8(«, f) and 8(e, g) be as in Definition
39.2. It follows from Lemma 39.5 that there exist strictly positive
numbers , K such that if ||x—c||< y, then f(x)eB and
(40.3) IIf(x)
— f(c)l| = K |[x - ef].
For simplicity, we write L;= Df(c) and L, =Dg(b). By Theorem 21.3
there is a constant M such that

(40.4) |L.(u)|| = M jul, for ue R“.

If ||x ~ cl] = inf {y, (1/K) 8(e, g)}, then (40.3) implies that |If(x)—f(c)||<
6(¢, g), which means that

(40.5) |[g(f(x)) — g(f(c)) —Le (f(x)


— fic) = €If)
— f(e)l| = eK ||x — el].
If we also require that ||x —c|| < 6(e, f), then we infer from (40.4) that

I[Leff(x) — f(c) — L(x —c)}]<eM |x —cl].


362 DIFFERENTIATION IN R?

If we combine this last relation with (40.5), we infer that if 8;=


inf {y, (1/K) 8(e, g), 8(e, f)} and if xe A and ||x — cl] < 8, then
Ile (F(x) — g(f(c)) — Le (Ly(x — ))]] = eK + M) |x — ce],
which means that

llgef(x)—gef(c)-LyeL(x—c)|| < e(K +M) |[x—cel].


We conclude that Dh(c) = L, ° Ly. O.E.D.
Maintaining the notation of the proof of the theorem, L;= Df(c) is a
linear function of R? into R‘ and L, = Dg(b) is a linear function of R° into
R’. The composition L,°L; is a linear function of R’ into R’, as is
required, since h = gof is a function defined on part of R? with values in
R’. We now consider some examples of this result.
40.3 EXAMPLES. (a) Let p=q=r=1; then the derivative Df(c) is
the linear function which takes the real number u into f’(c)u, and similarly
for Dg(b). It follows that the derivative of g°f sends the real number u
into g'(b)f'(c)u.
(b) Let p>1, q=r=1. According to Example 39.8(c), the derivative
of f at c takes the point w=(w1,..., wp) of R? into the real number

Dif(c)wi pees D,f(c)wp

and so the derivative of g of at c takes this point of R’ into the real number

g'(b)[Dif(c)wit: + D,f(c)we].
(c) Letq>1, p=r=1. According to Examples 39.8(b), (c) the deriva-
tive Df(c) takes the real number u into the point
Df(c)(u)= uf'(c)=(file)u,..., ftcju) in R*,
and the derivative Dg(b) takes the point w =(wi,..., Wa) in R® into the
real number
Dig(b)wit- + -+Dag(b)w,.

It follows that the derivative of h = gef takes the real number u into the
real number

Dh(c)u={Dig(b)fi(c) +> + - + Dag (b)falc)}u = u{Dg(b)(F'(c))}-


The quantity in the braces, which is h’(c) =(gef)'(c) is sometimes denoted by the
less precise symbolism
ag df;
—_2 ee dg df,
4f-+—ee—,

ay, dx * dy, ax

In this connection, it must be understood that the derivatives are to be evaluated at


appropriate points.
40 THE CHAIN RULE AND MEAN VALUE THEOREMS 363

(d) We consider the case where p=q=2 and r=3. For simplicity in
notation, we denote the coordinate variables in R’ by (x, y), in R* by
(w, z), and in R” by (r,s,t). Then a function f on R?’ to R* can be
expressed in the form

w=WG,y), z=Z(x,y)
and a function g on R‘ to R' can be expressed in the form
r=R(w, z), s = S(w, z), t=T(w, z).

The derivative Df(c) sends (é, 4) into (@, £) according to the formulas

o= W,.(c)é + Wy (c)n,
(40.6)
f= Z.(c)§ + Z,(c)n.
Here we write W, for D:.W=D,W, etc. Also the derivative Dg(b) sends
(w, £) into (p, a, 7) according to the relations

p=R, (bows R.(b)E,


(40.7) o =S,(b)w
+ S.(b)é,
T= T.(b)w+ TAb)L

A routine calculation shows that the derivative of gof sends (&, 7) into
(p, o, 7) by
p ={Rw(b) Wc) + Re(b) Zz (C)}E + {Rw (b} Wy(c) + R.(b)Z, (c)}m,
(40.8) o={S..(b) W.(c) + S.(b)Z.(c)}E + {S.(b) W(c) + S.(b)Z,(c)}n,
T= {T.(b) W.(c) + Tb) Z.(c)}E +{T.(b) W,(c) + T(b)Z,(c)}n.
A more classical notation would be to write dx, dy instead of & 1; dw, dz instead of
@, ¢; and dr, ds, dt instead of p, o, +. If we denote the values of the partial
derivative W, at the point c by dw/dx, etc., then (40.6) becomes

dw = dx ey dy,

dz = oe dx +5 dy;
similarly, (40.7) becomes

dr= a dw +t dz,

ds= 2s dw +3 dz,

at at
dt =.
aw OY dw+—
az O23dz;
364 DIFFERENTIATION IN R?

and (40.8) is written in the form

dr = (Stow 4 2) dx + (2 oe, dy,


Ow ax Oz Ox Ow dy az ay

as = (28.2,
5
8 22) gy 5 (89, 9822) ay,
as dW | Os Oz
Ow ax dz Ox x
Os Ow , OS Oz
aw dy dz oy y

dt= (2. , 2 a8) ax + (2190, 3 2)


Ow dx dz Ox dw dy dz dy

In these last three sets of formulas it is important to realize that all of the indicated
partial derivatives are to be evaluated at appropriate points. Hence the coefficients
of dx, dy, etc., turn out to be real numbers.

We can express equation (40.6) in matrix terminology by saying that the


mapping Df(c) of (& 4) into (@, ) is given by the 2x2 matrix
ow ow

(40.9) pws Wier). ax ay .


ZAc) Z(c)]~ | az ©) 5)
az
ox

Similarly, (40.7) asserts that the mapping Dg(b) of («, Z) into (p, a, r) is
given by the 3x2 matrix
or (b)
or
= (b)
Ru(b) Re(b) ow
02
(40.10) S.(b) S.(b) |= 2S
ow
(p) (b) .
T.(b) Tb) ar
aw (6) =)
Finally, relation (40.8) asserts that the mapping D(g°f)(c) of (& 7) into
(p, o, tT) is given by the 3x2 matrix

R,,(b) Wi(c)+ R.(b)Z.(c) Rw(b)W,(c)


+ R.(b)Z,(c)
S.(b) W.{c)+S.(b)Z.(c) S..(b) W,(c) + S.(b)Z,
(c)
T.(b) Wc) + TAb)Z.(c) T(t) W,(c) + T.(b)Z,(c)
which is the product of the matrix in (40.10) with the matrix in (40.9) in
that order.

Mean Value Theorem

We now turn to the problem of obtaining a generalization of the Mean


Value Theorem 27.6 for differentiable functions on R?’ to R*. It will be
seen that the direct analog of Theorem 27.6 does not hold when g>1. It
might be expected that if f is differentiable at every point of R’ with values
40 THE CHAIN RULE AND MEAN VALUE THEOREMS 365

in R‘, and if a, b belong to R’, then there exists a point c (lying between a,
b) such that

(40.11) f(b) — f(a) = Df(c)(b—


a).
This conclusion fails even when p = 1 and q = 2 as is seen by the function f
defined on R to R’ by the formula
f(x) =(4-x?,x-x*).
Here Df(c) is the linear function on R to R* which sends the real number
u into the element

Df(c)(u) = (1—-2c)u, (1—3c7)u).


Now f(0)=(0,0) and f(1)=(0,0), but there is no point c such that
Df(c)(u) = (0, 0) for any non-zero u in R. Hence the formula (40.11)
cannot hold in general when q >1, even when p=1. However, for many
applications it is sufficient to consider the case where q=1 and here it is
easy to extend the Mean Value Theorem.
40.4 MEAN VALUE THEOREM. Let f defined on an open subset 0 of
R’ and have values in R. Suppose that the set 0 contains the points a, b and
the line segment S joining them, and thatf is differentiable at every point of
this segment. Then there exists a point c on S such that

(40.11) f(b) — fla) = Df(c)(b—


a).
PROOF. Let ¢:R— R° be defined by

e(t)=(1-that+th=a+t(b—a),

so that (0) =a, o(1)=b, and g(t)e SO for te[0, 1]. Since 0 is open
and ¢ is continuous, there is a number y > 0 such that @ maps the interval
(-y, 1+ y) into O. Now let F:(-y,1+y)— RB be defined by
F(t)=fee(t)=f(1—-tha+tb).
By the Chain Rule [see 40.3(c) and 40.P] it follows that

F(t) = Df((1—t)a + tb)(e'(t)


= Df((1—t)a+tb)(b-c).
If we apply the Mean Value Theorem 27.6 to F, we infer that there exists
to€ (0, 1) such that F(1)— F(0)= F'(t.). Letting c = @(to)€ S, we obtain

f(b)
— f(a) = FQ)
— F(0)
= F'(to) = Df(c)(b—a). QED.
Although the most natural extension of the Mean Value Theorem does
not hold when the range space is R*, q>1, there are some extensions
366 DIFFERENTIATION IN R?

which are available. One of the most useful is based on an inequality


rather than an equality.
40.5 MEAN VALUE THEOREM. Let (1¢ R? be an open set and let
f:Q—R* Suppose that © contains the points a, b and the line segment S
joining these points, and that
f is differentiable at every point of S. Then there
exists a point c on S such that

(40.12) f(b)— f(a)|| < ||Df(c)(b — a).


PROOF. If yo=f(b)—f(a) is the zero vector in R*, then the result is
trivial. If yo#0, let y:= yo/|lyoll and use the inner product in R* to define
H:Q—R by

A(x) =f(x)-y1 for xe.

Evidently we have

H(b)— H(a) = {f(b)


— f(a)} + ys = |IF()
— Fla) |}
and it is easily seen (cf. Exercise 40.H) that

DH(x)(u) ={Df(x)(u)} * y1
forxe S,ueER’. It follows from the Mean Value Theorem 40.4 that there
is a point c on S such that

H(b)— H(a) = DH(c)(b —a)


= {Df(c)(b—a)}- yn.
If we use the Schwarz Inequality and the fact that |ly;|]= 1, we have

IIf(b) — flail = {Df(c)(b — a)} - y. <|]Df(c)(b— a)]],


which is the desired result. O.E.D.

Since the exact value of the point c is usually not known, the theorem is
often applied by using the following result, whose statement uses the
notion of the norm of a linear map L from R? to R‘ that was introduced in
Exercise 21.L. It is only necessary to recall that ||L(u)l] < M |lull for all
uéR’, if and only if the norm ||Lj|,, =< M.
40.6 CoROLLARY. Suppose the hypotheses of Theorem 40.5 are
satisfied and that there exists M>0O such that |\Df(x)|,q <M for all xeS.
Then we have

IIb) — f(a@)|| = M ||b — all.


40 THE CHAIN RULE AND MEAN VALUE THEOREMS 367

PROOF. Since ||Df(c)(b—a)||<||Pf (lla ||b—al], and since ceS, we


have

IIf(b) — f(@)|| = |]Df(c)(b — a)|| = ||Df(©)|lna [|b ~ al] = M |[b - al].


O.E.D.

Interchange of the Order of Differentiation

If f is a function with domain in R? and range in R, then f may have p


(first) partial derivatives, which we denote by

Df or of i=1,2,...,p.

Each of the partial derivatives is a function with domain in R? and range in


R and so each of these p functions may have p partial derivatives.
Following the accepted American notation, we shall refer to the resulting
p’ functions (or to such ones that exist) as the second partial derivatives of f
and we shall denote them by

of a
Dyf or Bx, ax? Lj=i,2,...,p.

It should be observed that the partial derivative intended by either of the


latter symbols is the partial derivative with respect to x; of the partial
derivative of f with respect to x: (In other words: first x, then x;).
In like manner, we can inquire into the existence of the third partial
derivatives and those of still higher order. In principle, a function on R?°
to R can have as many as p” nth partial derivatives. However, it is a
considerable convenience that if the resulting derivatives are continuous,
then the order of differentiation is not significant. In addition to decreas-
ing the number of (potentially distinct) higher partial derivatives, this result
largely removes the danger from the rather subtle notational distinction
employed for different orders of differentiation.
It is enough to consider the interchange of order for second derivatives.
By holding all the other coordinates constant, we see that it is no loss of
generality to consider a function on R® to R. In order to simplify our
notation we let (x, y) denote a point in R’ and we shall show that if D.f,
D,f, and D,,f exist and if D,,f is continuous at a point, then the partial
derivative D,,f exists at this point and equals D,,f. It will be seen in
Exercise 40.U that it is possible that both D,,.f and D,,f exists at a point
and yet are not equal.
The device that will be used in this proof is to show that both of these
368 DIFFERENTIATION INR?

mixed partial derivatives at the point (0,0) are the limit of the quotient

fh k)—- fh, oie. k)+ FO, 0)

as (h, Kk) approaches (0, 0).


40.7 Lemma. Suppose that f is defined on a neighborhood U of the
origin in R? with values in R, that the partial derivatives D.f and D,,f exist in
U, and that D,,f is continuous at (0,0). If A is the mixed difference

(40.13) A(h, k) =f(h, k)— f(h, 0) — f(0, k) + (0, 0),


then we have

D,.f(0, 0)
= th, Jim, 0)Ato
,

proor. Let ¢>0 and let 6>0 be so small that if |h|<6 and |k|<8,
then the point (h, k) belongs to U and

(40.14) |D,.f(h, k)— D,.f(0, 0)|


< e.
If |k|<68, we define B for |h]<6 by

B(h)
= f(h, k)— f(h, 0),
from which it follows that A(h,k)=B(h)—B(0). By hypothesis, the
partial derivative D.f exists in U and hence B has a derivative. Applying
the Mean Value Theorem 27.6 to B, there exists a number ho with
0<|ho|<|h| such that
(40.15) A(h, k) = B(h)—
B(0) = hB'(ho).
(It is noted that the value of h. depends on the value of k, but this will not
cause any difficulty.) Referring to the definition of B, we have
B'(ho) = D,f (ho, k) — D.f (ho, 0).
Applying the Mean Value Theorem to the right-hand side of the last
equation, there exists a number ky with 0<|ko|<|k| such that
(40.16) B'(ho) = k{Dyf(ho, ko)}.
Combining equations (40.15) and (40.16), we conclude that if 0<]h|<6
and 0<|k|<6, then
A(h, k)_
hk Dyf(ho, ko),

where 0<|ho|<|h|, O0<|ko|<|k|. It follows from inequality (40.14) and


40 THE CHAIN RULE AND MEAN VALUE THEOREMS 369

the preceding expression that


A(h, k) ~Dyxf(0, 0)| <e
hk
whenever 0<|h|<6 and 0<|k|<8. QED.
We can now obtain a useful condition (due to H. A. Schwarz) for the
equality of the mixed partials.
40.8 THEOREM. Suppose that f is defined on a neighborhood U of a
point (x, y) in R* with values in R. Suppose that the partial derivatives D,f,
D,f, and D,,f exist in U and that D,,f is continuous at (x,y). Then the
partial derivative Dyf exists at (x, y) and D,yf(x, y) = Dyzf(x, y).

PROOF. It is no loss of generality to suppose that (x, y) = (0, 0) and we


shall do so. If A is the function defined in the preceding lemma, then it
was seen that

(40.17) Dy.f(0, 0)=_ lim,


yz, Ald,a k)

the existence of this double limit being part of the conclusion. By


hypothesis D,f exists in U, so that

(40.18) lim Ath, k)_ 1 {D,f(h, 0)—D,f(0, 0}, h#0.


hk oh
If ¢>0, there exists a number 8(e)>0 such that if 0<|h|<6(e) and
0<|k|<8(e), then
Aw 1)
k) _p,.f(0, 0} <e.

By taking the limit in this inequality with respect to k and using (40.18), we
obtain

| (Psf(h, 0) D,F(0, 0) Dysf(0, 0)] = «,


for all h satisfying 0<jh|<6(e). Therefore, D,,f(0, 0) exists and equals
Dysf(0, 0). QED.

Higher Derivatives

If f is a function with domain in R? and range in R, then the derivative


Df(c) of f at c is the linear function on R?’ to R_ such that

IIf(c +z) —f(c)— Df(c)(z)|| = € |lzI|,


370 DIFFERENTIATION IN R?°

for sufficiently small z. This means that Df(c) is the linear function which
most closely approximates the difference f(c+z)—f(c) when z is small.
Any other linear function would lead to a less exact approximation for
small z. From this defining property, it is seen that if Df(c) exists, then it is
necessarily given by the formula

Df(c)(z) = Dif(c)zi
++ + Def(c)zp,
where z=(%,...,Zp) in R’.
Although linear approximations are particularly simple and are suffi-
ciently exact for many purposes, it is sometimes desirable to obtain a finer
degree of approximation than is possible by using linear functions. In such
cases it is natural to turn to quadratic functions, cubic functions, etc., to
effect closer approximations. Since our functions are to have their
domains in R’, we would be led into the study of multilinear functions on
R’ to R for a thorough discussion of such functions. Although such a
study is not particularly difficult, it would take us rather far afield in view of
the limited applications we have in mind.
For this reason we shall define the second derivative D’f(c) of f at c to
be the function on R® x R” to R such that if (y, z) belongs to this product
and y=(yi,.-., yp) and z=(z1,..., z,), then

D’f(c)ly, z) = 2 Dyif(c) yiz;.

In discussing the second derivative, we shall assume in the following that


the second partial derivatives of f exist and are continuous on a neighbor-
hood of c. Similarly, we define the third derivative D’f(c) of f at c to be
the function of (y, z, w) in R’ XR’ x R° given by

D°f(c)(y, z, w) = x Daif(c) yiziwWe-

In discussing the third derivative, we shall assume that all of the third
partial derivatives of f exist and are continuous in a neighborhood of c.
By now the method of formation of the higher derivatives should be
clear. (In view of our preceding remarks concerning the interchange of
order in differentiation, if the resulting mixed partial derivatives are
continuous, then they are independent of the order of differentiation.)
One further notational device: we write

D*fic\wyY for ~~ D’f(c)(w, w),


D*f(c)\(wy for D°fic)(w, w, w),

D"f(c)(w)" for D"f(c)(w, w,..., W).


40 THE CHAIN RULE AND MEAN VALUE THEOREMS 371

If p=2 and if we denote an element of R®* by (x, y) and w =(h, k), then
D?f(c)(w)* equals the expression
Da f(c)h? + 2Dyf(cyhk + D,,f(c)k?;

similarly, D*f(c)(w) equals


Dyof(c)h? +3 Dayf(C)W7k + 3Doyf(c)hk? + Dy f(e)k?,
and D"f(c)(w)" equals the expression

Dy...f(c)h + (TD. n
flOn n-1
k+(3)De-.-onflh
n
k
n-2yp,2

D,...yf(c)k”.
Now that we have introduced this notation we shall establish an
important generalization of Taylor’s Theorem for functions on R? to R.
40.9 TayLor’s THEOREM. Suppose that f is a function with open
domain ©, in R? and range in R, and suppose that f has continuous partial
derivatives of order n in a neighborhood of every point on a line segment S
joining two points a, b=a+uin Q. Then there exists a point c on S such
that

foro feysg mtonn eg one


+ qa 1 Dau +5DY Ceuy"
PROOF. Let F be defined for t in I to R by

F(t)=f(a+tu).

In view of the assumed existence of the partial derivatives of f, it follows


that

F(t)= Df(a+ tu)(u),


F"(t)=Df(at+tu)(uy,

F(t) = D"f(at+ tu)(u)’.


If we apply the one-dimensional version of Taylor’s Theorem 28.6 to the
function F on F, we infer that there exists a real number to in I such that

FU) =F() + pF'O)+-: Fe "+4 p F(to).


cc Wi
If we set c = a+ tou, then the result follows. O.E.D.
372 DIFFERENTIATION IN R°

Exercises
40.A. If f(x, y)=x?+y’ and g(t)=(3t+1, 2t—-3), let F(t)=feg(t). Evaluate
F'(t) both directly and by using the Chain Rule.
40.B. lf f(x, y)=xy and g(s, t) =(2s+ 3t, 4s +1), let F(s, t)=feg(s, t), Evaluate
D,F and D,F both directly and by using the Chain Rule.
40.C. lf f(x, y, z)=xyz and g(s, t)= (3s + st, s, t), let F(s, t) =feg(s, t). Evaluate
D,F and D.F both directly and by using the Chain Rule.
40.D. I£ f(x, y, z)=xy + yz + zx and g(s, t) = (cos s, sin s cos f, sin t), let F(s, t)=
feg(s,t). Evaluate D,F and D,F both directly and by using the Chain Rule.
40.E. If Cartesian axes are rotated in the plane by the angle @, then the new
coordinates u, v of a point are related to the original coordinates x, y by

x=ucos @—v sin 6, y=usin@+v


cos 0.

Let f:R’— R be differentiable on R’ and let F(u, v) =f(x, y) for all x, y. Show
that

[D,F(u, vo)?+ [DoF(u, v)= [Df y+ Daf(x yF.


40.F. Let f:R’—R be differentiable on R’, let g:(0,+e%)xR—>R_ be
defined by g(r, 8) =(r cos 6, rsin @), and let F=feg. Calculate D,F and D,F and
show that

[D.F(r, ar+S [D.F(r, #) = [D,f(r cos 6, r sin 6)


+[D.f(r cos 6, r sin 0).

40.G. Let f:R > R be differentiable on R.


(a) If F(x, y)= f(xy), then xD, F(x, y) = yD.F(x, y) for all (x, y).
(b) If F(x, y)=f(ax+by) where a,beR, then bD,F(x, y)=aD.F(x, y) for all
(x, y).
(c) If F(x, y)=f(x’?+y’), then yD, F(x, y)=xD.F(x, y) for all (x, y).
(d) If F(x, y)=f(x’—y’), then yD, F(x, y)+xD,F(x, y)=0 for all (x, y).
40.H. Let ACR?’ and let c be an interior point of A. Suppose that f, g are
defined on A to R* and are differentiable at c. If h:A—R is defined by
h(x) = f(x) - g(x) for all x € A, show that h is differentiable at c and that if ue R’,
then

Dh(c)(u) = (Df(c)(u)) « g(e)


+ fle) « (Dg(c)(u)).
40.1. Express the result of Exercise 40.H in terms of the coordinate functions.
40.J. Let ACR and let c be an interior point of A. Suppose that f:A > R? is
differentiable at c and such that |f(x)|[=1 for xeA. Show that f(c)-V.f=0,
where V.f denotes the gradient of f at c (see Exercise 39.J). Interpret this
conclusion geometrically.
40.K. Let f: R’ > R be (positively) homogeneous of degree k in the sense that

f(tx) = t*f(x) for xER’,t>0.


40 THE CHAIN RULE AND MEAN VALUE THEOREMS 373

(a) If f is differentiable on R’, show that it satisfies Euler’st Relation:

kf(x) =x,D.f(x) +- +x,D,f(x)


for all x ={x,,...,x,} in R’ with x40.
(b) Conversely, let f satisfy Euler’s Relation and let ce R’, c#0. Let g(t)=
f(tc) for t>0 and show that tg’(t)=kg(t) for t>0. Use this to prove that f is
homogeneous of degree k.
40.L. Let ACR’, f:A— R’, and let the function g: f(A) > R’ be inverse to f in
the sense that
feg(x)=x, — gef(y=y
for all xeA and yef(A). If f is differentiable at a point aeA and if g is
differentiable at b= f(a), show that the linear functions Df(a) and Dg(b) are
inverse to each other; that is, Df(a)>Dg(b) and Dg(b)° Df(a) are the identity on
R’.
40.M. Let B: R’ x R’ = R*? — R‘ be bilinear in the sense that
B(ax + bx’, y)=aB(x, y)+ bB(x’, y),
B(x, ay + by.) =aB(x, y)+ bB(x, y’))

for all a, be R and all x, x’, y,y’in R®. It can be proved that there exists M>0
such that |[B(x, y)]] = M |[x||[lyll for all x, y in R’. Assuming this, prove that B is
differentiable at every point (x, y)<¢ R’ x R’ = R” and that

DB(x, y)(u, v) = B(x, v) + By, y)


for all (u,v) in R’x R’=R”.
40.N. Let B:R’ XR’ ~ R‘ be bilinear in the sense of the preceding exercise
and let g(x) = B(x, x) for all x eR’. Show that if x, ue R’, then
(i) g(t) =2?g(x) for all te R;
(ii) Dg(x)(u) = B(x, u) + Blu, x) = Dg (u(x);
(iii) g(x + u) = g(x) + Dg(x)(u)+ g(u).
Moreover, if B is symmetric in the sense that B(x, y)= B(y, x), then
(iv) De(x)(u) = 2B(x, u).
40.0. Give a proof of Exercise 40.H using 40.M.
40.P. Let Q2R? be open and let f:Q—R* be differentiable on 0. Let
I= (a, b) be an open interval in R and let g: I — R?’ be differentiable on I and such
that g()cQ. If h=feog:I— R®* show that

h'(c) = Df(g(c))(g'(e)).
40.Q. Let Q2R?’ be open and let f:2— R*. Suppose that O contains the
points a, b and the line segment S joining these points, and that f is differentiable at
every point of S. Show that there exists a linear mapping L: R’ > R* such that
f(b)— fla) =L(b— a).
tT LEONARD EULER (1707-1783), a native of Basel, studied with Johann Bernoulli. He
resided many years at the court in St. Petersburg, but this stay was interrupted by twenty-five
years in Berlin. Despite the fact that he was the father of thirteen children and became
totally blind, he was still able to write over eight hundred papers and books and make
fundamental contributions to all branches of mathematics.
374 DIFFERENTIATION IN R°

40.R. Let QCR® be an open connected set and let f:— R* be differentiable
on ©. If Df(x)=0 for all x €2, show that f(x)= f(y) for all x, ye€Q. Show that
this conclusion may fail if 0 is not connected.
40.8. Let JGR’ be an open cell and suppose that f:J — R is differentiable on
J. Show that if the partial derivative D,f(x) =0 for all x < J, then f does not depend
on the first variable in the sense that

F(X, Xay os Xp) = FHI, X25 +» Xp)


for any two points in J whose second, ..., pth coordinates are the same.
40.T. Show that the conclusion of the preceding exercise may fail if J is not
assumed to be a cell.
40.U. Let f:R’—R be defined by

fla, y= EP) for (x,y)


# (0,0),
=0 for (x, y) = (0, 0).

Show that the second partial derivatives D,,f and D,,f exist at (0,0) but are not
equal.
40.V. Use the Mean Value Theorem to determine approximately the distance
from the point (3.2, 4.1) to the origin. Give error bounds for your estimate.
40.W. Let OCR? be open and let f:2— R*. Suppose that 0 contains the
points a, b and the line segment S joining the points, and that f has continuous
partial derivatives on S. Show that

f(b) -fla)= { Df(a+ t(b—a))(b— a) dt.

40.X. Let f, g:R —R have continuous second derivatives on R.


(a) If ce R and u(x, y)=f(x+cy)+¢(x—cy), show that u: R’— R satisfies the
“wave equation”

e?D,,u(x, y)= D,,u(y, y)


for all (x, y).
(b) If v(x, y)=f(3x +2y)+ g(x —2y), show that v: R* > R satisfies the equation

4D,,.0(x, y)—4D,,0(x, y)— 3D,,v(x, y) =0


for all (x, y).
40.Y. If f:R*®— R has continuous second partial derivatives and if F(r, €)=
f(r cos 6, r sin 6) for r>0, 06€R, show that

Daf(s, y) + Dyf(x, y) = D,F(r, 8) +2 DF(r, 8) +4 DooF'(r, 8)


= * D,(rD,F(r, 6)) +5 DeoF (7, 9),
where x =rcos 0, y=r sin @.
41 MAPPING THEOREMS AND IMPLICIT FUNCTIONS 375

Project

40.a. (This project is a modification of the classical Newton’s Method for the
location of roots when a sufficiently close approximation is known.) Let f be
defined and continuous on an open set containing the closed ball B,(x,)=
{x ER? :||x —x,|| <r} with values in R*. Suppose that f is differentiable at every
point of B,(x,) and that there exists a number C, with 0<C <1, and an injective
linear map T': R* > R® such that |'°f(x.)l| = (1—C)r and such that

1 -Te Df(x)|b, <= C for x € B,(xo).

(a) Let g:B,(x) > R? be defined by g(x) =x —lof(x) for x € B,(x). Show that
g is differentiable at every point of B,(x.) and that g is a contraction with constant
C<1 (see 23.4) on B,(x).
(b) Define x,= (x0) and x..41=g(x,) for neN. Show that |ba.i—x||s
C* |x: -x<, whence it follows that |[x..1-x,l[<C"r for n=m20. Hence
|x. —xo|<r for k= 0, 1,2,....
(c) Show that (x,) is a Cauchy sequence and hence converges to an element
X¥ © B,(X»), which is such that g(x)=x. Moreover, we have the estimate ||x, — x||<
C*r,
(d) Show that f(%)=0 and x is the only element in B,(x.) where f vanishes.

Section 41 Mapping Theorems and _ Implicit


Functions

Let © be an open set in R® and let f be a function with domain © and


range in R‘; unless there is specific mention, we do not assume that p = q.
It will be shown that, under assumptions that will be stated, the ‘‘local
character” of the mapping f at a point c€Q is indicated by the linear
mapping Df(c). Somewhat more precisely:
(i) if p=q and Df(c) is injective (= one-one), then f is injective on
small neighborhoods of c;
(ii) if p = q and Df(c) is surjective (= maps R° onto R*), then the image
under f of a small neighborhood of c is a neighborhood of f(c); and
(iti) if p=q and Df(c) is bijective (=one-one and onto = invertible),
then f maps a neighborhood U of c in a one-one fashion onto a
neighborhood V of f(c). In case (iii), there is a function defined on V
which is inverse to the restriction of f to U.
As a consequence of these mapping theorems we shall obtain the
Implicit Function Theorem, which is one of the fundamental theorems in
analysis and geometry. We also present a useful Parametrization
Theorem and the important Rank Theorem.
376 DIFFERENTIATION IN R°

The Class C'(Q)


The mere existence of the derivative is not enough for our purposes; we
need also the continuity of the derivative. We recall that if f:0Q— R? is
differentiable at every point of QC R’, then the function x +> Df(x) is a
map of © into the collection ¢(R’, R‘*) of all linear functions from R? to
R*. It was noted in Section 21 that this set FUR’, R*) is a vector space
and, in Exercise 21.L, that this space is a normed space under the norm

(41.1) |Llla = sup {]L(x)|]:x ] R?, |x||< 1}.

41.1 Derinition. If © is open in R?’ and f:Q— R’, we say that f


belongs to Class C*(Q) if the derivative Df(x) exists for all x €Q and the
mapping x +> Df(x) of © into YCR’, R*) is continuous under the norm
(41.1).
We recall from Example 39.8(d), that for each x€Q, the derivative
Df(x) can be represented by the q x p Jacobian matrix [D,fi(x)]. Hence
Df(x)— Df(y) is represented by the q Xp matrix

[Difi(x)— Difiy)].

Now it follows from the inequality (21.5), that

|Df(x) = Df (ylloa < {> x |Difi(x) _ Df}.

Hence the continuity of each of the partial derivatives Djf, on Q implies


continuity of x+> Df(x). We leave it to the reader to show that the
converse is also true. Hence we have the following result.
41.2 THeorem. If OCR? is open and f:Q— R‘ is differentiable at
every point of Q, then f belongs to Class C'(Q) if and only if the partial
derivatives Dyfi, i=1,...,q,j=1,...,p, off are continuous on 0.
We shall need the next lemma, which is a variant of the Mean Value
Theorem.

41.39 Lemma. Let OCR? be an open set and let f: 0 — R‘ be differen-


tiable on Q. Suppose that Q contains the points a, b and the line segment S
joining these points, and let x»€Q. Then we have

\If(b) — f(a) — Df(xo)(b — a)|] = fb — al sup {| D(x) ~ Df (xo) lea}.


PROOF. Let g:9.— R‘ be defined for x EO by

(x) = f(x) ~ Df (Xo)(x).


41) MAPPING THEOREMS AND IMPLICIT FUNCTIONS 377

Since Df(xo) is linear, it follows that Dg(x) = Df(x)— Df(x») for x EQ. If
we apply the Mean Value Theorem 40.5, we infer that there exists a point
céS such that

\lf(b) — f(a) — Df(xo)(b — a)|| = |]g(b) — g(a)]}


< ||Dg(c)(b — a)|| = ||(Df(c) ~ Df %o))(b — @)||
= \|b — al sup {||Df(x) — Df (xo)|lpa}- OQ.E.D.

The next result is the key lemma to the mapping theorems.


41.4 APPROXIMATION LEMMA. Let 0. R? be open and let f:Q.— R?
belong to Class C1(Q). If xo€Q and « >0, then there exists 8(e)>0 such
that if |x; —xol| = 5(e), k= 1,2, then x, €Q and
(41.2) ILF(x1) — f(X2) — Df (%0)(x1 — X2)| = [xa — x2].
PROOF, Since x +> Df(x) is continuous on 2 to £(R’, R*), then given
~>0 there exists 6(¢)>0 such that if |x—xo]<8(e) then xe and
||Df(x)—Df(xollq<e. Now let x1, x2 satisfy ||x,—x0|<8(e), whence
the line segment joining x; and xz lies inside of the closed ball with center
Xo and radius 5(e), and hence inside . Now apply Lemma 41.3 to obtain
the stated conclusion. O.B.D.

The Injective Mapping Theorem


We shall now show that if f belongs to Class C'(Q) and if Df(c) is
injective, then the restriction of f to a suitable neighborhood of c is an
injection.
A reader familiar with the notion of the ‘‘rank” of a linear transformation, will
recall that L: R’ — R° is injective if and only if rank (L)=p = q.
41.5 InrectivE MappInG THEOREM. Suppose that .¢ R® is open,
that f:0. — R® belongs to Class C'(Q), and that L = Df(c) is an injection.
Then there exists a number 6>0 such that the restriction off to Bs=
{x € R? :||x — cl] < 5} is an injection. Moreover, the inverse of the restriction
f | Bs is a continuous function on f(Bs)¢ R? to Bs ¢ R?.
PROOF Since the linear function L = Df(c) is an injection, it follows
from Corollary 22.7 that there exists an r>0O such that
(41.3) r |u|] < IDf(c)(w)|| forueR’.
We now apply the Approximation Lemma 41.4 with « =4r to obtain a
number 6 >0 such that if |x, —c|| =< 6, k =1, 2, then
2)
— L(x1
WKF (2) — f(x — x2)| < 34 [fc — xa.
378 DIFFERENTIATION IN RP?

If we apply the Triangle Inequality to the left side of this inequality, we


obtain

I|L-Gea— x2)
I] = []f(2e1) — fee)|| = 27 |] x1 — x2].
If we combine this and (41.3) with u=x,—x2, we obtain

(41.4) 2F ||x1
— x2l] <[lf(x.)
— f xa)
for x. € Bs. This proves that the restriction off to B; is an injection; hence
this restriction has an inverse function which we shall denote by g. If
yx € f(Bs), then there exist unique points x,=g(y.) in Bs such that
yx =f(xx). It follows from (41.4) that

lg(v)— g(ya)ll s (2/1) |ly:— yall,


whence it follows that g =(f | Bs)‘ is uniformly continuous on f(B;) to R?.
QED.
We note that g need not be defined on a neighborhood of f(c); that is, f(c) need
not be an interior point of f(B;). For that reason we can make no assertion about
the differentiability of g. A stronger inversion theorem will be established below
under additional hypotheses.

The Surjective Mapping Theorem


The next result is a companion to the Injective Mapping Theorem. This
theorem, which is due to L. M. Graves,t asserts that if f is in Class C'(Q)
and if for some c €Q, the linear map Df(c) is a surjection of R’ onto R‘,
then f maps a suitable neighborhood of c to a neighborhood of f(c). Thus
every point of R* which is sufficiently close to f(c) is the image under f ofa
point near c.

A reader familiar with the notion of the “rank” of a linear transformation will
recall that L: R° — R° is surjective if and only if rank (L)=q sp.
41.6 SURJECTIVE MAPPING THEOREM. Let OCR? be open and let
f:0Q— R? belong to Class C'(Q). Suppose that for some c €Q, the linear
function L = Df(c) is a surjection of R’ onto R*. Then there exist numbers
m>0 and a >0 such that if y € R? and |ly — f(c)|| < a/2m, then there exists
an x €Q such that |x —clls a and f(x)=y.
PROOF. Since L is a surjection, each of the standard basic vectors
e,=(1,0,..., 0), e.=(0,1,...,0),...,e,=(0,0,...,
1)

+ Lawrence M. Graves (1896-1973) was born in Kansas, but was associated with the
University of Chicago for many years as student and professor. He is best known for his
contributions to functional analysis and the calculus of variations.
41) MAPPING THEOREMS AND IMPLICIT FUNCTIONS 379

in R* is the image under L of some vector in R°, say ui, u2,..., Ug. Now
let M:R°—R?’ be the linear function mapping ¢ into u; for j=
1,2,...,q5 that is,

m(3 ae)=$ an
It follows that L°M is the identity mapping on R‘; that is, Lo M(y)= y for
all ye R*. If we let

= {dtu}.
2 1/2

i=l

then an application of the Triangle and Schwarz Inequalities implies that if


y =¥21 we, then

|MOvyl-= 3 La le
={¥ lar} {¥ hair}
=m fyl.
By the Approximation Lemma 41.4 there exists a number a@ >0 such
that if |x. —c||< a, k =1,2, then x € and

(41.5) f(x.) —f(x2)- Lii- x2)|| <


= am
Lx. X2l|.

Now let B, ={x eR :|x—c|| <a} and suppose that y¢R* is such that
lly — f(c)|| < a/2m. We will show that there exists a vector x with x é B.
such that y = f(x).
Let xo=c and let x1=x0+M(y—ff(c)) so that ||x1—xol| <= m lly —f(|l <
3a, whence
1
Ix ~xol| = 5 and Ix: -cl| = (1-3)a,

Suppose that c = Xo, X1,..., Xn, have been chosen inductively in R? such
that
(41.6) [xx
— xx-1]] = @/2*, [xx
— cl] = 1-1/2" )a,
for k=1,...,n. We now define x41 (n = 1) by

(41.7) Xnv1 = Xn —~ M[f(%) — f(%n-1) — Ln — Xn-1)].


It follows from (41.5) that

[[Xn+1— Xn] <= 10 [lf (Xn) — f%n—1)—


Ltn = Xn—1)}]
1
= a I|>en ~~ Xn~ill,
380 DIFFERENTIATION IN R?

whence it follows that ||x.+1— all < 2(a/2") = o/2"** and


[[xn+1 el] = [Peue1 Xl + [xn ~ el]
<(a/2"")+(1-1/2")a
=(1—1/2""a.
Hence (41.6) is also established for k =n +1. Therefore, we can construct
a sequence (x,) in B, in this way. If m=n, then we have

[[>n —~ Xen] <= [fen — Xnval] + [fae — Xneal| +> + + +|[Xm—1— Xml
a a a a
= gut gaat tam = 5a -
2
It follows that (x,) is a Cauchy sequence in R° and therefore converges to
some element x. Since ||x, — cl] =< (1—1/2")a, it follows that |x —c||< « so
that xe B,.
Since x1— x0 = M(y —f(c)), it follows that

L(x1— Xo) = LeM(y ~ f(c)) = y — fx).


Moreover, by (41.7) we have

L(Xn+1— Xn) = —L ° M[f(Xn) — f(%n-1) — L (tn — Xn-1)]


= —{f(%n) = fO%n=1) — Lin — Xn}
= L(Xn — Xn—1) [Ef (%n) — f(%m-1)]
By induction we find that
L(Xn41- Xn) =y — f(x);

whence it follows that y = lim f(x,) = f(x). Hence every point y satisfying
lly —f(c)l| < a/2m is the image under f of a point x €Q with |x —c||< a.
OED.
41.7 Open Maprinc THEOREM. Let OCR? be open and let
f:Q— R’‘ belong to Class C'(Q). If for each x €O the derivative Df(x) is a
surjection, and if G <Q. is open, then f(G) is open in R*.
PROOF, If bef(G), then there exists a point c € G such that f(c)=b. It
follows from the Surjective Mapping Theorem 41.6 applied to f |G that
there exists 8 >0 such that if ||y — b|| =< 6 then there exists an x € G such
that y = f(x). Hence f(G) is open in R*. QED.

The Inversion Theorem


We shall now combine our two mapping theorems in the case p = q.
Here the derivative Df(c) is assumed to be a bijection. This is the case if
41 MAPPING THEOREMS AND IMPLICIT FUNCTIONS 381

and only if the derivative Df(c) has an inverse which, in turn, is true if and
only if the Jacobian determinant
J;(c) = det [Dyfi(c)] = det [f.;(c)]
is different from zero.

A reader familiar with the notion of the “rank” of a linear transformation will
recall that L: R’ — R‘ is bijective if and only if rank (L)=p=q.

It follows from the continuity of the partial derivatives and of the


determinant that if Df(a) is invertible, then Df(x) is invertible for x
sufficiently close to c.
41.8 INVERSION THEOREM. Let QC R? be open and suppose that
f:Q— R? belongs to Class C'(Q). If c €. is such that Df(c) is a bijection,
then there exists an open neighborhood U of c such that V = f(U) is an open
neighborhood of f(c) and the restriction off to U is a bijection onto V with
continuous inverse. Moreover g belongs to Class C'(V) and

Dg(y)=[Df(ety)T" for ye V.
PROOF. By hypothesis L = Df(c) is injective; hence Corollary 22.7
implies that there exists r>0 such that

2r lz = ||Df(c\(z)|| ~~ for ze R’.


Since f is in Class C'(Q), there is a neighborhood of c on which Df(x) is
invertible and satisfies

(41.8) rz] < ||Df(x)(z)|| for ze R?’.


We further restrict our attention to a neighborhood U of ¢ on which f is
injective and which is contained in a ball with center ¢ and radius o (as in
the Surjective Mapping Theorem 41.6). Then V =f(U) is a neighborhood
of f(c), and we infer from the preceding mapping theorems that the
restriction f| U has a continuous inverse function g: V > R’.
It remains to show that g is differentiable at an arbitrary point y,¢ V.
Let x: = g(yi) € U; since f is differentiable at x., it follows that if x € U, then

f(x) — (x1) — Df(x1)(x — x1) = [Ix — xil] u(x),


where ||u(x)||
> 0 as x > x1. If we let M, be the inverse of the linear
function Df(x:), then

x~x1= M[Df(x1)(x — x1)]


= M[f(x) —f(x1)—||x — xiff w()]-
If xe U, then x = g(y) for some y = f(x)€ V; moreover y: = f(x), so this
382 DIFFERENTIATION IN R?°

equation can be written in the form

8(y)— gy) — Maly — yi) = — [fx


— xl] Mi(u(x)).
Since Df(x;) is injective, it follows as in the proof of the Injective Mapping
Theorem 41.5 that

lly — yall = [1fe) — fxs)|| = 37 [lx — 2h


provided y is sufficiently close to y;.. Moreover, it follows from (41.8) that
|M.(u)jj = (1/r) |lul| for all ue R°’. Therefore we have

lle(y)— gy)
— Maly — yall = (2/77) flu(o)Il ly — yall.
Now as y > yi, then x = g(y)—> g(y1) = x1 and so |lu(x)||-> 0. We con-
clude, therefore, that Dg(y1) exists and equals M; = (Df(x.))'.
The fact that g belongs to Class C’(V) follows from the relation
De(y)=[Df(g(y))T* for y € V, and the continuity of the mappings

yr>g(y), x >Df(x), Lr
of V> U, U— £(R’,R’), and ¥(R’, R’) > L(R’, R°), respectively.
(See Exercise 41.L.) QED.

Implicit Functions

Suppose that F is a function that is defined on a subset of R’ x R‘ into


R‘. (If we make the obvious identification of R° x R* with R’™, then we
do not need to define what it means to say that F is continuous, or is differen-
tiable at a point, or is in Class C' on a set.) Suppose that F takes the point
(a, b) into the zero vector of R*. The problem of implicit functions is to
solve the equation

F(x, y)=0
for one argument (say, y) in terms of the other in the sense that we find a
function » defined on a subset of R°® with values in R* such that b = ¢(a)
and

F(x, o(x))=0
for all x in the domain of ~. We assume that F is continuous on a
neighborhood of (a,b) and we hope to conclude that the “solution
functionӢ is continuous on a neighborhood of a. It will probably be no
surprise to the reader that we shall assume that F belongs to Class C’ on a
neighborhood of (a, b); however, even this hypothesis is not enough to
guarantee the existence and uniqueness of a continuous solution function ¢
defined on a neighborhood of a.
41 MAPPING THEOREMS AND IMPLICIT FUNCTIONS 383

Indeed, ifp = q = 1, then the function given by F(x, y) = x?— y* has two continuous
solution functions 9,(x)=x and ¢(x)=—x corresponding to the point (0,0). It
also has discontinuous solutions, such as

@(c) = x, x rational,
=-x, x irrational.

The function G(x, y)=x—y’ has two continuous solution functions corresponding
to (0, 0), but neither of them is defined on a neighborhood of the point x =0. To
give a more exotic example, the function H:R’— R defined by

H(x, y) =x, y=0,


==x—y sensin yfi 3 y#0,

belongs to Class C* on a neighborhood of (0, 0), but there is no continuous solution


function defined on a neighborhood of x =0.

In all three of these examples the partial derivative with respect to y


vanishes at the point under consideration. In the case p=q=1, the
additional assertion needed to guarantee the existence and uniqueness of
the solution function is that this partial derivative be non-zero. In the
general case, we observe that DF(a, b) is a continuous linear function on
R°’ XR‘ to R* and induces a continuous linear function L.:R‘’—> R‘
defined by
L2(v) = DF(a, b)(0, v)
for ve R*. Ina very reasonable sense, Lz is the ‘“‘partial derivative” of F
with respect to y € R’ at the point (a, b). The additional assumption that
we shall impose is that L2 be invertible.
We now wish to interpret this problem in terms of coordinates. If
x=(x1,...,x,p) and y=(yi,..., yq), then the equation F(x, y) = 0 takes the
form of q equations in the p + q arguments x1, ..., Xp, Yi,---, Yq given by
fi(X1, - . +> Xp» Yay ene y,) = 0,
(41.9) eee ee ee ee ee ee

fa(X1y .- 5 Xpy Vay ee + Yq) =O.


For the sake of convenience, suppose that a=0 and b=0 so that this
system is satisfied for x;=0,...,x,=0, yi=0,..., y, =O, and it is desired
to solve for the y; in terms of the x; at least when the |x;| are sufficiently
small. Ifthe functions f, are linear, then the condition for solvability is that
the determinant of coefficients of the y; should be non-zero. If the
functions f, are not linear, then the condition is that the Jacobian
determinant
afi, -.-» fa)
aly... Ya) (a, b) #0.
384 DIFFERENTIATION IN R?

When this is the case, there are functions ¢;, j=1,...,q, defined and
continuous near a =0 such that if we substitute

Yi= Pilxi,..., Xp),

Yq = 4(%1, wey Xp),

into the system (41.9), then we obtain an identity in the x,


41.9 ImMpLicir FUNCTION THEOREM. Let Q.¢ R? x R‘ be open and let
(a, b)€Q. Suppose that F:O — R‘ belongs to Class C’(Q), that F(a, b) = 0,
and that the linear map defined by
L.(v) = DF(a, b)(0, v), ve R4,
is a bijection of R* onto R*.
(a) Then there exists an open neighborhood W of ae R? and a unique
function @: W— R‘ belonging to Class C'(W) such that b= ¢(a) and
F(x, g(x) =0 for allx € W.

(b) There exists an open neighborhood U of (a, b) in R® x R* such that the


pair (x, y)eU satisfies F(x, y)=0 if and only if y = (x) for x € W.
PROOF. It is no loss of generality to assume that a=0 and b=0. Let
H:0Q— R’ xR?‘ be defined by

A(x, y= (x, F(x, y)) for (x, ye Q.

It follows readily (see Exercise 39.V) that H belongs to Class C’(Q) and
that

DH, y)(u, v) = (u, DF(x, y)(u, v))


for (x, y}€Q and (u,v) R’ X R*. We now claim that DH(0, 0) is invert-
ible on R* x R*. Indeed, if we let L1.e Y(R’, R*) be defined by

L,(u)
= DF(O, 0)(u, 0) forueR’;
then the fact that DF(O, 0)(u, v) = Li(u)+L.(v) shows that the inverse of
DH(0, 0) is the linear mapping K on R? x R‘ defined by

K(x, z) =(x, Lo "[z —Li(x))).

Hence it follows from the Inversion Theorem 41.8 that there is an open
neighborhood U of (0,0)€R’R* such that V=H(U) is an open
neighborhood of (0,0)¢€R’ xR‘ and the restriction of H to U is a
bijection onto V with a continuous inverse @: V — U which belongs to
Class C'(V) and with (0, 0)=(0, 0). Now ©® has the form
P(x, z) = (g(x, z), 2(x, z)) for (x, z)eE Vv
41 MAPPING THEOREMS AND IMPLICIT FUNCTIONS 385

where ¢1:V—R? and g2:V— R*. Since

(x, 2) = He B(x, z) = H[e.(%, 2), a(x, z)]


= [eilx, z), F(@ils, z), palx, z))],
we infer that ~.(x, z)=x for all (x, z)¢ V. Hence takes the simpler form
P(x, z) = (x, G2(x, z)) for (x, z)eE V.

Now if P:R’ xR‘ —R’‘ is defined by P(x, z)=z, then P is linear and
continuous and g2= Pe; therefore g, belongs to Class C’(V) and we
have
z = F(x, ¢2(x, z)) for (x, z)e V.

Now let W = {x € R” : (x, 0) € V} so that W is an open neighborhood of 0 in


R’, and define ¢(x) = o2(x, 0) forx € W. Evidently »(0) =0, and it follows
from the preceding formula that
F(x, o(x))=0 forxe W.
Moreover D¢(x)(u) = De2(x, 0)(u, 0) for xe W, ue R’, whence we con-
clude that ¢ belongs to Class C’(W). This proves part (a).
To complete the proof of part (b), suppose that (x, y)¢U satisfies
F(x, y)=0. Then H(x, y) = (x, F(x, y)) = (x, 0)€ V whence it follows that
xeW. Moreover (x, y)= P(x, 0) =(x, p(x, 0)) =(x, g(x)) so that y=
(x). QED.
It is sometimes useful to have an explicit formula for the derivative of ¢.
In order to give this it is convenient to introduce the notion of the block
partial derivatives of F. If (x, y)e, the block partial derivative
DwF(x, y) is the linear function mapping R’ — R? given by
DaF(x, y)(u) = Df(x, y)(u,0) for ue R®,
and the block partial derivative DF (x, y) is the linear function mapping
R‘ — R‘ given by
DeaF(x, y(v) = DF(x, y)(0, 0) — forve R*.
Since (u, v) = (u, 0)+(0, v) it is clear that

(41.10) DF(x, y)(u, v) = DaF (x, yu) + DaF(x, y)(v).


Note that the maps L, and L, that entered in the preceding proof are
DF (0, 0) and DaF(0, 0), respectively.
41.10 CoROLLARY. With the hypotheses of the theorem, there exists a
y >0 such that if |x —al|<-y, then the derivative of ¢ at x is the element of
F(R’, R*) given by
(41.11) De(x)=—[De@F(x, ¢(x))J 'eLDaF(x, (x).
386 DIFFERENTIATION IN RP

PROOF. Let K:W-—> R’x R° be defined by


K(x) = (x, ¢(x)) for x € W.
Then since Fo K(x) = F(x, ¢(x))=0, we have FoK: W — R® is a constant
function. Moreover, since it is readily seen that
DK(x)(u) = (u, De(x)(u)) forueR?,
it follows from the Chain Rule 40.2 applied to the constant function FoK
that
0= D(F° K)(x) = DF(K(x))° DK(x).
If we use (41.10), we have

DF(x, ¢(x))(u, v0) = DayF(x, o(x))(4) + DeayF(x, e(x))(v).


It follows from this that if ue R’, then

0 = DF(x, ¢(x))(4) = Da F(x, o(x))(u)


+ DaF(x, (x) (De(x)(u))
= DaF(x, o(x))(4u) +[DaF(x, e(x))eDe(x)(u).
Hence we have

0=DaF(, ¢(x))+DaF(s o(x))eDe(x)


for allxe W. By hypothesis, L.2= DF(a, b) is invertible. Since ¢ and F
are continuous, there is a y >0 such that if ||x — al|<-y, then DF (x, p(x))
is also invertible. Hence equation (41.11) follows from the preceding
equation. Q.E.D.
It may be useful to interpret formula (41.11) in terms of matrices.
Suppose that we have the system of q equations in-p+q arguments given
by (41.9). As we have remarked, the hypothesis of the Implicit Function
Theorem requires that the matrix

frp+t ose fip+a

fap+t —— fav+a

is invertible at the point (a,b). (Recall that f,; denotes the partial
derivative of f, with respect to the jth argument.) In this case the
derivative of the solution function ¢ at a point x is given by
fryer frp+a a fis fip

fap faw+a faa fav


41) MAPPING THEOREMS AND IMPLICIT FUNCTIONS 387

where it is understood that both matrices are evaluated at the point


(x, 9(x)) near (a, b).

The Parametrization and Rank Theorems

The Implicit Function Theorem 41.9 can be regarded as giving condi-


tions under which the “‘level curve”

C={(x,
y)e R’ x R*: F(x, y)
= 0}
passing through the point (a, b), can be parametrized at least locally as the
graph in R? x R‘ of some function defined on a neighborhood W of ae R?
to R‘; that is,

C={(x, e(x)):x © Wh.


We shall now present another theorem which gives conditions under which
the image of a function mapping an open subset of R° into R* can be
parametrized by means of a function » defined on an open set in a space of
lower dimension.
In presenting this theorem, we will need to use some elementary, but
important, facts from linear algebra which may be familiar to the reader.t
We recall that if L: R’ — R’ is a linear transformation then the range (or
the image) R, of L is the subspace of R‘ given by
Ri ={L(x):x € R*},
and the null space (or the kernel) N; of L is the subspace of R® given by
Nr = {x € R? : L(x) = O}.

The dimension r(L) of Ry is called the rank of L, and the dimension n(L)
of Nz is called the nullity of L. (Thus the rank of L is the number of
linearly independent vectors in R‘ needed to span the range Rz, and the
nullity of L is the number of linearly independent vectors in R’ needed to
span the null space Nz.) It is an exercise to prove that if {ui,..., ua}
(where n = n(L)) is a linearly independent set of vectors in R’ spanning Nr
to which we adjoin p — n vectors uns, ..., Up to get a basis for R’, then the
set {L(un+1),.--, L(up)} is a linearly independent set of vectors in R?
spanning R,. Therefore it follows that p=n(L)+r(L); hence: the
dimension of the domain of L is equal to the sum of the nullity and the rank
of L.
If we represent L by a q X p matrix as in (23.1), then it can be shown that
the rank of L is the largest number r such that there is at least one rxr
submatrix with non-zero determinant.

{+ For more detail, consult the books of Hoffman and Kunze or Finkbeiner listed in the
References.
388 DIFFERENTIATION IN R?

The Parametrization Theorem asserts that if f is a C’ mapping of an


open set QC R?® into R* such that Df(x) has rank equal to r for all x—eQ
and if f(a)= be R* for some a e€Q, then there is a neighborhood V of a
such that the restriction of f to V can be given as a C' mapping ¢ defined
on a neighborhood in R’.
41.11 PARAMETRIZATION THEOREM. Let OCR? be open and let
f :Q.- R? belong to Class C'(Q). Let Df(x) have rank r for allx EQ and
let f(a)=beER* for some acQ.. Then
(i) there exists an open neighborhood VCQ of a and a function
a:V—>R' in Class C'(V), and
(ii) there exists an open set WCR' and functions B:W-—->R? and
eo: W — R*‘, such that
(iii) f(x) = gea(x) for all xe V, and e(t)=feB(t) for all te W.

PROOF. Without any loss of generality we may assume that a=Oe R?


and b=0eER‘.
Let L = Df(0) so that L: R’ - R®* has rank r, and let {x.,...,x,} bea
basis in R? such that {x,.1,..., xp} spans the null space of L. We let X, be
the span of {x:,...,%} and X.=N, be the span of {x41,...,X%p}. As
mentioned above, it follows that Yi=R, is spanned by
{yr = L(x1),..., ye = L(x}. We let {y-a1,..., yg} be chosen such that
{y1,---, Ya} is a basis for R* and let Y. be the span of {yr41,.--5 Ya}.
It follows that every vector xe R?’ has a unique representation in the
form x =cix;+-+-+Xp)x,. We let P; and P, be the linear transformations
in R? defined by

Pi(x)= Y C)Xj, P(x) = x CjX}.


j=l j=rtl

Clearly the range of P; is equal to X;, j= 1,2. Similarly, we let Q, and Q2


be the linear transformations in R* defined for y=ciyit---+¢.y, by
r

Qy)= =)
Lay, Qa(y)= a ciyi-
art

Clearly the range of Q; is Yj, j= 1,2.


If Li is the restriction of L to Xi, then Li is a bijection of X, onto Y.; we
let A: Y,-> X, be the inverse of L,. We note that AceL(x)=x for all
xeX, and Le A(y)=y for all ye Y1. We now define u on QCR? to R?
by
(41.12) u(x)= Ae Qiof(x)+ P(x), xe,

so that u(0)=0, u maps X,MQ into X, and that

Du(x) = A°Qi° Df(x)+ Po, xEeQ;


41) MAPPING ‘THEOREMS AND IMPLICIY FUNCTIONS 389

hence u belongs to Class C'(Q). Since it is readily seen that Du(0) is the
identity map on R’, then it follows from the Inversion Theorem 41.8 that
there exists an open neighborhood U of a=0 such that U'=u(U) is an
open neighborhood of 0, and that the restriction of u to U is a bijection
onto U’ with inverse w= u~':U’ > R? which belongs to Class C'(U’).
Further, by replacing U and U' by smaller sets, we may also suppose that
U’ is convex (that is, contains the line segment joining any two of its
points).
We now let g:U'— R‘ be defined by
g(z)=f(w(z)), zeEU'cR?.
Clearly g belongs to Class C’(U’) and

De(z) = Df(w(z))°Dw(z), zeU'.


Since Df(x) has rank r for all x € O and Dw(z) is invertible for x € U’, then
it follows from a theorem in linear algebra that Dg(z) has rank r for all
zeéU’, On the other hand,

g(z) = (Qi + Q2)°f(w(z))


= Qe f(w(z)) + Ore f(w(z)).
Since w =u’, it follows from (41.12) that
z=u(w(z))=AeQief(w(z)) + Pow(z)), zeU".
But since LoAcQ,=Q, on R*% and LoP,=0 on R’, we have

(41.13) L(z) = Qi f(w(z))= Qieg(z),


whence it follows that L = Qi°Dg(z) for z¢U". Therefore, if z € U’, then
the operator Q, maps the range of Dg(z) (which has dimension r) onto the
range of L (which also has dimension r). It follows that Q, is injective on
the range of Dg(z) for ze U'; hence, if ze€U’ and xe R? is such that
L(x) =0, then Dg(z)(x)=0. Consequently, if ze U' and z26¢ X2.=N,,
then we infer that Dg(z)(z2) =0.
We will now show that g: U’ > R‘ depends only on z,¢ X; in the sense
that if ze U’ and z.e€ X> is such that z+ z.¢ U’, then g(z+2z2)=g(z). To
see this, we apply the Mean Value Theorem 40.5 to deduce that there
exists a point zo on the line segment joining z and z + z2 (and hence in U’)
such that
0 = |g(z +: z2)—g(z)|| = |Dg(zo)(z2)| = 0;
hence g(z +22) =g(z), as claimed.
We are now prepared to define the maps a, B, ». Let C:R'—> R® be
the linear transformation which maps the standard basis elements e1,..., é
of R' into the vectors x;,..., x, which form a basis for X,. Hence C isa
390 DIFFERENTIATION IN R?

bijection of R" onto X; and so C7':X,— R' exists. Let W=C7(U)=


C(U'N X,), so that WR’ is an open neighborhood of 0 in R' and let
VcU be an open neighborhood of a=0 such that Picu(V)CU"’. We
now define a:
V > R’ and B: W— R? by
(41.14) a(x)=C'eP,ou(x), B(D)=weC(t)
for xe¢V and te W. It is clear that a belongs to Class C'(V) and
a(V)¢ W, and that B belongs to Class C’(W) and B(W)cU. We now
define »: W— R* for te W by

p(t)=geC(t),
whence it follows that

p(t) = (few)
C(t) = fe B(t).
Moreover, if x € V, then

f(x) = f(weu(x))
= (few)eu(x) = geu(x);
however, we have seen that geu(x) = g°P,°u(x) so that

F(x)= geu(x)=g°e(CeC™)o(Pieu)(x)
=(geC)o(C'ePieu)(x)
= pea(x).
Hence, f(x) = ¢°a(x) for all xe V. Q.E.D.

In the course of this construction, we have actually established a bit more


information. In this corollary we make use of the notation developed in
the proof of the theorem.
41.12 CoroLLary. (a) The mapping ¢:W— R?’ has the form gi+
2, where 1 is the restriction to W of the linear map of R' > R* which takes
e,€R' into y; = L(x), j=1,..., 7, and where g2(W)¢ Yo.
(b) If te W, then ac B(t)=t.
(c) IfxeUNXi, then xe V and Boa(x)=x.
PROOF. (a) Since g=Q.°g+Qz°g, it follows from (41.13) that g =
L+Q.°g. Hence, from the definition of », we have p =L°C+Qi°g°C,
which has the form stated in (a).
(b) If te W, then x= B(t)=weC(t)e U has the property that u(x)=
ueweC(t)=C(the U'N X1; hence P,ou(x)=C(t)e U' so that xe V and
a(x)=C7leP,ou(x) =C eC) =t,
which proves statement (b).
(c) If x €QN X, then it follows from (41.12) and the fact that P2(x) = 0,
that u(x)eX,. Hence if xe UNX, it follows that Piou(x)=u(x)eE
410 MAPPING THROREMS AND IMPLICILD FUNCTIONS 391

U'N X&, so that xe V. Moreover,


Bea(x) =(weC)e(C
eo Pyou)(x)
=weCeC 'eu(x) = weu(x) =x. O.E.D.
We can now use the result of the Parametrization Theorem to prove the
Rank Theorem.

41.13 RANK THEOREM. Let QC R? be open and let f:O.— R‘ belong


to Class C'(Q). Let Df(x) have rank r for all x €.Q and let f(a) =be R* for
some a€Q. Then:
(i) there exist open neighborhoods V of a and V' of 0 in R’, and a
function ¢:V — V' in Class C'(V) which has an inverse o': V'—> V in
Class C'(V’);
(ii) there exist open neighborhoods Z of b and Z’ of 0 in R‘, and a
function +:Z'—Z in Class C’(Z') which has an inverse 17:Z— Z' in
Class C'(Z);
(iii) if xe V then f(x)=7°1,°0(x), where i,:R’
> R* is the mapping
defined by
i (C1, 2.64 Cry Crety ss CP =H(Cr,.-.,
0, 0... OER”

PROOF. We assume that a =0 and b=0 and shall employ the notation
and results established during the proof of The Parametrization Theorem.
Let B:R° — R?® be the linear function which maps the standard basis
elements e:,..., e, of R” into the vectors x1, ..., x»; hence B is a bijection
of R?’ onto R’ and so B™ exists. The map o:Q-—R?’ defined by
a(x) = B'ou(x) belongs to Class C'(Q) and, since the restriction of u to U
has an inverse w:U'— R” mapping onto U, it follows that the restriction
of o to U has an inverse o7'=weB mapping B™'(U’) onto U.
Let WC R' and o: W > R* be as in the Parametrization Theorem and
let H:R*— R‘ be the linear function which maps the standard basis
elements e:,..., e, of R* into the vectors y:,..., y,; hence H is a bijection
of R* onto R? and so H™ exists. We define
W'={(c,...,¢,)E
Ro: (c1,...,¢,)€ Wh
and let +: W’— R®* be defined by
T(C1,-.-, Cg) = O(C1,...,G) + H(O,..., 0, Crety-- +5 Cy).

It follows from Corollary 41.12(a) that Dr(0)= H; hence the inversion


Theorem 41.8 implies that the restriction of + to some neighborhood Z’ of
0 is a bijection onto some neighborhood Z of 7(0)=0.
By further restricting V if necessary, we may assume that f(V)< Z.
Now let x € V and consider o(x) = B™'eu(x). If i, is as defined above, then
ica(x) =(C 'ePyou(x), 0) =(a(x), 0). Hence rei,ea(x) = peo(x) = f(x)
for allxe V. O.E.D.
392 DIFFERENTIATION IN R?

Exercises
41.A. Let QCR?’ be open and f:Q— R*. If Df(x) exists for all xe and if
i=1,...,q, j=1,...,p, then show that |Dji(x)-D,fi(y)| < ||Df(x)- Df(y)lh.
Hence, if f belongs to Class C’(Q) then each of the partial derivatives D,f, is
continuous on ©.
41.B. Let Q¢ R’ be open andf:Q— R*. If f belongs to Class C'(Q) and K <Q
is compact, show that x +> Df(x) is uniformly continuous in the sense that for every
e >0 there exists 6 > 0 such that if x, y ¢ K and |x — yl|< then ||Df(x) — Df(y) lla <
&.

41.C, Let Q¢ R? and 0, < R* be open and let f:Q— R* belong to Class C'(Q)
and g:0,—R’ belong to Class C'(Q,). If f(Q) <0), show that gef belongs to
Class C'(Q).
41.D. Let f:R > R be defined by f(x) =x. Show that f belongs to Class C'(R)
and that it is a bijection of R onto R with inverse g(x) =x'" for allx eR. However
Df(0) is neither injective or surjective. Does g belong to Class C’(R)?
41L.E. Let g:R—R be such that g'(x)#0 for all xe R. Show that g is a
bijection of R onto g({R).
41.F. Let ACR’, let f:A — R’, and let g:f(A)— R? be inverse to f. Suppose
that f is differentiable at ae A and g is differentiable at b=f(a). If Df(a) is not
invertible, then show tnat Dg({b) is not invertible.
41.G. Let f: R’— R’ be given by

f(x, y)= (x+y, 2x + ay).


(a) Calculate Df(x, y) and show that Df(x, y) is invertible if and only if a# 2.
(b) Examine theimage of the unit square {(x, y):x, y €[0, 1]}}whena = 1, 2, 3.
41H. Let f be the mapping of R’ into R’ which sends the point (x, y) into the
point (u, v) given by

uU =x, v= xy.

Draw some curves u = constant, v = constant in the (x, y)-plane and some curves
x = constant, y =constant in the (u, v)-plane. Is this mapping one-one? Does f
map onto all of R*? Show that if x #0, then f maps some neighborhood of (x, y) in
a one-one fashion onto a neighborhood of (x, xy). Into what region in the
(u, v)-plane does f map the rectangle {(x, y):1=<x <2, 0< y <2}? What points
in the (x, y)-plane map under f into the rectangle ((u,v):lsau<2, 0<v = 2}?
41.1. Let f be the mapping of R? into R’* which sends the point (x, y) into the
point (u, v) given by

u=x?~y’, v= 2xy.

What curves in the (x, y)-plane map under f into the lines u=constant, v=
constant? Into what curves in the (u,v)-plane do the lines x =constant, y =
constant map? Show that each non-zero point (u, v) is the image under f of two
points. Into what region does f map the square {(x, y):0 xx <1, 0<y<1}?
What region is mapped by f into the square {(u,v):0=u<1, 0<v <1}?
410 MAPPING THEOREMS AND IMPLICIT FUNCTIONS 393

41.J. Let h: R— R be defined by

h(x) =x 42x? sin + for x #0,

=0 forx =0.
Show that h does not belong to Class C'(R) and that h is not injective on a
neighborhood of 0. However, it is surjective on a neighborhood of 0 and Dh(0) is
invertible.
41.K. Let f:R’?— R’ be defined by f(x, y)=(y,x+y’) for (x, y)eR®. Show
that f belongs to Class C’(R’) and that f is invertible on some neighborhood of an
arbitrary point of R?. Draw the image under f of the lines x = 0, +1, +2 and y =0,
+1, +2. Find the inverse g=f7':R’—R’ and show that Dg(f(xo, yo)) =
Df(Xu5 yo)".
41.L. (This exercise assumes familiarity with the notion of the determinant of a
square matrix.) Let Le ¥(R°, R’) and tet [c,] be the matrix representation of L
with respect to the standard basis in R°. It is shown in linear algebra that L is
invertible if and only if A=det[c,] is not zero. Furthermore, if A #0, then the
matrix of L“'h the form [p,/A], where the p, are polynomials in the c,.
(a) Show that if L, is invertible and if ||L —Ly||,, is sufficiently small, then L is
invertible.
(b) Show that if L, is invertible, then the map L+>L™ is continuous on a
neighborhood of L, with respect to the norm in £(R’, R°).
(c) Let QE2R® be open and f:Q— R° belong to Class C'(Q). If Df(c) is
invertible for some c€Q,, then Df(x) is invertible on some neighborhood of c.
41.M. Let F:R’— R be defined by F(x, y)=y’—x. Show that F belongs to
Class C'(R’) but that D,F(0,0)=0. Show that there does not exist a function @
defined on a neighborhood W of 0 such that F(x, p(x))=0 for all xe W.
41.N. Let f:R°— R’ be defined by

fx, y, z)= (e+ y+z,x—-y—2xz),


so that f(0, 0,0)=(0, 0) and Df(0, 0, 0) is given by

[; 1 4
1 -1 or

(a) Show that we can solve for (x, y)=¢(z) near z=0 and that

peo=|_ |
Rie wie

(b) Carry out the explicit solution of (x, y)=@(z) to obtain

_ Zz 2-22?
forz<1.
o(z)= Gc 1)’ 2G- 5)
Check the result of part (a).
(c) Show that we can solve for (y, z)= (x) near x =0 and that

pwo-(.4}
394 DIFFERENTIATION IN R?

(d) Carry out the explicit solution of (y, z)= w(x) to obtain

2x +x _ 2x ) for x <3,
wo)= (FS "2x-1

Check the result of part (c).


41.0. Let F: R°— R’ be defined by
F(u, v, w, x, y) = (uy + ox +wtx?, uwww+x+y+1),

and note that F(2, 1,0, -1, 0) =(0, 0).


(a) Show that we can solve F(u, v, w, x, y) = (0, 0) for (x, y) in terms of (u, v, w)
near (2, 1,0).
(b) Tf (x, y) = @(u, v, w) is the solution in part (a), show that De(2, 1, 0) is given
by the matrix
fy i [e -1 l= -1 3
11 0 oO 2) 310 1 —-3/S
41.P. Let ACR? and let F: A > R represent a surface S; in R’ implicitly as the
“level surface’
Sp ={(x, y, z)€ A F(x, y, z) = O}.
If F is differentiable at a point (Xo, yo, Zo)€S- which is interior to A, then the
tangent space to S,; at this point is the set of points

10%, ys Z) ER?! Acro. yo.20% Ys Z) = OF,


where Aq, yz) is the affine map of R’—> R defined by

A cro, 90.20(X) Ys Z) = F (Xo, Yo, Zo) + DF (Xo; Yo Zo)(X —Xas Y — Yos Z — Za)
= DF (Xos Yoo Zo)(X — Xu Y — Vos Z ~ Zo)
(a) Show that the tangent space at (Xo, yo, Zo) is given by

{(x. y, 2): Di F (Xo, Yo. Zo)(% — Xn) + DoF (Xo, You Zo)(Y — Yo) + DsF (Xe, Yor Z0)(Z — Zo) = OF.
Hence the tangent space to S; is a plane if at least one of the numbers
D,F (Xo, Yo, Zo), DoF (Xo, Yor Zo); DaF (Xo, Yo, Zo) is different from 0. In this case the
tangent space to S, is called the tangent plane to Sp at (Xo, yo, Zo).
41.Q. Let F:R’— R, given below, represent a surface S, in R* implicitly as the
level surface

Sp = {(x, y, z)€ R?: F(x, y, z) = O}.


In each of the following cases, determine the tangent space to S, at the indicated
points.
(a) Let F(x, y, z)=x?+y’—z at the points (1, 1, 2) and (0, 2, 4).
(b) Let F(x, y, z)=x?+y?+z7—25 at the points (3, 4, 0) and (3, 3, V7).
(c) Let F(x, y,z)=z—xy at the points (1, 1,1) and (4,3, 2).
41.R. (a) Suppose that, in addition to the hypotheses of the Inversion Theorem
41.8, it is known that the function f has continuous partial derivatives of order
m>1. Show that the inverse function g: V > R° has continuous partial deriva-
tives of order m.
41) MAPPING THEOREMS AND IMPLICIE FUNCTIONS 395

(b) Prove the analogous result for the Implicit Function Theorem 41.9.
41.8. Let f:R’—R belong to Class C'(R’). Show that f is not injective;
indeed, the restriction of f to any open set of R’ is not injective.
41.T. Let g:R— R® belong to Class C’(R). Show that if c¢R, then the
restriction of g to any neighborhood of c is not a surjective map onto a
neighborhood of g(c).
41.U. Let Le £(R’, R*) be injective and let r >0 be such that r |[x|] < |[L(x)|| for
all xe R’. Show that if L,¢ £(R’, R*) is such that |[L,—L],,<r, then L, is
injective. (Hence, the set of injective maps is open in £(R’, R*).)
41.V. Let Le £(R’, R*) be surjective and let m > 0 be as in the proof of 41.6.
Show that if L,¢ £(R’, R*) is such that ||L,—L||,, <m/2, then L, is surjective.
(Hence, the set of surjective maps is open in £CUR’, R*).
41.W. Let g:R’ > R® belong to Class C'(R°) and satisfy ||Dg(x)|,, = a<1 for
allxeR’. If f(x)=x+ g(x) for xe R’, show that f satisfies

IF.) — f(x) — (41 ~ x2) = a@ [x1 ~ xo]


for all x,, x. in R° and that f is a bijection of R’ onto R’.

Projects

4l.a. (This project gives a direct and elementary proof of the Implicit Function
Theorem.) Let Q¢ R’ be open and let F:0. > R belong to Class C'(Q). Suppose
that (a, b)eQ, that F(a, b)=0, and that D,F(a, b)>0.
(a) Show that there exists a closed cell Q =[a,, a2] <[bi, b2] with center (a, b)
such that D,F(x, y)>0 for all (x, y)¢Q, and such that F(x, b,)<0 and F(x, b,)>0
for all x €[a,, a2].
(b) If x €[a,, ap], then the function F, :[b,, b.] > R defined by F,(y) = F(x, y) for
y €[b,, b.] is such that F,(b,\)<0<F,(b,) and Fi(y)>0 for y €[b,, be].
(c) There exists a function mapping [a,, a,] into [b,, b.] such that F(x, p(x))=0
for all x €[a,, a,].
(d) If xe(a,, a.) and {h| is sufficiently small, show that there exists h, with
0<}h,|<|h[{ such that
O= Fix +h, e(x+h)]- Fx, e(x)]

=D,F[x +h, eth jh + DF[x + hy, o(x +h ie + h)— o(x)].


(ec) Show that ¢ is differentiable on (a,,4.) and that (x)=
—D,Flx, @(x)/DoFlx, @().
(f) Modify the preceding argument for a function F defined on an open set
QER?.
(g) Let Q2 R?’ x R’ be open and let F, G:Q— R belong to Class C'(Q) and
suppose that for some point (a,b)<¢R’ XR’ we have F(a,b)=0, G(a, b)=0.
Suppose that
A=det [ D,.1F (a, b) D,+2F(a, b) | 0,
DyusGla, b) Dyi2G(a, 5)
then at Jeast one of D,,,F(a,b) and D,..F(a, b) does not vanish. Suppose that
D,..F(a, b) #0 and use (f) to obtain x,.=@(%.,..., X.1) in a neighborhood of
396 DIFFERENTIATION IN R?

(a, b,)ER’ XR. Hence F(x,,..., Xpar, @(%1,- ~~, Xp11)) = 0, on this neighborhood.
Now put
(x1, 6, Xpar) = G(X, 2 Xp ary PCH, ~~» 5 Xpar))-
By Chain Rule
Dow = DyiiG + (Dyi2G)(Dp ie),
where these functions are evaluated at the appropriate points. Since Dip =
—(D,aF)(D,.F) we infer that D,,,H =—A/D,,.,.F which does not vanish at
(a, b,). Hence we can use (f) to obtain x,,.= o(x,,..., x,) on a neighborhood of
aéR’. (This establishes the Implicit Function Theorem in the case where q = 2;
extensions to the case of general q are obtained by induction.)
41.8. (This project is parallel to Project 40.@ and gives a more direct proof of
the first part of Inversion Theorem 41.8 then the one given in the text.) We
assume that Q¢R’ is open, that f:(— R? belongs to Class C’(Q), and that for
some x,€Q the linear mapping Df(x,) is a bijection. We let T= Df(x.)".
(a) Show that there exists r>0 such that if [|x — xl <7, then I-Ie Df(x)|},, <3.
(b) Let s = 3r [[[\k¢ and for fixed y with |ly — f(xo)[| < s, we define F,(x)=f(x)—y
for ||x—x. <r. Then F, is differentiable, |[['°F,(x,)| < 4r, and |[{-TeDF,(x)||,, <3
for ||x — xo <7.
(c) If lly —f(.)l}-< s, let G, be defined for |x —xdl|<r by G,(x)=x —Te° F,(x).
Then G, is a contraction with constant 3 on this ball.
(d) If ly —f(xo)|| < 5, define ¢(y) =x. and @,.1(y)= G,(¢,(y)) for n=0,1,2,....
Show that |lg..1(y)-@,(y)|| <2 |le.(y)—@o(y)||
<2 "77, whence it follows that
llenvi(y)— Ga (y)]} = 2-7 for n= m=O. In particular, |lp,.(y) —x,| <1, so that this
iteration is possible.
(e) Show that each of the functions ¢, is continuous for ||y — f(x.)|| < s and that
the sequence (¢,) is uniformly convergent to a continuous function g which is such
that G,(e{y))=¢(y) for |ly—f(x.)l|<s, whence it follows that f(@(y))=y for
ly —f(xo)|| = s. Hence the function ¢ is the inverse of f on the set {y :|ly —f(x0)|| =
s} and maps it into the set {x :||x— xl <r}.
41.y. (This project is parallel to Projects 40.@ and 41.6 and gives a direct proof
of the Implicit Function Theorem.) Let Q¢ R°x R* be open and let (x,, yo) EM.
Suppose that F:0Q-> R* belongs to Class C'(Q), that F(x, yo) =0, and that the
linear map L,: R*— R* defined by

L,(v) = DF (Xo, yo)(0, v) forv eR?


is a bijection of R* onto R*. Let T=L,".
(a) Show that there exists r>0 such that if ||x — xl[’+|ly— yal? = 7°, then
jv —Te DF(x, y)(0, v)I| <3 fol] for vER’.
(b) Let 0<s = }r be such that if |x —x,||< s, then

FCs, yo)l| = ar [Pee


For each fixed x with ||x — x)|| = s, we define G,(y) = y Te F(x, y) for |ly ~ yoll <4r,
with values in R*. Show that for each x with ||x — xl < s, we have

IG. Cy.) — G.(y2)I] = 3 lly: — yall


for all y,, y. satisfying ||y, — yoll <r.
42 EXTREMUM PROBLEMS 397

(c) If ||x —xoll <s, define y(x)=yo and Yi(x)=G. (bh, (x)) for n=0,1,2,....
Show that |[h.1(x) — (x) = 2°" 'r for n = m= 0. Hence |ly,(x)— yoll < 4, so that
this iteration is possible.
(d) Show that each of the functions , is continuous for ||x — x,||< s and that the
sequence (if,) is uniformly convergent to a continuous function w such that
F(x, (x))=0 for all |x —x, |<.
(e) To show that is differentiable for ||x = x,|<s use Exercise 39.W and employ
anargument similar to thatin (d) and (e) of Project 41.« foreachcomponentof F.

Section 42 Extremum Problems

In Section 27 we briefly discussed the familiar process of locating interior


points at which a real-valued differentiable function of one variable attains
relative extreme values. The question as to whether a critical point (that
is, a point at which the derivative vanishes) is actually an extreme point is
not always discussed, but often can be handled by means of Taylor’s
Theorem 28.6. The analysis of extreme points which belong to the
boundary of the domain often yields to an application of the Mean Value
Theorem 27.6.
In the case of a function with domain in R® (p> 1) and range in R, the
situation is often considerably more complicated, and each function needs
to be examined in its own right. However, there are a few general
theorems that are useful and which will be presented here.
Let QCR? and let f:0>R. A point c€Q is said to be a point of
relative minimum of f if there exists 6 > 0 such that f(c) < f(x) for allxe OD
with ||x—cl]<6. A point ceQ is said to be a point of relative strict
minimum of f if there exists 5 >0 such that f(c)<f(x) for all xe with
0<||x—cl|<6 We define a point of relative [strict] maximum of f
similarly. Moreover, if c¢Q is a point of relative [strict] minimum or
relative [strict] maximum of f, we say that c is a point of relative [strict]
extremum of f, or that f has a relative [strict] extremum at c.
The next result is very often useful.
42.1 THEOREM. Let OCR?, and let f:Q— R. If an interior point c
of Q is a point of relative extremum of f, and if the partial derivative D.f(c) of
f with respect to a vector ue R? exists, then D,f(c)=0.
PRooF. By hypothesis the restriction of f to the intersection of 0 with
the line {c+ tu:t¢R} has a relative extremum at c. It therefore follows
from Theorem 27.4 that D.f(c) = 0. Q.E.D.
42.2 COROLLARY. Let OCR’, and let f:Q—R. If an interior point
c of QO is a point of relative extremum of f, and if the derivative Df(c) exists,
then Df(c)=0.
398 DIFELRENTIATION IN R?

PROOF. It follows from Corollary 39.7 that each of the partial deriva-
tives D,f(c), j=1,...,p, exist and that if u=(m,...,u,)e¢R°, then

Dflo\(u) = ¥ wf.
By the preceding theorem, Djf(c) = 0 for j =1,..., p, whence Df(c)(u) =0
forallue R?. O.E.D.

It follows that if QCR?, and if f:Q—R has a relative extremum at


ceéQ and if Df(c) exists, then

(42.1) D,f(c) =0,..., D,f(c) =0.


An interior point c at which Df(c) =0 is called a critical point of f. We
infer that if O is an open set in R° on which f is differentiable, then the set
of critical points of f will contain all of the relative extreme points of f. Of
course, this set of critical points may also contain points at which f does not
have a relative extremum. (Jn addition, the function f may have a relative
extremum at an interior point c of © at which the derivative Df(c) does not
exist, or f may have a relative extremum at a point c €Q which is not an
interior point 0; in either case, the point ¢ will not be a critical point of f.)
42.3. ExAmpLes. (a) Let f:(x)= x? for xe[-1,1]. Then Df,(0)=0;
however, fi does not have an extremum at x =0. On the other hand, f;
does have strict extrema at the points +1 (which are not interior points of
the domain and are not critical points).
(b) Let f.(x) =|x| for x e[—1,1]. Then Df,(0) does not exist; however,
f. has a relative strict minimum at the interior point 0. On the other hand,
fz does have relative strict extrema at the points +1.
(c) Let fs:R’—> R be defined by fs(x, y)=xy. Then Df,(0,0)=0 so
that the origin (0, 0) is a critical point of f3; however, it is not a relative
extremum of f; since

f3(0, 0) < fa(x, y) for xy >0,


f:(0,0)>fs(x, y) — forxy <0.
We say that the origin (0,0) is a saddle point of f; meaning that every
neighborhood of (0,0) contains points at which f; is strictly greater than
f:(0, 0) and also contains points at which f; is strictly less than f3(0, 0).
(d) Let fs:R’ > R be defined by fa(x, y) = (y — x7)(y — 2x”). Show that
Df,(0,0)=0 and that the restriction of f; to every line passing through
(0,0) has a relative minimum at the origin. However, show that in
every neighborhood of (0, 0) there are points where f, is strictly positive
and those where f, is strictly negative.
42) EXTREMUM PROBLEMS 399

The Second Derivative Test

In view of the examples given above, it is convenient to have conditions


which are necessary (or are sufficient) to guarantee that a critical point is an
extremum or that it is a saddle point. The next results give conditions in
terms of the second derivative of f that was introduced at the end of
Section 40.
42.4 THEOREM. Let .< R? be open and let f:0.— R have continuous
second partial derivatives on QO. If ce Q is a point of relative minimum
[respectively, maximum] of f, then

(42.2) Df(c)(w)? = x Dif (c)wiw; =0

[respectively, D’f(c)(w)’ = 0] for all we R®.


PROOF. Let wER’,||wl|=1. Ifc is a point of relative minimum, there
exists 5 >0 such that if |t]|<8 then f(c+tw)—f(c)=0. Since 2 is open,
there exists 6, >0 with 8, = 6 such that c + tw belongs to 0 forO0<t< 4:.
By Taylor’s Theorem 40.9 there exists t, with 0 < t;< t= 6, such that if
CG =c+t,w, then

f(c + tw) = f(c) + Df(c)+32D°f(c(tw)’.


Since c is a point of relative minimum it follows from Corollary 42.2 that
Df(c)=0; hence we have
3D’flca)(tw)= 0
for 0<1t<8,. It follows that D’f(c,)(w)? = 0. Since |[c. —cl|= |t:| < |e], it
follows that c.—> c as t—0Q. Since the second partial derivatives of f are
continuous, then D’f(c)(w)? = 0 for all we R? with ||w||=1, from which
the result follows. OED.
The next result is a partial converse to Theorem 42.4. However, note
that its hypothesis is somewhat stronger than the conclusion of 42.4.
42.5 THEOREM. Let LCR? be open, let f:Q-> R have continuous
second partial derivatives on ©, and let ce be a critical point of f.
(a) If D?f(c)(wy’>0 for all we R’, w#0, then f has a relative strict
minimum at c.
(b) If D’f(c)(w)y
<0 for all we R’, w#0, then f has a relative strict
maximum at c.
(c) If D’f(c)(w)* takes on both strictly positive and strictly negative values
for we R?, then f has a saddle point at c.
PROOF. (a) By hypothesis D’f(c)(w)?>0 for w in the compact set
{weR?:||wll=1}. Since the map w+> D’f(c)(w)’ is continuous, there
400 DIFFERENTIATION IN R?

exists m >O such that


D’f(c)\(w)? =m for |w||=1.
Since the second partial derivatives of f are continuous on Q, there exists
8 >0 such that if |x —c||<8 then
D7f(x)\wY =im for ||w||= 1.
By Taylor’s Theorem 40.9, if O<t=1 there is a point c on the line
segment joining c and c+tw such that

f(c
+ tw) = f(c) + Df(c)(tw)
+2D flew)’.
Since c is a critical point, it follows that if ||wl|=1 and 0<1t<6, then
fle + tw)-f(c) =3t’Df(c)\(wy= amt?
> 0,
Thus f(c+u)>f(c) for 0<|lu—cl|<6, whence f has a relative strict
minimum atc. Thus part (a) is proved and the proof of part (b) is similar.
(c) Let ws, w- be unit vectors in R? such that
D’f(c)(w.)’ > 0, D’f(c)(w_-)? <0.
It follows from Taylor’s Theorem that for sufficiently small t > 0 we have
f(c +tw.)>f(c), f(c+tw_)<f(c).
Thus c is a saddle point of f. Q.E.D.
On comparing Theorems 42.4 and 42.5, one is led to make the following
conjectures: (i) if c€Q is a point of relative strict minimum, then
D’f(c)(wy’ > 0 for all we R’, w #0, (ii) if c €O is a saddle point of f, then
D’f(c)(w)* takes on both strictly positive and strictly negative values, (iii) if
D°f(c)(w)’ = O for all we R®, then c is a point of relative minimum. All of
these conjectures are false, as may be seen by examples.
In order to implement Theorem 42.5 it is necessary to know whether the
function w+> D?f(c)(w)’ is of one sign. An important and well-known
result of algebra (see the book of Hoffman and Kunze cited in the
References) can be used to determine this. For each j=1,2,..., p, Ict
A, be the determinant of the (symmetric) matrix

Df (c) Difle)

Dnf(c) +++ Dyf(c)


If the numbers Aj, A2,..., A, are all strictly positive, then D’f(c)(w)’
>0
for all w#0 and f has a relative strict minimum at c. If the numbers
A:, Ao,...,A, are alternately strictly negative and strictly positive, then
42. EXTREMUM PROBLEMS 401

D°f(c)(w)’
<0 for all w#0 and f has a relative strict maximum atc. In
other cases there can be extreme or saddle points.
In the important special case p = 2 a less elaborate formulation is more
convenient and a bit more information can be derived. Here we need to
examine the quadratic function
O = Au?+2Buv+ Cov’.
If A= AC— B*>0, then A# 0 (and C# 0) and we can complete the square
and write

Q= ~ [(Au+ Bv)?+ (AC - B’)v’].

Hence the sign of Q is the same as that of A (or C). On the other hand, if
A<O, then Q has both strictly positive and strictly negative values. This is
obvious from the above equation if A # 0 and is also readily established if
A=0.
We collect these remarks in a formal statement.
42.6 CoroLLary. Let Qc R? be open, let f:Q— R have continuous
second partial derivatives on ©, let ce€Q, be a critical point of f, and let
(42.3) A=Duf(c)Dzf(c)-[Diaf(c)).

(a) If A>0 and if Dif(c) > 0, then f has a relative strict minimum at c.
(b) If A>0 and if Diif(c) <0, then f has a relative strict maximum atc.
(c) If A<O, then f has a saddle point at c.
Some information concerning the case where A=0 will be given in the
exercises.

Extremum Problems with Constraints

Until now we have been discussing the case where the extrema of the
function f:— R belong to the interior of its domain QC R°”. None of our
remarks applies to the location of the extrema on the boundary.
However, if the function is defined on the boundary of 0 and if this
boundary of © can be parametrized by a function , then the extremum
problem is deduced to an examination of the extrema of the composition
fee.
There is a related problem which leads to an interesting and elegant
procedure. Suppose that S is a “‘surface”’ contained in the domain Q of the
real-valued function f. It is often desired to find the values of f that are
maximum or minimum among all! those attained on S. For example, if
Q= R® and f(x) =|{x||, then the problem we have posed is concerned with
finding the points on the surface S$ which are closest to, or farthest from,
402 DIFFERENTIATION IN R?

the origin. If the surface S is given parametrically, then we can treat this
problem by considering the composition of f with the parametric represen-
tation of S. However, it frequently is not convenient to express S in this
fashion and another procedure is often more desirable.
Suppose S can be given as the points x in © satisfying a relation of the
form
g(x) =0,
for a function g defined on Q to R. We are attempting to find the relative
extreme values of f for those points x in © satisfying the constraint (or side
condition) g(x)=0. If we assume that f and g are in Class C’(Q) and that
Dg(c)#0, then a necessary condition that c be an extreme point of f
relative to points x satisfying g(x) =0, is that the derivative Dg(c) is a
multiple of Df(c). In terms of partial derivatives, this condition is that
there exists a real number A such that

Dif(c)
= ADig(c),
(42.4) 0
D,f(c) = AD, g(c).

In practice we wish to determine the p coordinates of the point c satisfying


this necessary condition. However, the real number A, which is usually
called the Lagrange multiplier, is not known either. The p equations given
above, together with the equation

g(c)=90
are then solved for the p +1 unknown quantities, of which the coordinates
of c are of primary interest.
42.7 LAGRANGE’s THEOREM. Let (.¢ R® be open and suppose thatf
and g are real-valued functions in Class C'\(Q). Suppose c €Q. is such that
g(c)=0 and that there exists on neighborhood U of C such that
fy=flc) — [or f(x) = fle)]
for all points x € U which satisfy g(x)=0. Then there exist real numbers «1,
A, not both zero, such that
(42.5) pDf(c) =ADg(c).

Moreover, if Dg(c)#0, we can take wp =1.


PROOF. Let F:U-> R’ be defined by
F(x) = (f(x), g()) — forx ee U,
so that F belongs to Class C'(U) and
DF(x)(v) = (Df(x)(v), Dg(x)(v)), xeU, veR’,
42) &XTREMUM PROBLEMS 403

Moreover, a point x¢U satisfies the constraint g(x)=0 if and only if


F(x) = (f(x), 0).
If f(x) = f(c) for all x € U satisfying g(x) =0, then the points (r, 0) with
f(c)<r are not in the image F(U); hence DF(c) is not a surjection of R*’
onto R’. But since the range of the linear map DF(c) is a subspace of R?
and does not coincide with R’, the range of DF(c) is contained in some line in
R’ passing through (0, 0). Therefore, there exists a point (A, 4.) ¥ (0, 0) such
that the range of DF(c) is contained in the line through (0, 0) and (A, ys).
Hence we have

(42.6) wDf(c)(v)=ADg(c)(v) for all ve R®, veR®.


whence equation (42.5) follows.
Finally, suppose that Dg(c) #0. If w =0, then equation (42.5) implies
that A = 0, which contradicts the fact that (wu, 4) 4 (0,0). Therefore in this
case we must have pz. ¥ 0 and can divide by and replace A/u byA. QED.
Since U CR’, equation (42.6) with v =e1,..., ep yields the system of p
equations:
uDif(c) = rADig(c),

uDef(c) = AD, g(c).


If not all of the Dig(c), i=1,...,p, vanish, then we can take 4. =1 to
obtain the system (42.4).
It should be emphasized that Lagrange’s Theorem yields a necessary condition
only, and that the points obtained by solving the equations (which is often difficult
to do!) may be relative maxima, relative minima, or neither. However, Corollary
41.13 below often can be used to test for relative maxima or minima. Further-
more, in many applications the determination of whether the points are actually
extrema can be based on geometrical or physical considerations.

42.8 ExaAmpLes. (a) We wish to find a point on the plane


{(x, y, zZ):2x +3y —z =5} in R° which is nearest to the origin. To attack
this problem, we shall minimize the function which gives the square of
the distance to the origin:

fix y, zy=xP ty? 27,


under the constraint
g(x, y,Z)=2x+3y—z—-S5=0.
Since Dg(c) #0 for all ce R*, Lagrange’s Theorem leads to the system
2x = 2A,
2y = 3a,
2z=-h,
2x+3y—-2—-5=0.
404 DIF-ERENTIATION IN R®

Hence, on eliminating x, y, z, we get


24+ 3GA)—(—-3A)-5=0,
or 144 =4A+9A+A=10, whence A=5/7. We are lead to the single
point (5/7, 15/14, —5/14). From geometrical considerations we deduce
that this is the point on the plane nearest (0, 0, 0).
(b) Find the dimensions of the rectangular box, open at the top, with
maximum volume and given surface area A. Let x, y, z be the dimensions
of the box, with z as the height. Then we wish to maximize the function

V(x, y, Z) = xyz
subject to the constraint

g(x, y, Z)=xyt+2xz+2yz-A=0.

Since the desired point will have strictly positive coordinates, Lagrange’s
Theorem leads to the system
yz =A(y+2z),
xz =A(x+2z),
xy =A(2x+2y),
xy+2xz+2yz—-A=0.

If we multiply the first three equations by x, y, and z, respectively, equate,


and divide by A (why is 4#0?), we are led to

xy +2xz =xy+2yz =2xz4+2yz.

The first equality implies x = y, and the second implies y = 2z. Hence the
ratio of the sides are 2:2:1 and it follows from the last equation that
4z°+42z7+4z?= A which implies that z =3(A/3)’”. Therefore the volume
of this box is 3(A/3)°”.

Frequently there is more than one constraint; in this case the following
result is useful.
42.9 THEOREM. Let Q<R’ be open and suppose that
f and g1,..., &
are real-valued functions in C'(Q). Suppose that c€Q satisfies the con-
straints

gi(x) =0, rte g(x) =0,

and that there exists an open neighborhood U of a such that f(x) = f(c) [or
f(x) = f(c)] for all x € U satisfying these constraints. Then there exist real
numbers 2, A1,..., Ax not all zero such that

(42.7) eDf(c)= AiDgi(c) ++ + +ADgx(c).


42) EXTREMUM PROBLEMS 405

PROOF. Let F:U— R“*' be defined by


F(x) = (f(x), g:(x),.--, ge (x)) forx¢ U,
and argue as in the proof of Theorem 42.7. O.E.D.
42.10 COROLLARY. In addition to the hypotheses of Theorem 42.9,
suppose that the rank of the matrix
Digil(c) Dig. (c)

(42.8)

Dygile) Dpge(c)
is equal to k( <p). Then there are real numbers Ai, ..., Ax not all zero such
that
Dif(c) = A:Digi(c) Feb AcDigi(c),
(42.9) ee
Dpf(c) = Ai Dp gi(c) + > +A.Dpg(c).
PROOF. If we apply the formula (42.7) to e1,...,e,¢R°, we obtain a
system of equations with the right-hand side of (42.9) and the left-hand
side of (42.9) multiplied by uw. If =0, then the assumption that the rank
equals k implies that A,.=--:=A,.=0, contrary to hypothesis. Hence
p. #0 and we can normalize this system to obtain (42.9). QED.
42.11 ExampLe. Find the points on the intersection of the cylinder
{(x, y, z):x?+ y? = 4} and the plane {(x, y, z):6x+3y+2z = 6} which are nearest to
the origin and those which are farthest from the origin.
We shall search for relative extrema of the function
fQG y, z= x? +y?4+2?
subject to the constraints
glx, y, z)=x?+y?-4=0,
g(x, yz) = 6x +3y +2z-6=0.
The matrix corresponding to (42.8) in this case is
2x 6

0 2
which has rank 2 except at the point (x, y)=(0,0) which does not satisfy the
constraints. Hence we can apply the corollary to obtain the system

2x =A\(2x)+A,(6),
2y =A,(2Y) + A2(3),
2z= A.(2),
x*+y?=4,
6x+3y+2z =6,
406 DIFFERENTIATION INR?

of five equations in five variables. The third equation gives 4.=z, so we can
eliminate A, from the first two equations. To eliminate A,, we multiply the
resulting first equation by y and the second by x and subtract, to get

0 = 6yz —3xz =3z(2y—x).


It follows that either z =0 or x =2y.
If z =0, the fifth equation yields 2x+y=2. When combined with the fourth
equation this gives
x?+(2-2x)Y
= x*+4—8xt4x'=4,
whence 5x*— 8x =x(5x — 8) =0, and hence x = 0 or x = 8/5. This case leads to the
two points (0, 2, 0) and (8/5, —6/5, 0) each of which have distance 2 from the origin.
On the other hand, if x = 2y, the fourth equation yields Sy*= 4 so that y = 2/V5
(and x =4/V5) or y= —2/V5 (and x =—4/V5). Substituting in the fifth equation we
get z=3(1 —V5) and z = 3(1 + v5), respectively. Therefore, this case leads to the
two points (4/V5, 2/V5, 3(1 —V5)) and (-4/V5, —2/V5, 3(1+V5)). The squares of
the distances between these points and the origin are seen to be 58-185 and
58+ 18V5, respectively.
We deduce that both of the points (0,2,0) and (8/5, —6/5,0) minimize the
distance from the origin and this intersection, and that the point (-4/V5, -2/V5,
3(1+V¥5)) maximizes this distance. From geometrical considerations we also see
that the point (4/V5, 2/V5, 311 —V5)) gives a relative maximum among points of this
intersection. (The reader should draw a diagram to help him visualize this
situation.)

Inequality Constraints

In recent years, extremum problems involving constraints which are


inequalities rather than equalities have become increasingly important.
Thus we may wish to find a relative extremum of a function f:Q0—-R
among all points in OC R® satisfying the constraints
hi(x) = 0,..., a(x) = 0.
We will see that such problems can also be handled by Lagrange’s method.
Sometimes an extremum problem may involve both equalities and inequalities,
but since the equality g(x)=0 is equivalent to the inequality —(g(x))’= 0, such
problems can always be reduced to one involving only inequality constraints.

42.12 THEeEoREM. Let QCR? be open and suppose that f and


hi, ..., hx are real-valued functions in C'(Q). Suppose that c €O satisfies
the inequality constraints
hi(x)= 0,..., bxe(x)= 0,
and that there exists an open neighborhood U of c such that f(x) = f(c) [or
f(x) = f(c)] for all x € U satisfying these constraints. Then there exist real
42. &XTREMUM PROBLEMS 407

numbers w, Ai,..., Ax not all zero such that


(42.10) wDf(c)
= A1Dhi(c) +- -+A.Dhi(c).
Furthermore, if hi(c)>0 for some i, then 4, =0.
pRooF. Let F:U— R*‘*' be defined by
F(x) = (f(x), hi(x), ..-, We (x)) forxe U.
If c is a point in U where the constraints are satisfied and where f is either
maximized or minimized; then DF(c) cannot be surjective and so (42.10)
must hold.
If hi(c) =0,..., bh (c) =0, but hai(c)>0,..., ke(c) > 0, then let UicU
be an open neighborhood of c on which h,.1,..., h, are strictly positive
and apply the theorem to the constraints hi(x)=0,...,h(x)=0. QED.
42.13 CoroLiary. In addition to the hypotheses of Theorem 42.12,
suppose that the rank of the matrix

Dyhi(c) Dih,(c)

(42.11)

Dyhi(c) Dyh-(c)

corresponding to those h; for which h;(c) = 0, is equal tor. Then we may take
=1 in (42.10). In addition, if f(x)=fic) [respectively, f(x)=f(c)] for
all x€U satisfying the constraints and if we take =1 in (42.10), then
A, <0 [respectively, A,=0] fori=1,...,r.

PROOF. The proof that we can take uw = | in (42.10) is similar to that of


Corollary 42.10. Suppose, then, that » =1 and that f(x) <f(c) for all
x€U satisfying the constraints. Since the rank of the matrix (42.11) is
r=<k, then if 1<j <r there exists a vector vj ¢.R’ such that

Dhi(c)(v})= 8.

Therefore, if t>0 is sufficiently small, then there exists a point c, on the


segment joining c and c+tvy such that
0 = f(c + ty)
— fle) = Df(e.)(ty,) = fla }(yj).
Consequently we have

0 = tim LAWL ~ Dy(ey() =F ADh(e)(0)=A.


Hence Aj < Oforj=1,..., 7. OED.
408 DIFFERENTIATION IN RP

For an elementary, but very different proof of the theorem of Lagrange


involving inequality constraints, see the article of E. J. McShanet listed in
the References.

Exercises

42.A. Find the critical points of the following functions and determine the
nature of these points.
(a) f(x, y)=x?+4xy,
(b) f(x, y)=x*+2y*+32x—-—y+17,
(c) f(x, y) =x? +4y?— 12y?—36y,
(d) f(x, y)=x*—4xy,
(e) f(x, y)=x?+4xy + 2y?—2y,
(f) f(x, y)=x?+3y*—4y?— 12y’,
42.B. Let QC R?® be open, let f:— R have continuous second partial deriva-
tives on Q, let c EQ be a critical point of f, and let 6>0.
(a) Show that if D?f(x)(w)? = 0 for all 0<||x—c||< 6 and we R®, thenc isa point
of relative minimum of f.
(b) Show that if D?f(x)(w)? > 0 for all 0< |x —c||< 8 and we R’, w#0, then c is
a point of relative strict minimum of f.
42.C. Let Qc R’ be open, let f:Q— R have continuous second partial deriva-
tives on Q, let ce be a critical point of f, and let

A(x) = Du f(x)Dof(x) _ (D of (x)P

for xeQ. Suppose that for some 6 >0, then A(x) = 0 for all |x —c/]<4.
(a) If D..f(x) >0 (or if D2.f(x) > 0) for all x such that 0<||x —cl]< 8, show that c
is a point of relative minimum of f.
(b) If Di f(x) <0 (or if D.2f(x) <0) for all x such that 0<||x — cl|< 5, show that c
is a point of relative maximum of f.
42.D. Let f:R° > R be differentiable on R’ and f(x)=0 for all xe R’ with
|x| =1. Show that there exists a point ce R° with ||c||/<1 such that Df(c)=0.
(This is a version of Rolle’s Theorem in R’*.)
42.E. Use the Surjective Mapping Theorem 41.6 to establish Corollary 42.2.
42.F. Show that each of the following functions has a critical point at the origin.
Find which have relative extrema and which have saddle points at the origin.
(a) f(x, y)= xy’, (b) f(x, y)=x?—y’,
(c) fx y)=x?-y?, (d) f(x, yy=x*—x?y*+y4,
(e) f(x, yy=x?y —xy?, (f) f(x, y)=x*ty*
42.G. Show that the function f(x, y)=2x+4y—x’y* has a critical point but no
relative extreme points.
42.H. Study the behavior of the function f(x, y) =x*—3xy’ in a neighborhood of
the origin. The graph of this function is sometimes called a ‘‘monkey saddle.”
Why?

+E. J. McShane (1904— ) received his doctor’s degree from the University of Chicago.
He has long been associated with the University of Virginia and is widely known for his
contributions to integration theory, the calculus of variations, optimal control theory,
and exterior ballistics.
42 EXTREMUM PROBLEMS 409

42.1. Find the minimum distance from the point (2, 1, —3) to the plane 2x + y —
2z =4.
42.J. Find the dimensions of the rectangular box, open at the top, with given
volume and minimum surface area.
42.K. Find the minimum distance between the lines L, = {(x, y,z):x=2-t,
y=3+t, z=1-2t} and L,={(x, y, z):x=1-—s, y=2-s, z=3+s5}.
42.L. Give examples to show that each of the conjectures stated after Theorem
42.5 is false.
42.M. Suppose we are given n points (x;, y,), j= 1,..., in R? and wish to find
the affine function F:R — R given by F(x) = Ax +B, such that the quantity

» (F(x;)— yi)’
is minimized. Show that this leads to the equations

ALtt x +BY
r=1
=) i= xy,

j=l

for the numbers A, B. [This function F is said to be the affine function which “best
fits the n points in the sense of least squares.’’]
42.N. Let f:[0,1]—R be continuous on [0,1]. We wish to choose real
numbers A, B, C in such a way as to minimize the quantity

{ [f(x) —(Ax?+ Bx + C)f dx.


oO

Show that we should choose A, B, C to satisfy the system

tA +1 B4+5C= [ x? f(x) dx,

jA4+3B43C= { xf(x) dx,


1

3A +1B+C= | f(x) dx.

[The resulting function x > Ax?+ Bx + C is said to be the quadratic function which
“best fits f on [0, 1] in the sense of least squares.’’]
42.0. Use Lagrange’s Theorem to locate points on the curve y = x*+ x — 2 where
the function f(x, y)=x—y may have a relative extremum. Then sketch the curve
and the level curves for f to show that the point(s) located are not point(s) of
relative extrema for f.
42.P. Let f:R’—R be the quadratic function f(x, y)=ax?+2bxy+cy’ for
(x, y)<.R®. We wish to find the relative extrema of f on the unit circle {(x, y):x?+
y’=1}. Use Lagrange’s Theorem to show that the points (xo, yp) at which these
relative extrema are taken must satisfy the system
(a—A)Xo+
by, = 0,
bxo+(c—A)yo
= 0,
410 DIFFERENTIATION INR?

where the Lagrange multiplier ’ is a root of the equaiion

\?-(atc)a+t
(ac —b?) =0.

Show that the corresponding value of the multiplier A is equal to the extreme value
of f at such a relative extremum.
42.Q. The sum of three real numbers is 9. Find these numbers if their product is
to be maximized.
42.R. Show that the volume of the largest box that can be inscribed in the
ellipsoidal region

{x y, 2) a+ otos i}

(where a, b, c are strictly positive numbers) is equal to 8 abe /3V3.


42.8. For each of the following functions, find the maximum and minimum
values on the given set. (When appropriate, consider the signs of the multipliers.)
(a) fgy)ax*~y', Saf y)ix?ty’ = 1}.
(b) f(x, y)=x7+2x4+y’, S = {(x, y)ix?+y?
= 1}.
() fqyax2+2xty, S={Ony), |e} Ly) <1
(d) f(x, y)=(1—x’) sin y, S={(x, y), |x| <= 1,|y| < 7}.
42.T. Let f be defined for x >0, y>0 to R by f(x, y) = 1/x+cxy+1/y.
(a) Locate the critical points of f and determine their nature.
(b) If c>0, let S={(x, y):0<x, O<y, x+y<c}. Determine the relative
maximum and minimum values of f on S.
42.U. Find the extreme values of f(x, y,z)=x°+y°+z* subject to the con-
straints x*+y?+z7=1 andx+y+z=1.
42.V. Let f have continuous second partial derivatives on an open set containing
the ball {x
¢ R°:||x|| =r} to R, and suppose there exists c with ||cl|<r such that

M = f(c)
> sup {f(x):||xl| =r}= m.
Let g be defined by

a(x) = fe) + Ie el.


Show that g(c)=M, while g(x)<M for ||xl=r. Hence g attains a relative
maximum at some point c, with ||c,||<r, where we have

2 D, fle.) < Ps (M—m)<0.


42.W. Let O2¢ R’ be a bounded open set, let b(Q) be the set of boundary points
of © (see Definition 9.7), and let Q =Q.Ub(Q) be the closure of 0. A function
f:Q° > R is said to be harmonic on © if it is continuous on 0 and satisfies the
Laplace equation

» Dyfle) =0
for all x EQ.
(a) Use the argument of the preceding exercise to show that a harmonic function
on © attains its supremum and infimum on b(Q).
42) EXTREMUM PROBLEMS 4it

(b) If f and g are harmonic on © and if f(x) = g(x) for x € b(Q), then f(x) = g(x)
for all xe.
(c) Iff and g are harmonic on © and if f(x) = @(x), g(x) = b(x) for x € b(Q), then

sup {|f(x)— g(x)[:x €Q} = sup {lp(x)— W(x) |: x € b)}.


(This conclusion can be stated by saying that “the solutions of the Dirichlet
problem for 0 depend continuously on the boundary data.”’)
42.X. Show that the maximum of f(x.,...,x,)=(x.°--x,)? subject to the
constraint x,°++:+-+x,'=1 is equal to 1/p*. Use this to obtain the inequality:

Iv °° Yol Sr for ye R".

42.Y. Show that the geometric mean of a collection of positive real numbers
{a,,..., a,} does not exceed their arithmetic mean; that is,

(a,-++a,)'” Flat -++a,).

42.Z. (a) Let p>1, q>1, 1/p+1/q=1. Show that the minimum of f(x, y)=
(1/p)x* +(1/q)y* (x >0, y >0) subject to the constraint xy = 1 is equal to 1.
(b) Use part (a) to show that if a>0, b>0, then

ab<tartips.
Pp q

(c) Let {a,}, {bh}, i=1,...,n, be positive real numbers. Prove Hdélder’s
Inequality:

by letting A =( a”)’”, B=(© b*)’", and applying part (c) to a=a/A, b=b/B.
(d) Use Hdlder’s Inequality in (c) to obtain Minkowski’s Inequality:

(Zla+ar)”=(Z lar) +(Z tm)"


‘fp

[Hint: Ja+b]’ =|a+b||a+b|"? <|a||a+bl* + |b] |a+ bl.)


VIII
INTEGRATION
IN FR

In this chapter we shall present the theory of integration of real-valued


functions on R’ where p>1. The approach used here is the same as that
initiated in Section 29 for the case p=1, but we shall be concerned here
only with the Riemann integral (and not the Riemann-Stieltjes integral).
It will be seen in Section 43 that, for bounded functions defined on a
closed cell in R’, the theory requires virtually no changes from that in R.
However, in order to be able to integrate over more general sets in R° it is
necessary to develop, as we do in Section 44, a theory of “‘content”’ (as we
shall call the p-dimensional notion of ‘“‘area’’) for a suitable family of sets in
R°. We shall characterize the content function on this family of sets, and
show how to express integrals in R? as iterated integrals. The final section
is devoted to developing important theorems on the transformations of sets
and integrals under differentiable mappings. The theoretical difficulties
are considerable, but we conclude with a very useful theorem justifying the
change of variables even in cases when the transformation may possess a
limited amount of “singular”? behavior.

Section 43 The Integral in R’

In Sections 29-31 we discussed the integral of bounded real-valued


functions defined on a compact interval J in R. A reader for an eye for
generalization will have noticed that a considerable part of what was done
in those sections can be carried out when the values of the function lie in a
Cartesian spaces R*. Once this possibility has been recognized it is not
difficult to carry out the modifications necessary to obtain an integration
theory for functions on J to R’.
It is also natural to ask whether we can obtain an integration theory for

412
43. THE INTEGRAL IN R?° 413

functions whose domain is a subset of the space R’, and the reader will
recall that this was in fact done in calculus courses where one considered
“double” and “‘triple”’ integrals. In this section we shall initiate a study of
the Riemann integral of real-valued functions defined on a suitable
bounded subset of R’. Although many of our results can be extended to
permit the values to be in R‘ for q > 1, we shall leave this extension to the
reader.

Content Zero

We recall from Section 5 that a cell in R is a set having one of the four
forms:

(a, b), [a, bl, (a, b), (a, b],

where a = b. The numbers a, b are called the end points of these cells. A
cell in R? is the Cartesian product J=J,x-+-+xJ, of pcellsin R. The cell J
is said to be closed (respectively, open) if each of the cells Jy,..... J, are
closed (respectively, open) in R. If the cells J; have end points a; = b;
(i=1,..., p), we define the content of J=J,x---x J, to be the product

c(J)
= (bi— ai) + + + (by
— Gp).
If p = 1, content is usually called “length”; if p = 2, the content is called ‘area’; if
p =3, the content is called “volume.” We shall use the word ‘‘content’’ because it is
free from special connotations that these other words may have.

Note that if J=J,;x---xJ,, and K=K,x---xK, are cells in R® such


that the end points of J, and of K; are the same for eachi=1,..., p, then
c(JJ)=c(K). Similarly, if a. = b, for some k = 1,..., p, then the cell J has
content c(J)=0; however, it is not necessary that J=9.
If J; is a cell with end points a; = b; and if b:-ai=---=b,—a, >0, then
we say that
J=HJ,xX-+ KS,

is a cube. Cube may be closed, open, or neither. We call the number


b,—a,>0 the side length of the cube.
43.1 DerIniTIoN. A set ZR? has content zero if for each ¢>0
there exists a finite set J:,..., J, of cells whose union contains Z and such
that
c(Ji)t--- +c) <e.

It is important that the reader show that one can require the cells
appearing in this definition to be closed. or to be open, or to be cubes, and
the notion of content zero remains exactly the same.
414 INTEGRATION IN RP

43.2 Examples. (a) A point in R’ has content zero. (Why?) More


generally, any finite subset of R’ has content zero.
(b) If (zn )new is a Sequence in R’ which converges to z.€ R®, then the set
Z={zZ,:n = 0} has content zero. For, if ¢>0 let Jo be an open cell
containing Zo such that c(Jo)<«. Therefore there exists k € N such that
Zn€Jo for all n>k, and we can take J,={z;} for i=1,...,k to get
ZoIJSUF,U- + US, Since

cot cSi+---+e(hj<et+0+---+0=6,
and since ¢ >0 is arbitrary, it follows that Z has content zero.
(c) Any subset of a set with content zero has content zero. The union of
a finite number of sets with content zero has content zero.
(d) In the space R*, the diamond-shaped set S$ = {(x, y):|x|+]y|= 1} has
content zero. For, if ne N, we introduce squares with diagonals along S$
and vertices at the points x = y =+k/n (k =0,1,...,"), then we see that
we can enclose § in 4n closed squares each having content 1/n*. Hence
the total content is 4/n, which can be made arbitrarily small. (See Fig.
4.31.)
(e) The circle S$ ={(x, y):x’+y’=1} in R® has content zero. This can
be proved by modifying the argument in (d).
(f) Let f be a continuous function on J=[a, b]to R. Then the graph’
G ={(x, f(x)eR?:xeS}
has content zero. This can be proved by using the uniform continuity of f
and modifying the argument in (d).

Figure 43.1
43. ‘THE INTEGRAL IN RP? 415

(g) The set SR? consisting of all points (x, y) where both x and y
belong to 1M Q is a countable set but does not have content zero. Indeed,
any finite union of cells containing $ must also contain the cell Ix J, which
has content 1.
In contrast to (f) we note that there are ‘“‘continuous curves” in R’ which
have positive content. Indeed, there are continuous functions f, g on
I=[0, 1] to R such that the set

S={(f(t), 3); ce
contains the cell Ix Fin R*. Such a curve is called a space-filling curve, or
a Peano curve. (See Exercise 43.U.)

Definition of the Integral

We shall first define the integral for a bounded function f defined on a


closed cell I< R” and with values in R. Let

[= [a., b,]x 6X [a,, by],

and, for each k=1,...,p, let P, be a partition of [a,, b.] into a finite
number of closed cells in R. This induces a partition P of I into a finite
number of closed cells in R’. If P and Q are partitions of I, we say that P
is a refinement of Q if each cell in P is contained in some cell in Q.
(Alternatively, noting that a partition is determined by the vertices of its
cells, P is a refinement of O if and only if all of the vertices contained in Q
are also in P.)
43.3. Derinition. A Riemann sum S(P; f) corresponding to a parti-
tion P={J,,...,J,} of I is given by

S(Psf)= ¥ flaw)e(),
where x, is any “intermediate” point in J., k=1,...,n. A real number L
is defined to be the Riemann integral of f over I if, for every ¢ >0 there is a
partition P. of I such that if P is any refinement of P. and S(P; f) is any
Riemann sum corresponding to P, then (S(P;f)-—L|<e. In case this
integral exists we say that f is integrable over I.

It is a routine exercise to show that the value L of the integral is uniquely


determined when it exists; we shall generally write

=| Ff
416 INTEGRATION INR?

however, when p=2 we occasionally denote the integral by

Jr or [ffs yy ae v9,
and when p=3 we occasionally write

J[fr or Jf fromm erates Tt

There is a convenient Cauchy Criterion for integrability. Since its proof


is so similar to that of Theorem 29.4, we shall omit it.

43.4 Caucny Criterion. A bounded function f:1—> R is integrable


on I if and only if for every « > 0 there exists a partition Q, of I such that if
P and Q are partitions of I which are refinements of Q. and S(P;f) and
S$(Q;f) are any corresponding Riemann sums, then
P| se.
|S(P; f)- S(Q;
We now wish to consider functions which are defined on bounded
subsets of R’ more general than closed cells. Let A < R?’ be a bounded set
and let f: A > R be a bounded function. Since A is bounded there exists a
closed cell 12 R® such that ACI. We define f;: I> R by

filx)=f(x) for xe A,
=0 for xe1\A.
If the function f; is integrable on I in the sense of Definition 43.3, then it is
an exercise (see 43.M) to show that the value f; fr does not depend on the
choice of the closed cell I containing A. Becauuse of this we shall say that
f is integrable on A and define

[t+ [ 1

since the right-hand side depends only on f and A. (In subsequent


arguments, we shall often denote f; simply by f.)
Similarly, let A and B be bounded subsets of R?’ and let f: A— R. Let
I be a closed cell containing AUB and define f;:1— R by
fi(x) = f(x) forxE ANB,
=0 forxel\(ANB),.

Note that f; is the extension to I of the restriction f| ANB. If fi is


integrable over I, we say that f is integrable on B and define

hoele lt
43. ‘THE INTEGRAL IN R? 417

Properties of the Integral

We shall now give some of the expected properties of the integral.


Throughout, A will be a bounded subset of R’.

43.5 THEOREM. Let f and g be functions on A to R which are integrable


on A and let a, BER. Then the function af + Bg is integrable on A and

[, (of+6er-of fre «
PROOF. This result follows from the fact that the Riemann sums for a
partition P of a cell IDA satisfy
S(P; af+ Bg) = aS(P; f) + BS(P; g),
when the same intermediate points x, are used. Q.E.D.
43.6 TuHeorem. If f:A—R is integrable on A and if f(x)=0 for
xeéA, then faf= 0.
PRooF, Note that S(P; f) = 0 for any Riemann sum. Q.E.D.

43.7 TuHrorem. Letf:A— R be a bounded function and suppose that


A has content zero. Then f is integrable on A and f,f=0.
pRooF. Let I be a closed cell containing A. If ¢ >0 is given, let P, be
a partition of I such that those cells in P, which contain points of A have
total content less than «. (Show there exists such a partition P..) Now if
P is a refinement of P., then those cells in P which contain points of A
will also have total content less than e. Hence if |f(x)| =< M for xe A, we
have |S(P; f)|< Me for any Riemann sum corresponding to P. Since
« >0 is arbitrary, this implies that J, f = 0. QED.
43.8 THEOREM. Let f, g:A—R be bounded functions and suppose
that f is integrable on A. Let ECA have content zero and suppose that
f(x) = g(x) for allxe A\E. Then g is integrable on A and

pRooF.
[ifs
Extend f and g to functions f;, g defined on a closed cell I
containing A. The hypotheses imply that h,=f;—g: is bounded and
equals 0 except on E. By Theorem 43.7 we deduce that hy is integrable
on I and the value of its integral is 0. Applying Theorem 43.5, we infer
that g;=f;—h; is integrable on I and

[.e=[e-[u-m=[ nee OED.


418 INTEGRATION IN R?

Existence of the Integral


It is to be expected that if f is continuous on a closed cell I to R, then f
is integrable on I. We shall establish a stronger result which permits f to
be discontinuous on the complement of a set with content zero.

43.9 INTEGRABILITY THEOREM. Let ICR? be a closed cell and let


f:I—R be bounded. If there exists a subset E<I with content zero such
that f is continuous on I\ E, then f is integrable on I.
pRooF. Let |f(x)| = M for all x eI and let ¢ >0 be given. Then there
exists (why?) a partition P. of I such that (i) the cells in P, which contain
any points of E contain them in their interior, and (ii) these cells have
total content less than e. The union C of the closed cells in P. which do
not contain points of E is a compact subset on which f is continuous.
According to the Uniform Continuity Theorem 23.3, the restriction of f
is uniformly continuous on C. Replacing P. by a refinement, if necessary,
we may suppose that if J,, is a cell in P. which is contained in C, and if x,
yeéJ,, then |f(x)—f(y)|<e.
Now suppose that P and Q are refinements of P.. If S’(P;f) and
S'(Q; f) denote the portions of the Riemann sums extended over the cells
in C, then an argument similar to that in the second part of the proof of
Theorem 30.1, yields

IS'(P;
f)- SQ; f)| = |S'(P; f)-S'(P.s
f+ |S'(P.s -S'(Q; f)|
= 2ec(]).

Figure 43.2
43. THE INTEGRAL IN R? 419

Similarly, if S’(P;f) and S"(Q; f) denote the remaining portion of the


Riemann sums, then

f)- SQ;;f)| s |S°(P


IS"(P ; f)| = 2Me.
| +18"(Q;
It therefore follows that

IS(P;
f)- S(Q; f)| = e(2c(+2M);
since ¢ >0 is arbitrary, we infer from the Cauchy Criterion that f is
integrable on I. Q.E.D.
Necessary and sufficient conditions for integrability will be given in
Exercise 43.P and Project 44.a.

Exercises
43.A. (a) Let a=(a,,...,a,)€R° and let J;=[a,, a,],..., J, =[a,, a,] be cells
in R. Show that J=J,x---xJ, has content zero in R’. Hence the set {a} has
content zero in R’.
(b) If we take J{=(a,, a,), then the cell J’=J{xJ,x---XJ, is empty and has
content zero.
43.B. Show that a set ZR’ has content zero if and only if for each ¢ >0
there exists a finite set K,,..., .K, of cubes whose union contains Z and such that
c(K,)+++++c(K,)<e.
43.C. Write out the details of the proof of the assertion, made in Example
43.2(f), that the graph S ¢ R’ of a continuous function f:[a,b]— R has content
zero.
43.D. If J is a closed cell in R* and g:J-—R is continuous, show that the
graph {(x, y, g(x, y)):(x, y}e D}CR? of g has content zero.
43.E. Let A&R’ be the set consisting of all pairs (i/p, j/p) where p is a prime
number, and i, j=1,2,...,p—1. Show that each horizontal and each vertical
line in R? intersects A in a finite number (often zero) of points. Does A have
content zero?
43.F. Let 12 R” be a closed cell and let P={h,..., 4} and Q={i,..., Jn}
be two partitions of I into closed cells. Show that R={LNJ,:
i=1,...,n;j=1,...,m} is a partition of I and that R is a refinement of both
P and QO. The partition R is called the common refinement of P and Q.
43.G. If I<J are cells in R° and if P is a partition of I, show that there exists a
partition Q of J such that every cell in P belongs to Q.
43.H. Let ZR?’ be aset with content zero and let I be a closed cell containing
Z. If J,,...,J, are cells contained in I whose union contains Z, show that there
exists a partition P of I such that the closure of each J, is the union of cells in P.
43.1. Let Z CR?’ be a set with content zero and let I be a closed cell containing
Z. If ¢ >0, show that there exists a partition P, of I such that the cells in P, which
contain points in Z have total content less than «.
43.J. In the preceding exercise, show that we can choose P, to have the
additional property that the cells in P. which contain any points of Z contain them
in their interior.
420 INTEGRATION IN R?

43.K. In Exercise 43.], if I is a cube, show that there exists a partition Q, of I


into cubes such that the cubes in Q, which contain points in Z have total content
less than e.
43.L. Let ACR’ be bounded and let I, J be closed cells in R° such that
AcIcJ. If f:A-R is a bounded function, define f,:I— R (respectively,
fi:J> R) to he the function which agrees with f on A and vanishes on I\A
(respectively, J\ A). Show that the integral of f, over I exists if and only if the
integral of f, over J exists in which case these integrals are equal.
43.M. Let A&R’ be bounded and let I, and I, be closed cells in R’ such that
AclI. Let f:A—-R bea bounded function, and, for j =1, 2, define fj: > R by
f(x) =f(x) for xe A and f(x)=0 for xe1,\A. Show that the integral of f, and I,
exists if and only if the integral of f, over I, exists, in which case these integrals are
equal.
43.N. Establish the uniqueness of the integral of a bounded function f defined
on a closed cell ICR’.
43.0. Write out the details of the proof of the Cauchy Criterion 43.4.
43.P. Let ICR’ be a closed cell and let f:I->R be bounded. Then f is
integrable on I if and only if for every « >0 there exists a partition P. of I such that
if P={J,,...,J,} is a refinement of P., then

2. (M,— m,)o(I.)<e,
where M, =sup {f(x):xeJ,} and m, =inf {f(x):xeJ} for j=1,...,”. This result
is called the Riemann Criterion for Integrability (cf. Theorem 30.1).
43.Q. Let Ic R® be a closed cell and let f: I> R be bounded by M. if f is
integrable on I, show that the function |f| is integrable on J and that J,|f| < Mc(J).
43.R. Let I< R® be a closed cell and let f, g:I1 > R be integrable on I. Show
the product function fg is integrable on I.
43.8. Let Ic R® be a closed cell and let (f,) be a sequence of functions which are
integrable on I. If the sequence converges uniformly on I to f, show that f is
integrable on I and that

[r-wm([)
t " 1

43.T. Let K<R?° be a closed cube and let f, g:K > R be continuous. Show
that if « >0, then there exists a partition P, ={K,,..., K,} of K into cubes such that
if x, y, are any points of K, j=1,...,7, then

|, #8 -¥ fereonetK,)
| <eetK).
43.U. (This exercise gives an example due to I. J. Schoenbergt of a space-filling

t Isaac J. SCHOENBERG (1903-— ) was born in Romania and educated there and in
Germany. For many years at the University of Pennsylvania, he has worked in number
theory, real and complex analysis, and the calculus of variations.
43. THE INTEGRAL IN R?° 421

curve.) Let ¢:R—R be continuous, even, with period 2, and such that
e(t)=0 for O<t<},

=3t-1 for ge t<h,

=1 for f<t<l.
(a) Draw a sketch of the graph of y. Note that |lollx = 1.
(b) If teJ, define f(t) and g(t) by
I
fi) Se) +3 oF +H eB) + ey
1 1 1
git) = 5931) +33 9B) +53 oH +- >
Show that these series are uniformly convergent, so that f and g are continuous
on £
(c) Evaluate f(é) and g(t), where t has the ternary (base 3) expansion given by
0.2020, 0.0220, 0.0022, 0.2002.
(d) Let (x, y) belong to the graph S = {(f(t), g(t)):t € I} and write x and y in their
binary (base 2) expansions:
xX =0.0,0203..., y=0.6,8.B,...,

where a,, 8, are either 0 or 1. Let t be corresponding real number whose ternary
expansion is
t= 0.(2a,)(2B1)(2a2)(2B2) snes

Show that x = f(t) and y = g(t). Hence every point in the cube I x I belongs to the
graph S.
43.V. A set Z&R?’ has measure zero if for each « >0 there is a sequence (J,)
of cells whose union contains Z and such that } c(J,)<e.
(a) Since the empty set is a cell, show that a set which has content zero also has
measure zero.
(b) Show that every countable set in R’ has measure zero. Hence the set in
Example 43.2(g) has measure zero (but it does not have content zero).
(c) Show that, in the definition of ‘‘measure zero” given above, we can require
the cells to be open, or to be cubes.
(d) Show that every compact set with measure zero also has content zero.
(e) The union of a countable family of sets with content zero has measure zero.

Projects
43.a. Let IR’ be a closed cell and let f:1—R be bounded. if P=
{Ji,.-.,J,} is a partition of J, let

m,=inf{fx):xeF}, M,=sup{f(x):xeJ}
for j=1,..., and define the lower and upper sums of f for P to be

LiPS)=Y me), U(Psf) = MW).


422 INTEGRATION IN R?

(a) If S(P; f) is any Riemann sum corresponding to P, then L(P; f) = S(P; f)<
U(P;f). Ife >0, then there exist Riemann sums S,(P; f) and S.(P; f) correspond-
ing to P such that
SP; f) = L(P; f)te, U(P; f)—«e = S.(P;
f).

(b) If P is a partition of I and Q is a refinement of P, then

LP; f) = L(Q; f) = U(Q; f) s U®; f).


(c) If P, and P, are any partitions of I, then L(P,;f) = U(P;; f).
(d) Define the lower and the upper integral of f on I to be

L(f)=sup{L(P3f)}, Uf) = inf {U(P;


f)},
respectively, where the supremum and the infimum are taken over all partitions of
I. Show that L(f) < U(f).
(e) Show that f is integrable (in the sense of Definition 43.3) if and only if
L(f)= U(f), in which case L(f)= U(f)=hif-
(f) Show that f is integrable if and only if for each c >0 there exists a partition P
such that U(P;f)—L(P;f)<e. (This condition is sometimes called Riemann’s
Condition; compare it with Exercise 43.P.)
43.8. This project develops the integral of functions on a closed cell [< R’ and
with valuesin R*. If P={,,..., J,} is a partition of I, then a Riemann sum S(P; f)
corresponding to P is a sum
n
S(P; f=)k=l ch f(x)
where x, €J,. An element L ¢ R* is the Riemann integral of f over I if, for every
e¢ >O there exists a partition P, of I such that if P is any refinement of P, and
S(P; f) is any corresponding Riemann sum, then ||S(P; f)— L||<«. Examine which
of the theorems in this section remain true for functions with values in R*. Show
that f: I R°* is integrable if and only if each function f, =e, -f, j=1,...,4q, is
integrable. (Here ¢,,...,¢, are the standard basis vectors in R“.)

Section 44 Content and the Integral

In this section we shall introduce the collection of sets with content, and
characterize the content function as a real-valued function defined on this
collection of sets. Next we shall obtain some further properties of the
integral over sets with content, and show how the integral can be evaluated
ee
as an “iterated integral.”
44.1 Derinition. If A CR?, then we recall that a point x € R° is said
to be a boundary point of A if every neighborhood of x contains both
points in A and points in its complement (A). The boundary of A is the
subset of R® consisting of all the boundary points of A; it will be denoted
by b(A).
44. CONTENT AND THE INTLGRAL. 423

If ACR’, we recall that a point of R° is precisely one of the following: it is an


interior point of A, it is a boundary point of A, or it is an exterior point of A. The
interior A° consists of all of the interior points of A; it is an open set in R’. As
noted above, the boundary b(A) consists of all of the boundary points of A; it is a
closed set in R°’. The closure A” is the union A U b(A),; it is a closed set in R®.
We ordinarily expect the boundary of a set to be small, but this is
because we are accustomed to thinking about rectangles, circles, and other
elementary figures. Example 43.2(g) shows that the boundary of a
countable set in R? can have boundary equal to IX I.

Sets with Content

We shall now define the content of a subset of R’ whose boundary has


zero content.
44.2 Derinition. A bounded set A&R? whose boundary b(A) has
content zero is said to have content. The collection of all subsets of R?
which have content will be denoted by @(R’). If Ac @(R°) and if lisa
closed cell containing A, then the function g; defined by
g(x) =1 for xeA,
=0 forxe!\A,
is continuous on I except possibly at points of b(A). Hence g; is integrable
on I and we define the content c(A) of A to be equal to frig; Thus

(A)= [a=] 1.

Note that if J&R? is a cell, then its boundary consists of the union of a
finite number of “faces,” which are cells each having content zero. [For
example, if J=[(a, b]<[c, d], then b(J) is the union of the four cells
[a, b} x[c, c], [a, b|x[d, d],
[a, a] x[e, d], [b, b] x[c, d].
These same four cells are also the boundary of the cell (a, b) x(c, d).] It
follows that a cell in R’ has content; moreover, one easily sees that if
J= [ai, b.]x nx [a,, bp], then

(= [ 1= (ba)
+++, a).
Hence the content of a cell, as given by Definition 44.2, is consistent with
the definition of the content assigned to a closed cellin Section 43. Similar
remarks apply to other cells in R°; in particular, it is seen that if
K =[a,, bi) X---[a,, b,), then

c(K)= | 1=(bi-a,)- . - (bp — dp).


424 INTEGRATION IN RP

We shall now show that the notion of content zero introduced in


Definition 43.1 is consistent with the notion of content introduced in
Definition 44.2.

44.3. Lemma. A set AC R?® has content zero (in the sense of Definition
43.1) if and only if it has content (in the sense of Definition 44.2) and
c(A) =0.
PROOF. Suppose that ACR? has content zero. Then, if e >0, we can
enclose A in the union U of a finite number of closed cells with total
content less than «. Since this union U is a bounded set, then A is
bounded; since U is closed, it also contains b(A). Since e >0 is arbitrary,
we infer that b(A) has content zero; hence A has content and

c(a)= | 1.

It now follows from Lemma 43.7 that c(A)=0.


Conversely, suppose that A < R” has content and that c(A)=0. Hence
there is a closed cell I containing A and such that the function
gi(x)=1 for xeA,
=0 for xeI\A,

is integrable on I. Let e >0 be given and let P. be a partition of I such that


any Riemann sum corresponding to P, satisfies 0 = S(P.; g:)<<«. If we
take the intermediate points in S(P.; g:) to belong to A whenever possible,
we infer that A is contained in the union of a finite number of cells in P,
with total content less than «. Thus A has content zero in the sense of
Definition 43.1. Q.E.D.

44.4 THeorem. Let A, B belong to @(R?) and let xe R’.


(a) The sets ANB and A UB belong to @(R’) and
c(A)+c(B)=c(ANB)+c{A
UB).
(b) The sets A\ B and B\ A belong to BUR") and
c(A UB)=c(A
\ B)+c(AMB)+c(B\ A).

(c) Ifx+A={x+a:acA}, thenx+A belongs


to D(R°) and
c(x+A)=c(A).
pRooF. By hypothesis, the boundaries b(A), b(B) have content zero.
We leave it as an exercise to the reader to show that the boundaries
b(ANB), b(AUB), b(A\B), b(B\A)
are contained in b(A)Ub(B), It follows from this and Example 43.2(c)
that ANB, AUB, A\B, and B\A belong to @(R®).
44. CONTENT AND THE INTEGRAL 425

Now let I be a closed cell containing A UB and let f., fr, fi, fa be the
functions equal to 1 on A, B, ANB, A UB, respectively, and equal to 0
elsewhere on I. Since each of these functions are continuous except on
sets with content zero, they are integrable on I. Since

fatfo=fitf.,
it follows from Theorem 43.5 and the definition of content that

(A) +e(B)= | fet] f= fe +h)


=|att=[ ref ne
=c(ANB)+c(A UB).

This establishes the formula given in (a); the one in (b) can be proved
similarly.
To prove (c), note that if « >0 is given and if Ji,...,J, are cells with
total content less than ¢ whose union contains b(A), then x +Ji,...,x+Jn
are cells with total content less than « whose union contains b(x+ A).
Since ¢ >0 is arbitrary, the set x+A belongs to @(R"). To show that
e{x + A)=c(A), let I be a closed cell containing A; hence x +1 is a closed
cell containing x +A. Let f;:1— R be such that fi(y)=1 for ye A and
fily)=0 for yeI\A, and let fp:x+I—R be such that f.(z)=1 for
zEx+A and f.(z)=0 for ze(x+I)\(x+A). Show that to each
Riemann sum for f; there corresponds a Riemann sum for f2 which is equal
to it. Hence

(a)= | ha[ fo=c(x+A). O.E.D.

44.5 Coro.iary. Let A and B belong to &(R*).


(a) If AN B=, then c(A UB)=c(A)+c(B).
(b) If ACB, then c(B\ A)=c({B)—c(A).

Characterization of the Content Function

We have seen that the content function c: G(R’) > R is non-negative,


“additive,” “invariant under translation,” and assigns the value 1 to the
‘“thalf-open” cube

Ko=[0, 1) x[0, 1):


- + x[0, 1).
We shall now show that these four properties characterize c.
426 INTEGRATION IN R°

44.6 TuHeorem. Let y:9(R°’)—R be a function with the following


properties:
(i) y(A) = 0 for all Ae D(R®);
(ii) if A, BE @(R*) and AN B=Q, then y(A U B)= y{A)+ y(B);
(iii) if AE DCR’) and xe R’, then y(A)= y(x +A);
(iv) y(Ko)= 1.
Then we have y(A)=c(A) for all AG @(R?®).

proor. If neéN, let K, be the ‘thalf-open” cube

K, =[0, 2) x[0, 2) x» + x[0, 2).


We note that K, is the union of 2” disjoint translates of K,; hence
1=+y(Ko) = 2"y(K,.) and so +(K,,) = 1/2" =c(K,).
Let A, B belong to @(R*’) and let ACB. Then we can write
B=AU(B\A); since AN(B\ A)=Q, it follows from (i) and (ii) that

y(B) = y(A)+ y(B\ A) = y(A).


Hence y is monotone in the sense that if A < B then y(A) <= y(B). Now
let AE @(R"). Since A is bounded, then for some MEN, the set A is
contained in the interior of the closed cube I with half side length 2“ and
center at 0. If ¢ > 0, there is a partition of [ into small cubes of side length
2 ", say, such that the content of the union of all the cubes [,,..., I, which
are contained in A exceeds c(A)-—«, and such that the content of all the
cells I,,..., I, (vr < s) which contain points of A does not exceed c{A)+e.
(See Exercise 44.1.) Now each of these sets I, differ from a translate
x, + K, by a set of content zero. Hence we have

c(A)-ex U (x +K,)) <c(A)s U (xi +K,)) <c(A)te.

Now c and y are both invariant under translation of the set and agree on
K,. Moreover c and y are additive over disjoint finite unions. Hence it
follows that

(Ui=1 i+ K))=2i=] eG. + Ky) = i=1 yout Ky)


= (U (x: + K,)).

It follows from this and the fact that y is monotone that

c(A)-e< (0 (x+K,)) <y(A)< (U (x; +K,)) <c(A) +e,


44. CONTENT AND THE INTEGRAL. 427

whence |y(A)—c(A)|<«. Since ¢>0 is arbitrary, we infer that y(A)=


c(A). QED.

44.7 CoroLiary. Let 4: G(R’) — R be a function satisfying proper-


ties (i), (ii), and (iii). Then there exists a constant m=O such that
p(A)=mce(A) for all Ae D(R*).

PROOF. Since «. possesses properties (i) and (ii), it is easily seen that wu is
monotone in the sense that A ¢ B implies that u(A) = »(B). If u(Ko) =
0, then p» of any bounded set is 0, whence it follows that w(A)=0 for all
A€Q(R°), so we can take m=0. If (Ko) #0, let

1 G ,
(A) =— (Ko) (A) for all Ae @(R").
2(R")

Since it is readily seen that y has properties (i), (ii), (iii), and (iv) of the
theorem, it follows that y=c. Hence we take m = p,(Ko). Q.E.D.

Further Properties of the Integral


We shall now present some additional properties of the integral that are
often useful.
44.8 THEOREM. Let AG @(R") and let f: A—-R be bounded and
continuous on A. Then f is integrable on A.
proor. Let I be a closed cell with A <J and let f;:I— R be equal to f
on A and to 0 on I\ A. Since f; is bounded on I and is continuous on
I\ b(A), it follows from the Integrability Theorem 43.9 that f; is integrable
onf. Therefore f isintegrable on A. Q.E.D.
We shall now show that the integral is additive with respect to the set
over which the integral is extended.
44.9 THEOREM. (a) Let A; and A. belong to D(R®) and suppose that
Ai A: has content zero. If A= A,U Az and if f: A > R is integrable on
A: and Ao, then f is integrable on A and

aan Li[rels
(b) Let A belong to DUR”) and let Ai, A2ze BCR") be such that A=
AiU A, and such that A, A2 has content zero. If f: A —> R is integrable
on A, and if the restrictions off to A: and Az are integrable, then (44.1)
holds.
proof. (a) Let I be a closed cell containing A =A,iU Az. and let
fi: I— R, i=1, 2, be equal to f on A; and equal to 0 elsewhere on I. By
428 INTEGRATION IN R°

hypothesis, f, and f. are integrable on I and

[eq[s i=1,2.
It follows from Theorem 43.5 that f,+f2 is integrable on I and that

[u+to=[refe
Now since f(x)=fi(x)+/f2(x) provided xe A \(A:N Az), it follows from
Lemma 43.8 that f is integrable on A and that (44.1) holds.
(b) We preserve the notation of the proof of (a). By hypothesis, f, is
integrable on I. Now, fi(x) =fi(x)+f.(x) except for x in A.M Ao, a set
with content zero. It therefore follows from Theorem 43.5 and Lemma
43.8 that

[re t-[aorr-[
re] p
= [. f+ {. f. O.E.D.

We remark that if f: A > R is a bounded integrable function, then the


assumption made in 44.9(b) that the restrictions of f to A: and A: are
integrable is automatically satisfied. (See Exercise 44.J.)
The next result is often useful to estimate the magnitude of an integral.
44.10 THEOREM. Let A€ &(R°) and let f: A—R be integrable on A
and such that |f(x)|= M for allxe A. Then

(44.2) | f| = Mc(A).
I.
More generally, iff is real valued and m =< f(x) = M for all xe A, then

(44.3) mc(A)< [. f= Mc(A).

PROOF, Let f; be the extension of f to a closed cell I containing A. If


« > 0 is given, then there exists a partition P. ={J,,..., Jn} of I such that if
S(P.; fr) is any corresponding Riemann sum, then

S(P.sf)-e=
| f=S(Psfite.
We note that if the intermediate points of the Riemann sum are chosen
outside of A whenever possible, we have
S(Ps f= Li fue),
44. CONTENT AND THE INTEGRAL. 429

where the sum is extended over those cells in P, entirely contained in A.


Hence

S(P.:f) <M, cU) = Mc(A).


Therefore we have

| f=| fis Mc(A)+e,


A t

and since « >0 is arbitrary we obtain the right side of inequality (44.3).
The left side is established in a similar manner. QED.

As a consequence of this result, we obtain the following theorem, which


is an extension of the First Mean Value Theorem 30.6.
44.11 MEAN VALUE THEOREM. Let Ae &(R°) be a connected set and
let f: A—> R be bounded and continuous on A. Then there exists a point
peA such that

(44.4) [, F=f),
_ pRoor. If c(A)=0, the conclusion is trivial; hence we suppose that
c(A)#0. Let

m = inf {f(x):x € A}, M =sup {f(x):x € A};

it follows from the second part of the preceding theorem that

1
(44.5) may |, f< M.

If both inequalities in (44.5) are strict, the result follows from Bolzano’s
Intermediate Value Theorem 22.4.
Now suppose that f4f=Mc(A). If the supremum M is attained at
p <A, the conclusion also follows. Hence we assume that the supremum
M is not attained on A. Since c(A) #0, there exists a closed cell KC A
such that c(K)#0 (see Exercise 44.G). Since K is compact and f is
continuous on K, there exists « >0 such that f(x) <= M-e for all xeK.
Since A = K U(A\K) it follows from Theorem 44.9 and 44.10 that

me(A)= |, f= J. t+] f

<(M-—e)c(K)+Mc(A \ K)<Mc(A),

a contradiction. If Ja f=mc(A), then a similar argument applies. Q.£.p.


430 INTEGRATION IN R?

The Integral as an Iterated Integral


It is desirable to know that if f is integrable on a closed cell J=
[ai, bi] X---X[a,, b,] in R?’ and has values in R, then the integral f;f can
be calculated in terms of a p-fold “iterated integral’’:

("t {f'{ [te Xa, -- 5 X) an} ax.| ‘| dx».


This is the method of evaluating double and triple integrals by means of
iterated integrals that is familiar to the reader from elementary calculus.
We shall give a justification of this procedure: for simplicity we shall
suppose that p=2, but clearly the result extends to higher dimensions.
44.12 Tueorem. If f is continuous on the closed cell J=[a, b] x[c, d]
to R, then

, #=[{ fre » ax} ay


= {fre yay} ae
PROOF. It was seen in the Interchange Theorem 31.9 that the two
iterated integrals arc equal. To show that the integral of f on J is given by
the first iterated integral, let F be defined for y e[c, d] by
b

F(y)= [ flay) de
Let c=yo= yi=::-=y,=d be a partition of the interval [c, d], let
a=Xo=x15°''-=x,=b be a partition of [a,b], and let P denote the
partition of J obtained by using the cells [xx-1, xx] X[yj-1, yj]. Let y* be
any point in [y;-1, y] and note that

FOD=| fey &= > |” fos yh ae.


According to the First Mean Value Theorem 30.6, for each j and k there
exists a point xji in [xx-1, x] such that

FO =D flrk yum),
We multiply by (y;— y;-.) and add to obtain

Y FODO y= BY FG yOu —vi yi).


r s

j=lk=1
44. CONTENT AND ‘THE INTEGRAL 431

Now the expression on the left side of this formula is an arbitrary Riemann
sum for the integral

[Fo dy = [ {te y) ax dy.

We have shown that this Riemann sum is equal to a particular Riemann


sum corresponding to the partition P. Since f is integrable on J, the
existence of this iterated integral and its equality with the integral over J is
established. Q.E.D.
A minor modification of the proof given for the preceding theorem
yields the following, slightly stronger, assertion.
44.13 THEOREM. Let f be integrable on the rectangle J =[a, b]<[c, d]
to R and suppose that, for each y €[c, d], the function x > f(x, y) of [a, b]
into R is continuous except possibly for a finite number of points, at which it
has one-sided limits. Then

[if {lena}
As a consequence of this theorem, we shall obtain a result which is often
used in evaluating integrals of functions defined on a set which is bounded
by continuous curves. For convenience, we shall state the result in the
case where the set has horizontal line segments as its top and bottom
boundaries, and continuous curves as its lateral boundaries. Clearly, a
similar result holds if the lateral boundaries are vertical line segments and
the top and bottom boundaries are curves. More complicated sets are
handled by decomposing the sets into the union of subsets of these two
types.
44.14 THEoreM. Let ACR? be given by
A={@ y):a(y)
=x = By), c=y=d},

x= aly) x= B(y)

Figure 44.1
432 INTEGRATION IN R?

where a and B are continuous functions on [c, d] with values in [a, b]. Iff is
continuous on A > R, then f is integrable on A and

prooF.
[r-[{[[lmna}e
Let J be a closed cell containing A and let f; be the extension of
ftoJ. A variation of Example 43.2(f) shows that the boundary of A has
content zero; hence f; is integrable on J. Now for each yé[c, d] the
function x + f(x, y) is continuous except possibly at the two points a(y)
and B(y), at which it has one-sided limits. It follows from the preceding
theorem that

[r= = [{[ be » ax} ay


[Afr nara ox.
Exercises

44.A. If ACR’, then a point is a boundary point of A if and only if it ts a


boundary point of the complement €(A) of A. Hence b(A)=6(€(A)).
44.B. Let ACR’, and let b(A) be the boundary of A.
(a) The set b(A) is closed in R°.
(b) The interior A°= A \ b{A) is open in R° and contains every open set G with
GCA.
(c) The closure A” = A U b(A) is closed in R° and is contained in every closed
set F with ACF.
44.C. Let ACR’ and let A-=AWUD(A) be the closure of A. Show that
b(A’)<b(A). Give an example to show that equality can hold, and an example
that equality can fail.
44.D. Let A, B be subsets of R°. Show that the boundary of each of the sets

ANB, A\B, AUB

is contained in b(A)Ub(B). (Hint: b(A)=A7N((A))Y.)


44.E. Aset ACR? is closed in R? if and only if b(A) CA. Aset BCR? is open
in R? if and only if BN b(B)=9.
44.F. If A¢@(R’), show that its interior A°=A \ b(A) and its closure A7=
A Ub(A) also belong to @(R°) and that c(A°)=c(A)=c(A).
44.G. If AE@(R"’) and c(A)>0, prove that there exists a closed cell KC A
such that c(K) #0.
44.H. If ACR’, we define the inner and outer content of A to be

c¥(A)
= sup c(U), c*(A) =inf c(V),

where the supremum is taken over the set of all finite unions of cells contained in A,
and the infimum is taken over the set of all finite unions of cells containing points
of A.
44. CONTENT AND THE INTEGRAL, 433

(a) Prove that c,(A)=c*(A) and that A has content if and only if ca(A)=
c*(A), in which case c(A) is this common value.
(b) If A and B are disjoint subsets of R’, show that c*(A UB) < c*(A)+c*(B).
(c) Give an example of disjoint sets A, B such that 0#c*(A)=c*(B)=
c*(A UB).
44.1. Let MEN and let I, ¢ R” be the cube with half length 2” and center 0.
For néN we divide I, into a grid G,,,, of length 2 " formed by the collection of all
cubes in I, with side length 2" and dyadic rational end points (that is, end points of
the form k/2" where k € Z).
(a) If J <Iy is a closed cell and e >0, show that there exists n € N such that the
union of all the cubes in Gy, which are contained in J has a total content more than
c(J)—« and the union of all the cubes in Gy,,, which contain points in J has total
content less than c(J)+e.
(b) If A&I, has content and e >0, show that there exists n€ N such that the
union of all cubes in Gy, which are contained in A has total content exceeding
c{A)-« and the union of all cubes in Gy,, which contain points in A has total
content less than c(A)+<«.
44.3. Let I< R’ be aclosed cell and let f: 1 > R be integrable on I. If ACT has
content, then the restriction of f to A is integrable on A. (Hint: Use Exercise
43.P.)
44.K. Let Ae Q(R") and suppose that f and g are integrable on A and that
g(x)=0 for all xe A. If m=inf f(A), M=sup f(A), then there exists a real
number 2 €[m, M] such that

| fe =u
A A
8.
44.L. If, in addition to the hypotheses of the previous exercise, we suppose that
A is connected and f is continuous on A, then there exists a point p € A such that

[ fg =f) 8.

44.M. Let {(x,, y,):n € N} be an enumeration of the points in (0, 1) x (0, 1) with
rational coordinates. For each néEN, let I, be an open ceil in (0, 1) (0, 1)
containing (x, y.), and let G= Unen I. Show that G is an open set in R’ whose
boundary b(G) is (0, 1)<(0,1)\ G. Show that if © c(,)<1, then the open set G
does not have content.
44.N. Using the terminology of Exercise 7.K, let A <[0, 1] be a “Cantor-like”
set with length }. If K =A x[0, 1], show that K is a compact subset of R’, that
b(U&) = K, and that K does not have content.
44.0. Let a = b and let f:[a, b] — R be continuous and such that f(x) = 0 for all
xef[a, b]. Let S,={(x, y):a=<x <b, 0<y <f(x)} be called the ordinate set of f.
By examining the boundary of S, show that it has content. Show that

c(S) = {yr idy} dx = [fe dx.


44.P. Let ACR’ be the set in Exercise 43.E and let f be defined on
Q =[0, 1]<[0, 1] R by f(x, y)=1 for (x, y}e A and f(x, y)=0 otherwise. Show
434 INTEGRATION IN R°

that A does not have content and that f is not integrable on Q. However, the
iterated integrals exist and satisfy

{J 70 yy ax} ay = ['{ [re yy ay} ax


44.Q. Let Q=[0, 1] x[0, 1] and let f: Q > R be defined by f(x, y) =0 if either x
or y is irrational and f(x, y)=1/n if y is rational and x =m/n where m and n>0
are relatively prime integers. Show that

LL f[rnaar-c
but that {3 f(x, y) dy does not exist for rational x.
44.R. Let Jc R’ be an open cell containing (0, 0) and let f: J > R be continuous
on J. Define F:J— R by the iterated integral:

F(x, y) = I ‘{ [ts ’) ar| ds.


Show that D,D,F(x, y)= f(x, y)= D,D,F(x, y) for (x, y)eJ.
44.S. Let J be as in the previous exercise and let G: J — R be such that D,D,G
is continuous on J. Use this exercise to show that D,D,G exists and equals
D,D,G.
44.T. Let J=[a,, b,]X---x[a,, b,] and let f:J> R be continuous. Let J,,=
[a,, b.]x---x[a,, b,] in R’~’ and let Fy): Ja) — R be defined by

Foj(X2,...,%,) = { Fox, Xo, +++ Xp) AX.

(a) Show that F,) is continuous on Jy.


(b) Given (x3,..., x*) in Jaq) and any partition a, = X14 9<xX1.<" + <x, =b, of
[a,, b,], show that there exists points x}, in [X1,4-1, Xi] such that

Fu,(x3, tree x= 2 fhe XB, XP) Xt Xiu).


kei

(c) Prove that


by

[ Fay(x2, «+ «5 Xp) dix...) | {{ fix, X25. + +5 Xp) ax,| d(X2, «2.5 Xp)
(Uy a a

=| fx wen y Xp) A(X,


1, Xp).

(d) Extend the result to the case where for each point (x.,...,x,) in Jy) the
function x,+> F(x,, X2,...,%,) of [a,, b.]-> R is continuous except possibly for a
finite number of points at which it has one-sided limits.
44.U. (a) Let a, 8:[a, b]> R be continuous with a(x) < B(x) for all x €[a, b].
Show that the set

B={(x, y)eR’?:a=<x=b,a(x)<y
= B(x)}
is a compact set in R? with content.
44. CONTENT AND THE INTEGRAL 435

(b) Now let y, 6:B — R be continuous functions with y(x, y) < 6(x, y) for all
(x, y)€B. Show that the set

D={(x, y, z)e R= (x, y)e B, y(x, y) = z = &(x, y)}

is a compact set in R° with content.


(c) If f:D —- R is continuous, show that f is integrable on D and that

[oe SALA e245} ay} as


44.V. Let I=[a,, b,]*---x[a,, b,] and for each j=1,..., p, let f,:[a, bJ-R
be an integrable function.
If p:I— R is defined by p(x,,...,x,)=filx.)--- f,(x,), show that @ is inte-
grable on I and that

Jenn {Ls
44.W. Use the Weierstrass Approximation Theorem to show that if I=
[a,, b,]x-+-x[a,, b,] and if g: I= R is continuous, then
be fhe b,
[e-| {{ ff Blt, Xo 8) day} + dea} de,
J ay kway ap

44.X. Let @:[0, +)— R with ~(0)=0 be continuous, unbounded, and strictly
increasing, and let y be its inverse function. Hence & is also continuous and
strictly increasing on [0, +).
(a) If a, 8 are positive numbers, compare the area of the interval [0, «]x[0, 8]
with the areas bounded by the coordinate axes and the graph of » to obtain
‘Young’s Inequality:

ap = [¢ + [v.

(b) If p=1 and q=1 are such that (1/p)+(1/q)=1, and if g(x)=x' and
w(x) = y”, use Young’s Inequality to establish the inequality
ap = a*/p+ BiY/q.

(c) If a, 6, i=1,..., are real numbers and if A =(ja,|’+---+|a,|’)'" and


B=((b,\*+---+[b,|*)'", use the above inequality to derive Hélder’s Inequality

Y |ab| = AB,
which was obtained in Project 8.8(b).

Projects

44.0. Let IG R” be a closed cell and let f: 1 R be bounded. For a>0, let
D, = {x €I:@,(x) = a}, where w,(x) denotes the oscillation of f at x (see Project
23.a).
436 INTLGRATION IN R?

(a) Suppose that D, has content zero. Let P, ={l,,...,1,} be a partition of J


such that (i) each point of D, is contained in the interior of one of the cells I,,..., I.
(r =n), (ii) cU,) +--+ +c¢(L) = a/2 ||fl],, and (iii) if x, ye K, forj=r+i,...,n, then
|f(x)-fQ)|<a. If P is a refinement of P,, show that |S(P;f)—S(P.;f)|<
a(c(I)+1).
(b) Deduce that if D, has content zero for each a > 0, then f is integrable on I.
(c) Suppose that for some a >0, the outer content c*(D,)>0. Show that for
any partition P={J,,...,J,} of I we have

D (M,~m)c(J,) = ac*(D,).
Deduce that f is not integrable on 1
(d) Conclude that f is integrable on I if and only if the set D, has content zero
for all a>0.
(e) Recall that D = ),.~ D,,, is the set of points where f is discontinuous. Show
that D has measure zero (in the sense of Exercise 43.V) if and only if each set D,,,
has content zero.
(f) Conclude that f is integrable on I if and only if its set D of points of
discontinuity has measure zero. (This result is Lebesgue’s Criterion for
Integrability.)
44.8. This project considers lower and upper integrals (introduced in Project
43.a) and their iterations. Let I< R' and JCR* be closed cells, p=r-+s, and let
K=IxXJcCR’=R'xR*. Suppose that f:K > R is bounded.
(a) For each x € I, define g,:J > R by g.(y)=f(x, y) for yeJ. LetA:I—R be
defined to be the lower integral A{x) = L(g,) of g,, and let ~:1— R be defined to
be the upper integral x(x)= U(g,) of g.. If R is any partition of I, and S is any
partition of J, and P=RXS the resulting partition of K, then show that
L(P; f) = L(R; A) = U(R; A) = U(R; pw) = UP; f).
(b) Show that

LPSLA)sUQ)=U(f), Lif) =L(v) = Ul) = Uff).


Hence, if f is integrable on K, then A and w are integrable on I and

[fe Ja-[u
(c) For each ye J, define h,:1—>R by h,(x)=f(x, y) forxel Let A’:J OR
and «.’:J >R be defined by

M(y)=L(hy), — w'(y)= UChy).


Show that iff is integrable on K, then A’ and yw’ are integrable on J and

(d) If g.:J > R


Lr [o-Le
is integrable on J for each x Ef, then A= and

[, fe. ace» f r=] {J 1059 ay} a.


44. CONTENT AND THE INTEGRAL. 437

Similarly, if h,: I — R is integrable on I for each yeJ, then

[ fo aa =| t=] {f rey ax} ay.


44.y. Let OCR’ be open and let B() be the collection of all sets A ¢ @(R’)
with A~cQ. In this project we shall introduce the notion of an ‘‘additive”
function on @(Q) and of its “‘strong density.”” A function G:9(Q)— R is said to
be additive if
G(A UB)= G(A)+ G(B)
whenever A, BE F(Q) and ANB =6.
(a) If f:Q— R is integrable on every set in @(Q) and if we define F:9(Q) > R
by
F(A)=|
then F is additive on (QQ).
(b) Let G:B(Q) — R be an additive function and let g:Q > R. We say that g
is a strong density for G if, for every « >0 and every set Ae B(Q), there exists
6 > 0 such that if K is a closed cube with side length less than 6 contained in Q, and
if xe AN K, then

Te atx) <e.

(c) Let Q=R’. Show that the content function c:@(R’)— R_ has strong
density identically equal to 1 on R?.
(d) Suppose that w:(R’)— R is a positive additive function which is invariant
under the translation of sets [that is, 2(x +A)=p(A) for all xe R’, Ac D(R’)).
Show that » has a strong density on R’ which is a constant on R°.
(e) If f:Q— R is continuous and if F is defined as in (a), show that F has strong
density f on 0.
(f) If G:@(Q) = R is additive and has strong density g:(1— R, show that g is
continuous on Q. Hence g is uniformly continuous on every A € B(Q).
(g) Suppose that G : 9(Q) — R is additive and has strong density identically zero
on 2, Show that if K is a closed cube and if ¢ >0, then there exists a partition of K
into cubes {K,,...,K,} such that |G(K;)|< ec(K) for j=1,...,7, whence it
follows that |G(K)| =< ec(K). Conclude that G(K) = 0 for all closed cubes K <Q.
(h) Suppose that F, and F, are additive functions on @(Q) such that for some
M >0 we have |F,(A)| = Mc(A) for all AG @(Q), j=1, 2. If F.(K)=F,(K) for
every cube K <Q, prove that F,(A)=F,(A) for all A €@(Q).

Section 45 Transformation of Sets and Integrals

It was noted in Section 43 that continuous mappings of an interval in R


can cover a closed cube in R*. We will show that this phenomenon cannot
438 INTEGRATION INR?

happen if the mapping is in Class C' and shall study the mapping of sets
with content under C’ maps. The case of a linear map is particularly
important and the result is satisfyingly simple. In the case of a non-linear
mapping, it will be seen that the Jacobian of the mapping indicates the
extent of the ‘‘distortion” of the transformation.
These results will be used to establish a theorem concerned with the
“change of variable” of an integral over a set in R’. The special cases of
polar and spherical coordinates are briefly examined, and a stronger
theorem is given that applies to many transformations that exhibit a mild
amount of singularity.
45.1 Lemma. The 0. R? be open and let p:0.— R?° belong to Class
C'(Q). Let A be a bounded set with AT<Q. Then there exists a bounded
open set QO, with ATS 0, 507 EO and a constant M>0 such that if A is
contained in the union of a finite number of closed cubes in 0, with total
content at most a, then ¢{A) is contained in the union of a finite number of
closed cubes with total content at most (Vp Myra,
proo. If Q=R?", let 6&=1; otherwise let 6=+inf {la—x|:
aéA, x€Q}. Since A™ is compact, it follows that 5>0. (Why?)
Now let 0,={yeR°:||y—al|<6 for some a€ A}, so that 0, is open
and bounded and A-cQ, and 0;<. Since ge C(Q) and 0; is
compact, it follows that M =sup {|De(x)llp:x€Q:} is finite. If Ac
I,U-:+-UI,, where the J; are closed cubes contained in 04, then it follows
from Corollary 40.6 that for x, yeJj we have

le (x) - e(y)I| = M |x — yl}.


Suppose the side length of I is 27, and take x to be the center of J;; then if
y € Jj, we have |x —y|| = Vp and so ||e(x)— g(y)|| = Vp Mr. Thus (J) is
contained in a closed cube of side length 2Vp Mr, Hence it follows that
(A) is contained in the union of a finite number of closed cubes with total
content at most (vp My)Pa. Q.E.D.
45.2 THEoREM. Let OCR? be open and let ¢:Q— R? belong to Class
C'\Q). If AcQ, has content zero and if AX <Q, then @(A) has content
zero.
proor. Apply the lemma for arbitrary a >0. Q.E.D.
45.3 CoROLLARY. Let r<p, let QOR’ be open, and let §:Q.— R?
belong to Class C'(Q). If A <Q is a bounded set with AT <Q, then W(A)
has content zero in R°.

pRooF. Let %=2% R’™ so that Qp is open in R’, and define @:Q%—> °
R? by
(Kiyo Xep Netty oe ey Xp)
= WK, 2, Xe).
45. TRANSFORMATION OF SETS AND INTEGRALS 439

Evidently ¢ € C'(Q»). Let Ao= A X{0,..., 0} so that Ap < Me and Avo has
content zero in R’. It follows that #(A) = ¢(A,) has content zero in R’.
Q.E.D.
We note that this corollary asserts that the C’ image of any bounded set of
“lower dimensionality” has content zero.
Since the boundary of a set A with content has content zero, it follows
from Theorem 45.2 that if y is in Class C’ then @(b(A)) has content zero.
Unfortunately ¢(b(A)) need have little relation, in general, to b(@(A)).
This observation enhances the interest of the next two results.

45.4 THEOREM. Let Q.<R? be open and let ¢:Q— R?® belong to Class
C*\(Q). Suppose that A has content, AT <Q, and J,(x) #0 for all xe A®.
Then @{A) has content.

PROOF. Since A” is compact and ¢ is continuous, then (A) <¢ o(A_) is


bounded. To show that o(A) has content we shall show that b(@(A))<
e(b(A)) and that »(b(A)) has content zero.
Since g(A’) is compact, we have b(~(A))<e(A)=@(A°UD(A)).
Hence, if y€b(@(A)), there exists an xe A°UD(A) such that y = ¢(x).
If xe A°, then J,(x)#0 and it follows from the Surjective Mapping
Theorem 41.6 that y = g(x) is an interior point of p(A°)C@(A). But this
contradicts the hypothesis that yeb(~(A)). Therefore we infer that
b(@(A)) S e(b(A)).
Now, since A has content, its boundary b(A) <Q. is a closed set with
content zero, whence it follows from Theorem 45.2 that ¢(b(A)) has
content zero. Q.E.D.
45.5 Coroirary. Let QCR? be open, and let ¢:Q— R? belong to
Class C'(Q) and be injective on QO. IfA has content, AX <O, and J,(x) #0
for xe A®, then b(@{A)) = e(b(A)).
prRooF. It suffices to show that ~{b(A))& b(@(A)), since the reverse
inclusion was established in the proof of the theorem. Let xe b(A), so
that there exists a sequence (x,) in A and a sequence (y,) in 0 \ A, both of
which converge to x. Since @ is continuous, then ¢(x,)— (x) and
(yn)> (x). Since ¢p is injective on Q, then o(yn)¢e(A) and hence
p(x)e€b(e(A)). Therefore ¢(b(A)) & b(@(A)). Q.E.D.

Transformations by Linear Maps

We shall now see that sets with content are mapped by a linear map in
R’ into sets whose content is a fixed multiple of the original content.
Moreover, this multiple is the absolute value of the determinant corre-
sponding to the linear map. (In this theorem we shall assume that the
440 INTEGRATION IN R°

notion and elementary properties of the determinant of a linear map in R’


are familiar to the reader.)

45.6 TuHEorReM. Let Le ¥(R"). If AE @(R’), then c(L(A))=


|det
L| c(A).

proor. If L is singular (that is, if det L=0), then L maps R?® into a
proper linear subspace of R°. Since this subspace can also be obtained as
the image of some L’: R’ > R? with r<p, it follows from Corollary 45.3
that c(L(A))=0 for all Ae @(R°). Hence the statement is true for linear
maps which are singular.
if L is not singular (that is, if det L #0), then Theorem 45.4 implies that
if A€@(R’), then L(A)eE G(R’). We now define A:D(R°)—R by
A(A)=c(L(A)). @ It is clear that A(A)=O for all Ae BR’).
(ii) Suppose A, Be @(R”) and AM B= 9; then

(A UB)=c(L(A UB))=c(L(A) UL(B)).

Since L is injective, then L(A) L(B)=@ and hence

c(L(A) UL(B)) = c(L(A))+ c(L(B)) = A(A) +A(B).

(iti) Let xe R’ and Ae BR"); then

A(x +A) =c(L(x +A))=c(L(x)+L(A))=c(L(A)) = MA).

Therefore it follows from Corollary 44.7 that there exists a constant


m: = 0 such that A(A)=mc(A) for all A € @(R?).
We next examine how m, depends on Le £(R"). Let ME £(R”) be
non-singular; then if A¢@(R°), we have

mimc(A)= ¢(L°M{(A))=c(L(M(A)))
= mc(M(A)) = mumnc{A)).
Hence we have mrw = mmm for-all non-singular L, Me £(R°).
It remains to show that m. =|det L|. To do this we shall use the fact
from linear algebra that every non-singular L € £(R?) is the composition
of linear maps of the following three forms:

(a) Lilxa,..., Xp) =(ax1, X2,...,Xp) for some a #0;


(b) La(xi,.. 5 Xi, Minty.) Xp) = (%1, 5 Kiva, My. Xp)
(c) L3(xa,... , Xp) = (Xi
+ X2, X2,..., Xp).
Note that if Ky is the half-open cube [0, 1)«---x[0, 1) in R? and if a>0,
then L,(Ko) =[0, a) x[0, 1)x---x{0, 1), whence it follows that

a = c(Li(Ko)) = mi,c(Ko) = mr,.


45, TRANSFORMATION OF SETS AND INTEGRALS 441

Similarly, if a<0, then Li(Ko) = (a, 0]*[0, 1)x--+x[0, 1), and

-a= c(Li(Ko)) = mrz,c(Ko) =mzr,.

Hence, in either case we have m,,= |a|=|det L,}.


Since L2(Ko) = Ko, it follows that mz,= 1 = |det L2J.
Finally, let A, and A, be the two sets
Aiv={(x1,..., Xp) 105
x <1, x1
< x9},
Ao={(x1,..., %) 105% <1, x2 x1}.

It is clear that A, A,.=@ and Ko=A,iUA;. Since it can be seen that


L3(Ko)
= A, U{(1, 0,..., 0)
+ Ai}
it follows that

¢(L3(Ko)) = c(A2)
+ ¢((1, 0,..., 0)
+ Ai) = c(A2)
+ c(Ar)
=c(A;UA2) =c(Ko).
Hence m ,= 1=|det L;|.
Now let the non-singular linear map L be the composition of linear maps
Li, Ls,...,L, having one of the three forms given above. Since

Me = Mit eet,= MM, My,


= |det L,| |det L,| - - - |det L,
= |(det L,)(det L,) - - - (det L,I
=|det (LieLae- - -°L,)|=|det LI,
the theorem is proved. O.E.D.

Transformation by Non-linear Mappings

We shall now obtain an extension of Theorem 45.6 for C’ mappings


which are not linear. Of course, in this case the content of the image of an
arbitrary set need not be a fixed multiple of the content of the given set, but
may vary from point to point. The Jacobian Theorem implies that if K is a
sufficiently small cube with center x, then c(p@(K)) is approximately equal
to |J.(x)| c(K). This result is crucial in order to establish the Change of
Variables Theorem. It will be technically convenient to consider first the
following special case.

45.7 Lemma. Let K CR? be a closed cube with center 0. Let © be an


open set containing K and let §:Q— R° belong to Class C'(Q) and be
injective. Suppose further that J,(x) #0 for x¢K and that
(45.1) |W(x)
— x]] < @ [|x| forx eK,
442 INTEGRATION IN R?

where a satisfies 0<a<1/vp. Then


c((K))
(l-avp)’ = <(1t+avp)’.
c(K)
PROOF. It follows from Theorem 45.4 that w(K) has content and from
Corollary 45.5 that b(b(K)) = w(b(K)). If the side length of K is 2r and if
x€5(K), then (by Theorem 8.10) we have r < ||x||< rvp. The inequality
(45.1) implies that (x) is within distance arv'p of xe b(K). Therefore the
compact set &(b(K)) = b(y(K)) does not intersect an open cube C; with
center 0 and side length 2(1-avp)r. If we let A (respectively, B) be the
set of all interior (respectively, exterior) points of y(K), then A and B are
disjoint non-empty open sets with union R’\b((K)). Since C is
connected in R’, we must have either C,c A or C,CB. But since
OECNA, we infer that C CA Cw(K). In an analogous fashion the
reader can show that if C, is the closed cube with center 0 and side length
21+avp)r, then W(K)CC,. The stated conclusion now follows from
these inclusions. Q.E.D.

45.8 THE JACOBIAN THEOREM. Let 0.2 R? be open and suppose that
g :0— R? belongs to Class C'(Q), is injective on Q, and that J,(x) #0 for
x€Q. Suppose that A has content and A~<Q. If ¢ >0 is given, then there
exists y >0O such that if K is a closed cube with center x € A and side length
less then 2y, then

(45.2) \J-(x)|(1—e)? < c(e(K


em) \Jo(x)| (1+ e)°.
PROOF. Construct 6>0 and Q, as in the proof of Lemma 45.1. Since
det De (x) =J,(x) #0 for all x QO, it follows that L, = (De(x))' exists;
since 1 = det (L,° De(x)) = (det L,)(det De(x)), it follows that

det L, = 1/J,(x) for xe.


Since the entries in the standard matrix representation for L, are continu-
ous functions, it follows from the compactness of 0; and (21.4) that there
exists a constant M>0 such that ||Lx||p <M for all x eh.
Now let ¢, with 0<e<1, be given. Since the map x+> D@g(x) is
uniformly continuous on 0, there exists B with O< 8 <6 such that if xi,
x.€Q, and |x: — x, = B, then ||De(x,)— De(xa)lp < e/MvVp. We now let
xe€A be given; hence if |z||= B, then x and x+z belong to Q,. Hence
it follows from Lemma 41.3 that

(45.3) lex +z)—@(x)- De(x)(z)||= lzI| sup |]De( + tz)— De (x)Iln


é
= I2\
Mvp
45. TRANSFORMATION OF SETS AND INTEGRALS 443

Let x€A and define (z) for ||z|]<= B by


w(z) = L. [p(x +z) — (x).
Since L, =(De(x))", the inequality (45.3) yields
Iw(2)—21 = Fel for 7< 6
We now apply the preceeding lemma with a = ¢/v/p to infer that if K, is any
closed cube with center 0 and contained in the open ball with radius 6,
then
(-«)’s oe <(1+e).

It follows from the definition of wy and Theorem 45.6 that if K=x+K,,


then K is a closed cube with center x and that c(K)=c(K,) and

c(h(Ki)) = et L,| e(g(x + K1)- g(x)


Way €(e(K)

Hence, if K is a closed cube with center x € A and side length less than 2
(where y = Biv), then inequality (45.2) holds. Q.E.D.

Change of Variables
We shall now apply the Jacobian Theorem to obtain an important
theorem which is a generalization to R°® of the Change of Variables
Theorem 30.12. The latter result asserts that if o@:[a,B]—R has a
continuous derivative and if f is continuous on the range of ¢, then

(45.4) [r= [eee


The result we shall establish concerns an injective mapping ¢ defined on an
open subset 0.¢ R? with valuesin R°. We shall assume that g € C'(Q) and
that its Jacobian determinant
Jo (x) = det [D,ei(x))
does not vanish on 0. It will be shown that if A has content, if AT <0,
and if f is bounded and continuous on ¢({A) to R, then ¢(A) has content
and

(45.5) lon al (fee) |Jel.


It will be observed that the hypotheses are somewhat more restrictive than in the
case p=1. Indeed, in (45.4) we do not assume that ¢ is injective or that p(x) #0
444 INTEGRATION IN R?

for x €[a, B]. If m happens to be injective, we note that the exact analog for (45.5)
in the case p=1 is

[1 [Geert
where A = inf {e(a), o(8)} and B=sup{p(a), p(B)}. Of course, if ¢’(x)>0 for
a =x < B, then formula (45.5) reduces to (45.4); while if o'(x) <0 fora =x <8,
then formula (45.5) reduces to

[t= [deocen,
whence (45.4) also follows. The explanation for this difference is that the integral
over intervals in R is “oriented” in the sense that we define

[ve-[i
for any real numbers u, v. No such orientation has been defined for integrals over
R’.

The proof given here is essentially due to J. T. Schwartz.+ It is


“elementary” in the sense that it does not make use of any results from
measure theory. However, the argument is very delicate and makes use of
a number of the deeper properties of continuous functions, compact and
connected sets, and the properties of the integral. Even so, the theorem
that will be proved is not quite sufficient for all the important cases that
arise, and will be augmented below with a stronger form which permits J,
to vanish and f°@ to be discontinuous on a set with content zero.

45.9 CHANGE OF VARIABLES THEOREM. Let Q.¢ R? be open and sup-


pose that p:0Q— RP belongs to Class C'(Q), is injective on 0, and J,(x) #0
for xeQ. Suppose that A has content, A~cQ, and f:¢(A)—R is
bounded and continuous. Then

(45.5) [.., f= J, eerie


pRooF. It follows from Theorem 45.4 that @(A) has content. Since the
integrands are continuous, it follows that the integrals in (45.5) exist; it
remains to establish their equality. By letting f=f*—f~, where f*=
x(f+[fl) and f-=3(f|—f), and using the linearity of the integral, it is
enough to suppose that f(y) = 0 for all y € @(A).

TJ. T. SCHWARTZ (1930- ) was graduated from CCNY, received his doctorate at Yale
University, and is a professor at the Courant Institute of New York University. Although he
is best known for his work in functional analysis, he has also contributed to differential
equations, geometry, computer languages, various aspects of mathematical physics, and
mathematical economics.
45. TRANSFORMATION OF SETS AND INTEGRALS 445

Now let 0, be as in Lemma 45.1 and let

M, = sup {|[D¢ (x)lloe x € O4},


M; = sup {f(y):y € (Ah,
M, = sup {|J,(x)|:x € A}.
Let e >0 be arbitrary except that 0<¢ <1, let I be a closed cell containing
A, and let {Ki:i=1,...,M} be a partition of I into non-overlapping
closed cubes with side length less than 2, where y is the constant in the
Jacobian Theorem. Let those cubes that are completely contained in A be
enumerated Ki,..., Kn; let those that have points of both A and its
complement be enumerated K,,.1,..., Kn, and let those cubes completely
contained in the complement of A be enumerated K,+:,..., Km. Since A
has content, we may assume that the partition has been chosen sufficiently
fine that
n
(i) c(A)s¥clKite,
t=]
¥ c(Ki<e.
i=m+1

We let B= K,U---UK,, so that BCA. Since c(A \ B)=c(A)—c(B)<


e, we have

Gi) || eel
| Fee) In|
= | I... (fee) |Jo| | = MpMac(A \ B) = [M;Mz]e.
It follows from Lemma 45.1 that c(@(A \ B)) < (Vp M,)’s, so that

Jove f | <[M,(vp M,)" Je.


(ii)
how i Los i ~

If x; is the center of Ki, i=1,...,m, then it follows from the Jacobian


Theorem that

lJ,(x)| (1-2)? < ee) \J,(x)| +e)”.


Now since 0< «<1, it is seen that 1—2’e =< (1—e)? and (1+5)? <1+2°e,
so we can write this inequality in the form
iv) lc(@(Ki))— Jo (x:)| ¢(Ki)| = [c(Ki)My2?]e.
Now because of the continuity of the functions in the integrand on the
compact set B, it follows that we may assume that for any point y; € Ki, then

) | |, (Fee) Wel 2 Fowr(9) pC] (Ke) | <ec(B).


446 INTEGRATION IN R?

(For, if necessary, we can divide the cubes K:,..., Kn into small cubes;
see Exercise 43.T.)
Since @ is injective, two sets from {@(Ki):i=1,..., m} intersect at most
in a set ¢(K, 0 K;) which has content 0 since c(K; 1 K,;)=0. Also, since
¢(k:) has content, then f is integrable on ¢(K;); hence it follows from
Theorem 44,.9(b) that

{Eftp(B) i= Je(kK)

Now since K; is connected, then ¢(Ki) is connected. Since f is bounded


and continuous on ¢(K;), it follows from the Mean Value Theorem 44.11
that there exists p;€ ¢(K.) such that

[FH flpdel@K), T= 1, sm
Since p: € e{K;), there exists a unique y; € K; with p;= (yi), i=1,...,m.
Hence we have

(vi) J... P= 2 Poorme(ocw.


But since (fe@)(y:) = 0, it follows from (iv) that

¥ Feenydete(Ky)—¥ (Fern old] (Ke)


=[m.2°¥ Gee rwetky |e
< [Momior > (Ky) |e [MyM,2°c(A)]e.
i=1

If we combine this last relation with (v) and (vi), we get

(vi hee Pf, foo) bel < (1+M.M;2”)c(A)e.


Combining (vii) with (ii) and (iii), we obtain

| [.., Pf, ow) bet < [M,(VpM,)?+(1 + MsM;2°)c(A) + M)My Ie.

Since ¢« is an arbitrary number with O<e<1, the equation (45.5) is


established. O.E.D.

Applications
The use of the theorem on the change of variables when p>1 is
generally different from the application of the corresponding theorem
45. TRANSFORMATION OF SETS AND INTEGRALS 447

when p=1. For example, in evaluating


1

| x(1+ x7)? dx
QO

we usually note that if we introduce g(x)=1+x’ then g(x) =2x; hence


the integrand has the form 3(@(x))'*@'(x) and so
x =1
{ x(1+x7)'? dx =3 H((x))?
x=0
x= 1
=M1+x7)?] x= Qo
= 4297-1).
Thus the integration is performed by observing that the given integrand is
a composition of some function and ¢, multiplied by the derivative of ¢.
Similar applications to evaluate integrals in more than one variable are
usually possible only when the Jacobian term is constant (or very simple).
For example, an integral of the form

| [rce+2y, 2e-3y) a(x y)


A

may be treated by introducing the linear transformation (x, y)=


(x+2y,2x—3y). Here

I(x, y) =det |} 3)- 3-4 7

and so we have

| [re + 2y, 2x—3y) d(x, y) =z {J f(u, v) d(u, v).


A fA)

This second integral may be simpler if f(u, v) is simpler [for example, if


f(u, v) = g(u)h(v)}, or if e(A) is simple (for example, if it is a cell).
Otherwise, the transformation may not simplify things very much.
A more typical use of the theorem is to evaluate a multiple integral Jp f
by observing that the set D is the image of a simpler set A (for example,
a cell) under a suitable map ¢.
45.10 Examp.es. (a) Let D denote the rectangle with vertices (0, 0),
(2, 2), (1, 3), (-1, 1); that is, the region bounded by the lines given by
y=x, y=-xt+4, y=xt+2, y=-x.
If we let u=y—x and v=y+x, these lines become

u=0, v=4, u=2, v=0.


448 INTEGRATION IN R?

Hence, if » is the map (u,v)=(x, y), then g maps the cell A=


{0, 2]<[0, 4] into D. We leave it to the reader to show that

| [res y) d(x, y)= f [rae —u), (ut v)|3 d(u, v)


D A

- 5 [ {free —u), (utv)] aul dv.


(b) Let D&R? be the set of points in R® given by
D={(u, v):1<w-v’<9,1<
uv <4);
Hence D is bounded by four hyperbolas. If we define w:(u, v) > (x, y)
by
x=uw-v’, y=uv,
then it is clear that / maps these hyperbolas in (u, v)-plane into the lines
x=1,x=9, y=1, y=4 in the (x, y)-plane. Although is not injective
on all of R’, it is injective in the set Q={(u,v):u>0,v>0} and
Jy(u, v) =2(u?+ v*). Moreover, /(Q)={(x, y):x eR. y > 0}.
Hence we define ¢ on {(x, y):x€R, y >0} to Qc R’ be the inverse of
w. From the above it is clear that @ maps the lines x=1, x=9, y=1,
y =4 into the hyperbolas
w—-v=1, w-v'=9, w=1, uw=4,
respectively, and that the set D is the image under ¢ of the cell
A=[1,9]x{[1, 4]. Direct calculation shows that » has the form (x, y)=
(u, v) where
(45 6) us [ +o)" v= [= ee

. — > 3 — oO. .

It follows from this that u?+v>=(x*+4y’)'® so that J,(x,y)=


x(x°+4y’)"'". [This fact also follows from the identity (u?+v7)’=
(u?—v’)+ 4u’v? =x?+4y?.] Thus we have

J Jrcu. 0) ata w= ff APES ace»)


D A

where A =[1, 9]x[1, 4] and where u(x, y) and v(x, y) are given in (45.6).

Polar and Spherical Coordinates

It is often convenient to specify points in the plane R? by giving their


“polar coordinates.”’ Usually we think of the plane as possessing both the
Cartesian coordinates (given by vertical and horizontal lines) and the polar
45. ‘TRANSFORMATION OF SETS AND INTEGRALS 449

system (given by rays through the origin and circles centered at the origin).
Alternatively, we can think of polar coordinates as a map cf (r, 0)€ R’
into (x, y)< R® given by
(45.7) (x, y)= (7, 0)
= (1 cos 4, r sin @).

Any pair of numbers (r, 6) € R’ such that (x, y) =(r cos 6, r sin 0) is called a
set of polar coordinates of the point (x, y). Usually one requires r = 0; even
so, each point (x, y) in R* has infinitely many sets of polar coordinates.
For example, if (x, y) = (0, 0), then (0, @) is a set of polar coordinates of (0, 0) for
all 0€ R; if (x, y) # (0, 0) and (, 6) is a set of polar coordinates for (x, y), then for
each ne Z the pair (r, 9+n277) is also a set of polar coordinates for (x, y).

If (x, y)#(0, 0), then the unique pair (r, @) with r>0, 0<@<27, is
called the principal set of polar coordinates of the point (x, y). Thus the
function » gives rise to an injective map of (0, +%) x[0, 27) onto R*\
{(0, O)}. It also gives a map of [0,+~)x[0, 22r) onto R? but it is not
injective, since it sends all the points (0, 4), 0 =< @<27, into (0,0). Note
also that the Jacobian is given by

J,(r, 0) = det [Se 6 -—rsin °]


(45.8)
sin@ rcosé
= r(cos 6)?+ r(sin OY = 74,
which vanishes for r= 0.
It is clear that maps the cell A =[0, 1] x[0, 27] in the (r, @)-plane into
the unit disk D = {(x, y):x?+ y? < 1} but since @ is not injective on A and
since J, vanishes for r=0, we cannot apply the Change of Variables
Theorem 45.9 to convert integration over D into integration over A.
We encounter analogous difficulties with spherical coordinates in R’.
Recall that spherical coordinates are defined by the map ®:R*— R*
where
(45.9) P(r, 6, ) = (r cos 6 sin d, r sin 6 sin , r cos >).
Any triple of numbers (r, 4, @) € R® such that (x, y, z) = P(r, 0, ) is called a
set of spherical coordinates of (x, y,z). Usually one requires r= 0, but
even with this restriction each point in R* has infinitely many sets of
spherical coordinates.
For example, if (x, y, z)=(0, 0, 0), then (0, @, @) is a set of spherical coordinates
for all @E R, GER; if (x, y, z) # (0, 0, 0) and (7, 4, #) is a set of polar coordinates for
(x, y,z), then for each m, ne Z, the triples (1,6+2ma,6+2nm) and (7, 6+
(2m+1)a, 6+(2n+1)7) are sets of spherical coordinates for this point.

If (x, y, z) is such that (x, y) # (0, 0), then the unique triple (r, 6, @) with
r>0,0<0<27, 0<¢<z7, is called the principal set of spherical coordi-
nates of (x,y,z). Thus the function ® yields an injective map of
450 INTEGRATION IN R?

(0, +%) x [0, 277) x (0, 7) onto R°\{(0,0,z):z¢R}. The restriction of ®


to [0, +) x[0, 2a} X[0, a] gives a map onto all of R°* but it is not injective,
since it sends all points (0, 6, @) into (0, 0, 0) and, if @ = 0 or a, then all the
points (7, 6, @) map into (0,0,r cos). Note also that

cos@ésingd —rsinédsing rcosécosd


(45.10) Jo(r, 0, )=det| sin@singd rcos@singd rsiné@cosd
cos b 0 —rsind
=—-yr’ sin ¢.
It is readily seen that ® maps the cell A =[0, 1]x[0, 27] x[0, a] in the
(7, 6, b)-space into the unit ball D = {(x, y, z):x7+y?+z7 < 1}, but since ®
is not injective on A and J» vanishes when r’ sin @ = 0, we cannot use the
Change of Variables Theorem 45.9 to convert integration over D into
integration over A.
We shall now present a theorem which enables us to handle the
difficulties we have encountered in the use of polar and spherical coordi-
nates and which is often useful in other ‘‘transformations with sin-
gularities.”” It will be noted that the theorem does not require ¢ to be
injective on the set A, though it is injective on A°.
45.11 CHANGE OF VARIABLES THEOREM (STRONG Form). Let QC R?°
be open and let @:0.— R? belong to Class C'(Q). Let Qo be an open set
with content such that Oy <Q and such that @ is injective on Qo. LetEcO
be a compact set with content zero such that J,(x) #0 for x € Qu \ £. Suppose
that ACGQ has content, A GO», that f:@(A)— R is bounded, and that
f is continuous on o(A \ E). Then

(45.5) . f= {. (fee) el.

PROOF. Since b(A) and b(Qo) are compact and have content zero, we
may assume that they are contained in E; therefore A°\ECQOo\E. Since
A and E have content, the set A \ E has content; moreover, since E is
closed, then (A\E)°=A°\E so that J,(x)#0 for xe€(A\E)’.
Therefore, by Theorem 45.4 applied to A \ E, we deduce that ¢(A \ E)
has content. It follows from Theorem 45.2 that @(F) has content zero,
and since p(A)=@((A \ E)U(ANE))=@(A\ E)UG(ANE), that e(A)
has content. Since f is bounded on ¢(A) and continuous except on a
subset of @(E), we deduce that f is integrable over p(A). Moreover, since
fe is continuous except on a subset of E, we deduce that (fe) |J.| is
integrable over A. It remains to show that these integrals are equal.
Now apply Lemma 45.1 to E to obtain a bounded open set Q, with
Ec,¢;
<Q and a constant M,>0, with the property that if E is
contained in a finite union of closed cubes in 0, with content at most a >0,
45. TRANSFORMATION OF SETS AND INTEGRALS 451

then ~(E) is contained in a corresponding finite union of closed cubes with


content at most (vp M,)Pa.
We now let ¢ >0 be given and enclose E in a finite union U. of open
cubes in 0, with c(U.) < « and such that the union W, of the closures of the
cubes in U, is still contained in Q;.. Then c(W.)<e and it follows from
Lemma 45.1 that c(g(U.)) = c(@(W.)) = (Vp Mi)’e. We now let B = A \
U. so that B has content. Now since U, is open and contains b(Qo) and E,
we infer that B-C0Q)\E. We now apply the Change of Variables
Theorem 45.9 to B, Qo\E in place of A, Q, to get

[. ,0(B) f=|,3 feo) bel


Now it is readily seen that p(A)\ ¢(B)c@(A NU.) whence

Loew f- Los i = Dosw f | = Myc(¢(AN U,))


s (Vp M,)’ Mee.

Similarly we have that

| |, ee) el | fee) el =|. (fee) isl


< M;Myc(A 9 U.) = MyMge.
It follows that

Las tJ (foe) Wel = [(/p M.'M, 4 MMi Je.

Since ¢ > 0 is arbitrary, the conclusion follows. Q.E.D.


For polar coordinates, we take Qo to be an open set with content
contained in (0, +%) x (0, 277). For spherical coordinates, we take Qu to be
an open set with content contained in (0, +) x (O, 22) x (0, a).

Exercises

45.A. Let OQ¢ R’ be an open set and let f:— R° satisfy a Lipschitz condition
on Q; that is, for some M > 0, ||f(x)— f(y)|| = M ||x — yl for all x, yeQ. If K cQisa
cube with side length s >0, show that f(K) is contained in a cube with side length
Myps. Show that if AO is a compact set with content zero, then f(A) has
content zero, and if B <Q is a compact set with content, then f(B) has content.
45.B. Consider the polar coordinate map (x, y)=@(r, 6)=(rcos 6,rsin @)
defined on R®, and its behavior on the set A =[0.1]*[0,27]. Use Theorem 45.4 to
obtain the reassuring information that the image D = ¢(A), which is the unit disk
D={(x, y):x?+ y’ = 1}, has content. Investigate the manner in which ¢ maps the
boundary of A. Show that the boundary of D is the image under @ of only one side
of A, and that the other three sides of A get mapped into the interior of D.
452 INTEGRATION IN R°

45.C. Consider the map (x, y)= (u,v) =(sinu,sinv) defined on R’.
Determine the image of the boundary of B=[-—in,ia]x[-ia, ia] under &,
and the boundary of #(B). Show that most, though not quite all, of the boundary
points of (B) are images of interior points of B.
45.D. Given that the area of the circular disk {(x, y):x’+y’ = 1} is equal to a,
find the areas of the elliptical disks given by:

@ {ay 4+% <1};


(b) {(x, y):2x7+ 2xy+5y? < 1}.
(Hint: 2x74 2xy + 5y?=(x+2y)?+(x—y)*.)
45.E. Let B be the set {(x, y):0=x,0<y,lax+y<2}. Letu=x+y,v=y
so that B is the image under the map (x, y)= @(u, 0) =(u—v, v) of the trapezoid
C={(u v):1su<2,0<v<u}. Show that ¢ is injective on all of R’ and that
J,(u, v)=1. Deduce that

[fernac
y= | fate o=3
B

45.F. Let B={(u,v):0<u+v<2,0<v-—u<=2}. By using the transforma-


tion (x, y)> (u,v) =(x—y, x+y), evaluate the integral

[Joo _ use terre d(u, v).

45.G. Evaluate the iterated integral

Uf ve}as
directly. Then use the transformation (x, y)+> (u, v) =(x, y—x’) to evaluate this
integral.
45.H. Determine the area of the region bounded by the curves

xy=1, xy=2, y=x’?, y=2x?


by introducing an appropriate change of variable.
45. Let J: R®—> R’ be defined by (u, v) = W(x, y) =(x?—y’, x°+y’). Note that
the inverse image under & of the line u=a>0 is a hyperbola, and the inverse
image under w of the line v =c >O isa circle. Show that w is not injective on R’,
but its restriction to Q ={(x, y):x >0, y >O} is an injective map onto {(u, v):0>
Jul}. Let @ be the inverse of the restriction & | Q and show that if0<a<b<c<d,
then g maps the rectangle A =[a, b]<[c, d] into the region

g(A)={(x, y)iasx?-y?<bc<x?+y’ = d}.

Show that if f:Q— R is continuous, then

[fr rrar= [f°


fA A
9) garage
45, TRANSFORMATION OF SETS AND INTEGRALS 453

In particular, we have

[fv d(x, y= Jf d(u, v) =3(b—a)(d—-c).


lA) A

45.J. Let §:R*— R be as in the preceding exercise. Show that y maps the
triangular region A={(x, y):0 =x =1,0<y = x} into the triangular region
A. = #(A)={(u, v):0susil,usvs2-uh.

Here J, (x, y)=8xy. If 0Q,.=(0,2)x(O, 2), and if f is continuous on A,, apply


Theorem 45.11 to show that

[ [re v) d(u, v)= [ [roves y) |Su(x, y)| d(x, y).


At 4

In particular show that

[ [or-yoces yyy de 9) =§ | uo atu, 0).


A A

45.K. Let a<B belong to [0, 27] and let h:[a, 8] > R be continuous and such
that h(@) = 0 for @e[a, B]. Let H={(6,r)eR’:a <6 < B, Osr=sh(@)} be the
ordinate set of h (see Exercise 44.0), so that H has content. The polar curve
generated by h is the curve in R’ defined by 0+ (h(6) cos 6, h(6) sin @), and the
polar ordinate set of this curve is the set

H,={(rcos 6,r sin @)E R?:a <0<B,0<r<h(6)}.


Note that H, is the image of H under the (reversed) polar map ¢,(6,r)=
(r cos @,r sin @) and use Theorem 45.11 to show that

cH) =5 [ploy ae.


45.L. Let a<b and let f:[a, b] > R be continuous and such that f(x) = 0 for all
xeé[a,b]. As in Exercise 44.0, let S,={(x, y)iaxx=<b, O< y = f(x)} be the
ordinate set of f. Let p,:R’— R° be defined by p,(x, y, 0) =(x, y cos 8, y sin 0)
and let X; be the image of S, x[0, 27] under p,. (The set X; is called the “‘solid of
revolution obtained by revolving the ordinate set S, about the x-axis.”) Use
Theorem 45.11 to show that

(X) =f (fC) dx
45.M. Let 0 <a<b and let f:[a, b] > R and S, be as in the preceding exercise.
Let p,: R*— R° be defined by p,(x, y, 6) =(x cos 4, y, x sin @) and let Y; be the
image of S,x[0,2a] under p,. (The set Y; is called the ‘‘solid of revolution
obtained by revolving the ordinate set S, about the y-axis.”) Use Theorem 45.11
to show that

c(Y) = 2ar[ sfx) dx,


454 INTEGRATION IN R?°

45.N. (a) By changing to polar coordinates, show that

[Jew aa y=Fa-e",
CR

where Cz ={(x, y):0=x,0<y,x?+y’= R?}.


(b) If BL ={(x, y):0=x =< L,0<y =L}, show that

Jfews d(x, y)= (fer dx).


BL

(c) From the fact that Cz © Bry < Cavs, show that

. 8 ey V_t
um (flew dx) =F,
whence it follows that fe" e’ dx =iV a.
45.0. Let B = {(x, y):4x7+9y? <= 4}. Use an appropriate change of variables to
evaluate

[few d(x, y= gl -e~).

45.P. Observe that the set {(x,y,z):02x+y=<=3, (x*+y)"<2z<


(1-x?—y’)'*} is a “‘conical sector cut out of the unit ball’? in R’. Obtain this set as
the image under the spherical coordinate map ® of the cell [0, 1] x[0, 2a] x [0, i].
(a) Show that the content of this set in R? equals 7(2—~2)/3.
(b) Obtain the content of this set by using the cylindrical coordinate map
T:(r, 0, z) > (x, y, z)=(r cos 6, r sin 4, z).
45.Q. Let a>0 and let A be the intersection of the sets

{(x, y, Z):x?7+y?+z? <4a’} and {(x, y, z):z


= a}.

(a) Use the spherical coordinate map to show that c(A)=52a’/3.


(b) Use the cylindrical coordinate map to evaluate c(A).
45.R. Let B be the intersection of the sets

{(x, y, z):x?+y?+z7= 2} and {(x, y, z):x*+y?4+27


= 2z}.
(a) Use the spherical coordinate map to show that c(B)= a(4V2 —3)/3.
(b) Use the cylindrical coordinate map to evaluate c(B).
45.8. Let B,(r) ={x eR? :||x|| = r} be the ball with radius r>0 in the space R?’.
We shall compute the content ,(r) of B,(r).
(a) Use a change of variables to show that w,(r)=r’, (1).
(b) If p = 3, express the integral for w,(1) as an iterated integral and use part (a)
to show that

@, (1) = oa "{f'a _ yer, ar| de


= w,-2(1)2 a/p.
45. TRANSFORMATION OF SETS ANID INTEGRALS 455

(c) Conclude that if p = 2k is even, then ,(1)=a"/k! If p= 2k —1 is odd, then


(1) = 48a" "'k1/(2k)! In terms of the Gamma function, we have w,(1)=
aw? IT Gp +1).
(d) Obtain the remarkable result that lim (w,(1)) =0.
45.T. We shall obtain the result of the preceding exercise in a different way. Let
pEN and let o:R°—R’ be defined by o(@)=o(6;,..., 6,)=(cos 6,
sin 6, cos 6.,..., sin 6, sin 6, -- + sin @,-, cos @,).
(a) Show that ||o(6)|? = 1 and that |{o(@)|| = 1 only when 6, = 0 or 6, = 7 for some
value of j=1,...,p.
(b) Show that o is an injective map of (0, 7)’ =(0, 7) x-+ +- (0, 2)(p times) onto
the interior {x € R® :||x||< 1} of the unit ball B,(1). Show also that o maps [0, +f
onto the unit ball, but that it is not injective on the boundary.
(c) Evaluating the Jacobian of a, we obtain

J,(6) = (—1) (sin 6,)? (sin @,)° - - - (sin @,.,)°(sin 6,).


Hence, J,(6)#0 for 6 €(0, 7)’.
(d) Using the Wallis product formulas for f3” (sin 6)‘ d@ obtained in Project
30.y, derive the expressions for w,(1) given in the preceding exercise.

Project

45.a. This project is based on Project 44.y and provides an alternative approach
to the Change of Variables Theorem 45.9. Let Q¢ R” be open, and let p:Q > R”
belong to Class C'(Q), be injective on ©, and such that J,(x) #0 for allxeQ. For
simplicity, we also suppose that there exists M>O such that ||p(x)—¢(y)||<
M |x —y|| for x, yeQ.
(a) If b:B(Q)— R is defined by

@(A)=c(~(A)) for Ae (QQ),

then ® is additive on %(Q) and has strong density equal to |J,|. Moreover, for
some M,>0, we have ®(A) < M,c(A) for all Ae BQ).
(b) If f is a bounded function which is integrable on every set »(A), for
AeQM(Q), and if ¥:2(Q)— R is defined by

wA={ wlA)
of
then W is additive on @(Q). Moreover, for some M,>0, we have |W(A)|=<
M.c(A) for all AE BO).
(c) If f is a bounded and continuous function on @(Q), and if V is defined as in
(b), show that W has strong density equal to (f°@) |J,|.
(d) If f is as in (c), show that

I... f= I. (fFep) || for Ae B(Q).


REFERENCES

This list includes books and articles that were cited in the text and some
additional references that will be useful for further study.

Apostol, T. M., Mathematical Analysis, Second Edition, Addison-Wesley, Read-


ing, Mass., 1974.
Bartle, R. G., The Elements of Integration, Wiley, New York, 1966.
Boas, R. P., Jr., A Primer of Real Functions, Carus Monograph Number 13, Math.
Assn. of America, 1960.
Bruckner, A. M., “Differentiation of Integrals,” Amer. Math. Monthly, Vol. 78,
No. 9, Part II, 1-51 (1971). (H. E. Slaught Memorial Paper, Number 12.)
Burkill, J. C., and H. Burkill, A Second Course in Mathematical Analysis,
Cambridge Univ. Press, Cambridge, 1970.
Cartan, H. P., Cours de Mathématiques, J. Calcul Différentiel; Jl. Formes
Différentielles, Hermann, Paris, 1967. (English translation, Houghton-Mifflin,
Boston, 1971.)
Cheney, E. L., Introduction to Approximation Theory, McGraw-Hill, New York,
1966.
Dieudonné, J., Foundations of Modern Analysis, Academic Press, New York, 1960.
Dunford, N., and J. T. Schwartz, Linear Operators, Part I, Wiley-Interscience, New
York, 1958.
Finkbeiner, D. T., II, Introduction to Matrices and Linear Transformations, Second
Edition, W. H. Freeman, San Francisco, 1966.
Gelbaum, B. R., and J. M. H. Olmsted, Counterexamples in Analysis, Holden-Day,
San Francisco, 1964.
Halmos, P. R., Naive Set Theory, Van Nostrand, Princeton, 1960. (Republished by
Springer-Verlag, New York, 1974.)
Hamilton, N. T., and J. Landin, Set Theory, Allyn-Bacon, Boston, 1961.
Hardy, G. H., J. E. Littlewood, and G. Polya, Inequalities, Second Edition,
Cambridge University Press, Cambridge, 1959.
Hewitt, E., and K. Stromberg, Real and Abstract Analysis, Springer-Verlag, New
York, 1965.

456
REFERENCES 457

Hoffman, K., and R. Kunze, Linear Algebra, Prentice-Hall, Englewood Cliffs,


1961.
Kelley, J. L., General Topology, Van Nostrand, New York, 1955.
Knopp, K., Theory and Application of Infinite Series (English translation), Hafner,
New York, 1951.
Lefschetz, S., Introduction to Topology, Princeton University Press, Princeton,
1949,
Luxemberg, W. A. J., “Arzela’s Dominated Convergence Theorem for the
Riemann Integral,” Amer. Math. Monthly, Vol. 78, 970-979 (1971).
McShane, E. J., “A Theory of Limits,” published in MAA Studies in Mathematics,
Vol. 1, R. C. Buck, editor, Math. Assn. America, 1962.
, “The Lagrange Multiplier Rule,” Amer. Math. Monthly, Vol. 80, 922-925
(1973).
Royden, H. L., Real Analysis, Second Edition, Macmillan, New York, 1968.
Rudin, W., Principles of Mathematical Analysis, Second Edition, McGraw-Hill,
New York, 1964.
Schwartz, J., “The Formula for Change of Variables in a Multiple Integral,” Amer.
Math. Monthly, Vol. 61, 81-85 (1954).
Simmons, G. F., Introduction to Topology and Modern Analysis, McGraw-Hill, New
York, 1963.
Spivak, M., Calculus on Manifolds, W. A. Benjamin, New York, 1965.
Stone, M. H., ‘“‘The Generalized Weierstrass Approximation Theorem,”
Mathematics Magazine, Vol. 21, 167-184, 237-254 (1947/48). (Reprinted in
MAA Studies in Mathematics, Vol. 1, R. C. Buck, editor, Math. Assn. America,
1962.
suppes. P. Axiomatic Set Theory, Van Nostrand, Princeton, 1961.
Titchmarsh, E. C., The Theory of Functions, Second Edition, Oxford University
Press, London, 1939.
Varberg, D. E., “Change of Variables in Multiple Integrals,” Amer. Math.
Monthly, Vol. 78, 42-45 (1971).
Woll, J. W., Jr., Functions of Several Variables, Harcourt, Brace and World, New
York, 1966.
Wilder, R. L., The Foundations of Mathematics, Wiley, New York, 1952.
HINTS FOR
SELECTED EXERCISES

The reader is urged not to look at these hints unless he is stymied. Many of the
exercises call for proofs, and there is no single way that is correct; even if the reader
has given a totally different argument, his may be entirely correct. However, in
order to help the reader learn the material and to develop his technical skill, some
hints and a few solutions are offered. It will be observed that more detail is
presented for the early material.

Section 1

1.D. By definition AN BCA. If ACB, then ANB2A so that ANB=A.


Conversely, if AM B=A, then AN B2A whence it follows that B 2A.
1.E, F. The symmetric difference of A and B is the union of {x:x ¢ A andx¢B}
and {x:x¢A and xe€B}.
1.H. If x belongs to EM WA, then xe E andxe\JA,. Therefore, xe E and
xé€A, for at least one j. This implies that xe EMA, for at least one j, so that

ENUA,=U(ENA)).
The opposite inclusion is proved by reversing these steps. The other equality is
handled similarly.
LL. Ifxe€((\{A, :j EJ), then x¢ (){A,:jeJ}. This implies that there exists a
k eJ such that x¢ A,. Therefore, x ¢€(A,), and hence xe U{€(A,):j es}. This
proves that €((] A,)< LU €(A,). The opposite inclusion is proved by reversing
these steps. The other equality is similar.

Section 2

2.A. If (a, c) and (a, c’) belong to g°f, then there exist b, b’ in B such that (a, b),
(a, b’) belong to f and (b, c), (b’, c’} belong to g. Since f is a function, b = b’; since g
is a function, c=c’.
2.B. No. Both (0, 1) and (0, -1) belong to C.
2.D. Let f(x) = 2x, g(x) =3x.

458
HINTS FOR SELECFED EXERCISES 459

2.E. If (b, a), (b, a’) belong to f~’, then (a, b), (a’, b) belong to f. Since f is
injective, then a=a’. Hence f7' is a function.
2.G. If f(x.) = f(x.) then x1 = gof(x,) = gef(x.)=x.. Hence f is injective.
2.H. Apply Exercise 2.G twice.

Section 3

3.A. Let f(n)=n/2, ne E.


3.B. Let f(n)=(n+1)/2, ne O.
3.C. Let f(r)an+1, neN.
3.E. Let A,={n}, neN. Then each set A, has a single point, but N=
U{A, :n€ N} is infinite.
3.F. If A is infinite and B = {b, :n € N} is a subset of A, then the function defined
by fX)=ba, x=b.eB,
=X, xEA\B.

is one-one and maps A onto A \ {by}.


3.H. If f isa one-one map of A onto B and g is a one-one map of B onto C, then
gof is a one-one map of A onto C.

Section 4

4.G. Consider three cases: p= 3k, p=3k+1, p=3k+2.

Section 5

5.A. Since a?= 0 and b?= 0, then a’*+b’=0 implies that a? = b?=0.
5.D. If c=1+a with a>O0, then c"=(1+a)"=l+naztit+a=ce.
5.G. Observe that 1<2'=2. Ifk <2" fork = 1, thenk+1<2k<2-2*=2'".
Therefore, n<2" for all neN.
5.H. Note that b"- a" = (b—a)(b""'+---+a"")=(b—a)p, where p>0.
5.M. {(x, y):y =+x}.
5.N. A square with vertices (+1, 0), (0,41).

Section 6
6.A. If A={x}, then x,=supA. If A={x1,...,%)%eit and if u=
sup {x,,...,X,}, show that sup {u, x,.:} is the supremum of A.
6.C. Let S={xeQ:x’?<2}.
6.E. In fact, sup A U B =sup {sup A, sup B}.
6.H. If S=sup {f(x, y):x eX, y € Y}, then f(x, y) < S for all x € X, ye Y, and so
fi(x)=S for all xe X. Hence sup {fi(x):xeX}<S. Conversely, if « >0 there
exists (Xo, Yo) such that S—e <f(Xo, yo). Hence S—e <f,(x,) and therefore S—«<
sup {fi(x):x¢X}. Since «>0 is arbitrary, we infer that S = sup {f,(x):x € X}.
6.K. Since f(x) = sup {f(z):z¢€ X}, it follows that
f(x)+ g(x) = sup {f(z):z © X}+ supig(z):z € X}.
460 HINTS FOR SELECULD EXERCISES

Therefore sup {f(x)+ g(x):x € X} is less than or equal to the right hand side.
Similarly, if x ¢ X, then
inf {f(z):z € X}+ g(x) = f(x)
+ g(x).
If we use 6.J, we infer that

inf {f(z):z © X}+sup {g(x):x eX} = sup {f(x)+ g(x):x eX}.

The other assertions are proved in a similar way.

Section 7

7.B. Let aeA; if aéA’ then a€B’ and so €<é’ <a, a contradiction.
Therefore ae A’ and since a€A is arbitrary we have ASA’. Since —< &', there
exists xe R with &é<x<&'. Since <x, we must have x eB. But since x € A’ we
infer that A# A’.
7.C. Let A={x:x<1}, B={x:x = 1} and A’={x:x <= 1}, B’={x:x>1}.
7.E. Ifx € I, for all n, we have a contradiction to the Archimedean Property 6.6.
7.F. If xeJ, for all n, we have a contradiction to Corollary 6.7(b).
7.H. Every element in F, has a ternary expansion whose first digit is either 0 or
2. The points in the four subintervals of F, have ternary expansions beginning
0.00...,0.02...,0.20...,0.22...,
and so forth.
7.J. If n is sufficiently large, 1/3°<b—a.
7.K. As close to 1 as desired.

Section 8

8.E. Property 8.3(ii) is not satisfied.


8.H. The set S, is the interior of the square with vertices (0, +1), (£1, 0) and S, is
the interior of the square with vertices (1, +1), (-1, +1).
8.K. Take a=1/vp, b=1.
8.L.Take a=1/p, b=1.
8.M. We have |x-y|=Z |x/lyl={2 [xfsuplyl=llxlh llylh. But |x-y|<=
p |x|. ly. and if x =y=(1,1,..., 1), equality is attained.
8.N. The stated relation implies that

IIx]?
+ 20x > y) + llyl? = Ibe + ylP= Cell l ylb?
= [lxlP +2 [lll lyll+llylP.
Hence x - y =||x|| |ly|| and the condition for equality in Theorem 8.7 holds provided
the vectors are non-zero.
8.P. Since ||x + yl? = |x|? +2(% - y)+|lylf?. the stated relation holds if and only if
x-y=0.
8.Q. A set K is convex if and only if it contains the line segment joining any two
points in K. If vy. ye Ky, then |x
+ —dyll<t|x}+ -ollyj<et+0—-—n=1 so
HINTS FOR SELECTED EXERCISES 46]

ix+(1-t)y¢K, forO<t=<1. The points (+1, 0) belong to K,, but their midpoint
(0, 0) does not belong to K,.
8.R. If x, y belong to (] K,, then x, y¢ K, for alla. Hence tx+(1—t)yeK, for
all a; whence it follows that (| K. is convex. Consider the union of two disjoint
intervals.

Section 9

9.A. If x eG, let r=inf{x,1—x}. Then, if |y—x|<r, we have x-r<y<x+r


whence 0 = x—-r<y<x+r<1,sothat yeG. Ifz=0, then there does not exist a
real number r>0 such that every point y in R satisfying |y|<r belongs to F.
Similarly for z= 1.
9.B. If x€G, take r=1-||x||. If xe H, take r=inf {\x/], 1—|[x[}. If z =(1, 0),
then for any r>0, there is a point y in €(F) such that lly—z|}<r.
9.G. Enumerate the points in the open set with all coordinates rational numbers.
Then proceed as in the proof of Theorem 9.11 using open balls with center at
these rational points.
9.H. Argue as in the preceding exercise, but this time use closed balls.
9.I. Take complements and apply 9.H.
9.J. The set A° is the union of the collection of all open sets in A. Hence any
open set GCA must be contained in A®. By its definition we must have A°C A.
It follows that (A*)’c A°®. Since A° is open and (A°)’ is the union of all open sets
in A°, we must have A°<(A°)’. Therefore (A°)°=A°. Since A°< A and B°cB
it follows that A°N B°CANMB,; but since A°NB° is open this implies that
(A°N B*)<(A MB). On the other hand (A NM B)’ is an open set and is contained
in A and B; therefore (AN B)°cA®*® and (AN B)°<B’, whence (ANB)
A°NB°. Consequently A°N B°=(AMB)*. Since R? is open, (R’)°=R’. Let A
be the set of all rational numbers in (0, 1) and B be the set of all irrational numbers
in (0,1). Then A°UB°=Q, while (A UB)’ =(0, 1).
9.L. Either argue as in 9.J, or take complements and use 9.J.
9.N. If p=1, take A=Q. In R’, take Q’.
9.0. Suppose A, B are open in R. Let (x, y)e A XB, so that xe A and yeB.
There exist r>0 such that if |x’-x|<r then x’eA and s >0 such that if |y’—y|<s
then y'€ B. Now let t = inf {r, s}; the open bal! with radius ¢ is contained in A x B.
The converse is similar.

Section 10

10.C. If x is a cluster point of A in R”’ and N is a neighborhood of x, then


Nnf{yeR?’:|ly—x<1} contains a point a,eA, a.x%x. The set
N{yeR?:|ly~—x||<|la.|} contains a point a,éA, a,#x and also a,#a,.
Continue this process.
10.F. Every neighborhood of x contains infinitely many points of AUB. Hence
either A or B (or possibly both) must possess an infinite number of elements in this
neighborhood.
462 HINTS FOR SELECTED TF XERCISLS

Section 11

11.A. Let G, = {(x, y):x?+y’<1-1/n} for neN.


11.B. Let G,= {(x,-y):x’+y’<n7} for neN.
11.C. Let $={G,} be an open covering for F and let G= €(F), so that G is
openin R’. lf G,= GU {G}, then &, is an open covering for K; hence K has a finite
subcovering {G, G,, Gg,..., G,}. Then {G,, Gp, ..., G.} forms a subcovering of G
for the set F.
11.D. Observe that if G is open in R, then there exists an open subset G, of R*
such that G=G,NR. Alternatively, use the Heine-Borel Theorem.
11.E. Let $={G,} be an open covering of the closed unit interval J in R’.
Consider those real numbers x such that the square [0, x] x0, x] is contained in
the union of a finite number of sets in @ and let x* be their supremum.
11.G. Let x,€F,, neN. If there are only a finite number of points in
{x, :n € N}, then at least one of them occurs infinitely often and is a common point.
If there are infinitely many points in the bounded set {x,}, then there is a cluster
point x. Since x, €F, form = n and since F, is closed, then x € F, for alln €N.
11.H. If d(x, F)=0, then x is a cluster point of the closed set F.
LiJ. No. Let F={yeR?:|ly—xl|=r}, then every point of F has the same
distance to x.
11.K. Let G be an open set and let xe R’. If H={y—x:ye€G}, then H is an
open set in R’.
11.M. Follow the argument in 11.7, except use open cells instead of open balls.
11.Q. Suppose that Q@ = (]{G,:n€N}, where G, is open in R. The comple-
ment F,, of G, is a closed set which does not contain any non-empty open subset, by
Theorem 6.10. Hence the set of irrationals is the union of a countable family of
closed sets not one of which contains a non-empty open set; but this contradicts
Exercise 11.P.

Section 12
12.B. Let A, B be a disconnection for C’= C U{x}. Then ANC’ and BNC’
are disjoint, non-empty, and have union C’. One of these sets must contain x;
suppose it is B. Since B is an open set, it also contains points of C so
CN(B \ {x} #@. But then A, B\{x} form a disconnection of C.
12.E. Modify the proof of Theorem 12.4.
12.G. By Theorem 12.8, the sets C, and C, are intervals. It is easily seen that
C, XC; is convex so {2.E applies.

Section 13
13.A. Examine the geometrical position of iz =(—y, x) in terms of z =(x, y).
13.B. Note that cz = (x cos 6— y sin 6, x sin 8+ y cos @), and this corresponds to a
counter-clockwise rotation of @ radians about the origin.
13.C. The circle |z — c| =r is mapped into the circle |w— (ac + b)|=|al|r. Wecan
write z=a 'w—a'b and calculate x =Rez, y=Imz in terms of u=Rew,
v=Imw. Doing so we easily see that the equation ax + by =c transforms into an
equation of the form Au+Bv=C.
HINTS FOR SEEILCTED EXERCISES 463

13.D. A circle is left fixed by g if and only if its center lies on the real axis. The
only lines left fixed by g are the real and the imaginary axis.
13.E. Circles passing through the origin are sent into lines by h. All lines not
passing through the origin are sent into circles passing through the origin; all lines
passing through the origin are sent into lines passing through the origin.
13.F. Every point of C, except the origin, is the image under g of two elements
of C. If Re g(z)=k, then x°-y’=k. If Img(z)=k, then 2xy=k. If |g(z)|=k,
then k = 0 and |z|=~k.

Section 14

14.B. Note that oct 1.1.1


no ntl n(nt+l1) aA
14,.D. We have 0 <||[x.||—||xl|| <= lx.— xl].
14.1. Let re R be such that lim (x,../x,)<r<1. Since the interval (—-1,r) isa
neighborhood of this limit, there exists K € N such that 0<x,.,/x, <r for all n = K.
Now show that 0<x,< Cr" for some C and n= K.
14.K. Consider (1/n) and (n).
14.L. Sequences (a), (b), (e), and (f) converge; sequences (c) and (d) diverge.
14.M. Let reR be such that lim(xi")<r<1. Since the interval (-1, r) is a
neighborhood of this limit, there exists KEN such that 0<xi"<r, whence
O<x,<r" for all n= K.

Section 15

15.A. Consider z, = y,—x, and apply Example 15.5(c) and Theorem 15.6(a).
15.C. (a) Converges to 1. (b) Diverges. (f) Diverges.
15.D. Let Y=—X.
15.F. Consider two cases: x =0 and x >0.
15.G. Yes.
15.H. Use the hint in Exercise 15.F.
15.L. Observe that b <x, = b2*”.

Section 16

16.A. By induction 1<x, <2 for n= 2. Since xy4)— X. = (Xa — Xn-1)/(%nXn-1), the
sequence is monotone.
16.C. The sequence is monotone and bounded. The limit is (1+ (1 +4a)'”)/2.
16.D. The sequence X is monotone decreasing and bounded.
16.E. An element x, of X =(x,) is called a “peak” for X if x,=x, for n>k.
(i) If there are infinitely many peaks with indices k,<k,<..., then the sequence
(x,,) of peaks is a decreasing subsequence of X.
(ii) If there are only a finite number of peaks with indices k,<---<k,, let
m,>k,. Since x,,, is not a peak, there exists m, > m, such that x,,<X»,. Continuing
in this way we obtain a strictly increasing subsequence (x,,,) of X.
16.G. The sequence is increasing and x, = n/(n+1)<1.
464 HINTS FOR SELECTED EXERCISES

16.K. There exists K € N such that if n = K, then L~e = x,.,/x, 5 L+e. Now
use an argument similar to the one in Exercise 14-1.
16.M. (a) e, (b) e*”, (c) Hint: (1+ 2/n)=(14+ 1/n)(1+ 1/(n+1)), (d) e*.
16.P. Let y, € F be such that ||x — y,||<d+1/n. If y =lim (y,,), then |x — y||=d.

Section 17

17.A. All.
17.C. If x eZ, the limit is 1; if x¢ Z, the limit is 0.
17.E. If x =0, the limit is 1; if x0, the limit is 0.
17.G. If x>0 and O<e<7n/2, then tan(m/2—e)>0. Therefore nx =
tan (7/2—¢) for all n = n,, from which 7/2—e < Arctan nx < 7/2.
17.H. If x>0, then e*<i. 17.J. Not necessarily.
17.M. Consider the sequence (1/n) or note that ||fillp = 3.
17.P. Yes. 17.Q. Yes.

Section 18 .

18.A. (a) £1. (b) 0. (©) +1. (dd) +1. :


18.E. Let m, peEN, psm Then v,.(X%+Y)=sup{x,+y,:n=mjp=<
sup {x,:n = m}+sup {y,:n = m}=»v,(X)+v,(Y) <v,(X)+0,(Y). Therefore
(x + y)* = inf {v,(X+ Y):meN}s 0,(X)t+y*.
Since this is true for all péN, we infer that (x+y)* <= x*+y*.
18.G. (a) +0. (c) 0, +.
Section 19

19.E. If j <n, then x, <x,., and x¥(1+1/n) <= x,+(1/n)x,..1. Now add.
19.1. If X is increasing and not convergent in R, then X is not bounded.
19.K. (a) None exist. (b,c) All three are equal. (d) The iterated limits are
different and the double limit does not exist. (e) The double limit and one iterated
limit are equal. (f) The iterated limits are equal, but the double limit does not
exist.
19.L. Let x,, =n if m=1 and x,,=0 if m>1.
19.N. In (b,c, e).
19.0. Apply Corollary 19.7 to x = sup {X%mn im, n © N}.
19.P. Let x =0 for mn and Xm =(-1)"/n for m =n.

Section 20

20.A. If a=0, take 8(e}=e*. If a>O0, use the estimate

Jx—al__ |x—al
We—val= vx+vVa <
Va
20.B. Apply Example 20.5(b) and Theorem 20.6.
20.C. Apply Exercise 20.B and Theorem 20.6.
HINTS FOR SELECTED EXERCISES 465

20.E. Show that |f(x)—f()| =|x —d.


20.F. Every real number is a limit of a sequence of rational numbers.
20.J. There exist sequences (x,), (y,) such that lim (h(x,)) = 1, lim (h(y,)) = -1.
20.L. Show that f(a+h)—f(a)=f(h)—f(0). If f is monotone on R, then it is
continuous at some point.
20.M. Show that f(0)=0 and f(n)=nc for neN. Also f(n)+f(—n)=0, so
f()=ne for neZ. Since f(m/n)=mf(1/n), it follows on taking m=n that
f(/n) = c/n, whence f(m/n)=c(m/n). Now use the continuity of f.
20.N. Either g(0) =0, in which case g(x) =0 for all x in R, or g(0) = 1, in which
case g(a +h)— g(a) = g(a){g(h) — g(0)}.

Section 21

21.C. FG, 1) =, 1, —-D, fC, 3)


= (5, 1, -3).
21.D. A vector (a, b,c) is in the range of f if and only if a-2b+¢=0.
21.G. If A=0, then f(—b, a)=(0, 0). If A #0, then the only solution of

ax +by =0, ex+dy=0

is (x, y)
= (0, 0).
21.1. Note that g(x)=g(y), if and only if g(x—y)=8.
21.P. Note that c; =e, « f(e,) and apply the Schwarz Inequality.

Section 22

22.C. If f(xo)>0, then V={ye€R:y >0} is a neighborhood of f(x).


22.H. Let f(s, t)=0 if st =0 and f(s, t)=1 if st¥0.
22.L. Suppose the coefficient of the highest power is positive. Show that there
exist x;<0<x, such that f(x,)<0O<f(x,).
22.M. Let f(x) =x". If c>1, then f(0)}=O0<c<f(c).
22.N. If f(c)>0, there is a neighborhood of c on which f is positive, whence
c#supN. Similarly if f(c)<0.
22.0. Since f is strictly increasing, and a <b, then f maps the open interval
(a, b) in a one-one fashion onto the open interval (f(a), f(b)), from which it follows
that f~’ is continuous.
22.P. Yes. Let a<b be fixed and suppose that f(a)<f(b). If c is such that
a<c<b, then either (i) f(c)= f(a), (ii) f(c)<f(a), or (iit) f(a)<f(c). Case (i) is
excluded by hypothesis. If (ii), then there exists a, in (c, b) such that f(a,) = f(a), a
contradiction. Hence (iii) must hold. Similarly, f(c)<f(b) and f is strictly
increasing.
22.Q. Assume that g is continuous and let c,<c, be the two points in F where g
attains its supremum. If0<c,, choose numbers a,, a, such that 0<a,<¢,<a,<c,
and let k satisfy g(a)<k<g(c,). Then there exist three numbers b, such that
a,<b)<c,<b,<a,<b,;<c, and where k = g(b,), a contradiction. Therefore, we
must have c,=0 and c,=1. Now apply the same type of argument to the points
where g attains its infimum to obtain a contradiction.
22.8. Note that ¢7'(S) is not compact. Also g7' is not continuous at (1, 0).
466 HINTS FOR SELECTED EXERCISES

Section 23
23.A. The functions in Example 20.5(a, b, i) are uniformly continuous on R.
23.G. The function g is bounded and uniformly continuous on [0, p}.
23.1. If (x,) is a sequence in (0,1) with x,—>0, then (f(x,)) is a Cauchy
sequence and is therefore convergent in R.
23.K. Take f(x) =sin x, g(x) =x, for xER.

Section 24

24.B. Take ((i/n)f), where f is as in Example 20.5(g).


24.C. Obtain the function in Example 20.5(h) in this way.
24.E. (a) The convergence is uniform on [0,1]. (b) The convergence is uni-
form on any closed set not containing 1. (c) The convergence is uniform on [0, 1]
or on [c, +%), c>1.
24.J. It follows that f is monotone increasing. Sincef is uniformly continuous, if
e>0, let O=x,.<x,<-+-<x, =1 be such that f(x,)—f(x-.)<« and let n, be such
that if n=n,; then |f(x,)—f.(xy)|<e. If n=sup{n,m,...,",}, show that
|f)-fa(x)|<3e for all x El
24.8. Any polynomial (or uniform limit of a sequence of polynomials) is
bounded on a bounded interval.

Section 25

25.G. (b) If « >0, there exists 5(¢)>0O such that if c<x<c+8(e), xe D(f),
then |f(x)—b|<e. (c) If (x,) is any sequence in D(f) such that c<x, and
c=lim (x,), then b = lim (f(x,)).
25.J. (a) If M>O, there exists m>O such that if x =m and xe D(f), then
f(x) =M. (b) IfM <0, there exists § > 0 such that if 0<|x —c|< 6, then f(x)<_M.
25.L. (a) Let p(r) =sup {f(x):x >r} and set L=lim,_,. g(r). Alternatively, if
« >0 there exists m(e) such that if x = m(e), then |sup {f(x):x >r}-L]<e.
25.M. Apply Lemma 25.12.
25.N. Consider the function f(x)=—1/|x| for x#0 and f(0)=0.
25.P. Consider Example 20.5(h).
25.R. Not necessarily. Consider f.(x)=—x" for xe.
25.8. Yes.

Section 26

26.B. Show that the collection # of polynomials in cos x satisfies the hypotheses
of the Stone-Weierstrass Theorem.
26.E. If f(0) = f(a) =0, first approximate f by a function g vanishing on some
intervals [0,5] and [7—6,a]. Then consider h(x)=g(x)/sinx for x €(0, 7),
h(x) =0 for x =0, 7.
26.1. Consider f(x) =sin (1/x) for x#0.
26.K. Use the Heine-Bore] Theorem or the Lebesgue Covering Theorem as in
the proof of the Uniform Continuity Theorem.
HINTS POR SELECTED EXERCISES 467

26.Q. (a) Domain compact, sequence uniformly equicontinuous but not


bounded. (b) Domain compact, sequence bounded but not uniformly equicon-
tinuous. (c) Domain not compact, sequence bounded, and uniformly
equicontinuous.

Section 27

27.D. Observe that g'(0)=0 and that g’(x) = 2x sin (1/x)—cos (1/x) for x #0.
27.E. Yes.
27.L. We can write

f@)-f) _x-¢ f@)-fl)_y-e¢ fy)-fO©


x-y x-y x-c¢ x-y yc —
27.8. (b) If b#0, then for ne N sufficiently large, given x >n, there is an x, >n
such that

(Fx)
— f(n))/x| = |(x — n)/x] [f'n] = [Ge — 2)/x| [b|/2.

Section 28

28.F. Between consecutive roots of p’ the polynomial is strictly monotone. If xo


is a root of odd multiplicity of p’, then x, is a point of strict extremum for p.
28.H. The function f has roots of multiplicity n at x =+1; f’ has roots of
multiplicity n—1 at x =+1, and a simple root inside (—1, 1); etc.
28.0. Use Exercise 27.0.

Section 29
29.D. If e>0, then there are rational numbers n,...,7, in F such that
O<f(x)<e if x#r,. Let P be a partition such that each of the (at most 2m)
subintervals containing one of the r,,..., 7, has length less than e/2m. Show that
Os S(P;f, g) s2¢.
29.5. If fi(x)
= f(x) for x¢{c,...,¢,} and ¢ >0, let P be a partition such that
each of the subintervals containing one of the c.,...,¢m has length less than
e/2mM, where M = sup {|lf|[;, |i}. Using the same intermediate points, we have
IS(P; f, 2) -S(P; f,, g)}<e, where g(x) =x for xeJ.
29.N. Suppose c € (a, b); then f is g-integrable over [a, c] and[c, b]. If g, is the
restriction of g to [a,c], it follows from 27.N than g; is continuous on [a, c];
similarly for the restriction g, of g to[c, b]. It follows from Theorem 29.8 that fg;
is integrable over [a,c] and that fg; is integrable over [c, b] and that

[tae= [tes ['tae= [tet


Now let (fg’(x) =f(x)gi(x) for a= xc and (fg’)(x) =f(x)gi(x) for c<x = b.
29.P. If ||P||<6 and if Q is a refinement of P, then Q||<6.
29.R. If e >0, let P= (Xo, X1,..-, X,) be a partition of J such that if P > P., and
S(P; f) is any corresponding Riemann sum, then [S(P; f)—J? f|<e. Let M = |[fl.
468 HINTS FOR SELECTED EXERCISES

and let 6=e/4nM. If Q=(yo, yi,..-, Ym) is a partition with norm ||Ol|<
4, let
Q*=QUP, so that Q* 2 P and has at most n—1 more points than Q. Show that
S$(Q*; f)—S(Q; f) reduces to at most 2(n—1) terms of the form +{f(€)—f(n)}
(x;— y.) with |x,—y.|<6.

Section 30

30.C. If e >0 is given, let P, be as in the proof of 30.2. If P is any refinement of


P,, then

|S(P,; f,g)-S(P;f, g)| = y If d—fCoodl Ig(a)— &(Xe-1)|

where |u,—v,|<6(e) and hence this sum is dominated by eM. Now use the
Cauchy Criterion.
30.E. Direct estimation yields

<= M(b—a)"".
([ voor dx)
Conversely, f(x) = M-—e on some subinterval of [a, b].
30.H. If m = f(x)=M for a =x = 8, there exists an A with m= A <M such
that

F(B)-Fla)= [fag = A(g(B)- e(a)}.


30.1. Let f(x) =—1 for x e[—3,0) and f(x)=1 for x €[0, 1].
30.J. Apply the Mean Value Theorem 27.6 to obtain F(b) — F(a) as a Riemann
sum for the integral of f.
30.M. If m <= f(x) =M for xeéJ, then
b b b

m[ p= { fp<M| p-

Now use Bolzano’s Theorem 22.4.


30.P. The functions ¢, »™' are one-one and continuous. The partitions of [c, d]
are in one-one correspondence with the partitions of [a, b] and Riemann-Stieltjes
sums of fog with respect to g°@ are in one-one correspondence with those of f
with respect to g.
30.V. (a)% (€) 9 fe) 1/2.

Section 31

31.K. Since f.(x)-fi(c)=Si ff, we can apply Theorem 31.2 to obtain f(x)—-
f(c)=J§ig for all xeJ. Show that g=f’.
31.8. Apply Theorem 30.9 to (31.2) with h(t)=(b-1).
31.V. Prove that the functions G and H are continuous. The remainder of the
proof is as in 31.9.
31.X. The function f is uniformly continuous on J,
x J.
31.Y. Let g(0)=0, g(x) =} for O0<x <3, and g(x) =1.
HINTS FOR SELECTED EXERCISES 469

Section 32
32.D. (a), (b), (d), (e) are convergent.
32.E. (a) is convergent for p, q>—1. (b) is convergent for p+q>—1.
32.F. (a) and (c) are absolutely convergent. (b) is divergent.
32.G. (a) is absolutely convergent if q>p+t1. (b) is convergent if q>0 and
absolutely convergent if q>1.

Section 33

33.A. If O<t< 8, then x'e* <x’e™.


33.B. Apply the Dirichlet Test 33.4.
33.C. (a) is uniformly convergent for |t|=a>0. (b) diverges if t= 0 and is
uniformly convergent if t= c>0. (c) and (e) are uniformly convergent for all t.
33.F. va.

Section 34

34.C. Group the terms in the series S%_, (—1)" to produce convergence to —1
and to 0.
34.G. Consider = ((—1)"n-*”). However, consider also the case where a, = 0.
34.H. If a, b= 0, then 2(ab)’? < a+b.
34.J. Show that b,+b,+---+b, = aiti+---+1/n).
34.3. Use Exercise 34.F(a).
34.K. Show that a,+a,+--+-+a, is bounded below by }{a,+2a.+: +++ 2"ax}
and above by a,+2a,+: + -+2" anit ay.
34.0. Consider the partial sums s, with n/2=k <n and apply the Cauchy
Criterion.

Section 35

35.C. (a) and (e) are divergent. (b) is convergent.


35.D. (b), (c), and (e) are divergent.
35.G. (a) is convergent. (c) is divergent.
35.L. If r<1, then log m<log (m+ 1)—r/m for sufficiently large me N. Show
that the sequence (x,n log n) is increasing.

Section 36

36.A. Apply Dirichlet’s Test.


36.D. (a) is convergent. (b) is divergent.
36.E. (c) If % (a,) is absolutely convergent, so is %(b,). If a, =O except when
sin n is near +1, we can obtain a counter-example. (d) Consider a, = 1/n(log n)’.
36.1. If m>n, then s,, =+1; if m=n, then s,, =0; if m<n, then s,, =—1.
36.K. Note that 2mn =< m’+n’.
470 HINGS FOR SELECTED XI:RCISES

Section 37
37.A. (a) and (c) converge uniformly for all x. (b) converges for x#0 and
uniformly for x in the complement of any neighborhood of x =0. (d) converges for
x >1 and uniformly for x = a, where a>1.
37.C. If the series is uniformly convergent, then
lc, sin nx +++ ++ C2, sin 2nx|<e,
provided n is sufficiently large. Now restrict attention to x in an interval such that
sin kx >} for n<k <2n.
37.H. (a) ~, (c) 1/e, (f) 1.
37.L. Apply the Uniqueness Theorem 37.17.
37.N. Show that if néN, then there exists a polynomial P, such that if x #0,
then f(x) =e"'”’P,(1/x).
37.T. The series A(x) = (a,x"), B(x) = (b,x"), and C(x) =% (c,x") converge
to continuous functions on IE By the Multiplication Theorem 37.8, C(x)=
A(x)B(x) for 0=x<1, and by continuity C(1) = A(1)B(1).
37.U. The sequence of partial sums is increasing on the interval [0, 1].
37.V. If ¢ >0, then |a,| < ep, forn >N. Break the sum ¥ (a,x") into a sum over
n=1,...,N and asum over n>N.

Section 38

38.B. (b) If a, =0, then

Flx4+20)= [ro dt = [10 dt+ [ wT p(t) dt


= F(x) +0= F(x)

for all xER.


38.E. (c) Calculate f,(0) in two ways.
38.G. (a) If k, were continuous, then k,(—a) =~—7” but since k, has period 2a,
then k,(-m)=k,(7) =a
2 4 [cos2x cos 4x cos 6x |
38.1. (b) 2~4[ss2 ro

oases.1 2[cosx

cos2x
cos3x , cos 5x

cos4x cos 6x
(e) | LP? + 7 + 32 |

sin 2x 2 sin 4x Sein 6...


38K. (b) ae 3 3-5 5

() sys x, sin 3x, sin Sx, |


T L 3 5S

38.N. (d) Use Exercise 38.G(b).


HINTS FOR SELECTED EXERCISES 471

38.R. (a) Alfsininx


4|snim sin ax ae
sin3ax |
.
4y7_ 34601)" nnx (-1" . ne
(b) 135] ra O88G4 + sin |.

38.8. Use Fejér’s Theorem 38.12 and Theorem 19.3.


38.T. Modify the proofs of 38.7 and 38.12.

Section 39

39.G. We have |G(u, v)— G(0, 0)| = Ju?+ v7] =|K(u, v)|? so that DG(O, 0)(u, v) =
0. If (x, y)#(0, 0), then D,G(x, x) = 2x sin (2x?)"'— x7! cos (2x’)"', which is not
bounded as x 0.
39.L. (a) Vine
f: = (2a, 2b, 2c).
(c) Vasofs = (be, ac, ab).
39.M. (a) 0. (c) 4/v6.
39.0. (a) At (1, 2) we have {(x, y, z):z-5=2(x—1)+4(y—2)}.
(c) At (1, 1) we have {(x, y, z):z -V2 = —(x
+ y ~ 2)/V2}.
39.Q. (a) At £=0 we have {(x, y,z):x=t,.y=0,z=0}; at t=1 we have
{(x, y, z):x=1+s, y=14+2s,z=14+3s}.
(c) At t=3a we have {(x, y, z):x=~—2s, y=2, z=ia0+t+s}.
39.8. (b) At the point (3, —1, —3) corresponding to (s, t)=(1, 2) we have:
S. ={, y, z):x =34+(s—1)+(t-2), y=~—14+(s—-1)-(t-2),
z=-34+2(s—1)—-4(t—2)}.
(d) At the point (1,0,0) corresponding to (s,1)=(0,3ar) we have: S,=
{(x, y,z):x=1,y=s,z2 =-(t-ia)}.
39.V. Note that if ye R’, z¢R’, then (y,z)e R*xR'=R" is such that
Ky, 2P = lly P+ lle.
Section 40
40.A. F(t) =2(3t+ 134+ 2(2t—3)2 = 261-6.
40.D. D,F(s, t)= (sin s cos t+sin t)(—sin s)+ (cos s + sin t)(cos s cos t)+ 0.
40.G. (a) D,F(x, y)=f'(xy)y, DaF(x, y) =f'(xy)x.
(d) DiF(x, y)=f'Q?—y’)(2x), DiF(x, y)= f'(x?— y’)-2y).
40.K. (b) Since g’{(t)= D,f(tc)c,+---+D,flte)c,, it follows from Euler’s Rela-
tion that we have

ig’(t) = (te,)D, f(tc)+- --+(te,)D,f(tc) = kf(tc) = kg(t).

Therefore (why?) g(t) = Ct* for some constant C. Since f(c) = g(J) = C, we deduce
that f(tc) = g(t)=t'f(c), whence f is homogeneous of degree k.
40.M. Since

[Bx +u, y + v)~ B(x, y)— (B(x, v)


+ Btu, y))l|
= ||B(u, v)]| = M [ull loll s 2M (lull + oP) =2M [I »)/?,
it follows that DB(x, y)(u, v) exists and equals B(x, v)+ B(u, y).
472 HINTS FOR SELECTED EXERCISES

40.P. Since Dg(c)(u) = (ugi(c), ..., ugi(c)) =ug'(c) for ue R, it follows from
the Chain Rule that

Dh(c}(u) = Df(g(c))(Dg(c)(u)) = Df(g(c))(ug'(c)) = uDf(g(c))(g(c))


whence h'(c) = Df(g(c))(g"(c)).
40.Q. If f=(f.,.--,f,), there exist points c;e¢S such that f(b)—fi(a)=
Dji(c,)(b—
a). Now let L have the matrix representation [D,f.(c,)].
40.R. By Theorem 12.7 any two points in 0 can be joined by a polygonal curve
lying inside Q. Apply the Mean Value Theorem to each segment of this curve.
40.U. Indeed D,f(x, y) = y(x?- y*Mx?+ y?)' + 4x7y"(x?+ y7)? and D,,.f(0, 0) =
~—1, while D,,f(0, 0)=+1.
40.W. If o:(-e,1+¢2)—> R® is defined by p(t)=f(a+t(b—a)), then ¢'(t)=
Df(a+t(b—a))(b—a). Write p(t) = (gilt), ..., @,(t)) where ¢,(t) = fi(at t(b — a))
and note that ¢(1)—¢,(0) = fi p{(t).

Section 41

41.A. Use Exercise 21.P. 41.D. Here Df(0)=0. No.


41.E. Consider Exercises 27.H and 22.0.
41.F. Consider Exercise 40.L.
41.J. Find the relative extrema near 0.
41.Q. (a) At (1, 1,2) we have S-{(x, y, z):2x+2y-—z=2}.
(c) At (4,3, 2) we have S, ={(x, y, z):x+8y —2z =4}.
41.S. lf D,f vanishes on an open set, apply Exercise 40.8. If Di f(Xxo, yo) #0,
then consider F(x, y) = (f(x, y), y) mear (Xo, yo).
41.T. If Dig(c) #0, then consider G(x, y)= g(x)+(Q, y).
41.V. Show, as in the proof of 41.6, that if |ly||<m/2, then there is a vector x € R”
with ||x||< 1 such that y = L,(x).
41.W. If yeR’, let x»=0 and

Xn = Xu — (F(X) — F(%n—1) — on — Xn-1))-


Show, as in the proof of 41.6, that x =lim (x,) exists and f(x)= y.

Section 42
42.A. (a) Saddle point at (0,0). (b) Relative strict minimum at (—2.}3).
(c) Saddle point at (0, —1); relative strict minimum at (0,3). (f) Saddle point at
(0, 0); relative strict minima at (0, —1) and (0, 2).
42.D. If f is not constant, then either the supremum or the infimum of f on
S ={x ER? :||x|| <1} is not 0. Since $ is compact, this supremum (or infimum) is
attained at a point ce S. The hypothesis rules out the possibility that |lcl|= 1.
42.F. (a,d) Relative minimum at (0,0). (b,c,e) Saddle point at (0,0).
(f) Relative strict minimum at (0, 0).
42.G. Saddle point at (1,1).
42.H. Monkeys have tails.
PINTS TOR SELECTED EXERCISHS 473

42.1, 4. 42.K. 3.
42.S. (a) Maximum value = 1, attained at (+1, 0); minimum value = —1, attained
at (0,+1). (b) Maximum value=3, attained at (1,0); minimum value= ~—1,
attained at (-1,0). (c) Maximum value= 4, attained at (1, +1); minimum value=
—1, attained at (—1,0). (d) Maximum value= 1, attained at (0, 7/2); minimum
value = —1; attained at (0, —7/2).
42.U. Maximum value= 1, attained at (1, 0, 0); minimum value =, attained at
(—s 3 3).

Section 43

43.B. If peN is given, let n>(2'°-1)"'. If a cell I in R°® has side lengths
O<a,<a,<:-+:<4,, let c=a,/n. Then I is contained in the union of n((a,/c]+
1)---([a,/ce]+1) cubes with side length c, having total content less than
2(a,-+-a,)=2c(1). Hence, if Z is contained in the union of cells with total
content less than e, it is contained in the union of cubes with total content less than
26.
43.E. No.
43.H. If the closure of J, is [ai,6,]x---[a,, b,], for j=1,...,n, and if
I=[a,, b,]x---[a,, b,], let P, be the partition of [a,, b,] obtained by using the
points {aj, bi:j=1,...,a},..., and P, be the partition of [a,, b,] obtained by
using {a,,b,:j=1,...,n}. The partitions P,,...,P, induce a partition of I.
43.1. Enclose Z in the union of a finite number of closed cells in I with total
content less than e. Now apply 43.H.
43.J. Enclose Z in the union of a finite number of open cells in I with total
content less then «. Now apply 43.H.
43.K. We form a sequence of partitions of I into cubes with side length 2°"8 by
successive bisection of the sides of I. Given a cube K <I with side length r, enclose
K in the union of all cubes in the nth partition which have non-empty intersection
with K. If n is so large that (1+ 8/2""'r)’ <2, then this union has total content less
than 2c(K).
43.L. Use 43.G. 43.M. Use 43.L.
43.R. First treat the case f= g; then consider (f+).
43.T. Let M>lflL. Ilgl. Since f and g are uniformly continuous on K, if P, is
sufficiently fine, thenf and g vary less than e/2M on each K;, and such that for any
p,¢K, we have |fxfg—L f(p,)g(p,)e(K,)|=(e/2)c(K). Then we have

|[.fe-Liereorew)=|[ fe-L rerecreck)|


+|¥ fleets) g0,)]e(K)| = ect),
43.V. (d) If Z is compact and contained in the union of open cells J,, J,,...,
then it is contained in the union of a finite number of these cells.
474 HINTS FOR SELECTED EXILRCISES

Section 44

44.B. (a) If c¢ b(A), then either c is an interior point of A or it is an interior


point of €(A). In either case, there is a neighborhood of c disjoint from b(A), so
€(b(A)) is open.
44.C. In Example 43.2(g), we have S-=b(S)=IxE. However b(IxD=
Ex {0, 1}UL0, Do.
44.D. Since (ANB) CA NAB,, it follows that

b(ANB)=(ANBYN(C(ANB)) CA NB N(€(A)U E(B)


=A NB N(€(AY U(6(B) )
= (BN b(A))U(A>Nb(B)) < D(A) UB(B).
44.J. If e >0, let P. be a partition such as that in Exercise 43.P and such that the
union of the cells in P, which contain points in b(A) has total content less than
e/2 ||f\. Now apply Exercise 43.P to the restriction of f to A.
44.K. Since mg(x) s f(x) g(x) s Mg(x) for x € A, it follows that mJ. g = J. fg=
Ming. UE Sag70, take w= (a fe)ag)"-
44.N. The set b(K)=K does not have content zero.
44.R. Note that F(x, y)=J} {fi f(s, t) ds} dt.

Section 45

45.A. Examine the proofs of 45.1-45.4.


45.D. (a) 6m. 45.F. (e-1).
45.H. Let u=xy, v=y/x”. The area equals (log 2)/3.
Index

Abel, N. H., 306 for sequences, 108


Abel summability, 325 Bonnet, O., 234
Abel’s Lemma on partial summation, 306 Bound, lower, 37
Abel’ s Test, for convergence, 307 upper, 37
for uniform convergence, 318 Boundary point, 65, 422
Abel’s Theorem, 325 Bounded Convergence Theorem, 242
Absolute convergence, of an integral, 265 function, 119
ofa series, 289, 310 set, 69
Absolute value, of a real number, 35 sequence, 93
of a complex number, 88 variation, 225
of a function, 143 Borel, E., 74
Accumulation point, 69 Brouwer, L. E. J., 163
Additive function, 145, 437 Bunyakovskii, V., 56
Affine function, 350
Algebraic properties of R, 28 ff. Cantor, G., 24
Alternating series, 308 Cantor Intersection Theorem, 77
Appell, P., 328 Cantor set, 48
Approximation Lemma, 377 Cartesian product, 8
Approximation theorems, 167 ff., 183 ff. space, 56
Archimedean Property, 39 Category Theorem, 80
Archimedes, 39 Cauchy, A.L.,
Arithmetic mean, 129, 411 Cauchy Condensation Test, 293
Arzela, C., 189 Convergence Criteria, 109, 121, 131, 216,
Arzeld-Ascoli Theorem, 189 261, 267, 289, 317, 416
Ascoli, G., 189 Mean Value Theorem, 197
Axiom of Choice, 25 principal value, 259, 269
product, 313
Baire, R., 80 Root Test, 295
Baire’s Theorem, 80 sequence, 108, 113
Ball, in a Cartesian space, 57 Cauchy-Hadamard Theorem, 320
Bernoulli, J., 36 Cell, in R, 46
Bernoulli’s Inequality, 36 in RP, 69, 413
Bernstein, S. N., 169 Cesaro, E. 5129
Bernstein's Approximation Theorem, 171 Cesiro’s method of summation, 129, 339
Bernstein’s Theorem, 324 Chain Rule, 361
Bessel, F. W., 201 Change of variable, 234, 443 ff.
Bessel’s Inequality, 334 Chebyshev, P. L., 62
Beta function, 283 Chebyshev’s Inequality, 62
Bilinear function, 373 Choice, Axiom of, 25
Bilinearity of the Riemann-Stieltjes integral, Circumscribing Contour Theorem, 78
217
Bijection, 18
Binary operation, 28 Closed set, 64
Binomial Expansion, 208, 327 Closure of a set, 68, 423
Block partial derivatives, 360, 385 Cluster point, 69
Bolzano, B., 70 Collection, 1
Bolzano Intermediate Value Theorem, 153 Compact set, 73
Bolzano-Weierstrass Theorem, for infinite Compactness, preservation of, 154
sets, 70 Comparison Tests, 262, 294

475
476 INDEX
Complement of a set, 7 Descartes, R., 8
Complex number system, 86 ff. Diagonal method, 24, 190, 227
Components of a vector, 57 Difference, of two functions, 142
Conditional convergence, 289 of two sequences, 91
Conjugate, of a complex number, 86 symmetric, 10
Connected set, 80 Differentiable function, 349
Connectedness, preservation of, 153 Differential equation, 256
Constraint, 402 Differentiation Theorem, for integrals, 231
Content, inner, 432 for power series, 322
of a cell, 413 Dini, U., 173
outer, 432 Direct image, 19
of a set, 423 Directional derivative, 349
zero, 413 Dirichlet, P. G, L., 140
Continuity, 136 ff. Dirichlet’s discontinuous function, 140
of the inverse function, 156 Test for convergence, 263, 306
one-sided, 215 Test for uniform convergence, 268, 318
uniform, 158 ff. Disconnected set, 80
Contour, 78 Disconnection, 80
Contraction, 161 Discontinuity Criterion, 138
Convergence, absolute, 265 Discrete metric, 60
of Fourier series, 335 ff. Disjoint sets, 4
in mean, 253 Divergence, of a sequence, 92, 127
in mean square, 254 Domain, of a function, 12
in a metric space, 103 Dominated Convergence Theorem, 273
interval of, 320 Dot product, 54
radius of, 320 Double limit, 130
of a sequence, 92 sequence, 130 ff.
ofa sequence of functions, 114 series, 310 ff.
uniform, 117, 133, 267, 316
Convex function, 211 Element, ofa set, 1
set, 59 Equicontinuity, 189 ff.
Coordinates, cylindrical, 454 Equivalent sequences, 128
polar, 449 Euler, L., 373
spherical, 449 Exponential function, 44, 146, 208, 329
ofa vector, 57 Extension, of a continuous function, 186
Correspondence, 23
Cosine series, 329 of a function, 15
Countable set, 23 Exterior point of a set, 65
Covering, 73 Extremum, 397
Critical point, 398
Cube, 413 Fejér, L., 338
Curve, polygonal, 83 Fejér’s Theorem, 339
space-filling, 415 Field, 28
Cut, 45 Finite set, 23
Property, 46 First Mean Value Theorem, 230, 232
Cylindrical coordinates, 454 Fixed point, 161
Flyswatter Principle, 102
D’Alembert, J., 296 Fourier, J.-B. J., 330
Darboux, G., 197 Fourier coefficients, 331
Darboux’s Theorem, 199 cosine series, 342
Decreasing function, 145 series, 330 ff.
sequence, 105 sine series, 343
Dedekind, R., 45 Fresnel, A., 264
De Moivre, A., 239 Fresnel integral, 264
De Morgan, A., 8 Frobenius, G., 328
De Morgan’s Laws, 8 Function, 10 ff.
Density of a set function, 437 absolute value of, 143
of the rational numbers, 41 additive, 145,437
Denumerable set, 23 affine, 350
Derivative, 194 ff., 349 ff. Beta, 283
block partial, 360, 385 bijective, 18
directional, 349 bilinear, 373
one-sided, 200, 335 bounded variation, 225
partial, 348 Class C’ , 376
INDEX 477
composition, 15 Hyperbolic functions, 211
content, 423, 426 Hypergeometric series, 304
continuous, 137
convex, 211 Identity element, of a field, 28
decreasing, 145 Image, 12, 19, 20
derivative of, 194, 349 Imaginary part, 86
differentiable, 349 Implicit Function Theorem, 384, 395, 396
direct image of, 19 Improper integrals, 257 ff.
domain of, 12 Increasing function, 145
even, 200, 332 sequence, 104
exponential, 44, 146, 208, 329 Inequalities, basic properties of, 32 ff.
Gamma, 264, 282 Inequality, arithmetic-geometric, 60, 411
greatest integer, 145, 221 Bernoulli, 36
harmonic, 410 Bessel, 334
homogeneous, 372 Cauchy, 60
hyperbolic, 211 Chebyshev, 62
increasing, 145 Holder, 61, 202, 411, 435
injective, 17 Minkowski, 61, 411
inverse, 17 Schwarz, 56
inverse sine, 18 Triangle, 36,56
inverse image of, 19 Infimum, 38
Laplace transform of, 283 Property, 39
linear, 147 Infinite integral, 259 ff.
logarithm, 45, 146, 209, 238 limits, 127
monotone, 145 product, 305
nondifferentiable, 195 series, 286 ff.
odd, 200, 332 set, 23
periodic, 164, 330 Injection, 17
piecewise, continuous, 330 Injective function, 17
piecewise linear, 168 Injective Mapping Theorem, 377
polynomial, 144 Inner content, 432
positively homogeneous, 372 Innex product, 54, 254
range of, 12 product space, 54
semicontinuous, 180 Integrability theorems, 216, 228-229, 418,
step, 167 420, 436
surjective, 18 Integral, 212 ff., 415 ff.
square root, 18, 157 improper, 257 ff.
trigonometric, 209 ff., 238, 329 infinite, 259 ff.
Fundamental Theorem, of algebra, 88 iterated, 246
of integral calculus, 231 lower, 225, 422
partial, 259
Gamma function, 264, 282 transformation of, 234, 443 ff.
Gauss, C. F., 88 upper, 225, 422
Geometric mean, 60, 411 Integral Test, for series, 300
series, 289 Integrand, 214
Global Continuity Theorem, 151 Integration by parts, 219, 233
Gradient, 357 Integrator, 214
Graves, L. M., 378 Interchange theorems, relating to continuity,
Greatest integer function, 145, 221 166, 244, 270, 316
lower bound (= infimum), 38 telating to differentiation, 204, 245, 271,
Grid, 433 317, 322, 369
relating to infinite integrals, 270 ff.
Hadamard, J., 320 relating to integration, 241 ff., 244 ff.,
Half-closed cell (or interval), 47 316 ff., 321, 430 ff.
Half-open cell (or interval), 47 relating to sequences, 166, 204, 241 ff.,
Hardy, G. H., 260 273 ff.
Harmonic function, 410 relating to series, 316 ff., 321
series, 290 Interior maximum, 195
Heine, E., 74 point, 65
Heine-Borel Theorem, 74 of a set, 67,423
Helly Selection Theorem, 227 Intermediate Value Theorem, 153
Holder, O.,61 Intersection of sets, 4
Holder’s Inequality, 61, 202, 411, 435 Interval, of convergence, 320
Homogeneous function, 372-373 inR,47
478 INDEX
unit, 47 of Fourier series, 337
Inverse function, 17 Mean Value Theorem, for derivatives in R,
continuity of, 156 196 ff.
Inverse image, 20 for derivatives in RP , 364 ff,
Inverse sine function, 18 for integrals in R, 230, 232-233
Inversion mapping in C, 89 for integrals in R? , 429
Inversion Theorem, 381, 396 Means, arithmetic, 129
Irrational elements of a field, 31 Measure zero, 421
Irrational powers of a real number, 44 Member of a set, 2
Iterated integrals, 246, 275 ff., 430 ff. Meriens, F., 313
limits, 131 ff. Metric, 59
suprema, 42 space, 59
Minimum, relative, 195, 397
Jacobi, C. G. J., 354 Value Theorem, 154
Jacobian determinant, 354 Minkowski, H., 61
Jacobian Theorem, 442 Minkowski’s Inequality, 61, 411
Jump ofa function, 146 Models for R, 49
Monotone Convergence Theorem, for
Kronecker, L., 50 infinite integrals, 274
for integrals, 243
Lagrange, J. -L., 60 for sequences, 104
Lagrange identity, 60 Multiplication of power series, 323
multiplier, 402 ff. Multiplier, Lagrange, 402 ff.
Landau, E., 128
Laplace, P, -S., 283 Nearest Point Theorem, 78
Laplace transform, 283 ff. Neighborhood, 65
Least squares, 409 Nested Cells Property, in R, 47
upper bound (= supremum), 38 in RP, 69
Lebesgue, H., 77 Newton’s method, 200-201, 375, 396
Lebesgue Covering Theorem, 77 Nondifferentiable functions, 195
integral, 212 Norm, 54
number, 77 of a function, 119, 253, 254
Lebesgue’s Criterion for Integrability, 436 ofa matrix, 150
Leibniz, G. W., 353 of a partition, 223
Leibniz’s Alternating Series Test, 308 of a vector, 54
formula, 245 Norm Convergence of Fourier series, 337
L’Hospital, G. F., 203 Normed space, 54
Limit, deleted, 175 Null space, 387
of a double sequence, 130 Nullity, 387
ofa function, 175 Nusnbers, complex, 4, 86 ff.
inferior, 123 natural, 3
nondeleted, 175 rational, 4, 31
right-hand, 181 real, 27 ff.
of a sequence, 92
superior, 123, 178 0,0, 128
upper, 178 Open Mapping Theorem, 380
Linear function, 147 Open set, 62
functional, 247 Operation, binary, 28
transformation, 255 Order Properties of R, 32 ff.
Lipschitz, R., 161 Ordered pair, 8
Lipschitz condition, 161 Ordinate set, 433
Logarithm, 45, 146, 209, 238 Orthogonal vectors, 59
Lower bound, 37 Orthonormal set of functions, 344
integral, 225, 422 Oscillation of a function, 164
Outer content, 432
Machine, 14
Macilaurin, C., 299 Pair, ordered, 8
McShane, E. J., 408 Parallelepiped, 69
Mapping, 12 Parallelogram Identity, 58
Matrix, 148 Parametrization Theorem, 388
Maximum, relative, 195, 397 Parseval’s Equality, 338
Value Theorem, 154 Partial derivative, 348
Mean convergence, 253 integral, 259
Mean square convergence, 254 map, 360
INDEX 479
product, 305 sum, 213
sum, 287, 306, 310 Riesz, F., 248
Partition, 213,415 Riesz Representation Theorem, 248
Peano curve, 415 Rolle, M., 196
Periodic function, 164, 330 Rolle’s Theorem, 196
Perpendicular, 59 Root, multiplicity of, 207
Piecewise continuous function, 330 simple, 207
Point, accumulation, 69 Root Test, 295
boundary, 65, 422 Rosenberg, A., 50, 57
cluster, 69 Rota, G. C., 480
critical, 398
exterior, 65 Saddle point, 398
interior, 65 Schoenberg, I. J., 420
saddle, 398 Schwartz, J. T., 444
Pointwise Convergence of Fourier series, Schwarz, H. A., 56
Schwarz Inequality, 56
Polar coordinates, 449 Second Derivative Test, 399
curve, 453 Mean Value Theorem, 233
Pélya, G., 173 Semicontinuity, 180
Polygonal curve, 83 Sequence(s), 91 ff.
Polynomial, Bernstein, 169 of arithmetic means, 129
trigonometric, 332 bounded, 93
Positive class, 32 in a Cartesian space, 91
Power, of a real number, 31, 43-44 Cauchy, 108
series, 319 ff. convergent, 91
Preservation, of Compactness, 154 double, 130
of Connectedness, 153 difference of ,91, 100
Product, Cauchy, 313 divergent, 91, 127
dot, 54 equivalent, 128
of functions, 142 of functions, 113 ff., 165 ff.
infinite, 305 iterated, 131-132
of a real number and a vector, 53 limit of, 91
of sequences, 91 in a metric space, 103
Property, 3 monotone, 104
product of, 91, 100
Quotient, of functions, 143 quotient of 92, 100
of sequences, 92 sum of, 91, 100
unbounded, 126
Raabe, J. L., 298 Series, 286 ff.
Raabe’s Test, 298 absolutely convergent, 289
Radius of convergence, 320 alternating, 308
Range of a function, 12, 387 conditionally convergent, 289
Rank, 387 double, 310 ff,
Rank Theorem, 391 Fourier, 330 ff.
Ratio Test, 296 of functions, 315 ff.
Rational number, 31 geometric, 289
power of a real number, 44 harmonic, 290
Ray, 46 hypergeometric, 304
Real part, 86 p-series, 290
Rearrangement Theorem, 292 power, 319 ff.
Remainder in Taylor’s Theorem, 206, Tearrangements of, 291 ff.
Set(s), accumulation point of, 69
Cauchy’s form, 206 boundary point of, 65, 422
integral form, 243 bounded, 69
Lagrange’s form, 206 Cantor, 48
Restriction of a function, 15 Cartesian product of, 8
Riemann, B., 212 closed, 64
Riemann Criterion for Integrability, 228, closure of, 68, 423
420 cluster point of, 69
integral of a function on R, 214 compact, 73
integral of a function on R”, 415 complement of, 7
sum, 213,415 connected, 80
Riemann-Lebesgue Lemma, 334 content of, 423
Riemann-Stieltjes integral, 212 ff. convex, 59
480 INDEX
denumerable, 23 Surjective function, 18
disconnected, 80 Surjective Mapping Theorem, 378
disjoint, 4 Symmetric difference, 10
empty, 4
enumerable, 23 Tangent line, 358
equality of, 2 plane, 350, 358, 359, 394
exterior point of, 65 space, 358, 394
finite, 22 Tauber, A., 325
infinite, 22 Tauber’s Theorem, 325
interior of, 68, 423 Taylor, B., 205
interior point of, 65 Taylor’s Theorem, 205, 243, 371
intersection of, 4 Tests for convergence of series, 294 ff.
nonintersecting, 4 Tietze, H., 186
open, 62 Tietze’s Extension Theorem, 186
ordinate, 433 Topology, 63, 72
relative complement of, 7 Translation of a set, 80
symmetric difference of, 10 Transformation, 14
union of, 4 of integrals, 234, 443 ff.
void, 4 Triangle Inequality, 36, 56
Shuffled sequence, 112 Trichotomy Property, 32
Side condition, 402 Trigonometric functions, 209 ff., 238, 329
Simple root, 207 polynomial, 332
Sine series, 329
Space, inner product, 54 Uniform continuity, 159
metric, 59, 72 Uniform convergence, of Fourier series, 337
normed, 54 of an infinite integral, 267
topological, 72 ofa sequence of functions, 117
vector, 52 of a sequence of sequences, 133
Space-filling curve, 415 of a series of functions, 316
Sphere in a Cartesian space, 57 Uniform norm, 118 ff.
Spherical coordinates, 449 Union of sets, 4
Solid of revolution, 453 Uniqueness Theorem for power series, 322
Stieltjes, T. J., 212 Unit ball, cell, 47
Stirling, J., 239 content of, 454-455
Stirling’s formula, 239-240 interval, 47
Stone, M. H., 183 Upper bound, 37
Stone Approximation Theorem, 183 integral, 225, 422
Stone-Weierstrass Theorem, 184
Subsequence, 98 Value, of a function, 12
Subset, 2 Vector space, 52
Sum, Riemann, 213, 415
partial, 287 Wallis, J., 239
Riemann-Stieltjes, 213 Wallis product, 239
of two functions, 53, 142 Weierstrass, K., 70
of two sequences, 91 Weierstrass Approximation Theorem, 172,
of two vectors, 52 186, 340
Summability, Abel, 325 Weierstrass M-Test, for infinite integrals,
Cesdro, 129, 339 268
Supremum, 38 for series, 317
iterated, 42 Well-Ordering Property, 22
norm, 119
Property, 39 Zero, content, 413
Surjection, 18 measure, 421

You might also like