Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
12 views137 pages

Lecture Notes AM1

Uploaded by

Noman Shergill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views137 pages

Lecture Notes AM1

Uploaded by

Noman Shergill
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

Advanced Mathematics 1

Dr. Stefan Kühnlein

Institut für Algebra und Geometrie, Karlsruher Institut für Technologie, October
2023
3

Introduction
These lecture notes are companions of my lecture on this topic in the winter term
2021/22. Although many things in the lectures were inspired by the lecture notes
of Prof. Axenovich, I often had another focus. Still the lecture notes owe quite a
bit to its anchestor.

We will treat the basic objects of (real) analysis. Based on studying sequences
and series we will study the notions of continuity, differentiability and inegration.
We will then consider first examples of differential equations.

It takes some patience until we will be at the point where you want to be, but in
order to be able to apply mathematics properly it is helpful to understand WHY
things are as stated. Therefore we will prove several statements quite thoroughly,
because they should be true always. This will involve certain proofs where you
will think there is nothing to prove. This might then be a consequence that
in mathematics things are defined in a strict way even if we have an intuitive
understanding, and that we have to make shure that our definitions do what we
expect them to do.
I do not include many pictures, because they are part of the lectures. The pictures
I did include were created by me. Try to find the right pictures for the things
we write down. This would be a good starting point to understand the abstract
world of mathematics.

Karlsruhe, October 2023


Stefan Kühnlein
4
Inhaltsverzeichnis

1 Basic Notions from Mathematics 7


1.1 Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Methods of Proof . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Numbers and Functions 21


2.1 Sums and Products . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3 Polynomial and rational functions . . . . . . . . . . . . . . . . . . 33

3 Sequences 39
3.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2 Subsequences and accumulation points . . . . . . . . . . . . . . . 46

4 Continuity 55
4.1 Basic definitions and examples . . . . . . . . . . . . . . . . . . . . 55
4.2 Two Important Theorems . . . . . . . . . . . . . . . . . . . . . . 57

5 Series 63
5.1 Infinite sums . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.2 Criteria for absolute convergence . . . . . . . . . . . . . . . . . . 70
5.3 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4 The exponential function and some relatives . . . . . . . . . . . . 74
5.5 Two examples for calculations with power series . . . . . . . . . . 80

6 Differentiability 83

5
6 INHALTSVERZEICHNIS

6.1 The derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83


6.2 Rules for calculating the derivative . . . . . . . . . . . . . . . . . 88
6.3 The mean value theorem and extremal values . . . . . . . . . . . 92
6.4 Taylor polynomials and power series . . . . . . . . . . . . . . . . . 98

7 Integration 105
7.1 Lebesgue Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
7.2 The main theorems on integration . . . . . . . . . . . . . . . . . . 111
7.3 Calculating some integrals . . . . . . . . . . . . . . . . . . . . . . 116
7.4 Some elementary differential equations . . . . . . . . . . . . . . . 120
7.5 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 127
7.6 Integrals of rational functions . . . . . . . . . . . . . . . . . . . . 131

8 Indices 135
8.1 Important Theorems . . . . . . . . . . . . . . . . . . . . . . . . . 135
8.2 Some Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
8.3 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
Kapitel 1

Basic Notions from Mathematics

1.1 Sets
The notion of sets and constructions from set theory are basic tools in all of
mathematics. We nevertheless have to start with a naive definition which would
have to be stated differently if we really would want to spend time on the basics
of mathematics. This would take too long, and for our needs the naive version
will do.

Definition 1.1.1 Sets and elements


A set is a collection S of objects such that for every possible object it is decidable
whether it belongs to the collection or not.
The objects belonging to the set are its elements.
If the object s belongs to S , we write s ∈ S. Otherwise we write s 6∈ S.
The empty set ∅ is the set which has no elements. For every object s, s 6∈ ∅.

Remark 1.1.2 There are no restrictions what the objects of a set can be. They
can be rather complicated, in particular the elements of a set can themselves
can be sets as well. Everything can be an object in a set. We will very often be
concerned with more modest sets, for instance sets of numbers.
Things might become more clear by looking at examples (see next number).
A set can be written in several ways. Sometimes we can write up all elements, as
in S = {2, 3, 5, 7}. Sometimes it is more clever to write down properties which
correspond to being an element of a specific set. We then mark this condition
by a vertical line or by a colon. For instance, the above set with four elements
can also be written as S = {p | p is a prime number and p ≤ 10}. In particular,
infinite sets often need to be described in that way.

7
8 KAPITEL 1. BASIC NOTIONS FROM MATHEMATICS

Example 1.1.3 The following are examples for sets which you should have seen
before:

• N, the set of all natural numbers, its elements are 1, 2, 3, . . . and we then
write N = {1, 2, 3, . . . }, hoping that everyone agrees what the dots should
mean. If we want to include 0 to this set, we call it N0 .
• Z, the set of all integers, consisting of all natural numbers, zero, and all
elements −n for n ∈ N.
Z = {x | x ∈ N0 or − x ∈ N0 }.

• Q, the set of all rational numbers, i.e. an object is an element of Q if and


only if it is a number which can be written as a fraction nz , where z ∈ Z
and n ∈ N.
z
Q = { | z ∈ Z and n ∈ N}.
n
• R, the set of real numbers, which we understand as the set of all numbers
which have a decimal expansion. This is cheating a bit, but it is too time-
consuming to introduce real numbers on a conceptional level.
• The set of all people who are Members of Parliament in your favourite
country hopefully is well-defined“, i.e. we hope that for every object it is

clear whether it is a member of this parliament or not.
• The set of all squares in the plane is a set. NB: Its elements are sets.
The set of all squares having a circle as their boundary is the empty set.

Definition 1.1.4 Subsets, intersections and unions


Given two sets S and T, we call T a subset of S if every element in T also is
an element of S. The notation for this is
T ⊆ S.
If S and T have exactly the same elements, then they are equal. This is the case
if and only if S ⊆ T and T ⊆ S simultaneously.
The intersection of two sets S and T is the set of all objects which are elements
of S and also elements of T. We therefore can write it as
S ∩ T = {a | a ∈ S and a ∈ T }.

The union of S and T is the set of all elements which belong to S or to T,


where we understand this or“ as an inclusive or. Notation:

S ∪ T = {a | a ∈ S or a ∈ T }.
As an example, Z = N0 ∪ {−n | n ∈ N0 }.
1.1. SETS 9

Rules 1.1.5 Intersections and Unions


We have the following rules concerning intersections and unions of sets S, T, U :

S ∩ ∅ = ∅, S ∪ ∅ = S.
S ∩ T = T ∩ S, S ∪ T = T ∪ S.
(S ∪ T ) ∪ U = S ∪ (T ∪ U ), (S ∩ T ) ∩ U = S ∩ (T ∩ U ),
S ∩ (T ∪ U ) = (S ∩ T ) ∪ (S ∩ U ), S ∪ (T ∩ U ) = (S ∪ T ) ∩ (S ∪ U ).

The identities in the first line are quite clear: There is no element in ∅, and
therefore also no element in ∅ which belongs to S. The elements which belong
to S or to ∅ are exactly the elements in S.
The second and third line are left as an exercise.
The first identity in the fourth line means: An element belongs to S and T or U
if and only if it belongs to S and T or to S and U. Written more formally this
looks like: Let x be an object. If x belongs to S and to T ∪ U then it belongs to
T or U. If it belongs to T then it belongs to S ∩ T hence to (S ∩ T ) ∪ (S ∩ U )
and similarly if it belongs to U. Therefore

S ∩ (T ∪ U ) ⊆ (S ∩ T ) ∪ (S ∩ U ).

If x belongs to S ∩ T or S ∩ U, then in both cases it belongs to S and to T or


U , therefore
(S ∩ T ) ∪ (S ∩ U ) ⊆ S ∩ (T ∪ U ).

The last rule can be derived similarly.

Definition/Remark 1.1.6 Complement


If S, T are two sets, there is a new set

S r T := {s ∈ S | s 6∈ T },

called the complement (to T in S ).


For instance R r Q, the complement of the set of rational numbers in the set of
real numbers, is the set of irrational numbers (by definition).

The set of irrational numbers is not empty: 2 (cf. 1.3.3) and π (this is hard to
prove) are prominent members of this set.

Construction 1.1.7 Pairs, Triples


If S and T are two sets we build a new set

S × T = {(s, t) | s ∈ S and t ∈ T }.
10 KAPITEL 1. BASIC NOTIONS FROM MATHEMATICS

This is the product of S and T , its elements are called pairs. If S = T, we write
S 2 instead of S × S.
We similarly define

S × T × U = {(s, t, u) | s ∈ S, t ∈ T, u ∈ U }

and tacitely identify (S × T ) × U with this new set of triples. If S = T = U we


also write S 3 = S × S × S and invite the reader to guess what now S n could
mean for n ∈ N.
For instance R2 is the set of all pairs of coordinates of points in the plane, and
R3 the set of coordinates of points in threedimensional space.

Definition 1.1.8 Maps and functions


Given two sets D and C, a map from D to C is a “rule” f which associates
to each element d ∈ D a unique element f (d) ∈ C. Here we are vague about
the notion of a rule, but the only important property is that f (d) is uniquely
determined by f and d. Note, that we have maps between sets and that there
need not exist a way to calculate values. Of course, given subsets C, D ⊆ R,
there might be a way to calculate f (d), but this does not belong to the defining
properties of this notion.
D is called the domain of f and C the codomain. Of course they could be
denoted by different letters. We write f : D → C if f is a map from D to C.
The set

f (D) = {f (d) | d ∈ D} = {c ∈ C | there exists some d ∈ D with f (d) = c}

of values of f is called the range of f.


A map can be thought of as a kind of machine which takes the elements of D as
its input and offers f (d) as a result.
To every map one can associate its graph

G(f ) = {(d, f (d)) | d ∈ D} ⊆ D × C,

which has the property that for every d ∈ D there exists a unique c ∈ C (namely
c = f (d) ) such that (d, c) ∈ G(f ).
If, to the contrary, we are given a set G ⊆ D × C such that for every d ∈ D
there exists exactly one c ∈ C such that (d, c) ∈ G, this set G is the graph of a
map.
If the codomain of a map is contained in R we will often call f a function. As
soon as we have introduced complex numbers, also maps with complex values are
functions. We will not be too strict with this wording.
1.1. SETS 11

Example 1.1.9 Not “well defined”


The following suggests to be a function: Associate to every real number its real
square root. However, this is not a function, it is not “well defined” for two
reasons: not for every real number a (real) square root exists, and if it exists, it
is only defined up to sign, so there are two values of this “function”. In order
to find a function you have to adjust the domain and only admit positive real
numbers, and you have to choose one of the two solutions of the equation y 2 = x
as the value of the function at x. The usual choice is to take the non-negative
square root. Then we get a function f : R≥0 → R≥0 , f (x) = the unique y ≥ 0
with y 2 = x.

Definition 1.1.10 Composition of maps


Given three sets D, C, B and maps f : D → C, g : C → B, we can define a new
map from D to B which associates to each d ∈ D the value of g at f (d) as a
result.
g ◦ f : D → B, (g ◦ f )(d) = g(f (d)).
This map is called the composition of f and g.
It is a formalization of substituting the variable of g by the result of f.

Example 1.1.11 Matriculation numbers


If D is the set of students in a cohorte in mechanical engineering at KIT and
C = N≥2 , the set of all natural numbers exceeding 1, every student should have
gained exactly one matriculation number in C . This is a function m : D → C. If
P is the set of all prime numbers, we have a map from C to P, mapping every
c ∈ C to its smallest prime factor. Composing these two maps associates to every
student d ∈ D the smallest prime factor of her or his matriculation number.
Of course, there will be many students which share the smallest prime factor
2, but this is okay. What the notion of a map forbids is that some student is
associated with two or more different values (or no value at all). This should be
taken care of by the students office:-)

Example 1.1.12 There will be more compositions later on


You know functions which are composed from other functions. Think of examples
like
x 7→ sin(x2 + 3x)
which is a composition of the function f (x) = x2 + 3x with the function g(y) =
sin(y).
In later chapters it will be important to understand which properties (like conti-
nuity or differentiability) of more complicated functions are inherited from those
of their more basic building blocks.
12 KAPITEL 1. BASIC NOTIONS FROM MATHEMATICS

And sometimes it is helpful to substitute an argument in a function by the out-


come of another – this is nothing but composing two maps!

Definition/Remark 1.1.13 Inverse Map


Let f : D → C be a map. We call g : C → D the inverse map to f, in symbols:
g = f −1 , if for all d ∈ D and c ∈ C we have

g(f (d)) = d and f (g(c)) = c.

The maps g ◦ f and f ◦ g are “the identity” on D resp. C, i.e. the maps which
map every element from their domain to itself.
Every set has its own identity map.
Sometimes one is interested in inverting functions. You know examples of this:
The square root function is inverse to the squaring function (both on non-negative
real numbers):


f : R≥0 → R≥0 , f (x) = x2 , g : R≥0 → R≥0 , g(y) = y
are inverses to one another.
And if you want to know which number x satisfies ex = y for a given positive
y, you have to invent the natural logarithm, which is invers to the exponential
map.
In order that a map is invertible (i.e. an inverse map exists) you need that every
element of C is realized as a value of f, i.e. the range is all of C (“ f is surjective”)
and that no value is assumed more often than once (“ f is injective”). We will not
make extensive use of this terminology, but sometimes it is helpful. In particular,
the squaring function is not injective on all of R (as, e.g., 12 = (−1)2 ) and is
not surjective if we use R as the codomain, because the square of a real number
never is negative.
If D, C ⊆ R and f : D → C is invertible, the graph of the inverse function is
produced from the graph of f by reflecting it along the diagonal in the first/third
quadrant.

1.2 Inequalities
For any two real numbers x, y we say that x is larger than y if x − y is positive.
In symbols:
x > y iff x − y > 0.
Here, “iff” is an abbreviation for “if and only if” which is quite common in
mathematical texts.
1.2. INEQUALITIES 13

Rules 1.2.1 Properties of >


As every real number is either positive or zero or negative (this time the or
is exclusive :-) ) we see that for any x, y ∈ R one of the following exclusive
alternatives holds:
x > y or x = y or y > x.
We also write x < y in the last case. This threefold alternative is called the
trichotomy of the order relation.
We also see that for all x, y, z ∈ R we have the following implication:1

[x < y and y < z] ⇒ x < z.

This is the transitivity of the order relation. The additive monotonicity of the
order relation says that for any x, y, z ∈ R we have

x < y ⇒ x + z < y + z,

and the multiplicative monotonicity says that for all x, y, z ∈ R with z > 0 we
have
x < y ⇒ xz < yz.
These rules are well known from highschool.
We also denote by x ≤ y that x < y or x = y, and similarly x ≥ y means
x > y or x = y. Both ≤ and ≥ satisfy transitivity, additive mononicity and
multiplicative monotonicity but not the trichotomy (because ≤ and = are not
exclusive). The replacement for trichotomy is

[x ≤ y and y ≤ x] ⇒ x = y.

Example 1.2.2 Let x ∈ R satisfy 1 ≤ x. Then by the monotonicity of multi-


plication x ≤ x2 .
Similarly, for 0 ≤ x < y, we have x2 ≤ xy < y 2 , showing that the squaring
function is strictly monotonically growing on the positive real numbers, cf. 1.2.7.

Definition 1.2.3 Absolute value


If x is a real number, we define its absolute value to be

x, if x ≥ 0,
|x| :=
−x, if x < 0.

The absolute value satisfies an utmost important rule which will be used again
and again in many situations:
1
If A, B are two assertions, we write A ⇒ B if the truth of A implies the truth of B.
14 KAPITEL 1. BASIC NOTIONS FROM MATHEMATICS

Rules 1.2.4 Triangle inequality


Given two real numbers x, y we always have

|x + y| ≤ |x| + |y|.

This is clear if x and y are either both positive or both negative, and if one of
them is zero. Then we even have equality. The interesting case is x > 0, y < 0
(or vice versa, which goes similarly). Then if x + y ≥ 0, we have

|x + y| = x + y = |x| − |y| < |x| + |y|,

and similarly for x + y < 0 :

|x + y| = −(x + y) = −x − y = −|x| + |y| < |x| + |y|.

There is another variant of the triangle inequality, namely

||x| − |y|| ≤ |x − y|.

Its proof is left to the reader.

Definition 1.2.5 Intervals


Given two real numbers a ≤ b, we define the closed interval

[a, b] := {x ∈ R | a ≤ x and x ≤ b}.

The open interval is

(a, b) := {x ∈ R | a < x and x < b}.

We have to take care that this is not mistaken with the pair having a and b as
its components. The context (or an explicite remark) should make clear, which
meaning the notation has in the specific situation. Note that (a, a) = ∅.
We also have half open intervals

[a, b) := {x ∈ R | a ≤ x < b}, (a, b] := {x ∈ R | a < x ≤ b},

where we combine the two inequalities in one chain of inequalities.


As certain degenerations of these intervals we write

[a, ∞) := {x ∈ R | x ≥ a}

and similarly (a, ∞), (−∞, b), (−∞, b], but we do not include ∞ in the interval,
as this fails to be a real number.
The last special case is (−∞, ∞) = R.
1.3. METHODS OF PROOF 15

Definition/Remark 1.2.6 Maximum and Minimum


Let S ⊆ R be a set of real numbers. An element s ∈ S is called the maximum
of S , denoted as max(S), if for every x ∈ S the inequality x ≤ s is valid. It is
the minimum of S, if s ≤ x holds for every x ∈ S.
Note that not every set needs to posses a maximum or a minimum. Every non-
empty finite set certainly does, because of the trichotomy of <, but for instance
the interval [0, 1) does not have a maximum, because for every element s ∈ [0, 1)
there exists a larger element in [0, 1), for example 1+s
2
.
Every non-empty subset of N contains a minimum.
S ⊆ R is bounded (from above) if there exists a B ∈ R such that every element
s ∈ S satisfies s ≤ B. Such numbers B are then called upper bounds of S.
Similarly, S is called bounded (from below) if there exists some b ∈ R (a lower
bound) such that every s ∈ S satisfies s ≥ b.
If a maximum exists, it is an upper bound (even the smallest upper bound), and
if a minimum exists, it is a lower bound (the largest lower bound).
If S is both bounded from above and from below, then it is called bounded.
The set S = { n1 | n ∈ N} does not contain a smallest element, because for every
1 1
n
∈ S, 2n is smaller. However, it still is bounded, as every member is larger than
0 and at most 1. 1 is the maximum, 0 the largest lower bound.
Every non-empty subset S ⊂ Z with an upper bound B does have a maximum,
because the upper bound B can be chosen to be an integer and then −S + B =
{B − s | s ∈ S} ⊆ N0 has a minimum m giving the maximum B − m of S.

Definition 1.2.7 Monotonicity


Let D ⊆ R be given and f : D → R a function on D. Then D is called
montonically growing if for all x, y ∈ D :

x ≤ y ⇒ f (x) ≤ f (y).

If x < y always implies f (x) < f (y), f is strictly monotonically growing. If


x ≤ y always implies f (x) ≥ f (y), then f it is monotonically decreasing, and
strictly monotonically decreasing if x < y always implies f (x) > f (y).
If either of these cases holds, we call f (strictly) monotonic.

1.3 Methods of Proof


In mathematics we want to convince (hopefully) everybody that the statements
we talk about are valid. We build upon a fundament which everyone should accept
16 KAPITEL 1. BASIC NOTIONS FROM MATHEMATICS

(e.g. what is a set, what are numbers, what does addition mean?) and which in
pure“ mathematics is reduced to a smallest possible list of truths which are

accepted. In this course we take a larger base coming from your experience in
school, although we will prove many facts which you already know from school.
Given this base of true statements we will derive everything else by that using
rigid arguments which then are called proofs. There are several schemes of argu-
mentation which may be used for proving new statements. You will see proofs
over and over again, but it might be helpful to describe two general ways to argue
and one specific way to deal with natural numbers.

Remark 1.3.1 Direct Proofs


If A and B are two (mathematical) statements we sometimes accept A and
want to prove B. If this is possible, i.e. if A implies B, we write A ⇒ B. We
already used that in 1.2.1.
A direct proof for such an implication is a proof without much ado which tries
to take a direct route.
It will be easier to understand what direct proof“ means, if we introduce the

contrary concept.

Remark 1.3.2 Indirect Proof


If we want to prove that assertion A implies assertion B , it sometimes sparks
our creativity if we instead try to show that the falsity of B implies the falsity
of A.
Both implications are equivalent, as (in our naive world) every assertion is either
true or false. Therefore, if A is true, B has to be true if its negation would imply
the negation of A.
Such an argument is called proof by contradiction or indirect proof. We look at
two examples to demonstrate that this line of thougth can be useful.

Lemma 1.3.3 The square root of 2 is irrational


Let x ∈ R satisfy x2 = 2, x > 0. Then x 6∈ Q.

Proof. Assume otherwise that x = ab , a, b ∈ N. Write a = 2e · ã, b = 2f · b̃, where


e, f are non-negative integers and ã, b̃ ∈ N are odd. Then x2 = 2 implies

22f +1 b̃2 = 2b2 = a2 = 22e ã2 .

As ã2 and b̃2 are odd, after cancelling 2max(2f +1, 2e) we are left with the equality
of an even and an odd number, which gives the desired contradicton.
1.3. METHODS OF PROOF 17

Lemma 1.3.4 Sum of rational and irrational numbers


Let q ∈ Q be rational and s ∈ R r Q be irrational. Then q + s is irrational.

Proof. We assume to the contrary that q + s = x is rational. But then

s=x−q

is the difference of two rational numbers which clearly is rational – contradicting


the hypothesis that s 6∈ Q.
We denote by this circle that the proof is now complete (at least according to the
taste of the author:-)).

Consequence 1.3.5 Rational and irrational numbers are “dense”


Lat a < b be real numbers. Then there exist rational and irrational numbers in
the interval (a, b).

Proof. We first show the existence of rational numbers. If n ∈ N is larger than


1
b−a
, then
1
< b − a.
n
We look at the multiples nz , z ∈ Z. Then – due to 1.2.6 – there is a largest
multiple z/n ≤ a, because the set {z ∈ Z | z ≤ na} has a maximum. This
means that
z z+1 1
≤a< ≤ a + < b,
n n n
z+1
which shows n ∈ (a, b).

Now let s be any irrational number, for instance s = π or 2. Then we know
from what we just did that there is a rational number r in the interval (a−s, b−s),
but then r + s ∈ (a, b), and 1.3.4 tells us that this is irrational.

Example 1.3.6 A specific decimal number


Another statement which can nicely be proven by contradiction is the equality

0.9999999999 . . . = 0.9 = 1.

The second equation is the one in question, the first one only illustrates what
the notation 0.9 means: a periodic decimal number. If this number 0.9 were
different from 1, it certainly were the largest real number smaller than 1. But
such a number does not exist, because for any two different real numbers x < y,
(in particular for y = 1 ) their arithmetic mean x+y
2
satisfies
x+y
x< < y.
2
18 KAPITEL 1. BASIC NOTIONS FROM MATHEMATICS

Remark 1.3.7 Mathematical Induction


If we have a sequence of assertions A(n), depending on a parameter n ∈ N, we
sometimes face the task to show that A(n) is true for every n ∈ N. This means
we have to show that the set

S := {n ∈ N | A(n) is false}

is empty.
Assume that S is non-empty. Then – as indicated in 1.2.6 – S contains a smallest
element. This shows that if we can demonstrate that S does not contain a smallest
element we know it has to be empty. This idea creates what is called mathematical
induction.
In order to show that A(n) is true for every n ∈ N, it suffices to show that A(1)
is true (the base case) and that the truth of A(n) for any single natural number
n implies the truth of A(n + 1) (the induction step).
Note that we do not assume that A(n) is true for every n in order to prove
A(n + 1) , but that we only assume that A(n) s true for one (arbitrary) n.
In other words we show every of the implications

A(1) ⇒ A(2), A(2) ⇒ A(3), A(3) ⇒ A(4), . . .

and as A(1) is true by the base case, we successively see that A(2) also has to
be true, hence also A(3) , hence A(4) , hence. . . .
An example should make this procedure more clear.

Lemma 1.3.8 Bernoulli’s inequality


If h ∈ R satisfies h ≥ −1, then for every n ∈ N

(1 + h)n ≥ 1 + nh.

Proof. We prove this statement for fixed h and parameter n ∈ N by mathematical


induction. The statement A(n) is the truth of the said inequality.
The base case A(1) demands to show that (1 + h)1 ≥ 1 + 1 · h, which is clearly
true – both sides are even equal.
To make the induction step we assume that for some n the inequality (1 + h)n ≥
1 + nh is true (inductive hypothesis) and have to show that

(1 + h)n+1 ≥ 1 + (n + 1)h.

This can be done as follows:


(1) (2)
(1+h)n+1 = (1+h)·(1+h)n ≥ (1+h)·(1+nh) = 1+h+nh+nh2 ≥ 1+(n+1)h.
1.3. METHODS OF PROOF 19

Here we use that h ≥ −1, as this implies 1 + h ≥ 0 , hence we can use the
multiplicative monotonicity of ≥ (cf. 1.2.1) in (1) , and as nh2 ≥ 0, we can use
the additive monotonicity of ≥ in step (2).

Remark 1.3.9 Bernoulli for h ≥ −2


Let us point out that Bernoulli’s inequality also holds for all real h and n = 1
(clear) or n = 2, because

(1 + h)2 = 1 + 2h + h2 ≥ 1 + 2h.

It also holds for h ≥ −2 and every n but the proof by induction no longer works
for −2 ≤ h < −1, because 1 + h < 0 and therefore montonicity of ≥ cannot be
applied in this case.
However, for these values of h, |1+h| ≤ 1 and therefore |(1+h)n | = |1+h|n ≤ 1,
hence (1 + h)n ≥ −1, while 1 + nh < −1 for n ≥ 3.

We will soon have many occasions to see other proofs by mathematical induction.
20 KAPITEL 1. BASIC NOTIONS FROM MATHEMATICS
Kapitel 2

Numbers and Functions

2.1 Sums and Products


Definition 2.1.1 Sums
Given real (and later complex) numbers a1 , . . . , an for some natural number n,
we define n
X
ai := a1 + a2 + · · · + an .
i=1
The letter Σ is the capital “sigma”, the greek counterpart to our “s”, the initial
of the word “sum”. Sometimes the summation index i can also go through an
other finite set of integers between a lower and upper bound, for instance
X n
ai = am + am+1 + · · · + an , if m ≤ n.
i=m

In particular, for any natural number m between 1 and n, we find


Xn m
X n
X
ai = ai + ai .
i=1 i=1 i=m+1

MorePgenerally, if S is a finite set and a : S → R (or later C ) is a function, we


use s∈S a(s) with the now obvious meaning. If S = U ∪ V with U ∩ V = ∅,
then X X X
a(s) = a(u) + a(v)
s∈S u∈U v∈V
because every summand from the left hand side is taken into account exactly
once on the right hand side.
This suggests that it is reasonable to set an empty sum (i.e. no summands at
all, S = ∅ ) equal to 0, because addition of 0 does not change the result of the
summation.

21
22 KAPITEL 2. NUMBERS AND FUNCTIONS

It is certainly helpful to make some calculations for specific examples.

Example 2.1.2 Arithmetic and Geometric Progression


Pn n(n+1)
(a) For every natural number n we have i=1 i= 2
.
This can be shown by mathematical induction. The base case n = 1 says
1
X 1·2
1= i= = 1.
i=1
2

The inductive step is


n+1 n n+1
X X X (IH) n · (n + 1) (n + 2) · (n + 1)
i= i+ i = +n+1=
i=1 i=1 i=n+1
2 2

as desired. Here, (IH) is the inductive hypothesis, i.e the assertion for the
value n under consideration.
More generally, for given values a, b ∈ R, the numbers

ai := b + ia, i = 0, 1, 2, 3, . . .

are called an arithmetic progression. We find


n n n
X X X n(n + 1)
ai = b+a· i = (n + 1) · b + a.
i=0 i=0 i=0
2

(b) A most important equality is the following.


Let q 6= 1 be a real number. Then for n ∈ N0 we have
n
2 n
X q n+1 − 1
1 + q + q + ··· + q = qi = .
i=0
q−1

This equality can be seen by multiplying both sides by q −1. Multiplication


of the left hand side by q − 1 gives
n
X n
X
i
(q − 1) · q = (q i+1 − q i ) = q n+1 − 1,
i=0 i=0

as all other powers of q occur in the sum with positive and with negati-
ve sign as a summand, and these two summands cancel, leaving only the
highest and lowest power at the end.
A proof by mathematical induction also is possible.
2.1. SUMS AND PRODUCTS 23

(c) In a similar way as the first example, we can show for instance that
n
X 1 1
i2 = · n · (n + ) · (n + 1).
i=0
3 2

We leave this as an exercise.

Definition 2.1.3 Products


Given real (and later complex) numbers a1 , . . . , an for some natural number n,
we define
Yn
ai := a1 · a2 · . . . · an .
i=1

The letter Π is the capital “pi”, the greek counterpart to our “p”, the initial of
the word “product”. Sometimes the multiplication index i can also go through
an other finite set of integers (or even more general - see later) between a lower
and upper bound, for instance
n
Y
ai , if m ≤ n.
i=m

In particular, for any natural number m between 1 and n, we find


n
Y m
Y n
Y
ai = ai · ai .
i=1 i=1 i=m+1

We again
Q use a similar notation for a finite set S and a function a defined on
S: s∈S a(s). If S = U ∪ V, where U ∩ V = ∅, we again have
Y Y Y
a(s) = a(u) · a(v).
s∈S u∈U v∈V

Looking at the case U = S, V = ∅, this suggests that it is reasonable to set an


empty product (i.e. no factors at all) equal to 1.

Definition/Remark 2.1.4 Factorial


For any n ∈ N0 we define
n
Y
n! := i = 1 · 2 · 3 · . . . n.
i=1

In particular, 0! = 1. The notation is pronounced “ n factorial”.


24 KAPITEL 2. NUMBERS AND FUNCTIONS

The number n! is the number of different ways to order n different items. For
instance, there are 3! = 6 ways to write down the numbers 1, 2, 3 in different
orders:
123, 132, 213, 231, 312, 321.
We have (n+1)! = (n+1)·n! for every n. More generally, we have for 0 ≤ k ≤ n
n
Y
n! = k! · i.
i=k+1

We will use this in a second.

Definition/Remark 2.1.5 Binomial Coefficients


For integers 0 ≤ k ≤ n we set
 
n n!
:= .
k k! · (n − k)!

Cancelling the factor k! in numerator and denominator, this can also be written
as   Qn
n i
= i=k+1 .
k (n − k)!
These numbers are called the binomial coefficents.
The combinatorial meaning of the binomial coefficients is that nk is the number


of subsets of {1, . . . , n} having exactly k elements. This can be seen as follows.


For every such subset S with k elements there is an ordering of A = {1, . . . , n}
such that the elements of S are the first k of this ordering. However, we are
not interested in the ordering of the elements of S or of A r S, only in the set
S itself. Therefore, reordering the elements of S internally or the elements of
A r S internally, will not change the set S of the first k elements. This shows
that there are k! · (n − k)! orderings which lead to the same set S, showing the
assertion.

Theorem 2.1.6 Pascal’s Triangle


For any integers 0 ≤ k < n, we have
     
n+1 n n
= + .
k+1 k k+1

Proof. There are several ways to prove this theorem. We show two of them, you
can choose your favourite one.
2.1. SUMS AND PRODUCTS 25

First Proof: By calculation:


n n n! n!
 
k
+ k+1
= k!·(n−k)!
+ (k+1)!·(n−k−1)!

(k+1)·n!+(n−k)·n!
= (k+1)!·(n−k)!

(n+1)! n+1

= (k+1)!·(n+1−(k+1))!
= k+1
.

From the first to the second line we have expanded the fractions to the
common denominator (k + 1)! · (n − k)! and then added the numerators.

Second Proof: By combinatorial arguments:


There are two sources for subsets S ⊆ {1, . . . , n + 1} with exactly k + 1
elements: Either S ⊆ {1, . . . , n} or n+1 ∈ S and then S 0 := S ∩{1, . . . , n}
has k elements.
n

According to our interpretation of the binomial coefficients, there are k+1
sets of the first type and nk sets of the second type. As the two types are


exclusive and cover every possibility, our assertion follows.

Remark 2.1.7 Where is the triangle?


In order to explain the name of the last Theorem it is helpful to arrange the
binomial coefficients n0 , . . . nn as the n -th row in a triangular shape.

1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
and so on. Every entry in the next row is the sum of the two entries left and right
of it in the row before. The missing entries left and right of the 1s are zero.

Theorem 2.1.8 Binomial Formula For any real (or later complex) numbers
a, b and every n ∈ N0 we have
n  
n
X n
(a + b) = ai bn−i .
i=0
i

Proof. Again, there are several proofs, and we give two of them.
26 KAPITEL 2. NUMBERS AND FUNCTIONS

First Proof: By calculation, using mathematical induction with respect to n.


For n = 0 or n = 1 the assertion is clearly true. Assume it to be true for
some arbitrary but specific n. Then

(a + b)n+1 = (a + b)n · (a + b)

(IH) Pn n
 
= i=0 i
ai · bn−i · (a + b)

(DL) Pn n
 i+1 n−i
= i=0 i
(a b + ai bn+1−i )
Pn+1 n
 i n+1−i Pn n
 i n+1−i
= i=1 i−1
ab + i=0 i
ab

(P T ) Pn+1 n+1
 i n+1−i
= i=0 i
ab ,

where (IH) is the inductive hypothesis, (DL) is the


 distributive law and
n n

(P T ) is Pascal’s triangle. Here, we set −1 = n+1 = 0.

Second Proof: Using combinatorics again. We expand (a + b)n using the distributive law
as a sum
n
X
n
(a + b) = ci ai bn−i .
i=0

The factor ci is the number of summands of the shape ai bn−i , and we


have to find out, how many summands for a specific value i there are.
The summands we get are the result of chosing in any of the n factors
(a+ b) either a or b and multiplying the chosen summands. As we have
n
i
choices to select a from the two summands in one of the factors and b
from the others, we see the desired result.

Example 2.1.9 Some particular cases


The most well-known case of this law is the case n = 2 :

(a + b)2 = a2 + 2ab + b2 .

We also see that


112 = (10 + 1)2 = 102 + 2 · 101 + 1 · 100 = 121,
113 = (10 + 1)3 = 103 + 3 · 102 + 3 · 101 + 100 = 1331,
114 = (10 + 1)4 = 104 + 4 · 103 + 6 · 102 + 4 · 101 + 100 = 14641.

That this fails to go on so nicely with just the binomial coefficients as decimals
is caused by the binomial coefficients becoming larger than 10.
2.2. COMPLEX NUMBERS 27

2.2 Complex numbers


Complex numbers are an extension of real numbers to a larger supply of numbers.
They are important in many branches of science, in particular in quantum physics
(and nowadays of course quantum computing) and electronic engineering.

Remark 2.2.1 What do we want?


One usually introduces the real numbers because some numbers seem to be mis-
sing from the perspective of rational numbers. For instance, 2 does not have a
square-root in Q, or the ratio between radius and circumference of a circle is not
rational. It would be possible to remove several such “gaps” by different means,
but it turned out to be useful to introduce real numbers which close a large
amount of such gaps on one stroke. But nevermind from the prespective of ratio-
nal numbers, real numbers are quite a large step and are an idealization which
later on in the history of mathematics and sciences proved to be irreplaceable.
The real numbers, however, are separated into the non-negative and the negative
numbers. The negative ones do not have a square-root in R, because the square
of a positive number is positive, and that of a negative one also. It is a natural
desire to extend the area of numbers to a larger area where −1 has a square-
root. To that end one should explain what “number” then should mean. What we
want is a set S containing R and equipped with an addition and a multiplication
extending those from the real numbers and still satisfying:

commutativity , i.e. z + w = w + z and zw = wz for all w, z ∈ S

associativity , i.e. t + (w + z) = (t + w) + z and t(wz) = (tw)z for all t, w, z ∈ S

distributive law , i.e. t(w + z) = (tw) + (tz) for all t, w, z ∈ S

neutral elements , i.e. 0 + z = z and 1 · z = z for all z ∈ S.

We want moreover that S contains an element i which satisfies i2 = −1.


Let us assume for a while that such a set S with said properties exists. We fix
one element i with i2 = −1.
What do we see from this assumption?

Remark 2.2.2 Necessities


Such a set S contains all elements of the form x + y · i, x, y ∈ R because it
contains all real numbers and i and we may multiply i by y and then add x . We
call this subset C :
C := {x + yi | x, y ∈ R}.
28 KAPITEL 2. NUMBERS AND FUNCTIONS

The set C is closed under addition and multiplication, which means: for all
z = x + yi, w = u + vi, where u, v, x, y ∈ R,

w + z = (u + vi) + (x + yi) = u + x + vi + yi = (u + x) + (v + y)i ∈ C,

because x + u, y + v ∈ R. We made use of associativity of addition (resolving the


brackets), commutativity of addition and distributive law. Similarly,

wz = (u + vi) · (x + yi) = ux + uyi + vix + viyi = (ux − vy) + (uy + vx)i ∈ C,

where we now also used associativity and commutativity of multiplication and


that i2 = −1.
But this means that C already is a set of the type we want, and now we start
the other way round and define the desired set with multiplication and addition
inspired by what we just have seen.
Note that addition and multiplication in C is governed by certain rules for com-
bining the real numbers u, v, x, y.

Construction 2.2.3 Complex Numbers


We now define the set of complex numbers to be

C := R2 = {(x, y) | x, y ∈ R}

and define addition and multiplication by the rules suggested by what we just
saw. For pairs (x, y), (u, v) ∈ C we set

(x, y) + (u, v) = (x + u, y + v),


(x, y) · (u, v) = (xu − yv, xv + yu).

Then one has to check that these operations are commutative and associative and
obey the distributive law (which is tedious and works well) and that – identifying
the real number x with (x, 0) ∈ C leads to the fact that our new operations
extend those for the real numbers, i.e.

(x, 0) + (u, 0) = (x + u, 0) and (x, 0) · (u, 0) = (xu, 0),

and that 1=(1,


ˆ 0) and 0=(0,
ˆ 0) are neutral elements for the new multiplication
and addition as well. Moreover,

(0, 1)2 = (−1, 0)=


ˆ − 1.

ˆ denotes that we have identified a real number with the corre-


Here, the sign =
sponding pair having 0 as second entry. Having noted this, we will write = in
the future.
2.2. COMPLEX NUMBERS 29

This finally shows that our dream came true, and we now set

i := (0, 1)

and can write every element (x, y) ∈ C as

(x, y) = x · 1 + y · i = x + yi.

This is just the usual vector addition and scalar multiplication, but now we also
can multiply two vectors by the rule invented above.
Every complex number z = x + yi is given by its real part x and its imaginary
part y. We write

x = <(z), y = =(z), z = <(z) + =(z) · i.

Note,
Pn that now, given Q all the properties we wanted in 2.2.1, we may define sums
n
i=0 zi and products i=0 zi as for real numbers and that the binomial formula
will also be true.

Remark 2.2.4 Complex conjugation


We define a function z 7→ z from C to C which is called complex conjugation
and is given by

x + yi = x − yi, i.e. <(z) = <(z) and =(z) = −=(z).

It is easy to see that for any w, z ∈ C we have

wz = w · z and w + z = w + z.

Looking at the number z as a point in the plane, given by rectangular coordinates


x and y one sees that, following Pythagoras, the distance between 0 and z is
p √
|z| := x2 + y 2 = z · z.

We call this number the absolute value or modulus of z. We see that the absolute
value satisfies a multiplicative property:
√ √ √ √
|zw| = zw · zw = zz · w · w = zz · ww = |z| · |w|, z, w ∈ C.

The absolute value also satisfies the triangular inequality

|z + w| ≤ |z| + |w|, z, w ∈ C.

6 0 has an inverse with respect to multiplication,


We finally also see that every z =
−1 1
namely z = zz z, because zz = 6 0 is real and has an inverse in R
30 KAPITEL 2. NUMBERS AND FUNCTIONS

Definition/Remark 2.2.5 Polar coordinates; no order on C


If z ∈ C is nonzero, it can be written as z = |z| · w, where |w| = 1. Now w is a
point on the unit circle in the plane with center 0, and therefore it can be written
as w = cos(α)+sin(α)·i for some angle α which is defined up to adding multiples
of 2π. Sometimes it is helpful to insist that the angle is in the interval (−π, π],
and then it is unique. But as it is easy to digest the possible non-uniqueness of
α, we often will not do so.
Summing up we write
z = r · (cos(α) + sin(α)i)
and call the pair (r, α) a pair of polar coordinates of z; r is the modulus of z
and α the argument. Two non zero numbers z and z 0 are equal if and only if
they have the same modulus and the same argument up to adding a multiple of
2π.
Using the addition formula for the trigonometric functions (cf. 2.2.6) we see that

r (cos(α) + sin(α)i) · s(cos(β) + sin(β)i) = rs(cos(α + β) + sin(α + β)i),

i.e. multiplying two complex numbers means for the polar coordinates to multiply
the moduli and add the arguments. In particular, the polar coordinates are very
convenient to describe powers of complex numbers:

(r · (cos(α) + sin(α)i))n = rn · (cos(nα) + sin(nα)i).

For instance, every complex number has a square root. This is clear for z = 0,
and for non-zero z = r · (cos(α) + sin(α)i) the possible choices for the square root
are

± r · (cos(α/2) + sin(α/2)i).
In particular it is not feasible to establish an order relation on C satisfying all
properties from 1.2.1 because if we had such an ordering a square – and hence
every complex number! – always were ≥ 0.
For example, we would have −1 < 0 due to 0 < 1 and the monotonicity of the
addition and −1 = i2 = (−i)2 > 0, due to the monotonicity of the multiplication,
as i > 0 or −i > 0. Of course this contradiction shows that such an ordering
cannot be established.

Lemma 2.2.6 Addition formula for trigonometric functions


For all α, β ∈ R we have

cos(α + β) = cos(α) cos(β) − sin(α) sin(β),


sin(α + β) = sin(α) cos(β) + cos(α) sin(β).
2.2. COMPLEX NUMBERS 31

Proof. We make use of the geometric properties of the sine- and cosine-function
which hopefully are well-known from highschool.
The proof is based on the following picture, which works well if all angles α, β, α+
β are between 0 and π/2. We hence restrict to this case, the other possibilities
can be dealt with similarly.
The point 0 is the origin and the center of a circle of radius 1 on which the points
P and Q are placed.
...........................
...........
.........
........ Q
......
.........
.... ..... ..............
..
.. .. . ...
...... ..... .....`2......
.. ... ... ....
...... ... ... ...
.. ... . ..
.. .. .......................................0 ...P
.. S . . ..
.
..... β . .... .. ........ ...Q .....
.. ....... .. ...
...... ............... `1 .. ...
.... .......... .. ...
.............. α .. ...
0 ..................................................................................................................
R

The sine of α + β is the length of the perpendicular from Q to the x -axis.


We drop the perpendicular from Q to the line through 0 and P and call the
end point Q0 . As the rectangular triangle with corners 0, Q and Q0 has angle
β at 0, its legs have length cos(β) – the leg `1 from 0 to Q0 – and sin(β) – the
leg `2 from Q0 to Q,
The point S is chosen so that the line from S to Q0 is horizontal and that
from Q to S is vertical. The angle at Q in the small triangle turns out to be
α again. But now the legs `1 and `2 are the hypothenuses of the rectangular
triangles with corners 0, R and Q0 and with corners Q0 , Q, S respectively. As
both triangles have angle α at 0 and Q respectively, the leg opposite to 0 of
the former one has length cos(β) · sin(α), and the leg at Q of the latter one has
length sin(β) · cos(α).
As the sum of theses lengths is the length of the perpendicular from Q to the
x -axis, we find the addition formula sin(α + β) = sin(α) cos(β) + cos(α) sin(β).
The other addition formula follows from this by using that cos(γ) = sin(π/2 + γ)
for every γ :
cos(α + β) = sin(π/2 + (α + β)) = sin((π/2 + α) + β)
= sin(π/2 + α) cos(β) + cos(π/2 + α) sin(β)
= cos(α) cos(β) − sin(α) sin(β),
as cos(π/2 + α) = sin(π + α) = − sin(α).
32 KAPITEL 2. NUMBERS AND FUNCTIONS

Consequence 2.2.7 Quadratic equations


We now use that according to 2.2.5 every complex number has a square root
in order to show that every quadratic equation with complex coefficients has a
complex solution (which of course sometimes will be real, but every real number
is a complex number as well:-))
Let a, b, c ∈ C be given and a 6= 0. We want to show that there exists a z ∈ C
such that
az 2 + bz + c = 0.
This goes as in the well-known real case, because we have all the properties from
2.2.1 (associativity, commutativity, distributive law) and may make use of them.
We already said in 2.2.3 that the binomial formula holds for complex numbers.
We can therefore “complete the squares”:
b b b
az 2 + bz + c = a(z 2 + z + ( )2 − ( )2 + c/a),
a 2a 2a
and this is zero if and only if
b 2 b b b b2 4ac
(z + ) = z 2 + + ( )2 = ( )2 − c/a = 2
− ,
2a a 2a 2a (2a) (2a)2
i.e. √
b± b2 − 4ac
z=− ,
2a

where we use w to denote any possible square root of w. We will see later that
there are two of them if w 6= 0. In contrast to the real situation it is not so easy
to choose one of them in a coherent way, as we do not have notions like positive
or negative for arbitrary complex numbers.

Example 2.2.8 A specific quadratic equation


We consider the specific quadratic equation
z 2 + (1 + i)z + 1 = 0,
which – following our general procedure – has the solution
−1 √
z= · (1 + i ± 2i − 4),
2
because (1 + i)2 = 2i. For finding the square root of 2i − 4 we either have to
find the modulus and argument of this number, which can be tedious, or we more
directly have to find real numbers u, v such that (u + vi)2 = 2i − 4.
Comparing real and imaginary parts of both sides of this equation leads to two
equations which have to be valid simultaneously:
u2 − v 2 = −4, 2uv = 2.
2.3. POLYNOMIAL AND RATIONAL FUNCTIONS 33

The second equation demands v = u−1 , and substituting this into the first gives
after multiplication with u2
u4 + 4u2 − 1 = 0,
which is a quadratic equation in u2 . The formula for solving this tells us that

2 −4 ± 16 + 4 √
u = = −2 ± 5,
2
2
but as u has√to be real, u is positive (which still makes sense for real numbers:-)),
hence u2 = 5 − 2. Accordingly,
1 1 √
v2 = 2 = √ = 5+2
u 5−2

(as expanding the fraction with 5 + 2 shows).
Our sought for solutions therefore are (as v = u1 again)
√ √
q q
−1
z= · (1 + i ± ( 5−2+( 5 + 2)i)).
2

2.3 Polynomial and rational functions


Polynomial functions are functions which can be written down easily just using
the basic arithmetics of R or C.
Note: To make formulation smoother, we will denote by F one of the number
systems R or C in this section, unless a distinction between these two cases is
helpful or necessary.

Definition/Remark 2.3.1 Polynomial functions


A function f : F → F is a polynomial function if there are numbers d ∈ N0 and
c0 , . . . , cd ∈ F such that
d
X
f (x) = ci xi for all x ∈ F.
i=0

If in the situation above cd 6= 0, we call d the degree of f, d = deg(f ), and cd the


leading coefficient. c0 is called the constant term. The degree of the polynomial
f (x) = 0 is not defined by this, we set formally deg(0) = −∞.
Polynomials of degree 0 or −∞ are called constant polynomials, those of degree
1 are called linear polynomials, and those of degree 2 quadratic polynomials.
Elements z ∈ F with f (z) = 0 are called roots or zeroes of f. This terminology
will also be maintained for other functions, where of course roots have to be
elements in the domain (which need not be contained in F ).
34 KAPITEL 2. NUMBERS AND FUNCTIONS

Definition 2.3.2 Sum and product of functions


If D is some set and f, g : D → F functions with values in F, we can add and
multiply f and g by the recipe

∀x ∈ D : (f + g)(x) = f (x) + g(x), (f · g)(x) = f (x) · g(x).

This defines two new functions f +g, f ·g : D → F. Addition and multiplication of


functions on D satisfy associativity, commutativity and the distributive law. But
now the product of two functions can be the zero-function (constant 0) without
one of them being the zero-function.
If D = F and f and g are polynomial functions, then f + g and f · g are
polynomial functions as well.
Observe the formula forced upon us by the distributive law

d
X e
X d+e X
X k
ci x i · bj x j = ( ci bk−i )xk .
i=0 j=0 k=0 i=0

Here we set bk−i = 0, if k − i ≥ e, and ci = 0 if i ≥ d.

Remark 2.3.3 Properties of degree, leading and constant term


The following rules are easily established: if f, g are two polynomial functions,
then

deg(f + g) ≤ max(deg(f ), deg(g)), deg(f g) = deg(f ) + deg(g),

where of course we define −∞ + d = −∞ and max(−∞, d) = d. If f, g are


both non-zero, then the leading coefficient of f g is the product of the leading
coefficients of the factors, and the same for the constant terms. As the constant
term of f is f (0), the constant term of f + g is (f + g)(0) = f (0) + g(0), the
sum of the constant terms. There is no such rule for the leading coefficient of a
sum.

Lemma 2.3.4 A linear factor


Let f : F → F a polynomial function. Let x0 ∈ F be given. Then f can be
written as a multiple of the linear polynomial x − x0 with a polynomial cofactor
if and only if f (x0 ) = 0.

Proof. We write x = x0 +(x−x0 ) and use this together with the binomial formula
2.1.8 to see
2.3. POLYNOMIAL AND RATIONAL FUNCTIONS 35

f (x) = f (x0 + (x − x0 ))
= di=0 ci (x0 + (x − x0 ))i
P

= di=0 ci · ij=0 ji x0i−j · (x − x0 )j


P P 

= dj=0 ( di=j ji xi−j


P P  j
 0 ) · (x − x0)

= f (x0 ) + dj=1
Pd i  i−j
· (x − x0 )j
P
i=j j x0
Pd  Pd i  i−j 
= f (x0 ) + (x − x0 ) j=1 i=j j x0 · (x − x0 )j−1

where we have singled out in the second last step the summand for j = 0 which
gives f (x
P0 ) and
P then can write the rest as a multiple of (x − x0 ) by a polynomial
g(x) = dj=1 ( di=j ji xi−j
0 ) · (x − x 0 ) j−1
.
This shows that if f (x0 ) = 0, we have f (x) = (x−x0 )·g(x). If, on the other hand,
f is a multiple of (x − x0 ), then of course f (x0 ) is a multiple of (x0 − x0 ) = 0,
hence is zero.

Consequence 2.3.5 Roots of polynomials


Every polynomial function f on F of degree d ≥ 0 has at most d roots, i.e. at
most d elements x ∈ F with f (x) = 0.

Proof. We make mathematical induction by d. For d = 0, a polynomial function


of degree 0 is a non-zero constant function, which therefore has no root at all.
For d = 1, f (x) = c1 x + c0 , c1 6= 0, and therefore

f (x) = 0 if and only if x = −c0 /c1 ,

hence there is exactly one root.


Assuming now that every polynomial function of (fixed) degree d ≥ 0 has at
most d roots, and being given a polynomial function f of degree d + 1, we can
either have no root at all (hence less than d + 1 = deg(f ) many, which we want)
or there is a root x0 of f and we can write

f (x) = (x − x0 ) · g(x),

where g(x) is a polynomial function. Due to 2.3.3, g has degree d and therefore
at most d roots. However, if r ∈ F is a root of f, then

0 = f (r) = (r − x0 ) · g(r),

and this product of two elements of F can only be 0 if at least one of the factors
is zero. Therefore r = x0 or r is a root of g, which together gives at most d + 1
candidates, proving the inductive step.
36 KAPITEL 2. NUMBERS AND FUNCTIONS

Remark 2.3.6 Fundamental Theorem of Algebra


It is the most striking difference between real and complex numbers that in the
latter case every non-constant polynomial has a root. This fact is called the
Fundamental Theorem of Algebra. We will not prove it in this course.
Using this, 2.3.4, and mathematical induction, every non-constant complex po-
lynomial can be written as a product of linear complex polynomials.

Definition 2.3.7 Rational functions


Given two polynomial functions f, g on F, where g is not the constant polyno-
mial 0, we denote by
Z := {z ∈ F | g(z) = 0}
the (finite) set of zeros of g.
We then can define a function
f f (x)
: F r Z → F, x 7→ .
g g(x)
Functions of this type are called rational functions.

Definition/Remark 2.3.8 Partial Fraction Decomposition


Let fg be a rational function, i.e. f, g are polynomials, g 6= 0. It sometimes is
helpful to decompose this in summands of a specific shape, which can be most
easily described if
g(x) = (x − z1 )m1 · (x − z2 )m2 · . . . · (x − zn )mn
is a product of linear factors, where the zeros z1 , . . . , zn ∈ F are pairwise different
and occur with certain multiplicities m1 , . . . , mn ∈ N. Under this condition there
exist constants
ci,j , 1 ≤ i ≤ n, 1 ≤ j ≤ mi ,
and a polynomial function h(x) with
n i m
f (x) XX ci,j
= h(x) + j
.
g(x) i=1 j=1
(x − z i )

This is called the partial fraction decomposition of f /g.


The denominators in the expression are more “clean cut” than the original de-
nominator g itself, and it can be interesting to focus on specific zeros of g, in
particular when we later want to integrate rational functions. There is a similar
slightly more complicated formula in the case that the denominator should not
split as a product of linear factors.
We do not prove the existence of those decompositions.
2.3. POLYNOMIAL AND RATIONAL FUNCTIONS 37

Example 2.3.9 Two examples


The proof would consist of two steps, but of course we can belief the result and
try to find the decomposition without knowing the proof.

(a) Try to find the partial fraction decomposition for the rational function
1
x2 (x−1)2
: z1 = 0, z2 = 1, m1 = m2 = 2.
The numbers ci,j have to satisfy the condition

1 c1,1 c1,2 c2,1 c2,2


− − − − is a polynomial.
x2 (x − 1)2 x x2 x − 1 (x − 1)2

Using a common denominator this means that

1 − c1,1 x(x − 1)2 − c1,2 (x − 1)2 − c2,1 x2 (x − 1) − c2,2 x2

is divisible by x2 (x − 1)2 . But as the combination above has degree ≤ 3,


(as every summand) it then must be zero.
Expanding the brackets leads to

−(c1,1 + c2,1 )x3 + (2c1,1 − c1,2 + c1,2 − c2,2 )x2 + (−c1,1 + 2c1,2 )x + (1 − c1,2 ) = 0.

This forces c1,2 = 1, hence c1,1 = 2, hence c2,1 = −2 and finally c2,2 = 1.

(b) The general recipe in the case n = 1, i.e. g(x) = (x − z1 )m , consists


in writing x as z1 + (x − z1 ) and substituting this into f. Then f is a
polynomial expression in (x − z1 ) with coefficents depending on f and z1
and then one can separate summands.
We illustrate this with g(x) = (x − 1)3 and f (x) = x4 + 3x.
Then using the binomial formula we calculate

f (x) = f (1 + (x − 1)) = (1 + (x − 1))4 + 3(1 + (x − 1))


= (x − 1)4 + 4(x − 1)3 + 6(x − 1)2 + 4(x − 1) + 1 + 3 + 3(x − 1)
= (x − 1)4 + 4(x − 1)3 + 6(x − 1)2 + 7(x − 1) + 4.

Division by (x − 1)3 gives the desired partial fraction decomposition

f (x) 6 7 4
=x+3+ + 2
+ .
g(x) x − 1 (x − 1) (x − 1)3
38 KAPITEL 2. NUMBERS AND FUNCTIONS
Kapitel 3

Sequences

Sequences are a good method to understand the different ways in which one can
approximate complex numbers in a better and better way by other numbers.

3.1 Convergence
Definition 3.1.1 Sequences
A (complex or real) sequence is a map from N (or sometimes N0 ) to the complex
or real numbers.
We denote a sequence often as a = (an )n∈N , where an is the value of the given
sequence a at n. Sometimes we also write a sequence as

a = (a1 , a2 , a3 , . . .) or a = (an )n .

Note, that in contrast to a set, we here have to controll the specific order in which
the values come and that values may occur more often than once – maybe even
infinitely often! For instance, the sequences

(an )n = (2, 3, 5, 7, 11, 13, . . .), i.e. an = n − th prime

and

(bn )n = (2, 3, 2, 5, 2, 7, 2, 11, 2, 13, . . .), i.e. b2k−1 = 2, b2k = (k + 1) − st prime,

are totally different, although they take the same values.

As every real sequence also is a complex sequence, we will mostly treat these
when talking abut general rules.

39
40 KAPITEL 3. SEQUENCES

Definition 3.1.2 Limits and Convergence


Given a complex sequence (an )n∈N and a number ` ∈ C, ` is called the limit of
(an )n if for every real  > 01 there exists a real number N such that

for all n > N : |an − `| ≤ .

This says that for all but finitely many n the values an are located in the disc of
radius  around `. We will sometimes abbreviate this by saying that for almost
all n this condition is satisfied.
If such a limit exists, then we write

lim an = `
n→∞

and say that the sequence is convergent: it converges to `, as n goes to infinity.


If the sequence has real values, it sometimes is helpful to talk about convergence
to ∞ or −∞, meaning that – in the first case – for every R ∈ R almost all an
are larger than R and – in the second case – that almost all an are smaller than
R.
For instance, limn→∞ n2 = ∞, because for every real R the number of squares
being smaller than R is finite.
A sequence converging to 0 will be called a null sequence. This just means that
for every  > 0 and almost all n ∈ N, |an | < .
A sequence (an )n converges to ` ∈ C if and only if the sequence (an − `)n is a
null sequence.

Remark 3.1.3 A geometric point of view


A real sequence (an )∞ n=1 can be illustrated by its graph. This is just a discrete
set .. of points {(n, a n | n ∈ N} which we can think of as a subset of the plane.
)
..
.... ..... .... .... .... .... .... .... ...•. .... .... .... .... .... .... .... ..... ..... .... .... .... .... .....•..... .... .... .... .... ...•.. .... .... .... .... .... .... .... .... .... .... ....
`.......... ......•.... ........ ........ ..•...... ........ ........ ........ ........ ........ ........ ........ ........ ........ ..•...... ........•.......... .......... ......•.. ........ ........ ........ .......... .......... ........ ........•........ •........ .......... ........ •........ ....•.... ........•........ ..•...... ........•........ •........ ....•.... ........•
.. • •
...........+
... ..........+ .•.........+.........+ ..........+........+ ..........+ .........+ ..........+ .........+ ..........+ ............+............+ ............+ ............+............+ ...........+............+ ............+
............+ ............+ ............+...........+ ............+
............+ ............+
............+ ............+ ..
.. •
... • • • • •
..
.. •
The number ` is the limit of the sequence if only finitely many points (n, an ) lie
outside the red strip around the line {(x, `) | x ∈ R}, no matter how narrow this
strip is.
If we only know that infinitely many points lie in every of these strips, then this
will not be enough. But such values of ` will later on be called accumulation
points.
1
This symbol  is the greek letter epsilon; if you don’t like it, you may use any other letter.
3.1. CONVERGENCE 41

Example 3.1.4 Decimal expansion


We are used to deal with sequences of rational numbers as soon as we deal with
the decimal expansion of a given real numer. For instance, the statement

π = 3.1415926535897932384626433832795028842 . . .

just says that there is a sequence of rational numbers converging to π (which is


given as the ratio between circumference and diameter of a circle) and starts as

3, 3.1, 3.14, 3.141, 3.1415, . . .

and the more decimals you consider the closer you come to π.
But of course there are many more sequences of numbers converging to the same
limit, and this forces us to consider this phenomenon in a more conceptional way.

Lemma 3.1.5 Uniqueness of the limit and a helpful criterion


If a sequence (an ) is convergent, then its limit is unique and the sequence (an+1 −
an )n∈N is a null sequence.

Proof. We show the uniqueness of the limit by a proof by contradiction. To that


end, we assume that `, `0 ∈ C are different limits of the convergent sequence (an ).
Choose some positive  < |` − `0 |/2.
Then we know from the definition of convergence that for some N ∈ N and all
n ≥ N, both |an − `| and |an − `0 | are at most . Therefore, using the triangle
inequality from 2.2.4, we see for every such n

|` − `0 | = |` − an + an − `| ≤ |` − an | + |an − `0 | ≤ 2 < |` − `0 |,

which of course is a contradiction. Therefore no two different limits for one se-
quence can exist and the limit hence is unique.
Now let ` ∈ C be the limit of (an )n . For every real  > 0 we find that for some
N ∈ R all values an , n ≥ N, satisfy |an − `| ≤ /2 because such an inequality
holds for every positive number, and /2 is a positive number. We therefore find
using the triangle inequality again

|an+1 − an | = |an+1 − ` + ` − an | ≤ |an+1 − `| + |` − an | ≤ 2/2 = .

Now this inequality holds for almost all n and therefore the sequence of differences
an+1 − an is a null sequence.
Note, that the condition that (an+1 − an )n is a null sequence, is not sufficient
for convergence of (an ). We will see a counterexample later in the chapter on
sequences.
Nevertheless, we are now in a good situation to treat our first examples.
42 KAPITEL 3. SEQUENCES

Example 3.1.6 First examples

a) The sequence an = n1 converges to 0, because for every positive  there


are only finitely many natural numbers n smaller than 1 , and therefore
almost all n satisfy
1
n> ,

which of course is equivalent to
1
< .
n
1
Note, that here n
= | n1 − 0|, hence 0 is the limit.
In a similar fashion, one can show that if (an ) is a sequence of (non-zero)
real numbers converging to ∞, the sequence (1/an )n is a null sequence.
b) For q ∈ C we look at the sequence an = q n . We claim that (an )n converges
if and only if |q| < 1 or q = 1, and that

0 if |q| < 1,
lim an =
n→∞ 1 if q = 1.
To begin with, it is clear that for q = 1 we have an = 1 for all n and
therefore |an − 1| = 0 ≤  for every  > 0 and every n, clearly showing the
convergence in this case. If q has modulus 1 but is different from 1 we see

|an+1 − an | = |(q − 1)an | = |q − 1|

which shows that (an+1 − an )n cannot be a null sequence, whence (an )n


cannot converge due to 3.1.5.
Now take |q| > 1 and write |q| = 1 + h, h > 0. If (q n )n would converge
to some ` ∈ C, we in particular would have |q n − `| ≤ 1 for almost all n
(because 1 is a legal choice for our positive  ) and therefore by the triangle
inequality
(1 + h)n = |q n | = |q n − ` + `| ≤ |`| + 1.
On the other hand we know from Bernoulli’s inequality 1.3.8 that

|q n | = (1 + h)n ≥ 1 + nh,

and this will be larger than |`| + 1 as soon as n > |`|/h. Therefore we not
only see that q n cannot converge to `, we even see that |q n | converges to
∞, if |q| > 1.
Our last case now is |q| < 1. Here we note the |1/q| > 1 and therefore
we just saw that |1/q n | →n→∞ ∞, which in turn implies – using our first
example a) in this list – that |q n | →n→∞ 0, proving our last claim.
3.1. CONVERGENCE 43

c) For q ∈ C and n ∈ N we look at


n
X
an = qk .
k=0

Then (an )n converges to some complex number if and only if |q| < 1, in
1
which case we have limn→∞ an = 1−q .
It is clear from what we just saw that an+1 − an = q n+1 will only be a
null sequence if |q| < 1, which shows the necessity of this condition for
convergence.
In order to prove the convergence in the case |q| < 1, we recall from 2.1.2
n
b) that an = 1−q
1−q
, and this shows that
1 qn
|an − |=| |.
1−q 1−q
Given any real  > 0, we know from example b) that for almost all n we
have |q n | <  · |1 − q|, which is just some fixed real number, and therefore
1
for almost all n |an − 1−q | ≤ , proving
n
X 1
q k →n→∞ , if |q| < 1.
k=0
1−q

I again stress that this example – th geometric series – is one of the most
important examples in the whole lecture.
One specific case is q = 0.1. Then the sequence considered starts as
1, 1.1, 1.11, 1.111, 1.1111, 1.11111, . . .
and it is clear that the limit is 1.1. We therefore have shown that
1 10
1.1 = = ,
1 − 0.1 9
which implies
1
0.1 = , hence 0.9 = 1.
9
We already know that from 1.3.6.

Definition 3.1.7 Sums and products of sequences


Let a = (an )n∈N and b = (bn )n∈N be two sequences. We then define the sequences
a + b (sum of a and b ) and a · b (product of a and b ) by
a + b = (an + bn )n∈N and a · b = (an · bn )n∈N .
This is merely a special case of 2.3.2.
44 KAPITEL 3. SEQUENCES

Definition 3.1.8 Bounded sets and functions


Extending the definition of boundedness from 1.2.6 to subsets of C, where upper
and lower bounds do not make sense, we define S ⊂ C to be bounded if there is
some R ∈ R such that every s ∈ S satisfies |s| ≤ R.
We define the open disc BR (c) with center c ∈ C and radius R ∈ R>0 to be

BR (c) = {z ∈ C | |z − c| < R}.

The closed disc is


BR (c) = {z ∈ C | |z − c| ≤ R}.
Here the letter “ B ” comes from the word “ball”, because in higher dimensions
it is more usual to think of balls instead of discs.
Open and closed discs are bounded, as can be seen from the triangle inequality.
If D is some set and f : D → C a function, then the function f is bounded if
the range of f is bounded:

f (D) ⊆ BR (0) for some R > 0.

In particular, a sequence (an )n is bounded, if for some R > 0 and all n

an < R.

It is easily seen, that a convergent sequence is bounded, because if limn→∞ an = `


there is a N ∈ N with

|an − `| ≤ 1, hence |an | ≤ |`| + 1

for all n ≥ N, and therefore


|an | < R
for all n if R = max ({|ai | | i ≤ N } ∪ {|`|}) + 1.

Theorem 3.1.9 Sums and products of convergent sequences converge


Let (an )n and (bn )n be two convergent (complex) sequences with limits

lim an = `, lim bn = m.
n→∞ n→∞

Then their sum and product do converge and we have

lim (an + bn ) = ` + m, lim (an · bn ) = ` · m.


n→∞ n→∞

If, moreover, m 6= 0, then bn 6= 0 for almost all n and we have


an `
lim = .
n→∞ bn m
3.1. CONVERGENCE 45

Before giving the proof we admit that we are a bit careless about an /bn . This
quotient only is defined if bn 6= 0, and we will see in the proof that this is the
case for almost all n, which is what we need for talking about convergence.
Proof. To warm up, we start with an + bn . If any real  > 0 is given, we know
that for almost all n
|an − `|, |bn − m| < /2.
Therefore for these n the triangle inequality renders

|(an + bn ) − (` + m)| = |(an − `) + (bn − m)| ≤ |an − `| + |bn − m| ≤ 2 · /2 = ,

showing the convergence.


For the product we need that a convergent sequence is bounded. As (bn )n and
(an )n are bounded, there is some R > max(|`|, |m|) with |an |, |bn | ≤ R for all
n. Given  > 0, we find that

|an − `| < /(2R), |bn − m| < /(2R)

for almost all n (because /(2R) is real and positive) and therefore by the triangle
inequality
|an bn − `m| = |an bn − `bn + `bn − `m|
≤ |bn | · |an − `| + |`| · |bn − m|
≤ R · /(2R) + R · /(2R)
=
is true for almost all n.
The last assertion concerning the case m 6= 0 uses that for almost all n we have
1 2
|bn − m| < |m|/2, hence |bn − 0| ≥ |m|/2 and | |≤ .
bn |m|

Therefore for almost all n we have


1 1 m − bn 2
| − |=| |≤ · |m − bn |,
bn m bn m |m|2

which is smaller than any given , if |bn − m| < |m|2 /2.


We therefore see that limn→∞ b1n = m1 , and the rest follows from what we know
about products of convergent sequences.

Consequence 3.1.10 “Polynomials are continuous”


Let (an )n be a convergent sequence with limit `, let d ∈ N be some natural
number and c0 , c1 , . . . , cd ∈ C are given.
Then the sequence ( di=0 ci ain )n converges to
P Pd i
i=0 ci ` .
46 KAPITEL 3. SEQUENCES

Proof. This is clear from what we said in the last Theorem 3.1.9. As (an ) is
convergent, we find

lim an · an = `2 , lim (an )2 · an = `3


n→∞ n→∞

and so on, i.e. limn→∞ ain = `i for every i ∈ N0 , and then – using the constant
sequence (ci , ci , ci , ci , . . . ) which converges to ci ,

lim ci ain = ci `i .
n→∞

We finally may add these finitely many convergent sequences.


We will later say that due to this phenomenon the polynomial function z 7→
c0 + c1 z + c2 z 2 + . . . + cd z d is continuous.

Consequence 3.1.11 Complex vs. real sequences


A complex sequence (an )n converges to ` if and only if the real sequences (<(an ))n
resp. (=(an ))n converge to <(`) resp. =(`).

Proof. If a complex sequence (an )n converges to the limit `, then the complex
conjugates (an )n converges to `, because

|an − `| = |an − `| = |an − `|.

Therefore the real resp. imaginary parts of an converge to the real resp. imaginary
part of `, as for every z ∈ C
1 1
<(z) = (z + z), =(z) = (z − z).
2 2i
On the other hand, if the real sequences (<(an ))n resp. (=(an ))n converge to x
resp. y , (an )n = (<(an ) + =(an )i)n converges to x + yi.

3.2 Subsequences and accumulation points


It is desirable to get conditions which imply the existence of a limit of a sequence
without the task to check that for every  > 0 . . .

Definition 3.2.1 Subsequences


Given a sequence a = (an )n∈N we define a subsequence of a to be a sequence of
the form
(ank )k∈N = (an1 , an2 , an3 , . . . )
3.2. SUBSEQUENCES AND ACCUMULATION POINTS 47

where
n1 < n2 < n3 < . . .
is a strictly monotonically growing sequence of natural numbers.
For instance, the sequence (1, 3, 5, 7, 9, . . . ) of odd integers is a subsequence of the
sequence a = (1, 2, 3, 4, 5, 6, 7, . . . ) of all natural numbers (both in their natural
order). The sequence (1, 2, 1, 2, 1, 2, . . . ) is no subsequence of a.
It is clear that every subsequence of a convergent sequence converges to the same
limit.

Remark 3.2.2 Divide and rule!


If a has two subsequences such that every an for large enough n is a member
of one of them, then a converges if and only if both subsequences converge and
have the same limit.
For instance, if (a2k+1 )k converges to some limit and (a2k )k converges to the
same limit then so does (an )n .
For the nice application in 3.2.9 of this tool we need some more preparation.
In particular, to handle this type of phenomenon more clearly, the limits of con-
verging subsequences get a name.

Definition 3.2.3 Accumulation points


Let a = (an )n be a complex sequence. A complex number h is an accumulation
point of a if there exists a subsequence of a which converges to h. This means,
that for every  > 0 there exist infinitely many n such that |an − h| < .
We already touched upon this condition in 3.1.3.
Indeed, if this holds, we can “construct” a subsequence converging to h recur-
sively as follows: choose n1 such that |an1 − h| < 1. If n1 , . . . , nk are chosen,
choose nk+1 > nk such that
1
|ank+1 − h| < .
k+1
This is possible as there are infinitely many n such that an has distance less
1
than k+1 from h . Then (ank )k converges to h because

1
|ank − h| < →k→∞ 0.
k

A convergent sequence has exactly one accumulation point. What else can hap-
pen?
48 KAPITEL 3. SEQUENCES

Example 3.2.4 Many accumulations points

(a) It is easy to construct sequences which have a given finite set as accumulati-
on points and no other accumulation point at all. If S = {s1 , s2 , . . . , sc } ⊆ R
is a finite set, then just take the sequence which starts as

a1 = s 1 , a 2 = s 2 , . . . , a c = s c

and is periodic with period c, i.e. an = an+c for all n.


Then si is the limit of the subsequence (ai+kc )k∈N which is constant and
has value si all over. If, on the other side, (ank )k is any subsequence, there
will be some i ∈ {1, . . . , c} such that infinitely many of the nk are of
the shape i + lc, l ∈ N, showing that si is an accumulation point of this
subsequence. Therefore, if the subsequence converges, it converges to si ,
and there cannot be any other accumulation point but elements in S.

(b) A sequence can have infinitely many accumulation points. In particular,


consider the following sequence:
The first 3 values are the elements of { 11 , 01 , −1
1
}. The next values are the
z
rational numbers of the shape n , z ∈ Z, n ∈ N, max(|z|, n) = 2. There are
7 of these. Then go on with the 11 numbers of the shape nz , z ∈ Z, n ∈
N, max(|z|, n) = 3. Go on in that fashion, raising max(|z|, n) to 4, then
to 5, 6, 7 and so on. For every of these maxima you have finitely many
numbers which you sort in some way and make them to the next values of
the sequence. You could start of like

1, 0, −1, 2/1, 2/2, 1/2, 0/2, −1/2, −2/2, −2/1, 3/1, 3/2, 3/3, 2/3, 1/3, 0/3,
−1/3, −2/3, −3/3, −3/2, −3/1, 4/1, 4/2, 4/3, 4/4, . . .

Every rational number will be a value of this sequence at least once, even
infinitely often. Therefore every rational number is an accumulation point.
Even worse (or even more interesting, depending on your taste): As for every
real number x and every  > 0 there are infinitely many rational numbers
in (x − , x + ) by the argument given in 1.3.5, x is an accumulation point.
The set of accumulation points is the whole real line!
Note, that this argument also shows that Q cannot be the set of all accu-
mulation points of a sequence.

We already saw that every convergent sequence is bounded. The converse is false,
as you can see from the sequence ((−1)n )n∈N . But nevertheless the following
theorem is very important.
3.2. SUBSEQUENCES AND ACCUMULATION POINTS 49

Theorem 3.2.5 Bolzano-Weierstraß


Every complex bounded sequence has at least one accumulation point.

Proof. If we know the theorem for real sequences, then it follows for complex
sequences as well, because for a bounded complex sequence (an )n the real- and
imaginary parts are also bounded. The bounded real sequence of real parts has a
convergent subsequence (<(ank ))k and then the bounded real sequence (=(ank ))k
again has a convergent subsequence. The corresponding subsequence of (an )n
therefore converges due to 3.1.11.
We may therefore assume that (an )n is a bounded real sequence, the values being
contained in the interval (−R, R). After adding R and dividing by 2R we may
even assume 0 ≤ an ≤ 1 for all n.
We now have to “find” an accumulation point, which we “construct” as a decimal
number – this is an existence proof! We cannot really single out an accumulation
point explicitely.
We exclude the case that infinitely many members of the sequence are 1 – in this
case, 1 were an accumulation point and we were done with.
We now start by choosing the maximal integer b1 such that for infinitely many
n
b1
≤ an
10
This b1 exists because the set
{z ∈ Z | z ≤ 10an for infinitely many n}
is not empty (it contains every integer z < 0 ) and bounded (no such z is larger
than 9 ).
More generally we define for k ∈ N the integer bk to be maximal with the
property that for infinitely many n
bk ≤ an · 10k .
Again, it is clear that these bk exist by the same reason. As moreover 10bk ≤
an · 10k+1 infinitely often but 10(bk + 1) > an · 10k+1 for almost all n (due to the
choice of bk ), we see that
10bk ≤ bk+1 < 10bk + 10.
Therefore there exists an integer dk+1 ∈ {0, 1, . . . , 9} such that
bk+1 = 10bk + dk+1 .
This defines a decimal number x = 0.d1 d2 d3 . . . = lim∞
k=1 dk 10
−k
, and ` = b0 +x =
bk
limk→∞ 10k .
50 KAPITEL 3. SEQUENCES

This number ` is an accumulation point of an , as for every  > 0 we take a


k ∈ N such that 10−k < . Then by construction

bk bk + 1
k
≤`≤ ,
10 10k
and as also an is between these two bounds infinitely often, we get

|an − `| ≤ 10−k < 

infinitely often, i.e. ` is an accumulation point, as desired.

Theorem 3.2.6 Convergence Criterion


A bounded complex sequence (an )n converges if and only if it has exactly one
accumulation point.

Proof. It is clear that if (an )n converges there is exactly one accumulation point,
namely its limit.
To show the converse we give a proof by contradiction.
Assume that (an )n has eactly one accumulation point ` which is not the limit
of (an )n . Then there exists an  > 0 such that for infinitely many n

|an − `| > .

Therefore there is a subsequece (ank )k outside B (`). This subsequence still is


bounded and therefore – by Bolzano and Weierstraß, 3.2.5 – has an accumulation
point which also is an accumulation point of (an )n . But as no ank comes closer
to ` than with distance , the accumulation point of (ank )k is different from `,
contradicting the uniqueness of this accumulation point.

Remark 3.2.7 Boundedness is necessary


Note that the condition of boundedness is necessary in the last theorem. Indeed,
the sequence 
0, if n is not prime,
an =
n, if n is prime
i.e. (0, 2, 3, 0, 5, 0, 7, 0, 0, 0, 11, 0, 13, . . . ) has 0 as its only accumulation point but
does not converge.

Theorem 3.2.8 Monotonicity Criterion


If a real sequence (an )n is bounded and monotonic (i.e. either for all n an ≤ an+1
or for all n an ≥ an+1 ) then it converges.
3.2. SUBSEQUENCES AND ACCUMULATION POINTS 51

Proof. We only deal with the case that an ≤ an+1 for all n, the other case being
very similar.
As the sequence is bounded, there exists an accumulation point. This accumula-
tion point is unique, as one can see by contradiction. To that end, assume that
h < h0 are two accumulation points. Then there exists an N ∈ N such that

|aN − h0 | ≤ (h0 − h)/2,

which implies
aN ≥ h + (h0 − h)/2.
As all later members of the sequence are larger than aN , we have

|an − h| ≥ (h0 − h)/2 for all n ≥ N,

and therefore h cannot be an accumulation point. 3.2.6 then gives the desired
convergence.
We now are in good shape to treat some examples which will turn out to be useful
later.

Example 3.2.9 Fibonacci and the golden ratio


In this example we want to treat the notion of recursively defined sequences and
explain how one sometimes can calculate limits for this type of sequences. We
will simultaneously illustrate 3.2.2.
A sequence (an )n∈N0 is defined by a simple recursion formula if there is a function
f such that an+1 = f (an ) for every n ∈ N0 . This means that (an )n is specified by
a0 and the function f (where we tacitely assume that the domain of f contains
a0 and the range of f, so that f (an ) always is defined).
Similarly, if f depends on two variables, one sometimes can define a two-step
recursion formula by giving a0 , a1 and setting an+1 = f (an−1 , an ) for n ≥ 1.
The reader is invited to invent the notion of a k -step recursive formula.
Now we look at our example: the Fibonacci sequence. This is given by the two-
step formula
F0 = F1 = 1, Fn+1 = Fn + Fn−1 for n ≥ 1.
It starts as
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . . .
But maybe for us more interesting than (Fn )n itself is the sequence

Qn := Fn /Fn−1 , n ∈ N.

It satisfies the simple recursion formula

Qn+1 = Fn+1 /Fn = (Fn + Fn−1 )/Fn = 1 + 1/Qn .


52 KAPITEL 3. SEQUENCES

It is desirable to prove convergence of this, because then by 3.1.9 we see that the
limit ` satisfies
` = 1 + 1/`,
i.e. √
2 1± 5
` − ` − 1 = 0, i.e. ` = ,
2

1+ 5
and as Qn always is positive, we have ` = 2
.
Let us write down the first members of the sequence (Qn )n .
Q1 = 1, Q2 = 2, Q3 = 3/2, Q4 = 5/3, Q5 = 8/5, Q6 = 13/8, . . .
We see that
Q1 < Q3 < Q5 < Q6 < Q4 < Q2 .
Observe that
2Qn + 1
Qn+2 = 1 + 1/Qn+1 = .
Qn + 1
As now for all x > y > −1 we find 2x+1 x+1
> 2y+1
y+1
, it follows from Q1 < Q3
that Q3 < Q5 < Q7 < Q9 . . . , i.e. the subsequence (Q2n−1 )n∈N is monotonically
growing. It of course is bounded, because Qn ≥ 1 implies Qn+2 = 1 + 1/(1 +
Q−1n ) ≤ 2. Similarly Q2n is decreasing monotonically because Q2 > Q4 > Q6 >
. . . , and is bounded from below by 1. Therefore both subsequences converge by
the monotonicity criterion 3.2.8 and using 3.1.9 for the subsequences we see that
the limit `odd = limn→∞ Q2n−1 satisfies
2`odd + 1
`odd = ,
`odd + 1
2x+1
and the same equation holds for `even = limn→∞ Q2n . The equation x = x+1
,
however, gives after multiplication by 1 + x
x + x2 = 2x + 1, i.e. x2 − x − 1 = 0,
which in the end shows what we wanted.

Example 3.2.10 d -th roots


Let x ≥ 0 and d ∈ N be given. We want to show that there is a non-negative
number ` such that `d = x.
Following Bernoulli 1.3.8 we see that – writing a non-negative number as 1 + h
for some h ≥ −1 –
(1 + h)d ≥ 1 + dh > x
as soon as h ≥ (x − 1)/d. Therefore the set
S := {y ∈ R | y d ≤ x}
3.2. SUBSEQUENCES AND ACCUMULATION POINTS 53

is bounded from above by 1 + (x − 1)/d. Therefore for every n ∈ N the set of


rational numbers
Rn := {z · 10−n | z ∈ Z, z · 10−n ∈ S}
has a maximal element, because the numerators are bounded from above. Define
an := max(Rn ).
As Rn ⊆ Rn+1 , we find that (an )n is a monotonically growing sequence which
still is bounded by 1 + (x − 1)/d. Therefore by 3.2.8 this sequence has a limit
`. It is clear that `d ≤ x because by 3.1.10 limn→∞ adn = `d . It is also clear
that ` = max(S), because on the one hand ` ∈ S and on the other hand, for
every y ∈ S, truncating y at the n -th decimal gives an element bn in Rn , hence
bn ≤ an for all n , and as y = limn bn we see y ≤ `.
We want to show that `d = x. We now look at the sequence (` + n1 )n and
remember that all elements ` + n1 are outside S because they are larger than `.
Therefore (` + n1 )d ≥ x, and as ` + n1 →n ` we again use 3.1.10 and see `d ≥ x,
showing equality.
This number ` ≥ 0 with `d = x is unique, because if ` < m we have x = `d < md
and if m < ` we have md < `d = x. We therefore have defined a function
√ 1/d

d : R
≥0 → R≥0 , x 7→ x = dx

such that ( d x)d = x for every x ≥ 0. This is called the d − th root of x.
Due to the bound on the set S coming from Bernoulli’s inequality one sees that
for every x ≥ 1 one has
lim x1/d = 1,
d→∞
because the d− th root is at least 1 and at most 1 + (x − 1)/d which converges
to 1.
As (1/x)1/d obviously equals 1/(x1/d ), the same limit holds for all x ∈ (0, 1].
For x = 0, x1/d = 0 for every d ∈ N.

Example 3.2.11 The exponential function


Let xP ≥ 0 be a non-negative real number. We define the sequence (an )n by
an = nk=0 xk /k!. This is a monotonically growing sequence, as all summands
are non-negative. For natural N ≥ x and k > N we have
k−N
xk xN xk−N xk−N

x
= · ≤ · .
k! N ! (N + 1) · . . . · k N! N +1
x
As q := N +1
satisfies 0 ≤ q < 1, we see that (for n > N )
n N N
X xk X xk xN X xk xN 1
an = ≤ + · (1 + q + q 2 + q 3 + . . . + q n−N ) ≤ + · .
k=0
k! k=0
k! N ! k=0
k! N ! 1 − q
54 KAPITEL 3. SEQUENCES

Therefore, the sequence (an )n is bounded and hence convergent.


By a similar calculation, the sequence defined by this recipe for x < 0 also is
bounded, but now it is no longer monotonic. That it nevertheless converges either
can be shown by the Leibniz-criterion (coming soon) or by P showing that it has
only one accumulation point. Let x < 0 be given and an = nk=0 xk /k!. Due to
the convergence for positive x we have for every  > 0 that there is an N ∈ N
such that for all n > m ≥ N we have
n n
X xk X |x|k
|an − am | = | |≤ ≤ .
k=m+1
k! k=m+1
k!

This shows that no two different accumulation points can exist. A similar argu-
ment will later be used in the context of “absolute convergence” of series.
We now define the function exp : R → R by
n
X xk
exp(x) = lim .
n→∞
k=0
k!

This is called the exponential function and will play an important role again and
again.
In particular one sets e := exp(1) and can calculate good approximations to
this number – being called Euler’s number – as the sequence converges rapidly
towards its limit:
n
X 1
e = lim = 2.718281828459045 . . .
n→∞
k=0
k!

Unfortunately we can not yet justify the wording, but will do so later.
Kapitel 4

Continuity

4.1 Basic definitions and examples


Definition 4.1.1 Continuity
Let D ⊆ C be a set and f : D → C a function. Given a point z0 ∈ D we say
that f is continuous at z0 , if for every sequence (an )n , an ∈ D, converging to
z0 the sequence (f (an ))n converges to f (z0 ) :

lim f (an ) = f (z0 ).


n→∞

If f is continuous at every z0 ∈ D we just say that f is continuous.


As every real function on a subset D of the real numbers also is a complex
function on a subset of C, we do not have to invent another definition there.

Many phenomena in nature, in particular in classical mechanics, are based on


continuity of functions: “natura non facit saltus”, nature does not make jumps,
is one of the leading principles in ancient philosophy of nature, having been of
influence to Newton and Leibniz in inventing infinitesimal calculus.
That there are also discontinuous functions which play an enormous role in mo-
dern science (Chaos Theory, Quantum Mechanics) will not be our main concern.

Definition/Remark 4.1.2 Accumulation points of sets and limits


There is the notion of an accumulation point for subsets of C. Let S ⊆ C be
a subset. A complex number z is called an accumulation point of S if for every
 > 0 there exist infinitely many s ∈ S with |z − s| < .
Note, that an accumulation point need not be an element of S. For example, 1
is an accumulation point of the open interval (0, 1), as the sequence ( n−1
n n
) has
values in this interval and converges to 1.

55
56 KAPITEL 4. CONTINUITY

If S is the set of values of a complex sequence and z ∈ C is an accumulation


point of S, then it also is an accumulation point of the sequence. However, the
converse need not be the case, because, e.g., 0 is an accumulation point of the
constant sequence (0, 0, 0, . . .) but not of the set {0} of all values of this sequence.
The correct statement is: z ∈ C is an accumulation point of S ⊆ C if and only
if z is the limit of some sequence with values in S r {z}.
If f : S → C is a function and z0 an accumulation point of S, we call ` ∈ C
the limit of f (s) as s goes to z0 , if for all sequences (sn )n in S converging to
z0 we have
lim f (sn ) = `.
n→∞

We then write
lim f (s) = `,
s→z0

where we tacitely imply that all s under consideration lie in S, so that f (s) is
defined.
Of course there are examples, where the said limit does not exist at all. For instan-
ce limx→0 sin(1/x) does not exist, as, e.g., the limit of the sequence sin(1/xn )n
2
with xn = πn does not exist.
If z0 ∈ S is an accumulation point of S then f is continuous at z0 if and only
if lims→z0 f (s) = f (z0 ).

Lemma 4.1.3 Another point of view


Given D ⊆ C and f : D → C, f is continuous at z0 ∈ D if and only if for
every  > 0 there exists a δ > 0 such that

for all z ∈ D with |z − z0 | ≤ δ we have |f (z) − f (z0 )| ≤ .

Proof. If the new condition is satisfied and (an )n is a sequence in D converging to


z0 , then – given arbitrary  > 0 and a suitable δ > 0 as in the theorem – we have
|sn − z0 | ≤ δ for almost all n and therefore by the choice of δ |f (sn ) − f (z0 )| ≤ 
for almost all n. As this is true for every , we have the desired convergence of
(f (sn ))n to f (z0 ).
In the other direction, if f is continuous at z0 , we prove the asserted property
by contradiction. We hence assume that there is some  > 0, such that for every
δ > 0 there exists at least one z ∈ D with |z − z0 | ≤ δ but |f (z) − f (z0 )| > .
In particular, there is an  > 0 such that for every natural number n there is an
sn ∈ D with
1
|sn − z0 | ≤ , but |f (sn ) − f (z0 )| > .
n
But as limn→∞ sn = z0 , this clearly contradicts the continuity of f at z0 .
4.2. TWO IMPORTANT THEOREMS 57

This lemma gives a precise meaning to the vague statement that f is continuous
at z0 if f (z) is close to f (z0 ) as soon as z ∈ D is close enough to z0 .

Example 4.1.4 Some examples

a) The function
| · | : C → R, z 7→ |z|,
is continuous. Because: ||z| − |w|| ≤ |z − w| by 1.2.4. This means: For fixed
z0 and any sequence (sn )n converging to z0 we find that (|z0 | − |sn |)n is a
null sequence, whence sn converges to z0 .
b) The function 
0 if x ∈ Q,
f : R → R, f (x) =
1 if x 6∈ Q,
is nowhere continuous. Because for every x0 ∈ R and every δ > 0, there
are rational and irrational numbers in the interval (x0 − δ, x0 + δ) due to
1.3.5, and therefore in every such interval the values 0 and 1 are taken by
f infinitely often.
Similarly, for this f, the function g(x) = f (x) · x2 is continuous only at
x0 = 0, and nowhere else.
c) From 3.1.10 we remember that – without having had the terminology then
– we verified that polynomial functions always are continuous everywhere.
d) From 3.1.9 we see that sums and products of continuous functions defined
on the same domain are continuous again. The same for quotients, of the
divisor does not take the vaue 0 anywhere.

4.2 Two Important Theorems


Definition 4.2.1 Closed Sets and Compact Sets

(a) A subset A ⊆ C is called closed if for every convergent sequence (an )n in


A we have limn→∞ an ∈ A.
This implies that for every z ∈ C r A there exists an  > 0 with B (z) ∩
A = ∅ (compare 3.1.8 for the notation). Because: Otherwise we would find
elements in A arbitrarily close to z 6∈ A, which gives a convergent sequence
in A with limit outside A.
Indeed, both conditions are even equivalent.
Every closed interval [a, b] is closed. Open bounded intervals (a, b) are not
closed.
58 KAPITEL 4. CONTINUITY

(b) A subset C ⊂ C is called compact if it is closed and bounded.

Lemma 4.2.2 Real compact sets have a maximum


If C ⊂ R is compact and non-empty, then C has a maximal and a minimal
element.

Proof. We prove the existence of the maximum. The minimum goes similarly. As
C is bounded, there exists an upper bound b0 of C. As C is non-empty, there
is an element c0 ∈ C. We have c0 ≤ b0 and successively define two sequences:
(bn )n will be a decreasing sequence of upper bounds of C and (cn )n an increasing
sequence of elements of C such that |bn − cn | is a null sequence.
If b0 ≥ b1 ≥ . . . ≥ bn and c0 ≤ c1 ≤ . . . ≤ cn are already defined, we look at
x = an +b
2
n
. If x is an upper bound for C, we set cn+1 = cn , bn+1 = x. If it is
not an upper bound for C, there exists a cn+1 ∈ C with x < cn+1 and we set
bn+1 = bn . In both cases |bn+1 − cn+1 | ≤ 21 |bn − cn |, which makes the difference
a null sequence.
As both sequences are monotonic and bounded, they converge by the monotoni-
city criterion 3.2.8, and as their difference is a null sequence, they converge to the
same element M. But as C is closed and cn ∈ C, we have M ∈ C. On the other
hand, as M is a limit of upper bounds, it also is an upper bound. Therefore, M
is the maximum of C.

Remark 4.2.3 Struggling alone does not help

(a) If D ⊆ R is bounded and f : D → R is continuous, the range f (D) of f


need not be bounded. This is shown by the example D = (0, 1), f (x) = x1 .
The range is (1, ∞), hence unbounded.

(b) If D ⊆ R is closed and f : D → R continuous, then its range f (D) need


not be closed. This is shown by the example D = [1, ∞), f (x) = x1 again.
Now the range is (0, 1] and has no minimum as limn→∞ n1 = 0 6∈ f (D).

The miracle now is that the range of a continuous function on a compact domain
is compact.

Theorem 4.2.4 Compactness is preserved by continuous functions


Let C ⊂ C be compact and f : C → C continuous. Then the range f (C) is
compact.

Proof. We have to show that f (C) is both bounded and closed.


4.2. TWO IMPORTANT THEOREMS 59

Boundedness: Assume that f (C) is not bounded. Then for each n ∈ N there
exists an element cn ∈ C with |f (cn )| ≥ n. As C is bounded, (cn )n has a
convergent subsequence by Bolzano-Weierstraß, 3.2.5. Without loss of generality
we may assume (cn )n to be convergent. As C is closed, the limit ` = limn cn
also belongs to C, but then limn f (cn ) = f (`), which contradicts |f (cn )| ≥ n >
|f (`)| + 1, which holds for large n. This shows boundedness of f (C).
Closedness: If (an )n is a convergent sequence in f (C), we choose cn ∈ C such
that f (cn ) = an for every n. Again, (cn )n contains a convergent subsequence
(cnk )k , and then

lim an = lim ank = lim f (cnk ) = f (lim cnk ) ∈ f (C),


n k k k

as limk cnk ∈ C, showing that f (C) is closed.

Consequence 4.2.5 Existence of maxima


Let C ⊆ C be compact and f : C → R continuous. Then there exists a c+ ∈ C
such that for all x ∈ C, f (x) ≤ f (c+ ), and there exists a c− ∈ C such that for
all x ∈ C, f (x) ≥ f (c− ),

Proof. As the range of f is compact, it has a maximum and a minimum by 4.2.2.


Choose c+ such that f (c+ ) is the maximum of the range and c− such that f (c− )
is the minimum.

Remark 4.2.6 Optimization


The consequence just stated is the source for many applications of mathematics.
Many applications are concerned with “optimizing” certain parameters, and this
often means you have to find maxima and minima of values of certain functions.
If these functions are continuous and real-valued and the domain is compact, then
it at least is certain, that such extremal values exist. Of course, this sometimes
(more often than not) does not help at all in finding extremal values. It only says
that somewhere in the mist there should be this point of maximality, shrouded
in the circumstances of real life. . .

Another basic property of continuous functions is special for the real case. To
formulate it smoothly we say that a real number x lies between two other real
numbers y, z if y ≤ x ≤ z or z ≤ x ≤ y, i.e. x ∈ [min(y, z), max(y, z)].

Theorem 4.2.7 Intermediate value theorem


Let a ≤ b be real numbers and f : [a, b] → R a continuous function and v a
number between f (a) and f (b).
Then there exists an x0 ∈ [a, b] with f (x0 ) = v.
60 KAPITEL 4. CONTINUITY

Proof. Without loss of generality, we assume that f (a) ≤ f (b).


If v = f (a) or v = f (b) we can take x0 = a or x0 = b. We therefore may assume

f (a) < v < f (b).

Define
S := {x ∈ [a, b] | f (x) ≤ v}.
Then S is closed, because if (sn )n is a convergent sequence in S with limit `,
continuity implies f (`) = limn f (sn ) ≤ v, i.e. ` ∈ S. As S ⊂ [a, b], S is bounded
and hence it is compact and therefore has a maximum x0 by 4.2.2. As x0 ∈ S,
we have f (x0 ) ≤ v.
As v < f (b), there are sequences (tn )n in (x0 , b] converging to x0 . But tn > x0
implies f (tn ) > v, and therefore – by continuity again – f (x0 ) = limn f (tn ) ≥ v.
This shows f (x0 ) = v.

Example 4.2.8 Root functions again


We already made a similar construction as in the proof of the intermediate value
theorem, when we proved much earlier in 3.2.10 that for every positive number
x and every d ∈ N a d -th root exists. The argument given then was an adhoc
combination of several proofs we gave in this section, taylor-made for the situa-
tion. As x lies between 0 and (1 + (x − 1)/d)d and y 7→ y d is continuous on
[0, 1 + (x − 1)/d], there is a y in this interval with y d = x. Therefore the function
f : R≥0 → R≥0 , f (y) = y d , is surjective and – due to monotonicity – injective,

hence there is an inverse function, namely the d -th root, denoted by d .

Example 4.2.9 Roots of polynomials


If f is a continuous real-valued function on an interval I , the intermediate value
theorem can be reformulated in the following way: the range of f, f (I), also is
an interval.
In particular, if for two elements x, y ∈ I f (x) > 0 and f (y) < 0, then there is
an x0 between x and y with f (x0 ) = 0. Nature does not make a jump from the
positive to the negative real numbers, it goes through zero.
In particular if f is a polynomial function on R of odd degree, limx→∞ ff(−x)
(x)
=
−1, and therefore there has to be a sign change and hence a real root of f. As
for every constant c f − c also is a polynomial of odd degree, it also has a root,
which means in turn that f (x1 ) = c for some x1 : f is surjective.

Consequence 4.2.10 Injectivity vs. monotonicity


Let I be an interval and f : I → R a continuous function.
Then f is injective if and only if it is strictly monotonic.
4.2. TWO IMPORTANT THEOREMS 61

Proof. If f is strictly monotonic, then it is injective, because if x < y ∈ I are


given, then f (x) < f (y) if f is strictly monotonically growing or f (x) > f (y)
if it is strictly monotonically decreasing. In either case, f (x) 6= f (y). We do not
need that I is an interval for that direction.
Let now f be injective. We have to show that for any three elements x < y < z
in I, f (y) lies (strictly) between f (x) and f (z). Note that for any three real
numbers, exactly one of them lies between the others. Now if f (z) would lie
between f (x) and f (y), by 4.2.7 we would find an x0 ∈ [x, y] with f (x0 ) = f (z),
which contradicts injectivity. Similarly, if f (x) lies between f (y) and f (z), we
would have an x0 ∈ [y, z] with f (x) = f (x0 ). Therefore the only possibility left
is that f (y) lies between f (x) and f (z).
62 KAPITEL 4. CONTINUITY
Kapitel 5

Series

5.1 Infinite sums


Remark 5.1.1 Another view on sequences
When we discussed sequences, we realized in 3.1.5 that for every convergent
sequence (an )∞
n=0 the sequence of differences (an+1 − an )n is a null sequence.
Setting a−1 = 0 and di = ai − ai−1 for i ∈ N0 we get
n
X
an = di
i=0

for every n, and it sometimes is very helpful to write sequences in this way, with
given numbers di . This leads to the definition of series.

Definition 5.1.2 Series


Given a sequence (an )∞
n=0 of complex numbers, we call
n
X
sn = ai
i=0

the n -th partial sum of (an )n . The sequence of partial sums (sn )n is called the
(infinite) series defined by (an )n , and if it converges to a limit `, we write

X
`= ai .
i=0

As recalled in the introductory remark, this series can only converge if an =


sn − sn−1 is a null sequence. That this is not sufficient for convergence will be
exemplified in one of our coming examples.
One often also says (by abuse of the terminology) that the infinite series ∞
P
k=0 ak
does not converge, if the sequence of partial sums does not converge.

63
64 KAPITEL 5. SERIES

Example 5.1.3 Exponential series, geometric series, harmonic series

(a) The exponential series


n
X xk
exp(x) = lim
n→∞
k=0
k!

converges for every x ∈ R (cf. 3.2.11). We will see in 5.1.5 combined with
5.1.6 that it even converges for every x ∈ C.
P∞ i
(b) The geometric series n=0 q converges if an only if |q| < 1, as we saw in
3.1.6. For such q we now may write

X 1
qi = .
i=0
1−q
P∞ 1
(c) The harmonic series n=1 n does not converge, although the summands
are a null sequence.
Pn 1 This is a consequence of the fact that for the partial
sums sn = k=1 k we see that
2n
X 1 1 1
s2n − s2n−1 = ≥ 2n−1 n = ,
k 2 2
k=2n−1 +1

showing s2n ≥ n2 , so that (sn ) is unbounded - contradicting convergence.


This can be more pictorially presented as
1 1 1 1 1 1 1 1 1 1
1+ + + + + + + + . . .+ n−1 + n−1 + ... + n +...
2 |3 {z 4} |5 6 {z 7 8} |2 + 1 2 {z+ 2 2}
≥ 12 ≥ 21 ≥2n−1 · 21n = 12

Definition 5.1.4 Absolute convergence


P∞
Given a sequence (an )n of complex numbers, we say that the series n=0 an
converges absolutely if the series

X
|an |
n=0
converges.
As the sequence of the partial sums for |an | is real and monotonically growing,
by the Monotonicity Criterion 3.2.8 this is equivalent to
Xk
( |an |)k is bounded
n=0

which might be much more easy to decide.


5.1. INFINITE SUMS 65

Example 5.1.5 Exponential series


As |z n | =P
|z|n for every z ∈ C and n ∈ N0 , we see that the exponential series
exp(z) = ∞ zn
n=0 n! is absolutely convergent for every z ∈ C which follows from
3.2.11.
The same observation holds fo the geometric series ∞ n
P
n=0 q in the case |q| < 1.
Note that we did not yet show that exp(z) converges for every z ∈ C. This will,
however, follow from its absolute convergence and the following Lemma.

Lemma 5.1.6 Absolute convergence implies convergence


Let (an )∞
n=0 be a complex sequence.
P∞ P∞
If the series n=0 |an | converges, then n=0 an also converges.

Proof. Let sn = nk=0 ak be the n -th partial sum of (ak )k and tn = nk=0 |ak |
P P
the n -the partial sum of (|ak |)k .
As (tn )n converges, the (real) sequence (tn ) is bounded by some R > 0. By the
triangle inequality,
n
X n
X
|sn | = | an | ≤ |ak | = tn ≤ R
k=0 k=

also is bounded, sn ∈ BR (0), so that (sn )n has an accumulation point by


Bolzano-Weierstraß, 3.2.5.
As (by the triangle inequality again) for n < m, and due to the convergence of
(tn )n
m
X m
X ∞
X
|sm − sn | = | am | ≤ |am | = |tm − tn | ≤ |ak | →n→∞ 0,
k=n+1 k=n+1 k=n+1

there can only be one accumulation point of (sn )n .


Because: If `, `0 were two such accumulation points with |` − `0 | = d > 0, there
would be infinitely many values for n and m such that

|` − sn |, |`0 − sm | ≤ R/3,

resulting in
|sn − sm | ≥ R/3
infinitely often, which we just disproved.
As (sn )n is bounded and has exactly one accumulation point, it is convergent by
3.2.6.
66 KAPITEL 5. SERIES

Theorem 5.1.7 Cauchy Convolution Formula


Let (an )∞ ∞
n=0 , (bn )n=0 be two complex sequences such that
X X
an and bn both converge absolutely.
n n
Pn
We define cn = i=0 ai bn−i .
P
Then n cn converges absolutely and we have


X X∞ X∞
cn = ( an ) · ( bn ).
n=0 n=0 n=0

P∞ P∞
Proof. Let R be a common upper bound for n=0 |an | and n=0 |bn |.
Then for every k,
k
" k
# " k
#
X X X X
|cn | ≤ |am | · |bn | ≤ |am | · |bn | ≤ R2
n=0 m,n:m+n≤k m=0 n=0

is bounded, showing absolute convergence.


We now show the identity stated in the theorem.
We will use that for every  > 0 there is an N ∈ N such that

X ∞
X
|an |, |bn | ≤ .
n=N n=N

We know from 3.1.9 that the product on the right hand side of the inequality in
the statement is the limit of the products of the partial sums:

X∞ X∞ Xk Xk X
( an ) · ( bn ) = lim ( an ) · ( bn ) = lim am b n .
k→∞ k→∞
n=0 n=0 n=0 n=0 0≤m,n≤k

On the other hand,


2k
X X
cn = am b n .
n=0 0≤m,n, m+n≤2k

We get
2k
X k
X X
| cn − am b n | = | am bn |,
n=0 n,m=0 (m,n)∈Tk

where
Tk = {(m, n) ∈ N20 | m + n ≤ 2k and m > k or n > k}.
5.1. INFINITE SUMS 67
P∞ P∞
For given  > 0 and large enough k, m=k |am | ≤ /(2R) and n=k |bn | ≤
/(2R) Therefore, using the triangle inequality, we have
P P
| (m,n)∈Tk am bn | ≤ (m,n)∈Tk |am | · |bn |
P∞ P∞
≤ 2k
P P2k
m=0 |am | n=0 |bn | + m=0 |am | n=0 |bn |
≤ 2 · R · /(2R) = ,

showing in combination with the last inequality that the two sequences have the
same limit.

Consequence 5.1.8 Functional equation of the exponential function


We already know from 5.1.5 that for every complex number z the sequence

X zn
exp(z) =
n=0
n!

converges absolutely.
Using the Cauchy-Convolution formula and the Binomial Formula 2.1.8 this im-
plies for all z, w ∈ C

exp(z) · exp(w) = ∞
P zm
P∞ wn
m=0m! ·
P∞ Pk n=0 n!
z m wk−m

= k=0 m=0 m! (k−m)!
P∞ Pk k
 m k−m 
= k=0 m=0 m z w /(k!)
P∞ (z+w)k
= k=0 k!
= exp(z + w).

This identity is called the functional equation of the exponential function. We will
use it again and again.
First of all it shows that for natural numbers d
d
X
exp(d) = exp( 1) = exp(1)d = ed ,
i=1

and exp(−1) exp(1) = exp(0) = 1, which shows exp(−1) = 1


exp(1)
= e−1 .
As exp(x) > 0 for positive x is obvious, we also see that exp(−x) = 1/ exp(x) >
0 as well, which is not at all obvious from the exponential series. Therefore, on
real numbers, exp only takes positive real values.
Moreover, as we have exp(x/d)d = exp(d · x/d) = exp(x) for fixed d ∈ N, we see
that exp(x/d) is the unique positive d -th root of exp(x), cf. 3.2.10.
Therefore for rational numbers r, exp(r) = er .
68 KAPITEL 5. SERIES

We also see that on real numbers the exponential function is monotonically in-
creasing: for all real numbers x < y,

exp(y) = exp(x) · exp(y − x),


(y−x)k
and as exp(x) > 0 and exp(y − x) = ∞
P
k=0 k!
≥ 1 + y − x > 1, ( y − x being
positive) we see
exp(y) > exp(x).

For complex arguments this is meaningless, and we will even see that on C every
value of exp is non-zero and taken infinitely often.

Example 5.1.9 Not every convergent series is absolutely convergent


P∞ 1
We know that the harmonic series k=1 k is not convergent from 5.1.3(c). Ho-
wever, the series

X 1 1 1 1 1
(−1)k+1 = 1 − + − + − ... ± ...
k=1
k 2 3 4 5

indeed does converge. This is a special case of the Leibniz criterion:

Theorem 5.1.10 Leibniz criterion


Let (an )∞
n=0 be a monotonically decreasing sequence of non-negative real numbers
with limn→∞ an = 0.
Then the sequence

X
(−1)n an = a1 − a2 + a3 − a4 . . .
n=0

does converge. For the limit ` we find for every k


k
X
|` − (−1)n an | ≤ ak+1 .
n=0

Proof. We study the sequence of partial sums


k
X
sk = (−1)n an .
n=0

For even k we see that

sk+2 = sk − ak+1 + ak+2 = sk − (ak−1 − ak+2 ) ≤ sk ,


5.1. INFINITE SUMS 69

as ak+2 ≤ ak+1 .
For odd k we find
sk+2 = sk + ak+1 − ak+2 ≥ sk
for the same reason.
As a0 = s0 ≥ 0 and s1 = a0 −a1 ≥ 0, we see for odd k, due to this monotonicity,
sk ≥ 0. As then sk+1 = sk + ak+1 this is also true for all even k. Similarly, for
all even k we see sk ≤ a1 , and this then also holds for sk+1 = sk − ak+1 .
Therefore, all partial sums sk satisfy

0 ≤ s k ≤ a0 ,

and by their monotonicity, the subsequences

(s2n )n and (s2n−1 )n

converge. As their difference is a Null sequence (only here we need that an con-
verges to 0), they have the same limit. As in 3.2.2 we now see that this limit is
the only accumulation point ` of (sk )k and therefore sk converges.
Due to the monotonicities of the odd and even subsequences of partial sums, `
lies between sk and sk+1 , showing

|` − sk | ≤ |sk+1 − sk | = ak+1

as desired.

Remark 5.1.11 Warning: Observe the order!


As now we have our first example of a convergent series which does not converge
absolutely, we should warn that in this situation the given order of the summands
is absolutely essential. In fact, if the summands are real and the convergence is
not absolute, one can rearrange the summands by permuting the indices so that
the reordered summation converges to any given real number!!! The reason is
that the series over the positive summands and that over the negative summands
both do diverge.
This phenomenon does not happen in the case of absolute convergence. We will
not prove this here.
70 KAPITEL 5. SERIES

5.2 Criteria for absolute convergence


Lemma 5.2.1 Comparison test
If (an )n is a complex sequence and (bn )n a real sequence such that we have

X
|an | ≤ bn for almost all n and bn converges,
n=0
P∞
then n=0 an converges absolutely.
Pk
Proof. As |an | ≤ bn for all n ≥ N for some suitable N, we find that n=0 |an |
PN P∞
is bounded by n=0 |an | + n=N +1 bn , this series converges.

Lemma 5.2.2 Sums of series


P∞ P∞
If (an )n , (bn )n are two complex series such that n=0 an and n=0 bn converge,
then for any complex numbers w, z we have

X ∞
X ∞
X
(wan + zbn ) = w an + z bn .
n=0 n=0 n=0

Proof. Looking at the sequences of partial sums, this is clear from 3.1.9.

Theorem 5.2.3 Ratio test


Let (an )n be a complex sequence with an 6= 0. If there is a real positive q < 1
satisfying
|an+1 |
≤ q f or almost all n,
|an |
P∞
then n=0 an converges absolutely.

|an+1 |
Proof. Choose N ∈ N such that |an |
≤ q for all n ≥ N.
Then by mathematical induction for all k ∈ N0 we have

|aN +k | ≤ q k |aN |

which shows absolute convergence by 5.2.1, using the convergence of the geometric
series.

Remark 5.2.4 Special case


If in the situation of the last theorem we have
an+1
` = lim <1
n→∞ an
5.3. POWER SERIES 71

then for almost all n we have


an+1 1+`
≤ <1
an 2

proving convergence of the series by 5.2.3.

Theorem 5.2.5 Root test


Let (an )n be a complex sequence. If there is a real positive q < 1 satisfying
pn
|an | ≤ q for almost all n,
P∞
then n=0 an converges absolutely.

Proof. From the hypothesis we find

|an | ≤ q n

for almost all n, which shows absolute convergence by 5.2.1, using the conver-
gence of the geometric series.

Remark 5.2.6 Special case


Again, if p
n
lim |an | < 1
n→∞

the conditions in the last Theorem are satisfied.


Note, however, that this limit often will not exist.

The most important application of these theorems will be the topic of the next
section.

5.3 Power Series


Definition 5.3.1 Power series
A power series is a series of the shape

X
cn x n ,
n=0

where cn ∈ C are the coefficients and x is the variable. The set of all x for which
this series converges is called its domain of convergence.
72 KAPITEL 5. SERIES

Example 5.3.2 Exponential function and geometric series


We defined the exponential function as a power series. It converges for all z ∈ C.
P∞ n
The geometric series n=0 x is a power series which converges for all x with
1
|x| < 1 and has 1−x as the limit there.
More generally, looking at f : C r {0} → C, f (z) = 1/z, choosing some z0 6= 0,
for every h with |h| < |z0 | we have

1 1 1 1 X −1 n n
f (z0 + h) = = · = ( ) h
z0 + h z0 1 − (−h/z0 ) z0 n=0 z0

is a convergent power series in the variable h.


If a function f can be represented as a power series around x0 , i.e. there exists
an r > 0 such that

X
f (x0 + h) = cn hn for all h with |h| < r,
n=0

then f is called analytic at x0 . This is a very strong condition, analytic functions


are very special, but for instance the important exponential function, the function
x 7→ 1/x and every polynomial function (almost all coeffcients in the power series
are 0) and also rational functions belong to this class.
Sometimes, if you need a function with a certain property, it might be possible
to define it as a power series (more often, however, this will not work:-( ).

Theorem 5.3.3 Convergence behaviour of power series


Let f (x) = ∞ n
P
n=0 cn x be a power series which converges for some z ∈ C, z 6= 0.
Then f (w) converges absolutely for every w ∈ C with |w| < |z|.

Proof. As f (z) converges, the sequence (|cn z n |)n is a null sequence, and in par-
ticular it is bounded from above by some R > 0.
As |w| < |z|, we have q := |w/z| < 1. Then
pn
p √
n
|cn wn | = n |cn z n |q n ≤ R · q →n→∞ q < 1

and hence
p
n 1+q
|cn wn | ≤ < 1 for almost all n
2
shows absolute convergence by the Root Test 5.2.5.

Consequence 5.3.4 Radius of convergence


5.3. POWER SERIES 73
P∞ n
Let f (x) = n=0 cn x be a power series.
Then either f (z) converges absolutely for every z ∈ C or there exists some real
ρ ≥ 0 such that f (z) converges absolutely for every z ∈ C with |z| < ρ and
diverges for every z ∈ C with |z| > ρ.
This number ρ is called the radius of convergence of f, and in the first case we
formally set ρ = ∞.
The exponential series has radius of convergence ∞, the geometric series has
radius of convergence 1.
P∞ xn
The series n=1 n converges for x = −1 (cf. 5.1.9) and diverges for x = 1
(harmonic series). Therefore its radius of convergence simultaneously is ≥ 1 and
≤ 1, it therefore is 1.
Note, that f (0) always converges and we have f (0) = c0 .

Lemma 5.3.5 Expansion around z0


Let f (x) = ∞ n
P
n=0 cn x be a power series with convergence radius ρ and z0 ∈ C
a number with |z0 | < ρ.
Then f (x) is analytic in z0 .

Proof. We have to show that the function g(h) := f (z0 + h) can be written as a
power series in h (if |h| is small enough).
Let h ∈ C have absolute value |h| < ρ−|z0 |. Then f (z0 +h) converges absolutely
and we have
∞ ∞ n  
!
X X X n
f (z0 + h) = cn (z0 + h)n = cn · hk z0n−k .
n=0 n=0 k=0
k

As this converges absolutely, we may rearrange the summands and put together
summands involving the same power of h, i.e.
∞ ∞   ! ∞
X X n n−k k
X
f (z0 + h) = cn z0 ·h = c̃k hk ,
k=0 n=k
k k=0
P∞ n
 n−k
where c̃k = n=k cn k z0 .
We gain from the proof that c̃k converges, but this also could be established
independently from the absolute convergence of c̃0 = f (z0 ).

Lemma 5.3.6 Power series are continuous


Le f (x) = ∞ n
P
n=0 cn x be a power series with radius of convergence ρ. Then f (x)
is continuous at every z0 with |z0 | < ρ.
74 KAPITEL 5. SERIES

Proof. As f (x) is expandable as a power series around z0 , we may assume that


z0 = 0. We have to show that for every  > 0 there exists a δ > 0 such that for
all z with |z| < δ we have |f (z) − f (0)| < . Note that f (0) = c0 .
Choose some positive s < ρ. We know from the absolute convergence of f (s)
that there is some N ∈ N with

X
|cn |sn < /2.
n=N +1

If |z| < s, by the triangle inequality we therefore find



X
|cn ||z|n < /2.
n=N +1

As N n
P
n=0 cn z is a polynomial in z we know that there is some positive δ which
may be chosen smaller than s such that
N
X
for all z with |z| < δ : | cn z n − f (0)| < /2.
n=0
PN n
This comes from the fact that f (0) = n=0 cn 0 .
Putting together these two estimates (and using the triangle inequality) gives the
desired
|f (z) − c0 | ≤ 2/2 =  if |z| < δ.

5.4 The exponential function and some relatives


P∞ zn
We already saw the exponential function exp(z) = n=0 n! many times.
We now define two new functions, which we call S ad C for the moment, by:

Definition/Remark 5.4.1 The functions S and C


We define complex functions S and C on C by
1 1
S(z) = (exp(zi) − exp(−zi)), C(z) = (exp(zi) + exp(−zi)).
2i 2
Using the power series defining the exponential function this gives after some
cancellation due to i2 = −1 the power series
∞ ∞
X
nz 2n+1 X z 2n
S(z) = (−1) , C(z) = (−1)n .
n=0
(2n + 1)! n=0
(2n)!
5.4. THE EXPONENTIAL FUNCTION AND SOME RELATIVES 75

On the other hand we see from the definition that

exp(zi) = C(z) + S(z)i.

It is clear from the definition that S(−z) = −S(z) and C(−z) = C(z). This
also follows from the fact that in C only even powers of z are involved, while in
S only odd powers are involved.
The definition of S and C and the identity exp(z) = exp(−z)−1 leads to

S 2 (z) + C 2 (z) = 1 for all z ∈ C.

For real x, S(x) and C(x) are real, because the coefficients in the power series
are real. For two real numbers x, u we use the Functional Equation of exp , 5.1.8,
in the following calculation:

C(x + u) + S(x + u)i = exp((x + u)i)


= exp(xi) · exp(ui)
= (C(x) + S(x)i) · (C(u) + S(u)i)
= (C(x)C(u) − S(x)S(u)) + (C(x)S(u) + S(x)C(u))i.

We then compare the real and imaginary parts, and see

C(x + u) = C(x)C(u) − S(x)S(u) and S(x + u) = C(x)S(u) + C(u)S(x).

This also holds for complex x and u and is the same addition formula as for the
cosine and sine function from 2.2.6.
We will see later in 6.1.5 that C and S actually are the cosine and sine function.
The next lemma will be one step further in this direction.

Lemma 5.4.2 A zero of C


The function C from above has exactly one zero between 0 and 2, the function
S is positive in this interval.

Proof. For every x ∈ [0, 2] the sequence (x2n /(2n)!)∞ n=1 is positive and monoto-
nically decreasing. Therefore, using the last assertion of Leibniz’ Criterion 5.1.10,
we find that
|C(x) − (1 − x2 /2)| ≤ x4 /24,
showing that C(2) < 0, while C(0) = 1 > 0. By the Intermediate Value Theorem
4.2.7 we know that the continuous function C has at least one zero in [0, 2].
With a similar argument we find that for all x ∈ [0, 2],

S(x) ∈ [x − x3 /6, x]
76 KAPITEL 5. SERIES

and, as x3 /6 ≤ x for all x under consideration, in particular S(x) ≥ 0. We even


have S(x) > 0 for x ∈ (0, 2],
We still want to show that there is only one zero of C in [0, 2]. Let u, v ∈ [0, 2]
be two numbers, u < v. Set x = u+v 2
, h = v−u
2
, so that u = x − h, v = h + h.
Then
C(v) − C(u) = C(x + h) − C(x − h)
= C(x)C(h) − S(x)S(h) − C(x)C(−h) + S(x)S(−h)

and as C is even and S odd this implies

C(v) − C(u) = −2S(x)S(h) < 0.

Therefore, C is strictly monotonically decreasing on [0, 2], and therefore has at


most one zero.

Consequence 5.4.3 More similarities to sine and cosine


We call η the unique number with 0 ≤ η ≤ 2 and C(η) = 0 and – still for the
time being – ℘ = 2η. As C(η) = 0 and C(η)2 + S(η)2 = 1, we find S(η) = 1,
as this is positive.
This shows
exp(ηi) = C(η) + S(η)i = i,
and, by the functional equation of the exponential function, for every integer k

exp(ηki) = ik .

A special case is exp(℘i) = −1. Later on we will see that ℘ = π and then this
last equation is Euler’s equation.
As consequently exp(2℘i) = 1, we – again by the functional equation – derive
that the exponential function is periodic:

exp(z) = exp(z + 2℘i) for all complex z.


Therefore, the functions S and C have period 2℘ :

S(z + 2℘) = S(z), C(z + 2℘) = C(z) for all complex z.

We also get from exp(z + ηi) = exp(z) · i the identities

S(z + ℘/2) = C(z) and C(z + ℘/2) = −S(z)

as well as
S(z + ℘) = −S(z), C(z + ℘) = −C(z).
5.4. THE EXPONENTIAL FUNCTION AND SOME RELATIVES 77

For real numbers x, y

exp(x + yi) = exp(x) · (C(y) + S(y)i),

and in particular | exp(x + yi)| = exp(x). Remember that for real numbers x > 0
we have exp(x) > 1 (just as every summand in the power series is positive) and
1
for real numbers x < 0 we have exp(x) = exp(−x) < 1.
This implies that for solving exp(z) = 1 you need <(z) = 0 and C(=(z)) = 1.
As S(y) > 0 for y ∈ (0, 2) by 5.4.2 and for y ∈ (℘ − 2, ℘) by the other results,
℘ is the smallest positive zero of S. As the zeros of S have period ℘, we must
have =(z) ∈ Z · ℘, showing z = k℘i. For odd k, exp(z) = −1. Hence we have

exp(z) = 1 ⇔ z ∈ Z · 2℘i.

As already said, all that points pretty much in the direction that C = cos, S = sin
and ℘ = π, but we do not know that yet.
The missing step is that we have to show, that the arc length covered by the
curve t 7→ (C(t), S(t)), 0 ≤ t ≤ x, is x. This curve has values on the unit circle,
because S 2 + C 2 = 1. As already indicated, this gap will be closed in 6.1.5.

Lemma 5.4.4 The range of exp


The range of exp is C r {0}.
For z, w ∈ C we have exp(w) = exp(z) if and only if z − w = 2℘ik for some
k ∈ Z.

Proof. The last assertion is clear by what we just saw as exp(z) = exp(w) if and
only if exp(z − w) = 1.
As exp(z) · exp(−z) = 1, 0 does not belong to the range of exp .
We now have to show that every non-zero complex number w does belong to the
range. Write
w = r · (x + yi), r > 0, x, y ∈ R, x2 + y 2 = 1.

For every real t > 0, exp(t) > 1 by the power series defining exp . Therefore,
using 3.1.6(b), the sequence of numbers exp(nt) = exp(t)n , n ∈ N is unbounded.
By the intermediate value theorem, 4.2.7, every real number r ≥ 1 can therefore
be written as r = exp(t) for some t ∈ [1, ∞). If 0 < r < 1, there is a t ≥ 1 with
1
r
= exp(t), i.e. r = exp(−t).
Note that exp is strictly monotonically increasing on R by the functional equa-
tion.
Now we have to find a real u with exp(ui) = (x + yi).
78 KAPITEL 5. SERIES

We know that C takes every x ∈ [0, 1] as a value between 0 and ℘/2. Due
to C(℘ − u) = −C(u) we see that C takes every x ∈ [−1, 1] as a value on
[0, ℘] . Having found u with C(u) = x, we know that S(u) = ±y, because
S(u)2 = 1 − C(u)2 = 1 − x2 = y 2 .
As (C(−u), S(−u)) = (C(u), −S(u)) we finally find a suitable u ∈ [−℘, ℘] with
(C(u), S(u)) = (x, y), if x2 + y 2 = 1.

Remark 5.4.5 The logarithm


Starting with real numbers, we know that exp : R → R>0 is bijective, hence
there is an inverse map, called the natural logarithm.
ln : R>0 → R, y 7→ the unique x ∈ R with exp(x) = y.
It is clear from the functional equation of the exponential function, that for all
positive real y, v we have ln(yv) = ln(y) + ln(v).
With a bit of care we can extend this map to the complex world: Every non-zero
complex w can be written as exp(z) for a unique z with =(z) ∈ (−℘, ℘]. The
map defined by this again is called the natural logarithm, but now we only get
that
ln(w1 w2 ) = ln(w1 ) + ln(w2 ) + 2℘ik for some k ∈ Z.
Remember that we will later be able to see that ℘ = π.

Definition/Remark 5.4.6 Other basis for exponential functions


We
P∞now stick to the real case. The exponential function we introduced is exp(x) =
xn
n=0 n! , and it turned out later that for rational x and after setting e := exp(1)
we get the equation
ex = exp(x).
We now can view this as a definition of ex for arbitrary real x.
As for any integer k and any x ∈ R we have
1
exp(kx) = exp(x)k and exp(x/k) = exp(x) k , if k > 0,
we see that for given positive a and rational r we have
ar = exp(ln(a)r).
Again, as the right hand side is continuous on real numbers r, we now define
ar = exp(ln(a)r) = eln(a)r .
Again, this function takes every positive real number as its values, and we can
define its invers function on the positive reals as the solution of the equation
exp(ln(a) loga (x)) = x, i.e. loga (x) = ln(x)/ ln(a).
5.4. THE EXPONENTIAL FUNCTION AND SOME RELATIVES 79

Definition 5.4.7 Tangens and the like


We can now as well express other trigonometric functions in terms of sine and
cosine.
The tangens is defined as
sin(x)
tan(x) = , x 6∈ {π/2 + kπ | k ∈ Z},
cos(x)
the cotangens is
cos(x)
cot(x) = , x 6∈ Zπ,
sin(x)
and these definitions even are valid for complex numbers.
The tangens – in the real world again – is injective on (−π/2, π/2) and assumes
every real number as a value, there is an invers function called “arcus tangens”,
tan(arctan(x)) = x. That every real number arises as a value comes from the
intermediate value theorem, because

lim sin(x)/ cos(x) = ∞, lim sin(x)/ cos(x) = −∞.


x%π/2 x&−π/2

The injectivity comes from – setting c = cos(x), c̃ = cos(y) –


√ √
1 − c2 /c = 1 − c̃2 /c̃ ⇒ c2 = c̃2 ,

and as cos is non-negative between −π/2 and π/2 we get c = c̃.

Example 5.4.8 Hyperbolic functions


The hyperbolic counterparts to sine and cosine (or S and C we should say for
the moment) are
1 1
sinh(z) = (exp(z) − exp(−z)) and cosh(z) = (exp(z) + exp(−z)).
2 2
It is immediate that

cosh(z) = cosh(−z), sinh(z) = − sinh(−z).

We can write down the power series giving these functions, inserting those for
the exponential function in the defining equation:
∞ ∞
X z 2n X z 2n+1
cosh(z) = , sinh(z) = .
n=0
(2n)! n=0
(2n + 1)!

They converge for every complex number. This time we find

cosh(z)2 − sinh(z)2 = 1,
80 KAPITEL 5. SERIES

which is responsible for the fact that we can parametrize the real hyperbola

{(x, y) ∈ R | xy = 1}

as
{(cosh(t) + sinh(t), cosh(t) − sinh(t)) | t ∈ R}.
This justifies the names hyperbolic cosine and hyperbolic sine.
In 7.4.4 we will sketch why the hyperbolic cosine is the function describing a
catenary.
One can even define a hyperbolic tangens and the like as in the trigonometric
case.

5.5 Two examples for calculations with power


series
For the two examples I have in mind we prove the following lemma.

Lemma 5.5.1 The coefficients are unique


Let f (x) = ∞
P n
P∞ n
n=0 cn x = n=0 c̃n x for all x in some intervall (−r, r), where
r > 0.
Then cn = c̃n for all n.

Proof. Taking differences, we have to show that the only way to represent the
zero function on (−r, r) by a power series is with all coefficients being zero.
Assume now that

X
0= cn xn for all x ∈ (−r, r).
n=0

Looking at x = 0 we find c0 = 0.
Now assume further, that not all coefficients are zero, and let m be the smallest
index with cm 6= 0. We want to derive a contradiction.
Write the right hand side as

X
xm · cm+n xn .
n=0

As the second factor is a convergent power series by assumption, it is continuous


by 5.3.6. As the product can only be zero if one of the factors is zero, and as
5.5. TWO EXAMPLES FOR CALCULATIONS WITH POWER SERIES 81

xm 6= 0 for x 6= 0, we have

X
cm+n xn = 0 for all x ∈ (−r, r), x 6= 0.
n=0

But as this power series is continuous, it has to be zero on the whole interval
(−r, r), giving constant term 0 again: cm = 0.
This contradicts our assumption and shows that all coefficients are zero.
We already knew this for the special case of polynomials, which are finite power
series, and where we saw the uniqueness of coefficients in 2.3.5: A non-zero poly-
nomial cannot vanish on a whole interval.

Example 5.5.2 The square root function


We are looking at the square root function in a neighbourhood of 1 and want to
find a power series satisfying

√ X
1+x= cn x n
n=0

for all x with small absolute value.


Of course, looking at the constant term, we see c0 = 1. We now use Cauchy’s
Convolution Formula 5.1.7 to calculate the square of the right hand side:

!2 ∞
X X
n
cn x = bn x n ,
n=0 n=0
Pn
where bn = k=0 ck cn−k .
This list starts off like
b0 = c20 , b1 = 2c1 c0 , b2 = 2c2 c0 + c21 , b3 = 2c3 c0 + 2c2 c1 , . . .
But
√ we know the coefficients of the power series (a polynomial in this case) giving
( 1 + x)2 = 1 + x = ∞ n
P
n=0 n x :
b
b0 = b1 = 1, b2 = b3 = b4 = b5 = · · · = 0.
This leads to equations for c0 , c1 , c2 , . . . which are – starting with c0 = 1 –
uniquely solvable. This starts of with
1 1 1
c0 = 1, c1 = , c2 = − , c3 = ...
2 8 16
In practice, this means that the polynomial
1 1 1
1 + x − x2 + x3
2 8 16
82 KAPITEL 5. SERIES

is a good approximation to 1 + x for (very) small x, in√fact evaluating this at
x = 0.21 gives 1.091875, which at least is close to 1.1 = 1.21.
We will later come back to this topic when we study Taylor series and then be
able to give a formula for all coefficients in 6.4.8.

Example 5.5.3 The natural logarithm


In this example we want to study the natural logarithm and try to find a power
series which satisfies ∞
X
ln(1 + x) = cn x n
n=0

at least for small |x|. As ln(1) = 0, we must have c0 = 0.


This means that we want to solve for the coefficients of cn to solve
X∞
1 + x = exp( cn xn ).
n=1

In order to calculate at least the first 4 coefficients, you can truncate this after
n = 4 and neglect all higher powers of x. Setting g(x) = c1 x + c2 x2 + c3 x3 + c4 x4
this leads to

1 + x = 1 + g(x) + g(x)2 /2 + g(x)3 /6 + g(x)4 /24 + higher powers of x.

Now the calculation is slightly more involved, and we find

1 + x = 1 + c1 x + ( 21 c21 + c2 )x2 + ( 61 c31 + c2 c1 + c3 )x3

1 4
+ ( 24 c1 + 21 c2 c21 + c3 c1 + 21 c22 + c4 )x4 + higher terms.

This in turn leads to


1 1 1
c1 = 1, c2 = − , c3 = , c4 = − .
2 3 4
Indead, we will see in 6.4.3 that

X xn
ln(1 + x) = (−1)n−1 .
n=1
n
Kapitel 6

Differentiability

6.1 The derivative


Definition/Remark 6.1.1 The differential quotient
Given a function f : D → R, where D ⊆ R is some open set, i.e. for every
x ∈ D there exists some r > 0 such that (x − r, x + r) ⊆ D, we consider points
(x, f (x)) on the graph of f as x approaches some given x0 ∈ D.
.........
.........
..
.............................................
... ..........
... • ..
... (x0 , f (x0 )) .............................
... .......................
... ..... ..... ..........
... ..... ..... ..
... ..... ....
.... .........
... ........
... ......
... •.... (x, f (x))
.. ...
... ...
... ...
.... ...
... ...
.. ...
...
0 ...
...
. ..........
...
...
...
...
...
...
...
.......
...
...
...
...
......
...
.... .. .
......
......
......
.. .
... .....
....... . ..
What one wants to study is the question whether – as x approaches x0 – the line
segment joining the points (x, f (x)) and (x0 , f (x0 )) comes close to the tangent
line of the graph at (x0 , f (x0 )).
As this does not need to exist, one studies the question whether the slope of the
line segment has a limit as x approaches x0 .
One says that f is differentiable at x0 , if

f (x) − f (x0 )
lim
x→x0 x − x0

83
84 KAPITEL 6. DIFFERENTIABILITY

exists. If this is the case, the limit is called the derivative of f at x0 , denoted by
f 0 (x0 ).
The quotient studied in the limit is called the difference quotient, its limit – if it
exists – the differential quotient of f at x0 .
Often one calculates the derivative as
f (x0 + h) − f (x0 )
lim , 0 6= h ∈ D − x0 .
h→0 h
This notation just replaces x by x0 + h, and of course one is interested in values
of h such that 0 < |h| is small enough so that x0 + h ∈ D holds.
If the derivative exists on all of D, we say that f is differentiable on D. If then
the derivative again is differentiable, its derivative is called the second derivative
f 00 of f.
Recursively, one defines the (n + 1) -st derivative of f as the derivative of the
n -th derivative, if this exists. The n -th derivative is denoted as f (n) 1
Sometimes, in particular if f also depends on some parameters, we write
df
(x0 ) = f 0 (x0 ).
dx
This will become the prefered notation for functions in more than one variable.
If in physical applications the variable is time, then it often is denoted by t and
the derivative of f with respect to time is f˙ (yes, there is a dot over the f ).

Remark 6.1.2 Best linear approximation


The function f : D → R, D ⊆ R open, is differentiable at x0 with f 0 (x0 ) = q if
and only if for all h ∈ D − x0
f (x0 + h) = f (x0 ) + qh + rx0 (h),
r (h)
where limh→0 x0h = 0. In some sense this means that f (x0 ) + qh is the best
possible linear approximation to f (x0 + h) close to x0 . Every other linear ap-
proximation will produce a linear error term as h approaches 0.
This is what one means by saying that the tangent line at the graph of f at the
point (x0 , f (x0 )) has slope f 0 (x0 ). The tangent line is the line τ given by
τ = {(x0 , f (x0 )) + t · (1, f 0 (x0 )) | t ∈ R} = {(x0 + t, f (x0 ) + tf 0 (x0 )) | t ∈ R}.

This idea of best approximation will be taken up again in the section on Tay-
lor polynomials. We will also use it in the proof of the chain rule and in some
examples.
...
1
and not as f 000 :-)
6.1. THE DERIVATIVE 85

We note, however, that there could be other criteria for the quality of an approxi-
mation, and that this pretty much depends of what you need the approximation
for.

Consequence 6.1.3 Differentiability implies continuity


Let D ⊆ R be open and f : D → R a map which is differentiable at x0 ∈ D
Then f is continuous at x0 .

Proof. This is clear from 6.1.2, as we can write


f (x0 + h) = f (x0 ) + h · f 0 (x0 ) + rx0 (h)
where certainly the middle term will go to 0 as h does, and the last as well,
because
lim rx0 (h) = lim h · lim rx0 (h)/h = 0.
h→0 h→0 h→0
Here, h always is meant to be located in D − x0 .

Example 6.1.4 Some derivatives

(a) Let n ∈ N0 be given and f (x) = xn on all of R. Given x0 ∈ R, we


calculate
n
f (x0 + h) = (x
P0n + h)n k n−k
= k=0 k h x0
= xn0 + nxn−1 h + h2 nk=2 n
hk−2 x0n−k .
P 
0 k

As the last summand is h2 multiplied with some polynomial in h , we see


that !
n  
1 X n
lim h2 hk−2 xn−k = 0,
h→0 h k
k=2
and this shows that f is differentiable at x0 with
dxn
(x0 ) = f 0 (x0 ) = nx0n−1 .
dx
(b) If f (x) = ∞ n
P
n=0 cn x is a power series with positive convergence radius, we
see that the difference quotient at x0 = 0 is
P∞ n ∞
f (x) − f (0) n=1 cn x
X
= = cn xn−1
x−0 x n=1

hence this is a power series which is continuous and evaluates to its constant
term c1 at x = 0 :
f 0 (0) = c1 .
86 KAPITEL 6. DIFFERENTIABILITY

(c) We apply this last insight to f (x) = exp(x) and see exp0 (0) = 1. Now we
look at the functional equation of the exponential function and calculate
for h 6= 0
exp(x0 +h)−exp(x0 )
h
= exp(x0 )(exp(h)−1)
h
= exp(x0 ) exp(h)−exp(0)
h−0
→h→0 exp(x0 ) · exp0 (0)
= exp(x0 ).
The exponential function is its own derivative!
For the functions S and C from 5.4.1 we see from (b) that S 0 (0) = 1 and
C 0 (0) = 0 and again, using the addition formula from the same number
that
S 0 (x) = C(x), C 0 (x) = −S(x).

Remark 6.1.5 Curves in Space

(a) In physics (or other sciences) the movement of a particle in space during
some time interval [0, T ] is given by a map

F : t 7→ (x(t), y(t), z(t)), t ∈ [0, T ].

Often this movement is governed by exterior conditions which we do not


talk about at the moment. We just assume the map F to be given.
The velocity of the particle at time t ∈ [0, T ] is

Ḟ (t) = (ẋ(t), ẏ(t), ż(t)),

if every coordinate function is differentiable. There are areas of science, e.g.


in thermodynamics, where one should not expect everything to be differen-
tiable. But for instance in classical mechanics one certainly does.
The trace of the particle in space is the range of the map F. This usually
will be a curve in space, and if Ḟ (t) is nonzero at a specific time t then
locally this curve will have a tangent. The direction of this tangent then is
the direction of the velocity.
The velocity is a vector valued function. Its absolute value is the speed you
see on the speedometer:
p
kḞ (t)k = ẋ(t)2 + ẏ(t)2 + ż(t)2 .

The arc length L the particle has travelled as it moves from t = 0 to t = T


is given by the integral Z T
L= kḞ (t)kdt.
0
6.1. THE DERIVATIVE 87

As we will only talk about integrals in the following chapter, we just point
out that if the speed does not change, i.e. kḞ (t)k = s for fixed s and every
t ∈ [0, T ] you get
L = T · s.

Similar rules hold for a movement in the plane, which of course can be
viewed as a part of three dimensional space.

(b) Now look at the functions S and C from 5.4.1 again. We just learned that
S 0 (t) = C(t), C 0 (t) = −S(t) and already know that S(t)2 + C(t)2 = 1
for every real t. As we look at a “particle” moving along the unit circle
described by
F : [0, T ] → R2 , F (t) = (C(t), S(t)),

this particle starts at (1, 0) and moves up to some point on the unit circle,
which we now can describe in two ways:

F (T ) = (C(T ), S(T )) = (cos(α), sin(α))

for suitable α ∈ R. As the measurement for angles is the arc length (up
to multiples of 2π ) we derive from this that – as the speed of the particle
constantly is 1 –
T = α + 2πk, k ∈ Z suitable.

This observation finally shows that

S = sin and C = cos,

meaning more concretely

∞ ∞
X t2n+1 X t2n
sin(t) = (−1)n and cos(t) = (−1)n , t ∈ R.
n=0
(2n + 1)! n=0
(2n)!

In particular, we now see that ℘ = π and

sin0 (t) = cos(t), cos0 (t) = − sin(t).

We could have derived this from the addition formula, given that we believe
that sin0 (0) = 1, but now we have proved it, although much more indirect.
The motivation for finding the power series describing sine and cosine is
outlined in the section on Taylor series, 6.4.10.
88 KAPITEL 6. DIFFERENTIABILITY

6.2 Rules for calculating the derivative


Lemma 6.2.1 Sums, products and quotients
If f, g : D → R are real valued functions on an open domain D ⊆ R, x0 ∈ D,
and f, g are both differentiable at x0 , then

(a) for every λ ∈ R, the function f + λg is differentiable at x0 with

(f + λg)0 (x0 ) = f 0 (x0 ) + λg 0 (x0 ),

(b) f · g is differentiable at x0 with

(f g)0 (x0 ) = (f g 0 + f 0 g)(x0 ),

(c) if g(x0 ) 6= 0, fg is defined on an open interval containing x0 , is differen-


tiable there and we have
 0  0
gf − f g 0

f
(x0 ) = (x0 ).
g g2

The formula in (b) is called the product formula.


Proof. We write

f (x0 + h) = f (x0 ) + f 0 (x0 )h + r(h), g(x0 + h) = g(x0 ) + g 0 (x0 )h + s(h),

where by definition of differentiability

lim r(h)/h = lim s(h)/h = 0.


h→0 h→0

(a) We have

(f + λg)(x0 + h) = f (x0 + h) + λg(x0 + h)


= f (x0 ) + λg(x0 ) + f 0 (x0 )h + λg 0 (x0 )h + r(h) + λs(h)
= (f + λg)(x0 ) + (f 0 (x0 ) + λg 0 (x0 )h + (r + λs)(h).

Observe that

lim (r + λs)(h)/h = lim (r(h)/h + λs(h)/h) = 0,


h→0 h→0

and therefore, again by 6.1.2, we see the assertion.


6.2. RULES FOR CALCULATING THE DERIVATIVE 89

(b) This time,

(f g)(x0 + h) = (f (x0 ) + f 0 (x0 )h + r(h)) · (g(x0 ) + g 0 (x0 )h + s(h))


= (f g)(x0 ) + (f (x0 )g 0 (x0 ) + f 0 (x0 )g(x0 ))h + t(h),

where

t(h) = f 0 (x0 )g 0 (x0 )h2 +(f (x0 )+f 0 (x0 )h)·s(h)+(g(x0 )+g 0 (x0 )h+s(h))·r(h).

Looking at these summands individually we see that

lim t(h)/h = 0,
h→0

again showing the claim.

(c) We now first assume that g is differentiable and non-zero at x0 and that
also g1 is differentiable at x0 . Then the product formula applied to 1 =
1
g(x) · g(x) renders the equality (as 1 is the constant function and clearly
has derivative zero)
 0
0 0 1 1
0 = 1 = g (x0 ) · + g(x0 ) (x0 ),
g(x0 ) g

giving  0
1 −g 0 (x0 )
(x0 ) = .
g g(x0 )2
Using the product formula again, we derive from this formula the one clai-
med for the derivative of f /g.
What remains is to prove the differentiability of 1/g at x0 . We leave this
as an exercise.

Example 6.2.2 Derivative of Polynomial Functions


If f (x) = dk=0 ck xk is a polynomial function then by 6.2.1(a) and our calcu-
P

lations in 6.1.4 we find f 0 (x) = dk=1 ck kxk−1 . This is a polynomial of smaller


P
degree. In particular, if f has degree d > 0, i.e. cd 6= 0 in the above formula, f 0
has degree d − 1, f 00 has degree d − 2, and so on, until the d -th derivative f (d)
has degree 0, i.e. it is constant. Therefore the (d + 1) -st derivative of f is zero.
We will learn later in 6.3.7 that a function which is defined on an interval I (of
positive length) is polynomial of degree at most d if and only if it is differentiable
infinitely often and the (d + 1) -st derivative f (d+1) is zero on all of I. We do not
know the converse implication yet, but it certainly is important that the domain
is an interval. We will need later the following observation:
90 KAPITEL 6. DIFFERENTIABILITY

Every polynomial function is the derivative of a polynomial function, namely


f = F 0 if
d
X ck k+1
F (x) = x .
k=0
k + 1
Note that you may replace F by F + c for any constant c without changing the
derivative.
We observe another phenomenon which will be interesting later:
f (k) (0) = k! · ck .
This gives a precise recipe to recover the polynomial f from the values of its
derivatives at 0.
Similarly, given x0 ∈ R, we can write
d
X
f (x) = f (x0 + (x − x0 )) = c̃k (x − x0 )k
k=0

for some coefficients c̃k and find


f (k) (x0 )
c̃k = .
k!
This will later be the starting point for calculating Taylor polynomials.

Lemma 6.2.3 Chain rule


Let D, E ⊆ R be open, f : D → R differentiable at x0 , f (D) ⊆ E and g : E →
R differentiable at f (x0 ).
Then the composition g ◦ f is differentiable at x0 and we have
(g ◦ f )0 (x0 ) = g 0 (f (x0 )) · f 0 (x0 ).

Proof. As before we write


f (x0 + h) = f (x0 ) + f 0 (x0 )h + r(h), g(f (x0 ) + k) = g(f (x0 )) + g 0 (f (x0 ))k + s(k),
where by definition of differentiability
lim r(h)/h = lim s(k)/k = 0.
h→0 k→0

Of course this time we have to expand g at f (x0 ).


Then
(g ◦ f )(x0 + h) = g(f (x0 ) + f 0 (x0 )h + r(h))
= g(f (x0 )) + g 0 (f (x0 )) · (f 0 (x0 )h + r(h)) + s(f 0 (x0 )h + r(h))
= (g ◦ f )(x0 ) + g 0 (f (x0 )) · f 0 (x0 ) · h + t(h),
6.2. RULES FOR CALCULATING THE DERIVATIVE 91

where t(h) = g 0 (f (x0 ))r(h) + s(f 0 (x0 )h + r(h)). As f 0 (x0 )h + r(h) goes to zero
as h goes to zero, we have
s(f 0 (x0 )h + r(h)) s(f 0 (x0 )h + r(h)) f 0 (x0 )h + r(h)
lim = lim · = 0,
h→0 h h→0 f 0 (x0 )h + r(h) h
because the first factor goes to 0, while the second factor goes to f 0 (x0 ).
Here we may tacitely only look at the case f 0 (x0 )h + r(h) 6= 0, because s(0) = 0
anyway. This shows together with r(h)/h →h→0 0 , that also
lim t(h)/h = 0,
h→0

showing what we wanted.

Consequence 6.2.4 The derivative of the inverse function


Let D ⊆ R be open and f an injective real valued function on D with range
E = f (D). Then f has an inverse function
f −1 : E → D, f −1 (y) = x if f (x) = y.
If then f is differentiable at x0 ∈ D with f 0 (x0 ) 6= 0, the inverse f −1 is diffe-
rentiable at y0 = f (x0 ) with
1
(f −1 )0 (y0 ) = .
f 0 (x0 )

Proof.
We use the chain rule 6.2.3 in order to explain the formula giving (f −1 )0 (y0 ).
We do not prove differentiability formally, which could be done similarly to the
differentiability of a composition.
From y = f (f −1 (y)) for all y ∈ E we find from differentiating both sides
1 = f 0 (f −1 (y0 )) · (f −1 )0 (y0 ),
which implies the claim because f −1 (y0 ) = x0 .

Remark 6.2.5 A geometric explanation


The geometric meaning of the last insight is described by remembering that the
graph of the invers function f −1 is the result of reflecting the graph of f at
the line with slope 1 through the origin, which just exchanges the x - and the
y -coordinate. This process also maps the tangent line to the graph of f in the
point (x0 , y0 ) to the tangent line at the graph of f −1 in the point (y0 , x0 ), but
it inverts slopes of lines; the image of the tangent line at the graph of f is:
{(y0 + f 0 (x0 )t, x0 + t) | t ∈ R} = {(y0 + s, x0 + f −10 (y0 )s) | s ∈ R}.
92 KAPITEL 6. DIFFERENTIABILITY

Example 6.2.6 Derivative of the logarithm


As the logarithm ln is the inverse function of exp : R → R>0 , we find from
exp0 = exp
1 1
ln0 (y) = = .
exp(ln(y)) y
1
The logarithm is an antiderivative of the function y → y
on the positive real
numbers.
sin
The tangens tan = cos gives a differentiable bijective map between (−π/2, π/2)
and R which has an inverse function called arctan (in words: arcus tangens). We
find from the quotient rule 6.2.1(c) that

0 cos(x)2 + sin(x)2
tan (x) = 2
= 1 + tan(x)2 .
cos(x)
This gives then
1
arctan0 (y) = .
1 + y2

6.3 The mean value theorem and extremal va-


lues
Definition 6.3.1 Local points of extremum
Let f : D → R be a real valued function. Recall that f has a (global) maximum
at x0 ∈ D if f (x) ≤ f (x0 ) is true for every x ∈ D.
If D ⊆ R, we call x0 ∈ D a local point of maximum if there exists some r > 0
such that for all x ∈ D ∩ Br (x0 ) again

f (x) ≤ f (x0 ).

Similarly we call x0 a local point of minimum if the converse inequality is valid


in such a part of the domain.
In both cases, we call x0 a local point of extremum for f.
(NB: The notion of a global maximum can always be considered for real-valued
functions defined on any set; for local points of extremum you need some way
to define a distance on the domain (as e.g. for D ⊆ R ) or at least an open
neighborhood of x0 . )

Lemma 6.3.2 A necessary condition for local extremum


If f : (a, b) → R is differentiable at x0 and has x0 as a local point of extremum,
then f 0 (x0 ) = 0.
6.3. THE MEAN VALUE THEOREM AND EXTREMAL VALUES 93

Proof. We treat the case of a local point of minimum, the other case being similar.
By definition of local points of minimum, in a (sufficiently small) subinterval
(x0 − r, x0 + r) ⊆ (a, b) the numerator of the difference quotient

f (x) − f (x0 )
Q(x) = , x 6= x0 ,
x − x0
always is non-negative, the denominator changes sign.
This shows Q(x) ≤ 0 for x < x0 and Q(x) ≥ 0 for x > x0 .
Therefore f 0 (x0 ) = limx→x0 Q(x) = 0.
This lemma implies a pre-version of the mean value theorem.

Theorem 6.3.3 Rolle’s theorem


Let a < b be real numbers and f : [a, b] → R continuous with f (a) = f (b).
If f is differentiable on (a, b), then there exists an x0 ∈ (a, b) with f 0 (x0 ) = 0.

Proof. As the interval [a, b] is closed and bounded, we can apply 4.2.5 and deduce
that f does attain a maximal and a minimal value on [a, b]. If f (a) = f (b) is
both maximal and minimal, the function is constant and therefore the derivative
vanishes at every point.
In the other case, there must be a point of extremum x0 ∈ (a, b). As every point
of extremum also is a local point of extremum, by 6.3.2 we have f 0 (x0 ) = 0 at
this point.

Theorem 6.3.4 The mean value theorem


If f : [a, b] → R is continuous on [a, b] and differentiable on (a, b), then there is
some x0 ∈ (a, b) satisfying

f (b) − f (a)
f 0 (x0 ) = .
b−a

Proof. We modify f slightly to get a function for which we can apply Rolle’s
Theorem 6.3.3.
To that end, let
F (x) = f (x) − λ(x − a).
We choose λ ∈ R such that

F (a) = f (a) = F (b) = f (b) − λ · (b − a),


f (b)−f (a)
giving λ = b−a
.
94 KAPITEL 6. DIFFERENTIABILITY

As F clearly still is continuous on [a, b] and differentiable on (a, b), we find some
x0 ∈ (a, b) with F 0 (x0 ) = 0, which, by the rules for taking derivatives, leads to
0 = F 0 (x0 ) = f 0 (x0 ) − λ,
which is the desired equality.

Consequence 6.3.5 Monotonicity and the Derivative


Let I ⊆ R be an interval, I ◦ the set of all points in I which are not an end
point, and f : I → R continuous on I and differentiable on I ◦ .

(a) If f 0 (x) = 0 for every x ∈ I ◦ then f is constant.


(b) If f 0 (x) ≥ 0 for every x ∈ I ◦ then f is monotonically increasing. If
f 0 (x) > 0 on I ◦ , f is strictly monotonically increasing.

Proof. Let α < β be two elements in I. Then f gives a function on [α, β] ⊆ I


satisfying all conditions of the mean value theorem 6.3.4.
Therefore there exists an x0 ∈ (α, β) such that
f (β) − f (α) = (β − α)f 0 (x0 ).

(a) If f 0 (x) = 0 for every x ∈ I ◦ , we in particular have f 0 (x0 ) = 0 and


therefore f (β) = f (α). Therefore f is constant.
(b) If f 0 (x) ≥ 0 for every x ∈ I ◦ , we in particular have f 0 (x0 ) ≥ 0 and
therefore f (β) − f (α) has the same sign as β − α or is zero. Therefore f
is monotonically increasing. Similarly if f 0 (x) > 0 everywhere.

With the same argument, f 0 (x) ≤ 0 (resp. < 0 ) on I ◦ will lead to f being
monotonically decreasing (resp. strictly monotonically decreasing).

Consequence 6.3.6 A sufficient condition for local extrema


If f : (a, b) → R is continuously differentiable twice and f 0 (x0 ) = 0 > f 00 (x0 )
(resp. 0 < f 00 (x0 ) ), then x0 is a local point of maximum (resp. minimum) for f.

Proof. We only consider the case f 00 (x0 ) < 0, the other case being quite similar.
As due to the continuity f 00 (x) is negative in a small interval (x0 − r, x0 + r)
around x0 , we may apply 6.3.5 and see that f 0 is strictly decreasing on this
intervall.
In particular, f 0 (x) > 0 for x ∈ (x0 − r, x0 ) and f 0 (x) < 0 for x ∈ (x0 , x0 + r).
Therefore, again by 6.3.5, f is strictly monotonically increasing on (x0 − r, x0 ]
and decreasing on [x0 , x0 + r), showing the (local) maximality of f (x0 ).
6.3. THE MEAN VALUE THEOREM AND EXTREMAL VALUES 95

Consequence 6.3.7 Polynomial Functions

(a) We already know from 6.2.2 that a polynomial function of degree d satisfies
f (d+1) = 0.
Let now I ⊆ R be an interval and f : I → R a function which is d + 1
times differentiable and satisfies f (d+1) = 0.
We know from 6.2.2 that every polynomial function is the derivative of a
polynomial function hence from 6.3.5(a) that every function on the interval
I which has a polynomial as its derivative, is itself a polynomial.
Repeating this process several times, starting with f (d+1) = 0, we see that
f (d) is a polynomial and then f (d−1) is a polynomial and so on, showing
that f itself is a polynomial function of degree at most d.

(b) This is closely related to: If g : I → R ( I an interval) has an antiderivative


G : I → R, i.e. G0 = g everywhere, then for every other antiderivative G̃
we have G − G̃ is constant, because on all of I ◦ we have

(G − G̃)0 = G0 − G̃0 = g − g = 0.

There is still another variant of the mean value theorem we should look at.

Theorem 6.3.8 Another variant of the mean value theorem


Let a < b be real numbers and f, g : [a, b] → R be functions which are continuous
on [a, b] and differentiable on (a, b).
Then there exists an x ∈ (a, b) such that

(f (b) − f (a)) · g 0 (x) = (g(b) − g(a)) · f 0 (x).

In particular, if g 0 (t) 6= 0 for all t ∈ (a, b), we have

f (b) − f (a) f 0 (x)


= 0
g(b) − g(a) g (x)

for this x.

Proof. Again, we invent a new function F for which we can apply Rolle’s Theorem
6.3.3. This now is

F (x) = (f (b) − f (a)) · g(x) − (g(b) − g(a)) · f (x).

We find F (a) = f (b)g(a) − g(b)f (a) = F (b) and therefore Rolle’s Theorem gives
us the x we wanted.
96 KAPITEL 6. DIFFERENTIABILITY

If here g 0 is nowhere zero, again Rolle’s Theorem implies that g(b) 6= g(a) and
therefore we can divide the whole equation by (g(b) − g(a)) · g 0 (x), giving the
second equation.

There are two interesting consequences of this variant. One of them will be consi-
dered in treating the remainder term of Taylor’s polynomials in the next section.
The other one is:

Theorem 6.3.9 L Hôpital’s rule


Let a < b be real numbers and f, g : [a, b) → R be functions which are continuous
on [a, b) and differentiable on (a, b). Assume that f (a) = g(a) = 0 and g(x) 6=
0 6= g 0 (x) for all x ∈ (a, b).
f 0 (x)
Then if limx→a g 0 (x)
exists we have
f (x) f 0 (x)
lim = lim 0 .
x→a g(x) x→a g (x)

Proof. We know from 6.3.8 that for every x ∈ (a, b) there exists some x̃ ∈ (a, x)
such that
f (x) − f (a) f 0 (x̃)
= 0 .
g(x) − g(a) g (x̃)
If x goes to a, the corresponding x̃ also goes to a, because it sits between a
and x.
Therefore the right hand side converges to a limit, as x → a, but the left hand
side has the same values, forcing it also to converge to the same limit. As f (a) =
g(a) = 0 this just is what we wanted.

Remark 6.3.10 Variants of L Hôpital’s rule


There are several variants of this rule. It holds in a similar way, if f and g are
only defined on (a, b) and converge to ∞ as x goes to a : Replace f by 1/g
and g by 1/f and extend both continuously to a by setting the value equal to
0.
Similarly if the interval where the two functions are defined is (a, ∞) and we
look at the limit of the quotients as x goes to ∞. Then, again, if f and g have
limit 0 for x → ∞, we can use the same comparison as in the last theorem, and
similarly if f and g both go to ∞ as x → ∞.

Example 6.3.11 Exponential growth


1
(a) We know that limx→∞ ex
= 0. As 1 is the derivative of x and ex is its own
derivative we have
x 1
lim = lim = 0.
x→∞ ex x→∞ ex
6.3. THE MEAN VALUE THEOREM AND EXTREMAL VALUES 97

But then
x2 2x
lim x
= lim x = 0,
x→∞ e x→∞ e

and inductively we see


xd
lim =0
x→∞ ex

for every natural number d.


This means that the exponential function grows much faster than any poly-
nomial as x goes to infinity. This is what makes exponential growth special.
Be aware, however, that there are functions which grow much faster than
the exponential function, for instance

ex ex
lim x = lim ex x = 0.
x→∞ ee x→∞ e · e

x
And then of course there are functions which grow much faster than ee . . .

(b) We now want to prove that for fixed x ∈ R

x
lim (1 + )t = ex .
t→∞ t

Using the continuity of the exponential function and


x x
(1 + )t = et ln(1+ t )
t

it is sufficient to prove that


x
lim t ln(1 + ) = x.
t→∞ t

This is implied by

x ln(1 + xt )
1
1+ xt
· −x
t2
lim t ln(1 + ) = lim = lim −1 = x,
t→∞ t t→∞ 1/t t→∞
t2

where we did use L Hôpital’s rule in between, taking the derivative with
respect to t of course ( x being fixed).
Of course, this identity can now again be used to calculate

1 n
e = lim (1 + ) .
n→∞ n
98 KAPITEL 6. DIFFERENTIABILITY

6.4 Taylor polynomials and power series


Lemma 6.4.1 Power series are differentiable
P∞ k
Let f (x) = k=0 ck x be a power series with positive radius of convergence
r ∈ R>0 ∪ {∞}.
Then the function f is differentiable in (−r, r) and has derivative

X
0
f (x) = (k + 1)ck+1 xk .
k=0

Proof. We make use of the calculations in 5.3.5 and remind us of the fact that
for fixed x0 ∈ (−r, r) and small enough h we have


X
f (x0 + h) = c̃k hk ,
k=0
P∞ n
 n−k
where c̃k = n=k cn k x0 . This power series in the
P variablen−1h now is diffe-
rentiable by 6.1.4(b) at h = 0 with derivative c̃1 = ∞
n=1 cn nx0 . Replacing n
with k = n − 1 gives the formula from the statement.

Remark 6.4.2 Power series are differentiable infinitely often


The last Lemma tells us that the derivative of a power series again is a power
series. One can check that it has the same radius of convergence. The formula
for the derivative is just that for polynomials from 6.2.2, extended to the infinite
sum.
As f 0 is a power series, it again is differentiable, and f 00 is a power series, hence
differentiable, and you can imagine how this sentence could go on if we had an
infinite amount of space and time. To make it short, for every natural n, f is
differentiable n -times, f (n) being a power series again.
In particular, we get the same recipe to calculate the coefficients of the expansion
of a power series f around some point x0 in the domain of convergence as for
polynomials, cf. 6.2.2:

X f (k) (x0 )
f (x0 + h) = hk .
k=0
k!
P∞ k
We can also exhibit a function having our power series f (x) = k=0 ck x as its
derivative: ∞
X ck k+1
F (x) = x
k=0
k+1
will do the job. We will call F an antiderivative of f.
6.4. TAYLOR POLYNOMIALS AND POWER SERIES 99

Example 6.4.3 The Taylor series for ln


1
We know the power series representing the function f (x) = x
near x0 = 1,
namely – from the geometric series –

1 X
= (−1)n xn for |x| < 1.
1 + x n=0

This now gives a function having f as its derivative near x0 = 1 as



X (−1)n−1
F (1 + x) = · xn .
n=1
n
1
As ln(1 + x) also has 1+x
as its derivative, cf. 6.2.6, we have

F (1 + x) = ln(1 + x) + constant for |x| < 1

by 6.3.5(a) applied to F − ln . But F (1) = 0 = ln(1), and therefore this constant


is zero, resulting in

X (−1)n−1 n
ln(1 + x) = ·x .
n=1
n
This will later be called the Taylor series for the logarithm at x0 = 1. We already
had calculations pointing towards this power series in 5.5.3.

Example 6.4.4 A differential equation


Sometimes one can try to use power series for solving certain differential equati-
ons. This is an equation involving the derivatives of an unknown function, and
one can then try to find (maybe all) functions satisfying these conditions.
As an example, we look at the equation

f 0 (x) = x · f (x)

which should hold now for all real x. If you do not have an idea how to deal with
this, then just try to write f as a power series around x0 = 0 :

X
f (x) = ck x k ,
k=0

which leads to

X ∞
X
0 k
f (x) = (k + 1)ck+1 x = xf (x) = ck−1 xk ,
k=0 k=1

i.e.
c1 + 2c2 x + 3c3 x2 + 4c4 x3 + · · · = c0 x + c1 x2 + c2 x3 + c3 x4 + . . .
100 KAPITEL 6. DIFFERENTIABILITY

Now, according to 5.5.1, for every k the coefficients for xk have to coincide,
because both power series give the same function. Therefore,

c1 = 0, 2c2 = c0 , 3c3 = c1 , 4c4 = c2 , . . . , kck = ck−2 , if k ≥ 2.


1
This shows that c2k+1 = 0, k ∈ N0 , and c2k = 2k ·k!
· c0 . This gives

X x2k
f (x) = c0 · k · k!
= c0 · exp(x2 /2).
k=0
2

And indeed, plugging in this function into the equation we wanted to solve, we
see that is does satisfy
d
exp(x2 /2) = exp(x2 /2) · x
dx
by the chain rule.
We could have guessed this solution, but sometimes it is good not to depend on
ones capacities to guess something.
How should we have guessed this? If you want to solve the differential equation

f 0 (x) = g(x) · f (x)

for some fixed function g (a potential or something), and if this “potential” has
an antiderivative G, i.e. G0 (x) = g(x) for all x, then at least one type of solutions
is of the form
f (x) = c0 · exp(G(x))
again using the chain rule. And of course G(x) = x2 /2 is an antiderivative of
g(x) = x.
Whether these are all solutions for this particular equation or whether there
are other solutions at all is a totally different question which we will address in
7.4.5(b).

Lemma 6.4.5 A preparatory lemma


Let f : (a, b) → R be a function which is n + 1 times continuously differentiable.
Let x0 ∈ (a, b) be a point such that the k -th derivatives satisfy

f (k) (x0 ) = 0, 0 ≤ k ≤ n.

Then for every x ∈ (a, b) there exists a z between x0 and x such that

f (x) f (n+1) (z)


= .
(x − x0 )n+1 (n + 1)!
6.4. TAYLOR POLYNOMIALS AND POWER SERIES 101

Proof. We prove this by induction on n


For n = 0 we have to find some z satisfying (as f (x0 ) = 0 )
f (x) − f (x0 )
= f 0 (z)
x − x0
which is just the assertion of the mean value theorem 6.3.4
Now let n ≥ 0 be given and assume that in this case the assertion always is true.
Then let f be a function fullfilling the conditions for n + 1 instead of n. Using
the variant of the mean value theorem we find some z̃ between x0 amd x such
that (remember that f (x0 ) = f 0 (x0 ) = 0 !)
f (x) − f (x0 ) 6.3.8 f 0 (z̃) (IH) f
(n+2)
(z)
n+2
= n+1
= ,
(x − x0 ) (n + 2)(z̃ − x0 ) (n + 2)!
where we use the induction hypothesis (IH) for the function f 0 , which again
satisfies the conditions.
Of course here the new z lies between x0 and z̃, and therefore also between x0
and x.

Remark 6.4.6 What should the Taylor polynomial do?


We do understand polynomials quite well. They are given by a finite set of num-
bers (the coefficients).
Most functions we have to deal with in applied sciences are much more complica-
ted. It would be desirable to approximate these functions by some other functions
which we understand. Depending on the context this might be polynomials.
Another question of course is how we define “approximate”, what we expect from
this. As right now we deal with derivatives, it might be a good approach to search
for easier functions which have the same value for the first few derivatives at some
given point x0 in the domain of the function we study.
This leads to the following construction, because we remember 6.2.2, where at
the end we related the coefficients of a polynomial to the values of the derivative.

Construction 6.4.7 The Taylor polynomial


Let f : (a, b) → R be a function which is differentiable n times for some n ∈ N.
Let x0 ∈ (a, b) be some point. Then the n -th Taylor polynomial pn (x) of f at
x0 is defined as
n
X f (k) (x0 )
pn (x) = · (x − x0 )k .
k=0
k!
According to what we saw before, we have
f (k) (x0 ) = pn(k) (x0 ), 0 ≤ k ≤ n.
102 KAPITEL 6. DIFFERENTIABILITY

The n -th remainder term is defined as

Rn (x) = f (x) − pn (x), x ∈ I.

This remainder has the property that it is differentiable n times (as f and pn
(k)
are) and Rn (x0 ) = 0 for 0 ≤ k ≤ n. Therefore, if f is even differentiable (n+1)
times, we find for every x ∈ I a z between x0 and x such that

f (n+1) (z)
Rn (x) = · (x − x0 )n+1 .
(n + 1)!

This is due to our lemma 6.4.5 and the fact that

Rn(n+1) = f (n+1) − pn(n+1) = f (n+1) ,

because pn is a polynomial of degree at most n.


This way to express the remainder term is called the Lagrange form of the re-
mainder term.

Example 6.4.8 Power functions


If a is some real number, we have defined the function

f : R>0 → R, x 7→ xa = exp(a ln(x)).

We know that f is differentiable n times (in fact for every n ) with

f 0 (x) = axa−1 , f 00 (x) = a · (a − 1) · xa−2 ,

and inductively we see


k
!
Y
f (k) (x) = (a + 1 − i) · xa−k .
i=1

Therefore, the n -th Taylor polynomial of f at x0 = 1 is


n Qk
i=1 (a + 1 − i)
X
pn (x) = · (x − 1)k .
k=0
k!

The coefficients in this expression are abreviated as


Qk  
i=1 (a + 1 − i) a
= ,
k! k

and for natural a this is the old binomial coefficient indeed.


6.4. TAYLOR POLYNOMIALS AND POWER SERIES 103

For instance, looking at a = 12 , we have the following table of values

k 0 1 2 3 4 5
a
k
1 12 − 18 1
16
5
− 128 7
256

Comparing that to the values we started of with in 5.5.2 should persuade you
that Taylor series are good to have. They facilitate the work we had there and
make it much more elegant. Lagrange’s form of the remainder term tells us that
n 1
√ 1
 
1
X
1+x− 2 k
x = 2 z 2 −n−1 xn+1
k=0
k n+1

for some z between 1 and 1 + x. One can use this estimate to show that for
x ∈ (−1, 1) we get
∞ 1
√ X
1+x= 2 xk .

k=0
k

With similar considerations one more generally ends up with


∞  
a
X a k
(1 + x) = x , x ∈ (−1, 1).
k=0
k

This expression is the binomial series.

Definition 6.4.9 The Taylor series


Let f : (a, b) → R be differentiable infinitely often, choose some x0 ∈ (a, b).
f (k) (x0 ) k
If T (h) = ∞
P
k=0 k!
h has positive convergence radius as a power series in h,
then we call T the Taylor series of f at the expansion point x0 .

Example 6.4.10 The sine and cosine function


The addition formula together with sin0 (0) = 1, which can be viewed as a nor-
malization relating the measurement of the angle with that of the ratio between
the opposite leg and the hypothenuse in right angled triangles, we find that

cos0 (x) = − sin(x), sin0 (x) = cos(x).

This results in
cos(2n) (0) = (−1)n = sin(2n+1) (0), n ∈ N0 ,
while the other derivatives are zero. The corresponding Taylor series therefore
are the power series representing cos and sin from 6.1.5(b).
104 KAPITEL 6. DIFFERENTIABILITY

Remark 6.4.11 Not always helpful


We know many examples where the Taylor series is helpful. In particular, we
know the Taylor series for exp (5.1.3), sin, cos (6.1.5)at the expansion point 0
and of xa (6.4.8) as well as ln (cf. 6.4.3) at the expansion point 1.
There are, however, functions where the Taylor series converges, but does not
represent the given function.
The standard example for this is the function

exp(−1/x2 ), if x 6= 0,
f : R → R, f (x) =
0 if x = 0.

This is continuous, as for x 6= 0 this is a composition of continuous functions,


and the limit, as x goes to 0, is limt→∞ exp(−t) = 0. Outside zero, f clearly is
differentiable infinitely often. The limit of the (higher) derivatives, as x goes to
0, can always be calculated by L Hôpital’s rule, to be 0. Therefore,

f (k) (0) = 0, k ∈ N0 .

The Taylor series is the power series ∞ k


P
k=0 0 · x which of course is constant equal
to zero. But x = 0 is the only argument, where f (x) = 0. The Taylor series does
not help anything at all in this case for understanding the function near 0. At
every other real number, the function is analytic, however, but we will not prove
this here.

Remark 6.4.12 Table of some derivatives


We end this section with a table of some common derivatives.

f (x) f 0 (x) remarks


xi ixi−1 i ∈ Z or x > 0
eax aeax a∈R
ln(ax) 1/x x, a > 0
1
lna (x) x ln(a)
x, a > 0
sin(x) cos(x)
cos(x) − sin(x)
1
tan(x) cos2 (x)
= 1 + tan2 (x) if cos(x) 6= 0
sinh(x) cosh(x)
cosh(x) sinh(x)
arctan(x) 1/(x2 + 1)

Here, arctan is the “arcus tangens”, inverting the tangens restricted to the in-
terval (−π/2, π/2).
Kapitel 7

Integration

Integral calculus is concerned with calculating areas of certain regions in the


plane as the motivation and – as it turns out a posteriori – with calculating
antiderivatives of functions (if possible). We already did that, but only for power
series, cf. 6.4.2.
There are several variants of introducing integrals. We will mainly be concerned
with Lebesgue integrals and will later only make few comments on Riemann
integrals. For continuous functions the difference between these methods is not
very large. The Lebesgue theory, however, seems to be more appropriate for
studying differential equations more deeply than we will do it in this course.

7.1 Lebesgue Integral


Definition/Remark 7.1.1 Where do null sets come from?
If we want to generalize the notion of the length of (bounded) intervals and intro-
duce “onedimensional volumina” of certain subsets of the real line, we naturally
have certain expectations which should be dealt with. To formulate them, we
denote this onedimensional volume of A ⊆ R by `(A).

• For intervals we have `([a, b]) = `((a, b)) = b − a, a ≤ b ∈ R

• If for A, B ⊆ R the size `(A), `(B) is defined and A ∩ B = ∅ then `(A ∪


B) = `(A) + `(B).

• If `(A) is defined, then `(A) ≥ 0.

In particular this implies that for A ⊆ B ⊆ R,

`(B) = `(A ∪ (B r A)) = `(A) + `(B r A) ≥ `(A).

105
106 KAPITEL 7. INTEGRATION

It is quite natural to demand that the second wish extends to infinite unions in
the following way:
If Ui , i ∈ N, are subsets for which `(Ui ) always is defined, then

! ∞
[ X
` Ui ≤ `(Ui ).
i=1 i=1

Now assume that A = {ai | i ∈ N} is the range of a sequence. We call those sets
countable1 Choose some natural n > 1 and consider
1 1
Ui = (ai − i
, ai + i ).
n n
2
We have `(Ui ) = ni
. We therefore arrive at

X 2 1
`(A) ≤ i
=2·
i=1
n n−1

using the geometric series (note, however, that the summation starts with i = 1 ).
This shows (with n going to infinity) that `(A) necessarily is 0.
The procedure motivates the following definition:
If for a set A ⊆ R and for every  > 0 there exists a sequence of intervals
Ii = (ai , bi ) ⊆ R, such that

[ ∞
X
A⊆ Ii and (bi − ai ) ≤ ,
i=1 i=1

then we call A a null set.


For calculating onedimensional volumina, null sets are negligible.
The union of two (and even countably many) null sets is a null set,
If S ∈ R is a set and some property P, depending on s ∈ S holds for all s ∈ S
outside some null set, we say that P is true almost everywhere in S.

Example 7.1.2 Rational numbers and the Cantor set

(a) The fact that the unit interval [0, 1] has length 1 implies that it is not a
null set. As we know from 7.1.1 that countable sets are null sets, we see that
[0, 1] cannot be countable, it therefore cannot be the range of a sequence.
1
Some people prefer “at most countable” and treat the finite sets indpendently. For our
purposes, however, it is more convenient to call the finite sets countable as well.
7.1. LEBESGUE INTEGRAL 107

(b) It is quite surprising that the rational numbers Q are a countable set.
However, we already saw that in 3.2.4 (b), where we constructed a sequence
attaining every rational number as a value! Therefore, rational numbers are
a null set.
As a consequence, the set of irrational numbers cannot be a null set, because
otherwise [0, 1] were a null set, which it is not. This implies that in a certain
sense, there are much more irrational numbers than rational ones.

(c) There are also null sets which are not countable. A prominent example
is the Cantor set. This can be constructed by starting with [0, 1] = C0 ,
and once you have Cn which is a union of 2n intervals of length 1/3n ,
you construct Cn+1 by deleting the (open) middle third of every of these
previously remaining intervals.
The first few sets in this sequence are
C0 = [0, 1],
C1 = [0, 1/3] ∪ [2/3, 1],
C2 = [0, 1/9] ∪ [2/9, 1/3] ∪ [2/3, 7/9] ∪ [8/9, 1].
Then define the Cantor set to be the intersection of all these Cn ,

\
C∞ = Cn .
n=0

As C∞ is contained in Cn and Cn is a union of intervals whose lengths


add up to (2/3)n , we see that C∞ is a null set. Its description as

C∞ = {x ∈ [0, 1] | in the expansion of x to base 3, 1 is no decimal}

can be used to show that C∞ is uncountable, using Cantor’s diagonalization


procedure. We do not explain this here.

Definition 7.1.3 Step functions


Let I = [a, b] ⊆ R be an interval. A step function on I is a function σ : I → R
such that there exists a number n ∈ N and real numbers

a = x0 < x1 < x2 < . . . < xn−1 < xn = b

as well as real numbers ci , 1 ≤ i ≤ n such that

∀i = 1, . . . , n : f (x) = ci , if xi−1 < x < xi .

There will be values σ(x0 ), σ(x1 ), . . . , σ(xn ) as well, but we do not make any
instructions concerning these. This has the advantage that sums of step functions
again are step functions.
108 KAPITEL 7. INTEGRATION

We already define the integral of the step function σ over the interval I as
Z b X n
σ(x)dx = ci · (xi − xi−1 ).
a i=1

This is motivated by the area of a rectangle being the product of its side lengths.
Note, however, that the ci may be negative, so that the interpretation of the
integral as an area is not fully justified - unless you speak about “signed” area.
Rb
If all ci are non negative, the integral a σ(x)dx indeed is the sum of the areas of
the rectangles with side length (xi − xi−1 ) and ci . This is the area of the region
between the x -axis and the graph of σ.

Definition 7.1.4 Positive Lebesgue integrals


The idea of integrating non-negative functions ist to exhaust the region bounded
by their graph and the x -axis by corresponding regions for step functions. This
will not work for every function, and therefore the definition is slightly more
technical.
Let I = [a, b] as before. A function f : I → R belongs to L+ (I), if the following
three conditions hold:

• f (x) is non-negative almost everywhere in I.


• There exists a sequence of step functions σn : I → R such that for every n
0 ≤ σn (x) ≤ σn+1 (x) almost everywhere in I and for almost every x ∈ I
limn→∞ σn (x) = f (x).
Rb
• For this sequence, limn→∞ a σn (x)dx exists.

Then we define Z b Z b
f (x)dx = lim σn (x)dx.
a n→∞ a
We will explain later why this limit does not depend on the chosen sequence of
stepfunctions.
R Rb
We will also write I f (x)dx = a f (x)dx if I = [a, b].

Example 7.1.5 My first Lebesgue integral


To get an idea of how this sometimes can be made explicite, we treat the following
example. Let b > 0 be fixed and look at

f : I = [0, b] → R, f (x) = x.
b2
We expect the integral to be 2
, the area of a rectangular triangle with both legs
equal to b.
7.1. LEBESGUE INTEGRAL 109

The step functions we look at are


(i − 1)b
σn : I → R, σn (x) = , if (i − 1)b/2n ≤ x < ib/2n ,
2n
and σn (b) = 0.
The following picture shows the graph of σ2 in black and what is put on top as
you go to σ3 in red.
... ..
...
.. .....
... ..
... ............................
... .....
... .
. .... ...
... ... .
... .. ....................................................
... .. .
... ....... ...
... .. .
... .. ................................
... ... . ..
... . .... ... .
... .. . . ...................................................
.
... ... . ..
... .. .... ... ..
... ...........................
. ..
... .. .... .. ..
... .
. .. .. .. ..
... .
.... .. .. ..
... .. ................................................ ..
... .. .. .. .
. ..
... .
. ... .. .. ..
... ...
..................... .. ..
... ....... .. .. ..
... .... .. .. .. ..
....... .. .. .. .
...............................................................................................................................................................................
1
As |σn (x) − f (x)| ≤ 2n
for x 6= b, we see that σn (x) converges to f (x) almost
everywhere. We get
2 n
b
(i − 1)b b2 2n (2n − 1) b2
Z X b
σn (x) = · = · →n→∞ .
0 i=1
2n 2n 22n 2 2

Here we used 2.1.2(a) in between.


This is already very promissing.
Similarly, we can integrate f (x) = x2 for x ∈ [0, b] and – using similarly defined
step functions and 2.1.2(c) – end up with
Z b
b3
x2 dx = .
0 3

Example 7.1.6 A discontinuous function


Here we look at the function

0, if x ∈ Q,
f : [0, 1] → R, f (x) =
1, if x 6∈ Q.
110 KAPITEL 7. INTEGRATION

We saw this function in Example 4.1.4, where we had defined it on the real line.
Now we restrict the domain to the unit interval. The constant function 1 on [0, 1]
is a step function which is nonnegative and coincides with f almost everywhere,
so we can use it to calculate
Z 1
f (x)dx = 1.
0
In fact, we could have prescribed arbitrary values for f at the rational numbers
without changing the integral, because the rational numbers are a null set.

Lemma 7.1.7 Independence of choices


Rb
In definition 7.1.4, the value a f (x)dx is independent of the sequence of step
functions.

Proof. We have to show that, if (σn )n and (τn )n are two sequences of non negative
step functions such that for almost all x ∈ [a, b] the sequences (σn (x))n and
(τn (x))n grow monotonically and converge to f (x),
Z b Z b
lim σn (x)dx = lim τn (x)dx.
n→∞ a n→∞ a
In order to achieve that we fix some natural number m and consider the difference
δ̃n (x) := τm (x) − σn (x), x ∈ [a, b].
We then define
δn (x) := max(δ̃n (x), 0),
such that δn is a step function with non negative values, and as m is fixed, we
get δ̃n+1 (x) ≤ δ̃n (x) for all x, so that δn (x) also decreases.
As σn (x) converges to f (x) which is at least τm (x) almost everywhere, we will
have limn→∞ δn (x) = 0 almost everywhere. We now invoke without proof, that
under this condition Z b
lim δn (x)dx = 0.
n→∞ a
This is intuitively plausible, but not really easy to show.
As a consequence,
Z b Z b Z b
τm (x)dx ≤ lim σn (x)dx + lim δn (x)dx = lim σn (x)dx.
a n→∞ a n→∞ n→∞ a
Now we may let m tend to infinity and see that
Z b Z b
lim τm (x)dx ≤ lim σn (x)dx.
m→∞ a n→∞ a
Exchanging the roles of τm and σn shows that “ ≥ ” also is true, hence both
limits coincide.
7.2. THE MAIN THEOREMS ON INTEGRATION 111

Definition 7.1.8 Lebesgue integral


Let I = [a, b] again. A function f : I → R is called Lebesgue integrable if it can
be written as
f = f + − f − , f + , f − ∈ L+ (I).
We then set Z b Z b Z b
f (x)dx = +
f (x)dx − f − (x)dx
a a a

and call this the integral of f over the interval [a, b].
We denote by L(I) the set of all Lebesgue integrable functions.

Remark 7.1.9 Riemann’s integral


If f : [a, b] → R is a real valued function, for introducing Riemann’s integral
one looks at step functions σi on [a, b] which on every subinterval [xi−1 , xi ] have
value f (ξi ) for some ξi ∈ [xi−1 , xi ]. There is no assumption of monotonicity of the
σi or of convergence to f almost everywhere. Instead, one assumes that for every
such sequence of step functions, the limit of the Riemann sums which are just
the Lebesgue integrals of the step functions, converges to some limit independent
of the chosen subintervals and ξi , as the maximum length of subintervals tends
to 0. This limit is then the value of the Riemann integral.
For example, the function in 7.1.6 is not integrable in the sense of Riemann,
as the limit of the Riemann sums depends on which ξi you choose for the step
functions. If t ∈ [0, 1] is some fixed number and you use irrational t for intervals
which contain numbers ≤ t and rational ones for all intervals in (t, 1], then the
sequence of Riemann sums will converge to t, and is not independent of the ξi .
In practice, this will mean that more functions tend to be integrable in the Le-
besgue sense than in the Riemann sense. On the other hand, for many functions
both integrals will exist and give the same value.

7.2 The main theorems on integration


Theorem 7.2.1 Basic rules
Let I = [a, b] ⊂ R be a compact interval and f, g ∈ L(I).

(a) For every λ ∈ R we have


Z b Z b Z b
(f (x) + λg(x))dx = f (x)dx + λ g(x)dx.
a a a
112 KAPITEL 7. INTEGRATION

(b) If g(x) ≤ f (x) almost everywhere in I, then


Z b Z b
g(x)dx ≤ f (x)dx.
a a

In particular, if f (x) = g(x) amost everywhere, the integrals coincide.


(c) The function x 7→ |f (x)| is integrable in the sense of Lebesgue and
Z b Z b
f (x)dx ≤ |f (x)|dx.
a a

Rb
(d) If f (x) ≥ 0 almost everywhere and a
f (x)dx = 0, then f (x) = 0 almost
everywhere.
(e) For every c ∈ (a, b), the restrictions of f to [a, c] and [c, b] are integrable
and Z b Z c Z b
f (x)dx = f (x)dx + f (x)dx.
a a c

Proof.

(a) This assertion is clear for functions in L+ (I) and λ ≥ 0, because given
sequences (σn )n , (τn )n of non negative step functions which calculate the
integrals of f and g respectively, the step functions σn + λτn will do so for
f + λg.
For general f and g the assertion can be deduced from this by making
several case distinctions for f + , f − , g + , g − .
(b) We treat the case of non negative functions only. If (σn )n , (τn )n are se-
quences of non negative step functions which calculate the integrals of f
and g respectively, we can copy the proof from 7.1.7 and see from g −f ≤ 0
almost everywhere that
Z b Z b
τm (x)dx ≤ lim σn (x)dx.
a n→∞ a
Rb
But this limit just is a
f (x)dx, and as m goes to infinity, the claim follows.
The last assertion follows by symmetry.
(c) We omit the proof.
(d) If (σn )n is a sequence of non-negative step functions calculating the Le-
Rb
besgue integral of f, a σn (x)dx > 0, if σn is non-zero. The monotonicity
implies that all σn have to be zero. As they converge to f almost every-
where, we have the assertion.
7.2. THE MAIN THEOREMS ON INTEGRATION 113

(e) It is enough to show this for non negative functions, in which case one can
just consider the case of step functions σn as in the definition. For every
n, Z b Z c Z b
σn (x)dx = σn (x)dx + σn (x)dx
a a c
and this identity carries over to the limit by 3.1.7.

Remark 7.2.2 Exchanged bounds


If a < b are given and f : [a, b] → R is integrable, then we define
Z a Z b
f (x)dx = − f (x)dx.
b a

This will be helpful for circumvening several case distinctions later on.

Theorem 7.2.3 Continuous functions are integrable


Let [a, b] ∈ R be a compact interval and f : I → R continuous. Then f ∈ L(I).

Proof. As f has a maximal value by 4.2.5 there is some C > 0 such that f +C is
positive. If we can show that this is integrable, then f = (f +C)−C is integrable,
as a constant function certainly is.
Therefore it suffices to consider the case of f (x) ≥ 0 for every x.
We construct a family of step functions σn ≥ 0 in the following manner. Fix a
natural number n. Consider the subintervals Ii = [a + (i − 1) · b−a2n
, a + i · b−a
2n
], i =
n
1, . . . , 2 . Let vi be the minimal value of f on this compact interval, which exists
again by 4.2.5. Then σn is defined to be the function taking the value vi on the
interval Ii r {a + i · b−a 2n
} and 0 at b.
As every interval for the construction of σn+1 is contained in one of the intervals
of the generation before, the minimal value will not be smaller than there, and
(σn )n is a monotonically increasing sequence of step functions.
Now let x ∈ [a, b) and some  > 0 be given. As f is continuous there exists
some δ > 0 such that
|f (x) − f (y)| ≤ , if y ∈ (x − δ, x + δ) ∩ I.
1
As soon as n satisfies 2n
< δ, we therefore see that
|σn (x) − f (x)| ≤ .
This implies that f (x) = limn→∞ σn (x).
Rb
As lastly the sequence ( a σn (x)dx)n is monotonically increasing and obviously
bounded by (b − a) · max(f (I)), we get convergence of this sequence by 3.2.8.
Therefore all we have demanded for f to be integrable is satisfied.
114 KAPITEL 7. INTEGRATION

Remark 7.2.4 Riemann’s sums for continuous functions


If f : [a, b] → R is continuous, then it is also integrable in the sense of Rie-
mann, and its Riemann integral coincides with that in the sense of Lebesgue.
The sequence of step functions we just constructed for the Lebesgue integral is
also suitable for calculating the Riemann integral, but it is slightly more subt-
le to show the independence of the integral from the chosen sequence, which is
something we demanded for the Riemann case.

Theorem 7.2.5 Mean value theorem for integral calculus


Let f, g : [a, b] → R be given such that g ∈ L+ ([a, b]) and f is continuous.
Then there exists a z ∈ (a, b) with
Z b Z b
f (x)g(x)dx = f (z) · g(x)dx.
a a

Proof. As f is bounded, there exist a minimal value m and a maximal value M


of f in the compact interval [a, b]. As g is non negative almost everywhere, we
see that
m · g(x) ≤ f (x) · g(x) ≤ M · g(x)
for almost every x ∈ [a, b]. The integrability of f g can be seen similar to that of
f itself as in 7.2.3, because g is supposed to be integrable.
But then 7.2.1 tells us that
Z b Z b Z b
m g(x)dx ≤ f (x)g(x)dx ≤ M g(x)dx.
a a a
Rb
If a g(x)dx = 0, then g is zero almost everywhere by 7.2.1 which shows f g to
Rb
be zero almost everywhere, hence a f (x)g(x)dx = 0 showing the assertion for
every z ∈ (a, b).
In the other case, Rb
a
f (x)g(x)dx
m≤ Rb ≤ M,
a
g(x)dx
and by the intermediate value theorem 4.2.7 we know that every number between
m and M is a value of f , showing the assertion.

Remark 7.2.6 Important special case


In the special case g = 1 (constantly) we find for continuous f that there is
some z ∈ (a, b) with
Z b
f (x)dx = f (z) · (b − a).
a
Rb
There is nothing to prove, because a
1dx = b − a.
7.2. THE MAIN THEOREMS ON INTEGRATION 115

Theorem 7.2.7 Fundamental theorem of differential and integral cal-


culus
Let I be an open interval and f : I → R a continuous function. Choose some
c ∈ I and define the function F on I by
Z x
F (x) = f (t)dt.
c

Then F is differentiable on I and we have F 0 = f.

Proof. Fix x ∈ I and look at h ∈ R such that x + h ∈ I. Then by 7.2.1 and


7.2.6 Z x+h
F (x + h) − F (x) = f (t)dt = h · f (z)
x
for some z between x and x + h. This is also true for negative h thanks to 7.2.2.
As f is continuous this implies
F (x + h) − F (x)
lim = lim f (z) = f (x).
h→0 h z→x

Theorem 7.2.8 Variant of this


If I is an open interval and F : I → R is continuously differentiable on I then
for all a, b ∈ I we have
Z b
F 0 (t)dt = F (b) − F (a).
a
Rx
Proof. The function F̃ (x) = a F 0 (t)dt is an antiderivative of F 0 by 7.2.7. Two
antiderivatives of F 0 only differ by an additive constant due to 6.3.7. As F̃ and
F − F (a) are both antiderivatives of F 0 with value 0 at x = a these functions
coincide, and hence
Z b
F 0 (t)dt = F̃ (b) = F (b) − F (a).
a

Remark 7.2.9 Yet another variant


This variant can be extended to the following situation: If I = [a, b] is a compact
interval, f : [a, b] → R is continuous on [a, b] and continuously differentiable on
(a, b) and if f is integrable on [a, b] then
Z b
f 0 (t)dt = f (b) − f (a).
a
116 KAPITEL 7. INTEGRATION

The crucial point is that in 7.2.8 the interval [a, b] is contained in some larger
interval where f can be extended in a reasonable way. There really is something
to do in order to prove this equality if such an extension is not possible.

Definition 7.2.10 Notation for this difference and indefinite integrals


Rb
If f is continuous and F is an antiderivative of f then a f (x)dx = F (b)−F (a).
The common short hand notation for this difference is

F (t)|t=b
t=a = F (b) − F (a).

If the context is clear will also write F |ba for this.


It is common to write the antiderivative of a continuous function f on an interval
as an integral without bounds, being called an indefinite integral:
Z
F (x) = f (x)dx,

where the functions are only fixed up to some additive constant, which can be
chosen as it does not change the derivative.
The author would prefer a notation like
Z x
f (t)dt,

to stress that the variable of F (x) is not the variable for the integration but the
upper bound in the integral. The missing lower bound still leaves place for the
additive constant.

7.3 Calculating some integrals


The theory now is helpful in doing concrete calculations.

Example 7.3.1 Power functions


Let r 6= −1 be a real number and f : R>0 → R, f (x) = xr . We know from 6.4.8
that
f 0 (x) = rf r−1 ,
and it is not hard to guess from this that

xr+1
F (x) =
r+1
is an antiderivative of f.
7.3. CALCULATING SOME INTEGRALS 117

Therefore for positive numbers a, b we get


Z b
1
xr dx = (br+1 − ar+1 ).
a r+1

An antiderivative of f (x) = 1/x is the natural logarithm.


As an application of the first rule we find that for r < −1 and every natural
number n
n n Z n
X
r
X
r 1 1
k ≤1+ k ≤1+ xr dx = 1 + (nr+1 − 1) →n→∞ 1 − ,
k=1 k=2 1 r + 1 r + 1

as r + 1 still is negative. Therefore the infinite series



X
kr < ∞
k=1

converges, as the sequence of partial sums is monotonically increasing and – as


we now know – bounded.
This is in sharp contrast to the case r = −1 (harmonic series) where no upper
bound for the partial sums exists. Because: the logarithm is an antiderivative of
1
x
and is unbounded as x goes to ∞.

Remark 7.3.2 Integration by parts


From 6.2.1 we know that for two differentiable functions f, g on an interval I we
have
(f g)0 = f 0 g + f g 0 .
In particular, f g is an antiderivative of f 0 g + f g 0 . Depending on the nature of
f and g this can sometimes be used in order to reduce the integration of f 0 g to
that of f g 0 by writing
Z b b Z b
0
f (x)g(x)dx = [f (x)g(x)] − f (x)g 0 (x)dx.
a a a

The strategy is called integration by parts.


The task in practice would be to decompose a function h you might want to
integrate as a product h = f 0 g of two functions where g is continuously diffe-
rentiable, where you know an antiderivative f of f 0 and where h and f g 0 are
integrable over the interval in question.
If moreover the integral of f g 0 is more accesible than that of f 0 g, then you might
profit from this investment.
118 KAPITEL 7. INTEGRATION

Example 7.3.3 Some examples

(a) In order to integrate ln(x) you can use f 0 (x) = 1, g(x) = ln(x). Together
with f (x) = x and g 0 (x) = x1 this gives for positive a, b
Z b x=b Z b
ln(x)dx = [x ln(x)] − 1dx = b ln(b) − a ln(a) − b + a.
a x=a a

An antiderivative of ln(x) is x ln(x) − x.

b) In order to integrate x sin(x) from a to b to can use g(x) = x, f 0 (x) =


sin(x), which with g 0 (x) = 1, f (x) = − cos(x) leads to
Z b x=b Z b x=b
x sin(x)dx = − x cos(x) − (− cos(x))dx = [−x cos(x) + sin(x)] .
a x=a a x=a

An antiderivative of x sin(x) is −x cos(x) + sin(x).

Remark 7.3.4 Substitution


Let f and g be two differentiable functions which can be composed (i.e. the
range of f is contained in the domain of g ).
As (g ◦ f )0 (x) = g 0 (f (x)) · f 0 (x) we see that g ◦ f is an antiderivative of
g 0 (f (x))f 0 (x), which leads to
Z f (b) t=f (b) x=b Z b
0
g (t)dt = g(t) = (g ◦ f )(x) = g 0 (f (x) · f 0 (x)dx.
f (a) t=f (a) x=a a

Of course here instead of g only g 0 might be given. Writing h = g 0 leads to


Z f (b) Z b
h(t)dt = h(f (x))f 0 (x)dx.
f (a) a

Example 7.3.5 The area of a disc


You will not be surprised to learn that the area of a disc is π times the square
of its radius. Nevertheless we will prove that now. Let D be the disc with radius
r centered at the origin of the plane. By Pythagoras’ Theorem, the boundary of
the disc is the set

C = {(x, y) ∈ R2 | x2 + y 2 = r2 } = {(x, y) ∈ R2 | −r ≤ x ≤ r, y = ± r2 − x2 }.

The integral Z r √
A= r2 − x2 dx
−r
7.3. CALCULATING SOME INTEGRALS 119

calculates half of the area of this disc.


The geometric picture we have in mind suggests that we also may write x as

x = f (t) = r cos(t), 0 ≤ t ≤ π.

The above formula then reads as (note the changed roles of x and t ;-))
Z πp Z π
A=− r2 − r2 cos2 (t) · (−r sin(t))dt = r2 sin2 (t)dt.
0 0

The addition formula 2.2.6 for trigonometric functions tells us that

1
sin2 (x) = (1 − cos(2x)),
2
hence π
π
r2 r2
Z  
sin(2t) π
A= · (1 − cos(2t)) dt = t− = r2 .
2 0 2 2 0 2

Example 7.3.6 Another example


We now look at the integral
Z b
1
√ dx.
a 1 + x2

As we have
cosh(t)2 − sinh(t)2 = 1 and cosh(t) > 0,
we find q
cosh(t) = 1 + sinh2 (t).
We therefore substitute x = sinh(t) and choose α, β such that

sinh(α) = a, sinh(β) = b,

leading to
Z b Z β Z β
1 1
√ dx = q cosh(t)dt = 1dt = β − α.
a 1 + x2 α 1 + sinh2 (t) α

As α = Arsinh(a) (this is the common name of the invers function to sinh ), we


see that Z a
1
√ dx = Arsinh(a).
1 + x2
120 KAPITEL 7. INTEGRATION

7.4 Some elementary differential equations


Remark 7.4.1 Introductory remarks
We remind the reader that in 6.4.4 we already introduced the notion of a diffe-
rential equation.
The most basic differential equation one might think of is the equation

f 0 (x) = u(x),

and
R x we just learned how to solve this on an interval, using the indefinite integral
u(t)dt.
It might therefore be a good place to say something more on at least certain
classes of differential equations we can deal with right now.

Example 7.4.2 Gravitation


It is one of the basic observations in classical mechanics that the acceleration of a
“point with mass m ” moving in vacuum is equal to the gravitational constant g
at the place, where the point just happens to be. For a short range movement the
gravitational constant deserves this name. If we describe the vertical movement
of our point by a function x(t) (where t denotes time) we get the equation

ẍ(t) = g.

This leads to
ẋ(t) = g · t + v0 , v0 a constant,
and this again leads to
1
x(t) = gt2 + v0 · t + x0 , x0 a constant.
2
Of course for a specific movement there should be no choices of the constants –
they should be prescribed from the circumstances. But then we see that x0 = x(0)
is the place, where the point is at time 0, and v0 = ẋ(0) is the speed at time 0.
And of course the movement will depend on where the point happens to be at
time 0 and whether you just let it fall or throw it into the air or smash it to the
ground.
Thus in addition to the differential equation itself we also will need certain “boun-
dary conditions” to get a solution for a specific problem.

Example 7.4.3 The oscillating pendulum


One of the other classical examples is that of an oscillating pendulum. This is some
weight with mass m fixed at a rope of length l which hangs from some scaffold
7.4. SOME ELEMENTARY DIFFERENTIAL EQUATIONS 121

so that the weight can oscillate freely without being hindered by whatever. . . only
gravitation plays its role. We describe the location of the weight by its angle φ
with respect to the vertical axis emanating from the upper end of the rope. The
angle is a function φ(t) with respect to time, and it turns out that its acceleration
is given by
g
φ̈(t) = sin(φ(t)).
l
This equation already is quite hard to solve, and it is much nicer to look at the
approximation for small angles, i.e. in the case that the initial displacement is
not too large. Then sin(φ(t)) ∼ φ(t), and we see solutions, namely
p p
a · cos( g/l · t) + b · sin( g/l · t), a, b ∈ R.

One can even show (but not we, at least not now!) that these are the only solu-
tions.

Example 7.4.4 The catenary


The catenary is the shape of a rope or chain hanging freely from two fixed ends.
We describe this as the graph of a function f on some interval [a, b] and assume
that the length of the rope is longer than the distance between the positions of
its ends. Then this function has a minimum in (a, b). We move the x -axis such
that the minimum is attained at x = 0. We now pick some point x > 0 ( x < 0
will work similarly) and describe the three (blue) forces working at the (red) part
of the rope above the interval [0, x] in order to derive information on f 0 (x). The
hopefully helpful picture comes here:

...
..... ......
...
......... ..
...... ....... .. Fg
.......
........ .
.. ........... .
......... ..
........ ..... ..... ..... ..... ......
. .
........... .....
.................
. ..
...
...
...
. .
. ............
. F
.................... .................................. ...
... −F
−F ... g
...

.........................................................................•...................................................................•.............................
0 x
The forces which contribute are the two tension forces at the ends of the red part
of the rope (in tangential direction) and the gravitational force. We decompose
the tension force at the right end into its vertical and horizontal components
(dashed blue lines). As the three forces have to add up to zero (otherwise we
would have some movement of this part of the rope) we see that the horizontal
part at the right end is the same as that on the left end (other direction) and
that the vertical part at the right end is the gravitational force. As the tension
122 KAPITEL 7. INTEGRATION

force at the right end is tangential to the curve, its slope is f 0 (x), and we get
that
f 0 (x) · F = Fg = density · (length of red curve).
Here, “density” is understood as the density with respect to length; the rope is
an ideal rope, homogeneous and without width;-)
As the force F at the left end is independent of x we see that f 0 (x) is propor-
tional to the length of the red curve, which depends on x. This is the arc length
of the curve
γ : [0, x] → R2 , γ(t) = (t, f (t)).
The velocity of this curve is γ 0 (t) = (1, f 0 (t)), and the absolute value of the speed
therefore p
kγ 0 (t)k = 1 + f 0 (t)2 .
Therefore the length of the red curve is – cf. 6.1.5 –
Z xp
1 + f 0 (t)2 dt.
0

We arrange the units (for length, weight) such that the above mentioned propor-
tionality becomes an equality, and get
Z xp
0
f (x) = 1 + f 0 (t)2 dt.
0

Differentiating both sides with respect to x leads to


p
f 00 (x) = 1 + f 0 (x)2 ,

and squaring then gives


f 00 (x)2 = 1 + f 0 (x)2 .
We again differentiate and get

f 00 (x)f 000 (x) = f 0 (x)f 00 (x).

As the slope of the rope grows monotonically, we have f 00 (x) > 0 everywhere
such that we may divide by f 00 (x) :

f 0 (x) = f 000 (x), a ≤ x ≤ b.

This forces by repeated differentiation that for every n ∈ N

f (2n) (x) = f 00 (x) and f (2n−1) (x) = f 0 (x).


7.4. SOME ELEMENTARY DIFFERENTIAL EQUATIONS 123

We now remember that f 0 (0) = 0, because f has a local minimum there, and
hence f 00 (0) = 1, as it is positive with square 1. We therefore can write down the
Taylor series for f 0 at 0 , which is
∞ ∞
X
(n+1) n
X x2n+1
Tf 0 (x) = f (0)x /n! = ,
n=0 n=1
(2n + 1)!

because the even derivatives of f 0 vanish at 0 and the odd derivatives are 1.
Therefore, cf. 5.4.8, (the Taylor series of) f 0 (x) = sinh(x) and f (x) = cosh(x)+c,
where c depends on the heigth where the nails holding the rope are hit into the
wall.

* * * * *
We now study three types of differential equations of first order, i.e. only involving
the unknown function and its first derivative. These types are all of the shape

u0 (x) = f (x, u(x)),

where f : I × R → R is some given function, I is the interval where we study the


function, and often we impose some boundary condition on u(x) by demanding
that for a specified number x0 ∈ I u(x0 ) = u0 .
This is just a series of exemplary situations and will only demonstrate that one has
to adopt the tools to the given differential equation. There is no recipe working
for all of them and there will be other strategies than the ones described here as
well.

Example 7.4.5 Separable differential equations


The term separable here means that f (x, u) = g(x) · h(u) is a product of two
functions, each depending on one of both variables only.
We often assume that h(u) never gets zero, so that we can divide by it and truely
separate the u -part and the x -part, arriving at
u0 (x)
= g(x).
h(u(x))
1
If H is an antiderivative of h
, then by the chain rule

d u0 (x) !
H(u(x)) = = g(x),
dx h(u(x))
such that H(u(x)) is an antiderivative of g, which we can calculate by integra-
tion, such that Z x 
−1
u(x) = H g(t)dt
x0
124 KAPITEL 7. INTEGRATION

solves the original equation. Note, that H is injective as h(u) never gets zero,
which means there are no sign changes for the values of h and therefore H is
monotonic.
This is the general strategy, and we now look at two more special cases.

(a) The case g(x) = 1, the so called autonomous case:

u0 (x) = h(u(x)).

This is the case if, for instance, the physical conditions which are imposed
on the movement of a particle at position u in dependence on time x do
not change during the time of observation.
Then we have G(x) = x+const. and u(x) = H −1 (x+const.). There is also
an additive constant involved in the choice of H, and we should arrange
for the constants in such a way, that H −1 (x + const.) is defined at least at
x = x0 und such that the value there is our boundary value u0 – if possible.
No one can promise that such a choice always is possible, and in particular
it might not be possible to define u on all of I.
Sometimes one is particularly interested in stationary solutions, which are
just constant functions and are possible if h(u0 ) = 0. Then it will be
interesting whether this solution is stable, i.e. small changes will lead to a
new solution which approaches u0 , or unstable, i.e. a small change of the
conditions leads to a new solution moving away from u0 . But as you see,
in this case the old assumption that h never gets zero is no longer valid, so
that the original strategy will not work unchanged.

(b) The case h(u) = u, the so called homogeneous linear differential equation
of first order:
u0 (x) = g(x) · u(x).
We already dealt with this type of equation in 6.4.4 and saw the specific
solution
u(x) = exp(G(x)), where G0 = g.
We can now calculate such a G for every continuous g on an interval
I, using integration. We now take a second solution ũ of the equation
and compare it to the first solution u by dividing ũ by u, i.e. looking at
v(x) = ũ(x) exp(−G(x)).
Differentiation as well as application of the product rule and the chain rule
lead to

v 0 (x) = g(x) · ũ(x) · exp(−G(x)) + ũ(x) · (−g(x)) · exp(G(x) = 0.


7.4. SOME ELEMENTARY DIFFERENTIAL EQUATIONS 125

Therefore v(x) is constant by 6.3.7 and hence the functions

x 7→ c · u(x), c ∈ R constant,

are the only solutions of our homogeneous linear equation.


As u(x0 ) = exp(G(x0 )) 6= 0, we can arrange for a suitable constant c such
that cu(x0 ) = u0 is any given boundary value.
Conclusion: For every boundary value u0 and every x0 ∈ I, there is a
unique solution u for the equation u0 = g · u with boundary condition
u(x0 ) = u0 .

Example 7.4.6 The inhomogeneous linear differential equation


This type is closely related to the last case we studied, but it introduces an
obstacle to using our methods for the homogeneous case at once:

u0 (x) = g(x) · u(x) + h(x), x ∈ I.

Note that we are no longer in the separable case and that h has a different
meaning here. The function h is called the inhomogeneity of this equation.
As we can already solve the equation in case the inhomogeneity should be zero
we could try to start with a solution of the homogeneous equation u0 = gu
and modify this in order to get a solution of the inhomogeneous equation we
are interested in. The idea is that the derivative of a product is a sum, and we
now try to arrange for suitable factors. The attempt we make is the variation of
constants and uses a function

u(x) = c(x) · exp(G(x)),

where again G is an antiderivative of g and now c is a function on I, often


called an integrating factor. Calculating the derivative of this function u gives

u0 (x) = c(x) · g(x) · exp(G(x)) + c0 (x) · exp(G(x)).

Here the first summand already is g(x) · u(x), so we should try to solve for

c0 (x) · exp(G(x)) = h(x).

But as the exponential function never is zero, this just means

c0 (x) = h(x) · exp(−G(x)).

Therefore c has to be an antiderivative of the right hand side.


Reversing the calculations shows that for every antiderivative c of h · e−G the
function c · eG solves the differential equation.
126 KAPITEL 7. INTEGRATION

Again, if u and ũ are two solutions of the equation, v = u − ũ solves the


homogeneous linear equation

v 0 = g(x) · v(x),

and this shows that for a fixed solution us (the index s standing for “special”)
the functions
us (x) + a · exp(G(x)), a ∈ R,
are all solutions of the inhomogeneous linear differential equation. Again, for every
boundary condition u(x0 ) = u0 there is a unique value of a = (u0 − us (x0 )) ·
exp(−G(x0 )) giving the right solution.
We will see an example of this in 7.4.8.

Example 7.4.7 Bernoulli’s differential equation


There is yet another first order differential equation which can be solved by an
easy trick, namely Bernoulli’s equation:

u0 (x) = g(x) · u(x) + h(x) · u(x)α ,

where α is some real constant which we may assume to be neither 0 nor 1, as


these cases would be linear again (inhomogeneous for α = 0 and homogeneous
for α = 1 ).
The idea is to substitute a suitable power for u and get rid of the exponent α,
i.e reduce to the inhomogeneous linear case.
Setting u = v λ , we see that
u0 = λv λ−1 · v 0 .
This gives the equation

λv 0 = u0 · v 1−λ = g · v + h · v λα+1−λ .
1
If λα + 1 − λ = 0, i.e. λ = 1−α , this is an inhomogeneous linear differential
equation for v (after dividing by λ) .
As now we have to transform back to u we will often have to assume that v only
takes positive values, such that u = v λ is defined.
We now solve such an equation in an example.

Example 7.4.8 One specific case


In this example we look at the equation
u(x) x
u0 (x) = + , x > 0, u(1) = 1,
x u(x)
7.5. IMPROPER INTEGRALS 127

which is a special case of Bernoulli’s differential equation with α = −1.



Therefore we set λ = 1/2, u = v, g(x) = 1/x, h(x) = x, and v has to satisfy

v 0 (x) = 2v(x)/x + 2x.

Using the approach from 7.4.6 we calculate an integrating factor c to be an


antiderivative of

h(x) exp(−G(x)) = 2x · exp(−2 ln(x)) = 2/x.

The solution for this is

c(x) = 2 ln(x) + k, k constant

so that the candidates for v are the functions

v(x) = (2 ln(x) + k) · exp(2 ln(x)) = (2 ln x + k) · x2 .

As now we want v 2 = u, we also get the boundary condition v(1) = 1, and this
forces k = 1, so that our solution will be
p
u(x) = x · 2 ln(x) + 1.

But as we want a real square root, we have to demand that the expression in the
square root is non-negative, leading to x ≥ √1e , so that the solution cannot be
defined for all positive real numbers.
In the general task to solve differential equations this already indicates that we
will have to find a suitable (maximal. . . ) domain where the equation can be
solved, and this will also depend on the boundary conditions.

7.5 Improper integrals


Remark 7.5.1 A new look at Lebesgue integrals
In our original definition of integrals in 7.1.8 we always had integrals over compact
intervals. We now want to relax this condition.
If f is only defined on a bounded interval I we can extend f to a function on
the closure of I by setting its values at the ends to be zero and define the integral
over I to be the integral over its closure. As a change of the integrand on a null
set does not change the integral this definition is independent of the choice of
values at the end points.
But we could also have done without this artificial choice of extension of f by
defining step functions slightly differently, namely that a step function on an
128 KAPITEL 7. INTEGRATION

interval I (or even some other subset of the reals) is a step function on R which
has value zero outside I. In particular, such a step function is zero outside some
compact interval, as it has finitely many values on finitely many intervals of finite
length.
We can then word by word transfer the definition of the Lebesgue integral to the
case of unbounded intervals.
The task now still is to evaluate such integrals, as we can no longer use the
Fundamental Theorem 7.2.9, having no end points of the interval where we can
evaluate the antiderivative. The natural idea of course is to approximate the
domain by compact intervals. The technical tool for doing this is the following
theorem which we state in a special case only.

Theorem 7.5.2 Beppo Levi’s Theorem on monotonic convergence


Let I ⊆ R be an interval and f : I → R be given.
If there is a sequence (fj )j of Lebesgue integrable functions on I such that for
almost every xR ∈ I the sequence (fj (x))j converges monotonically increasing to
f (x), and if ( I fj (x)dx)j is bounded, then
Z Z
f (x)dx = lim fj (x)dx.
I j→∞ I

Remark 7.5.3 Two comments on this


The main difference to the very definition of the Lebesgue integral is that here
we do not use step functions only but any integrable functions. For the proof it
really is important that the convergence of the functions is “pointwise monotonic
almost everywhere”. In some sense one can use this to extract from stepfunctions
σj,n calculating the integral of fj one sequence of step functions for f.
The main application of this abstract result is the following:

Consequence 7.5.4 An exhaustion of I


Let I ⊆ R be an interval and f : I → R a function with non S negative values.
Let Ij ⊆ I be subintervals such that Ij ⊆ Ij+1 for all j and j Ij = I.
R
If f is integrable on every Ij and ( Ij f (x)dx)j is bounded, then
Z Z
f (x)dx = lim f (x)dx.
I j→∞ Ij

Proof. We construct a sequence (fj )j of functions on I for which 7.5.2 can be


used, namely 
f (x), if x ∈ Ij ,
fj (x) =
0, if x 6∈ Ij .
7.5. IMPROPER INTEGRALS 129

Then for fixed x ∈ I the sequence (fj (x))j is


0, 0, 0, 0, . . . , 0, f (x), f (x), f (x), . . .
and remains constant until the end. Therefore (fj )j clearly satisfies the conditions
from 7.5.2.

Example 7.5.5 A Laplace transform


With the strategy just developed we can calculate
Z ∞ Z b b
−x
e dx = lim e−x dx = − lim e−x = lim (1 − e−b ) = 1.
0 b→∞ 0 b→∞ 0 b→∞

This gets slightly more interesting when the integrand is somewhat more intricate,
e.g.
Z ∞ Z b  Z b 
−x −x −x b −x
xe dx = lim xe dx = lim (−xe ) 0 + e dx = 1,
0 b→∞ 0 b→∞ 0
−x
because limb→∞ xe = 0 due to 6.3.11.
R∞
Integrals of the shape 0 f (x)e−x dx are special values of the Laplace transform
of f. Laplace transforms are a tool for solving differential equations and will be
introduced in the AM2-course as close relatives to Fourier transforms.
The first two integrals evaluated above
Pd can ibe extended inductively, showing
finally that for a polynomial f (x) = i=0 ai x
Z d
X
−x
f (x)e dx = ai · i!.
i=0

Example 7.5.6 An unbounded function


Similarly, if I = (a, b) and f : I → R is non negative one can also use this
strategy and define (fj )j as
f (x), if a + 1j ≤ x ≤ b − 1j

fj (x) =
0 else.
R
Then if each fj is integrable and the integrals I fj (x)dx are bounded, f will
be integrable as well and
Z Z
lim fj (x)dx = f (x)dx.
j→∞ I I

For instance,
1 1 1

Z Z
1 1
√ dx = lim √ dx = lim 2 x = 2.
0 x j→∞ 1/j x j→∞
1/j
130 KAPITEL 7. INTEGRATION

Definition/Remark 7.5.7 Improper integrals


If the function under consideration is not non negative, one – as in the definition of
the Lebesgue integral – can write is as a difference of two non negative functions,
f = f + − f − , and may use the above procedure for f + and f − .
It can, however, happen, that f + and f − are not integrable over all of I = (a, b)

but still limα→a+ limβ→b− α f (x)dx exists. Here, a can also be −∞ and b can
also be ∞. It is this type of limit which then is called the improper integral of f
over (a, b).

Example 7.5.8 An example


Take I = (0, ∞) and f (x) = sin(x)/x. As f is bounded (due to its continuity
and limx→0 f (x) = sin0 (0) = 1 ) we see that for every b ∈ R>0 the integral
Rb
0
f (x)dx exists.
What about the limit for b → ∞ ?
We first note that f is not integrable over [0, ∞) in the Lebesgue sense because
then the integral of its absolute value would have to exist, which it does not,
because for k ∈ N
Z kπ Z kπ Z π
| sin(x)| 1 1 2
dx ≥ | sin(x)|dx = sin(x)dx = ,
(k−1)π x kπ (k−1)π kπ 0 kπ

and therefore
kπ k
| sin(x)|
Z
2X1
dx ≥ ,
0 x π j=1 j

which diverges for k going to infinity.


On the other hand, we have that
Z kπ Z (k+1)π
|f (x)|dx ≥ |f (x)|dx,
(k−1)π kπ

1
because |f (x + π)| ≤ |f (x)|, as | sin(x + π)| = | sin(x)| and x+π
< x1 .
Due to the signs of the sine function, we have
Z kπ k
X Z jπ
j+1
f (x)dx = (−1) |f (x)|dx
0 j=1 (j−1)π

which now is an alternating sum of summands with decreasing absolute value


going to zero, so that Leibniz criterion 5.1.10 shows that the limit exists. We call
this limit `.
7.6. INTEGRALS OF RATIONAL FUNCTIONS 131

However, we have to show that the limit 0 f (x)dx exists if β goes to ∞. To
that end we take a positive real β and the largest multiple kπ ≤ β and calculate
Rβ triangle ineq. Rβ R kπ R kπ
| 0
f (x)dx − `| ≤ | 0
f (x)dx − 0
f (x)dx| + | 0
f (x)dx − `|

5.1.10 Rβ R (k+1)π
≤ | kπ
f (x)dx| + | kπ
f (x)dx|
R (k+1)π 2
≤ 2| kπ
f (x)dx| ≤ kπ

≤ β2 ,

implying the desired result.


If g : (0, ∞) → R≥0 is some continuous function which is decreasing for all
x > x0 , has limit limx→∞ g(x) = 0 and for which limx→0 g(x) sin(x) exists, then
the same kind of argument shows that the improper integral
Z ∞ Z b
g(x) sin(x)dx = lim g(x) sin(x)dx
0 b→∞ 0

exists.
R 2kπ R∞
However, although limk→∞ 0
sin(x)dx = 0, the improper integral 0
sin(x)dx
does not exist.
There are other obvious generalizations of the existence of the improper integral,
e.g. replacing sin by some piecewise continuous function s : [0, ∞) → R for
which a positive h exists with

s(x + h) = −s(x)

and for which s(x) ≥ 0, x ∈ [0, h].

7.6 Integrals of rational functions


Remark 7.6.1 More than a short reminder
When we studied rational functions we already discussed the so called decompo-
sition in partial fractions, cf. 2.3.8.
We will now use this in order to integrate rational functions. It is good to know
antiderivatives of x1n , n ∈ N. For n = 1 this is ln |x|, for n > 1 it is (1−n)x
1
n−1 .

Looking more closely at partial fractions, there are also summands of the type
ax+b
(x2 +cx+d)n
, were the polynomial function x2 + cx + d does not have a real root.
This is the case if and only if 4d − c2 ≥ 0. Note, that therefore x2 + cx + d is
132 KAPITEL 7. INTEGRATION

always non negative, which allows us to take its square root in the real numbers.
We use this now after completing the square:
2  
1
1 1 1 x + c
(x2 + cx + d) = ((x + c)2 + d − c2 ) = (d − c2 ) ·  q 2  + 1 .
2 4 4 1 2
d − 4c

Let us now look at the case n = 1 :


ax+b a 2x+c b− ac
x2 +cx+d
= 2
· x2 +cx+d
+ x2 +cx+d
2

a d b− ac 1
= 2
· dx
ln(x2 + cx + d) + 2
d− 41 c2
· 1c
!2
√x+ 2
1 c2
+1
d− 4
h i
d a √2b−ac
= dx 2
2
ln(x + cx + d) + 4d−c2
arctan( √2x+c
4d−c2
) .

Here we use the derivative of the arctan -function from 6.4.12.


There are similar formulae for larger n, which can be derived from this by similar
substitutions and integration by parts. We do not give them here.

Example 7.6.2 An example


We now treat one specific example showing how to arrange for the decomposition
of a specific rational function.
We want to find an antiderivative of the function
3x3 + 8x2 + 6x + 7
f (x) = .
x4 + 2x3 + 3x2 + 4x + 2
For the partial fraction decomposition we need a factorization of the denominator,
which turns out to be

x4 + 2x3 + 3x2 + 4x + 2 = (x + 1)2 (x2 + 2).

As the degree of the numerator of f is smaller than that of the denominator, our
decomposition will be of the shape
A B Cx + D
+ 2
+ 2 ,
x + 1 (x + 1) x +2

and we now bring this to a common denominator, resulting in


A(x+1)(x2 +2)+B(x2 +2)+(Cx+D)(X+1)2
x4 +2x3 +3x2 +4x+2

(A+C)x3 +(A+B+2C+D)x2 +(2A+C+2D)x+2A+2B+D


= x4 +2x3 +3x2 +4x+2
.
7.6. INTEGRALS OF RATIONAL FUNCTIONS 133

Comparing the coefficients of the numerator here with those of the numerator of
f leads to a system of four equations in the variables A, B, C, D :
A +C =3
A +B +2C +D =8
2A +C +2D =6
2A +2B +D =7
One now can solve for this equations by using Gaußian elimination. For instance,
subtracting two times the first equation from the third gives C = 2D, hence
A = 3 − 2D. Plugging this in equations 3 and 4 results in
B +3D = 5
2B −3D = 1
which after addition gives 3B = 6, and then successively
B = 2, D = 1, C = 2, A = 1.
Therefore
1 2 2x + 1
f= + 2
+ 2 ,
x + 1 (x + 1) x +2
and this has antiderivative
−2 1 x
ln |x + 1| + + ln(x2 + 2) + √ arctan( √ ) + const.
x+1 2 2
To be honest, this derivation of an antiderivative does not at all use the theory
of integrals and could have been performed much earlier. But now we might
have some more motivation to calculate antiderivatives because they have a new
meaning as an indefinite integral.

Remark 7.6.3 Making use of substitutions


The result on rational functions can sometimes be put into action for other func-
tions as well, if there are suitable substitutions.
For instance, as by the addition formula 2.2.6 for sine and cosine
1 − tan2 (x/2) 2 tan(x/2) d 1
cos(x) = 2
, sin(x) = 2
, tan(x/2) = (1+tan2 (x/2)),
1 + tan (x/2) 1 + tan (x/2) dx 2
we have for t = tan(x/2)
Z Z
1 1
dx = 2 dt,
sin(x) + cos(x) 1 + 2t − t2
which – making use of the formula from 7.6.1 – results in

tan(x/2) − 1 + 2
Z
1 1
dx = √ ln √ .
sin(x) + cos(x) 2 tan(x/2) − 1 − 1 2
134 KAPITEL 7. INTEGRATION
Kapitel 8

Indices

8.1 Important Theorems

Bolzano-Weierstraß 3.2.5
Cauchy Convolution Formula 5.1.7
Chain Rule 6.2.3
Comparison test 5.2.1
Existence of maxima 4.2.5
Functional Equation 5.1.8
Fundamental theorem of differential and integral calculus 7.2.7,7.2.8
Intermediate Value Theorem 4.2.7
Leibniz Criterion 5.1.10
L’Hôpital’s rule 6.3.9
Mean value theorem 6.3.4
Mean value theorem for integral calculus 7.2.5
Monotonicity Criterion 3.2.8
Ratio Test 5.2.3
Rolle’s Theorem 6.3.3
Root Test 5.2.5

135
136 KAPITEL 8. INDICES

8.2 Some Symbols

∅ 1.1.1 empty set


[a, b], (a, b], . . . 1.2.5 intervals
|x|, |z| 1.2.3,2.2.4 absolute value
∪, ∩ 1.1.4 union and intersection
max(S),
Pn min(S) 1.2.6 maximum and minimum of S
Qni=m i a 2.1.1 sum
i=m ai 2.1.3 product
n!  2.1.4 n factorial
n
k
2.1.5 binomial coefficient
C 2.2.3 the set of complex numbers
z 2.2.4 complex conjugate of z
deg(f ) 2.3.1 degree of a polynomial
limn→∞ an 3.1.2 limit of a sequence
BR (c) 3.1.8 disc with center c, radius R

x1/d = d x 3.2.10 d-th root of x ≥ 0
exp 3.2.11 exponential function
e = exp(1) 3.2.11 Euler’s number
lims→z0 f (s) 4.1.2 limit of a function
P ∞
n=1 na 5.1.2 infinite series
ln 5.4.5 natural logarithm
sinh, cosh 5.4.8 hyperbolic sine resp. cosine
sin, cos 6.1.5 power series for sine and cosine
Rb
a
f (x)dx 7.1.8 Lebesgue integral
t=b
RF (t)|t=a 7.2.10 F (b) − F (a)
f (x)dx 7.2.10 indefinite integral
8.3. TERMINOLOGY 137

8.3 Terminology

absolute convergence 5.1.4 interval 1.2.5


accumulation point 3.2.3, 4.1.2 inverse map 1.1.13
addition formula 2.2.6 invertible 1.1.13
almost everywhere 7.1.1 leading coefficient 2.3.1
antiderivative 6.3.7 Lebesgue integrable 7.1.8
arithmetic mean 1.3.6 limit 3.1.2, 4.1.2
binomial coefficient 2.1.5 local point of extremum 6.3.1
binomial formula 2.1.8 logarithm 5.4.5
binomial series 6.4.8 map 1.1.8
bounded 1.2.6, 3.1.8 maximum 1.2.6
boundary condition 7.4.2 minimum 1.2.6
Cantor Set 7.1.2 monotonic 1.2.7
closed 4.2.1 null sequence 3.1.2
codomain 1.1.8 null set 7.1.1
compact 4.2.1 partial fraction 2.3.8
complex conjugation 2.2.4 partial sum 5.1.2
complex numbers 2.2.3 Pascal’s triangle 2.1.6
mathematical induction 1.3.7 polar coordinates 2.2.5
composition 1.1.10 polynomial function 2.3.1
constant term 2.3.1 power series 5.3.1
convergence 3.1.2 radius of convergence 5.3.4
countable 7.1.1 range 1.1.8
degree 2.3.1 rational functions 2.3.7
derivative 6.1.1 real part 2.2.3
differential quotient 6.1.1 recursion romula 3.2.9
disc 3.1.8 Riemann integral 7.1.9
domain 1.1.8 Riemann sum 7.1.9
exponential function 3.2.11,5.1.8 roots 2.3.1
exponential series 5.1.3(a) series 5.1.2
function 1.1.8 sequence 3.1.1
geometric series 5.1.3(b) set 1.1.1
harmonic series 5.1.3(c) step functions 7.1.3
hyperbolic sine/cosine 5.4.8 subsequence 3.2.1
imaginary part 2.2.3 Taylor polynomial 6.4.7
improper integral 7.5.7 Taylor series 6.4.9
indefinite integral 7.2.10 triangle inequality 1.2.4
integral 7.1.8 zeros 2.3.1
integrating factor 7.4.6
integration by parts 7.3.2

You might also like