Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
5 views119 pages

453 Notes

The lecture notes on Abstract Algebra by Uli Walther cover fundamental concepts and structures in algebra, including sets, groups, rings, and fields, organized into weekly chapters. Key topics include induction, modular arithmetic, group actions, and the isomorphism theorem, with a focus on both theoretical foundations and practical applications. The notes also provide exercises and expected progress throughout the course, emphasizing the importance of understanding basic notions in mathematics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views119 pages

453 Notes

The lecture notes on Abstract Algebra by Uli Walther cover fundamental concepts and structures in algebra, including sets, groups, rings, and fields, organized into weekly chapters. Key topics include induction, modular arithmetic, group actions, and the isomorphism theorem, with a focus on both theoretical foundations and practical applications. The notes also provide exercises and expected progress throughout the course, emphasizing the importance of understanding basic notions in mathematics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 119

Lecture notes on Abstract Algebra

Uli Walther
c 2021
Version of Spring 2021
Contents

Basic notions 7
0.1. How to use these notes 7
0.2. Set lingo 7
0.3. Size of sets 8
0.4. Finite vs infinite 9
0.5. Inclusion/Exclusion 10

Chapter I. Week 1: Introduction 13


1. Induction 13
1.1. Setup 13
1.2. The idea 13
1.3. Archimedean property and well-order 16
2. Arithmetic 18
2.1. The Euclidean algorithm 19
2.2. Primes and irreducibles 21
2.3. Some famous theorems and open problems on prime numbers 23
3. Modular arithmetic 26
3.1. Computing “modulo”: Z/nZ 26
3.2. Divisibility tests 27

Chapter II. Week 2: Groups 29


1. Symmetries 29
2. Groups 30
3. Cyclic and Abelian groups 32
4. Automorphisms 34
5. Free groups 36

Chapter III. Week 3: Z/nZ and cyclic groups 37


1. Subgroups of cyclic groups 37
2. Products and simultaneous modular equations 39
3. U (n): Automorphisms of Z/nZ 41

Chapter IV. Week 4: Cosets and morphisms 45


1. Equivalence relations 45
2. Morphisms 45
3. Cosets for subgroups 47
4. Kernels and normal subgroups 49

Chapter V. Week 5: Permutations and the symmetric group 51

Chapter VI. Week 6: Quotients and the Isomorphism Theorem 57


3
4 CONTENTS

1. Making quotients 57
2. The isomorphism theorem 60
Chapter VII. Week 7: Finitely generated Abelian groups 63
1. Row reduced echelon form over the integers 63
2. Generating groups 66

Chapter VIII. Week 8: Group actions 71


Review 77
Chapter IX. Week 9: Introduction to rings 79
Chapter X. Week 10: Ideals and morphisms 83

Chapter XI. Week 11, Euclidean rings 87


1. Euclidean rings 87
Chapter XII. Divisibility, Field Extensions 93
1. Divisibility 93
2. Making new fields form old 96

Chapter XIII. Splitting fields and extension towers 99


1. Roots with multiplicity 100
Chapter XIV. Week 14: Minimal polynomials and finite fields 103
1. Minimal Polynomials 103
2. Finite Fields 105
Chapter XV. Galois 109
1. The Frobenius 109
2. Symmetries 112
3. Applications 112

Stuff for later 113


3.1. Zerodivisors 113
3.2. Cartesian Products, Euler’s φ-function, Chinese Remainder 115
3.3. Fermat’s little theorem 118
CONTENTS 5

Expected progress:
• Week 1: Archimedes, Factorization
• Week 2: Symmetries, Groups, Subgroups, Order, Aut(G)
• Week 3: Z/nZ, Products, U (n)
• Week 5: Cosets, Morphisms
• Week 4: Symmetric and Free Group
• Week 6: Normal Subgroups, Quotients, Automorphism Theorem
• Week 7: Finitely Generated Abelian Groups
• Week 8: Review, Group Actions
• Week 9: Reading day, Intro to Rings
• Week 10: Midterm, Ideals, Morphisms
• Week 11: Euclidean Algorithm, PID, UFD
• Week 12: Fields, Eisenstein, Extensions,
• Week 13: Degrees and Splitting Fields
• Week 14: Minimal Polynomials, Finite Fields
• Week 15: Galois Outlook, Review
Basic notions

0.1. How to use these notes. These notes contain all I say in class, plus
on occasion a lot more. If there are exercises in this text, you may do them but
there is no credit, and you need not turn them in. All exercises that are due are
specifically listed on gradescope.
This initial chapter is here so we have acommon understanding of the basic
symbols and words. This should be known from MA375 (at least if I teach it).
Chapter 1 is still more than we did in week 1, but almost all of it should be
familiar, and the rest (the open problems on primes) is for your entertainment.
Future chapters correspond to actual weeks of classes and are much less verbose
than the Basic Notions.
The chapter “Stuff for later” can be ignored. It will be worked in when the
time comes.
Remark .1. There are typos in these notes. If you find some, please inform
me.
0.2. Set lingo. The mathematical word set denotes what in colloquial life
would be called a collection of things. The “things” are in mathematics referred to
as elements of the set. If S is a set and s is an element of S then one writes s ∈ S.
A sub-collection S 0 of elements of S is a subset and one writes S 0 ⊆ S, allowing
for the possibility of S 0 being all of S, or to have no element. There is, strangely,
a set that contains nothing. It’s called the empty set (denoted ∅) and, despite
its humble content, one of the most important sets. One uses the notation S =
{s1 , . . . , sn , . . .} to indicate that S consists of exactly the elements si . (In many
cases, one must allow the index set to be different from N. In other words, not all
sets can be “numbered”, a fact we explore a bit below).
A function φ : A → B from the set A to the set B is an assignment (think:
a black box) that turns elements of A into elements of B. The crucial conditions
are: the assignment works for every single input a ∈ A (so the black box does not
choke on any input from A) and for each input there is exactly one output specified
(no more, no less). Graphically, functions are often depicted by the help of arrows
(starting at the various elements of A and ending at the value φ(a) for each input
a). (For an example, suppose φ : Z → Z is the process of doubling. Then one could
write φ(3) = 6, or 3 7→ 6). The set A is usually referred to as source, the set B as
target.
Definition .2. The function φ : A → B is
(1) surjective (“onto”) if every element b appears as output of φ;
(2) injective (“into”) if the equality of outputs φ(a1 ) = φ(a2 ) occurs exactly
when the inputs a1 and a2 were equal;
(3) bijective if it is injective and surjective.
7
8 BASIC NOTIONS

An injective map is indicated as A ,→ B, a surjective one by A  B.


For example, the function φ(x) = x3 is a bijection from A = R to B = R
(all real numbers have exactly one cubic root); the function φ(x) = x2 is neither
injective (since always φ(x) = φ(−x)) nor surjective (since negative numbers have
no real roots).
We will often say “map” instead of “function”.

0.3. Size of sets. We wish to attach to each set S a size denoted |S|. In order
to make sense of this, we need to compare sets by size.
Definition .3. We write |S| ≤ |S 0 | if there is an injective map φ : S ,→ S 0 .
Do not confuse the symbols ≤ and ⊆. The following examples illustrate the
nature of the relation ≤.
Example .4.
|N| ≤ |Z| since each natural number is an integer. 
Exercise .5. Show that |Z| ≤ |N|. 
Example .6. • |Z| ≤ |Q| since each integer is a rational number.
• |Q| ≤ |R| since each rational number is also real.
• Somewhat shockingly, |Q| ≤ |Z|. To see this, it will be sufficient to prove
that there is a way of labeling the rational numbers with integer labels. (One can
then make an injective map that sends each rational to its label). How does one
label? Imagine sorting the rational positive numbers into a two-way infinite table,
as follows:
q
.
p .. 1 2 3 4 ···
1 1/1 1/2 1/3 1/4 · · ·
2 2/1 2/2 2/3 2/4 · · ·
3 3/1 3/2 3/3 3/4 · · ·
4 4/1 4/2 4/3 4/4 · · ·
.. .. .. .. .. ..
. . . . . .
Clearly all positive rationals appear (multiple times) in the table. Now suppose
you are moving through the table “on diagonals” where p + q is constant: start at
1/1, the only square on its diagonal (where p + q = 2). Next go on the diagonal
p + q = 3, starting on the square with 1/2 and then moving down and to the left.
Next walk along the diagonal p + q = 4 starting on 1/3 and moving down and left.
It is clear that this process allows you to label each field: 1/1 is number 1, 1/2 is
number 2, 2/1 is number 3, and so on. So, the set of all squares is in bijection with
the set N. Since all positive rationals are sorted into the various fields, it follows
that |Q| ≤ |{all squares}| ≤ |N|. A similar idea can be used on negative numbers,
and this shows that |Q| ≤ |Z|.
• In contrast, the statement |R| ≤ |Q| is false. The idea is due Cantor, and
goes like this. If you believe that you can inject R into Q then you can also inject
R into Z because |Q| ≤ |Z|. Since |Z| ≤ |N|, this also implies that you can inject R
into N. To inject R into N means to label the real numbers by using only natural
(non-repeated) indices. In particular, this can be done to the reals between 0 and
1.
BASIC NOTIONS 9

Suppose we have an exhaustive enumeration (0, 1) = {r0 , r1 , r2 , . . .} of all real


numbers in the unit interval. Let ri,j be the j-th digit in the decimal expansion of
ri . So, ri is the real number with expansion 0.ri,1 ri,2 ri,3 . . .. Now write the real
numbers into a two-way infinite table:
j
..
i . 1 2 3 4 ···
1 r1,1 r1,2 r1,3 r1,4 · · ·
2 r2,1 r2,2 r2,3 r2,4 · · ·
3 r3,1 r3,2 r3,3 r3,4 · · ·
4 r4,1 r4,2 r4,3 r4,4 · · ·
.. .. .. .. .. ..
. . . . . .
We construct now the real number ρ whose decimal expansion is determined as
follows: the i-th decimal of ρ is the i-th decimal of ri . So ρ is “the diagonal”.
Finally, concoct a new real number σ whose i-th decimal is: 1 if ρi = 3; 3 if ρi 6= 3.
The point is that by looking at the i-th decimal, it is clear that σ is not ri (as
they don’t agree in that position). So, σ is not on our list. So, one cannot make a
list (indexed by N) that contains all real numbers. In particular, there are seriously
more reals than rationals or integers.
One can (and for example Cantor did) try to determine whether there are sets
S such that |Q| ≤ |S| ≤ |R| but neither |S| ≤ |Q| nor |R| ≤ |S|. So the question
is whether there is a set that is between R and Q but one cannot inject R into S
and also not inject S into Q. That there is no such set is called the continuum
hypothesis. As it has turned out through fundamental work of Gödel and Cohen,
this question cannot be answered within the framework of the axioms of Zermelo
and Fraenkel.1 (Gödel proved that no matter what system of axioms you take, it is
either self-contradictory or allows unanswerable questions. Cohen showed that, in
particular, the continuum hypothesis cannot be decided with Zermelo–Fraenkel’s
axioms). 
0.4. Finite vs infinite. Some sets S allow injections into themselves that are
not surjective. For example, one can make a function φ : N → N that sends x to
x + 1 and so is clearly injective but not onto. Such sets are called infinite. A set
for which every injection φ : S → S has to be also surjective is finite.
Finite sets allow to attach a familiar quantity to S, by answering the question
“what is the smallest n such that |S| ≤ |{1, 2, . . . , n}|. One writes |S| = n in that
case and calls it the cardinality, although we will still call it the size of S. For
infinite sets, one needs new symbols since the size of such set will not be a natural
number. One writes |N| = ℵ0 (this thing is pronounced “aleph” and denotes the
size of the smallest set that is not finite) and |R| = ℵ1 . While we can’t answer the
question whether there is something between ℵ0 and ℵ1 , it is known that there is
no upper limit to sizes because of the following construction.
Example .7. The power set of a set S is the collection of all subsets of S,
denoted 2S . This power set includes the empty set ∅ and the whole set S as special
cases. By the exercise below, if S is finite, then the size of the power set is given
1All mathematical sentences we use are built from basic axioms laid down by Zermelo and
Fraenkel. For a somewhat digestible description of the axioms and the surrounding issues, see
http://en.wikipedia.org/wiki/Zermelo-Fraenkel_set_theory
10 BASIC NOTIONS

by |2S | = 2|S| . If S is infinite, such equation makes no sense. But in any event, 2S
is strictly larger than S in the sense that there is no injection 2S ,→ S. The idea of
the proof is the same as the Cantor diagonal trick for S = N we saw above. 

Exercise .8. If the set S is finite, prove that |2S | = 2|S| . (Hint: an element
S
of 2 is a subset of S. What question do you need to answer for each element of S
when you form a subset of S? How many possible answers can you get?) 

Exercise .9. Let S be a finite set of size n. Determine (in terms of n) the
number of pairs of sets (A, B) where both A and B are subsets of S, and where no
element of S is both in A and B. Prove the formula you find.
So, for example, if S has one element called s, then the options for (A, B) are:
(∅, ∅), (∅, {s}) and ({s}, ∅). 

Exercise .10. Let S be a finite set of size n as in the previous exercise. We


consider all pairs of sets C ⊆ D where D ⊆ S. Show that the number of such pairs
is the same as the numbers of pairs (A, B) from the previous exercise. 

0.5. Inclusion/Exclusion.

Notation .11. Given two sets A and B, their union is the set A ∪ B that
contains any element in A, any element in B, and no other. On the other hand,
the intersection A ∩ B is the set that contains exactly those elements of A that are
also in B, and no other.
Tk
For a list of sets A1 , . . . , Ak their common intersection is denoted i=1 Ai and
Sk
their union i=1 Ai .

Suppose A and B are two finite sets; we want to know the size of their union
A ∪ B. A first order approximation would be |A| + |B|, but this is likely to be off
because A and B might have overlap and elements in the overlap A ∩ B would be
counted twice, once in A and once in B. So, we must correct the count by removing
one copy of each element in the overlap:

|A ∪ B| = |A| + |B| − |A ∩ B|.

How about three sets? In that case, there are three intersections: A ∩ B, B ∩ C
and A ∩ C, whose sizes should presumably all be removed from |A| + |B| + |C|.
This is the right idea but doesn’t quite capture it. For example, if A = {1, 2, 3},
B = {3, 4} and C = {2, 3, 5} then |A| + |B| + |C| − |A ∩ B| − |A ∩ C| − |B ∩ C| is
3 + 2 + 3 − 1 − 1 − 2 = 4 while the union is the set {1, 2, 3, 4, 5}. To understand
what happened, look at each element separately. The expression above counts each
of 1, 2, 4, 5 a total of once. But the element 3 is counted three times, and then
removed three times. So, the count is off by one. Inspection shows that this error
will always happen if the intersection A ∩ B ∩ C is not empty, and the count will
be off by as many elements as this intersection contains. So, we conclude:

|A ∪ B ∪ C| = |A| + |B| + |C|


−|A ∩ B| − |A ∩ C| − |B ∩ C|
+|A ∩ B ∩ C|.

It is clear then what the general pattern is:


BASIC NOTIONS 11

Theorem .12 (Inclusion/Exclusion Formula). For any n finite sets A1 , . . . , An ,


[ Xn
| Ai | = |Ai |
1≤i≤n i=1
X
− |Ai1 ∩ Ai2 |
1≤i1 <i2 ≤n
X
+ |Ai1 ∩ Ai2 ∩ Ai3 |
1≤i1 <i2 <i3 ≤n
...
+(−1)n+1 |A1 ∩ A2 ∩ . . . ∩ An |.
Remark .13. In the special case where the sets Ai are pairwise disjoint (all
pairwise intersections Ai ∩ Aj are empty) then the formula just says: the size of
the union of disjoint sets is the sum of the separate sizes. 
Exercise .14. Let S = {1, 2, . . . , 666}. Determine the number of elements of
S that are
(1) divisible by 3
(2) divisible by 7 (Achtung!)
(3) divisible by 3 or 2 or 37 (“or” means “divisible by at least one of them”)
(4) divisible by 6 and 4
(5) divisible by 6 or 4
(6) divisible by 3 and 37 but not by 2
(7) divisible by 6 or 4 but not by 9.

CHAPTER I

Week 1: Introduction

Notation. The following symbols will be used throughout to denote number


sets:
• the natural numbers 0, 1, 2, 3, . . . are denoted by N;
• the integer numbers . . . , −3, −2, −1, 0, 1, 2, 3, . . . are denoted by Z;
• the rational numbers (all p/q with p, q in Z and q 6= 0) are denoted by Q;
• the real numbers are denoted by R;
• the complex numbers are denoted C.
If S is a set (compare Section 0.2 and the surrounding discussion), and s is an
element of S then we shorthand this to s ∈ S. If a small set S is contained in a big
set B we write S ⊆ B. If we want to stress that S does not fill all of B we write
S ( B. If a set S is not actually contained in another set B we write S 6⊆ B. (Note
the logical and notational difference of the last two!)
We denote by |S| the size of the set S, which is explained in Section 0.3.
If n is a natural number we will have need to differentiate between a set
{a1 , . . . , an } and an ordered n-tuple (a1 , . . . , an ). The difference is that when we
use round brackets, it is important in what order the numbers come. It is also
possible that entries are repeated in an ordered n-tuple. The ordered n-tuples are
in one-to-one correspondence with the points in n-space. There is also a hybrid: a
family a1 , . . . , an has no emphasis on order, but it can have repeated P entries.
If S = {as }s∈S is a set whose
Q elements are numbers, we write s∈S as for the
sum of all elements in S, and s∈S as for their product. In the extreme case where
S is empty, the sum over S is (by agreement) equal to zero, and the product equal
to 1.

1. Induction
Suppose you are faced with the task of proving that, for all natural numbers
n, the sum 1 + 2 + . . . + n equals n(n + 1)/2. A few tests show that the formula is
probably right, but no matter how many checks you do, there are infinitely many
others yet to run. It seems like a hopeless proposition. Mathematical induction is
a tool that allows you to complete precisely this sort of job.

1.1. Setup. Suppose that, for each n ∈ N, there is given a statement P (n)
that involves the symbol n more or less explicitly. (For example, P (n) could be the
statement “the sum 1 + 2 + . . . + n equals n(n + 1)/2” from above).
The task at hand is to prove that all statements P (0), P (1), . . . are true.

1.2. The idea. Imagine you are standing in front of a ladder that starts at
your feet (level 0) and goes up indefinitely. Your job is to convince your friend that
you are capable of climbing up to any step of the ladder. How might you do that?
13
14 I. WEEK 1: INTRODUCTION

One approach is to check that you can indeed make it to the lowest rung of the
ladder (the “base case”) and then to exhibit a kind of cranking mechanism that
allows for any position on the ladder (no matter which exact one), say rung n + 1,
to find another rung that is lower, such that you can move to rung n + 1 from the
lower one.
If you can do these two things then clearly you can make it to any desired level.
This is what induction does: imagine that the n-th step of the ladder symbolizes
proving statement P (n). The “base case” means that you should check explicitly
the lowest n for which the statement P (n) makes sense, and the “crank” requires
you to provide a logical argument that says “If P (k) is true for all k ≤ n then
P (n + 1) is also true”. This “crank” is called the inductive step where the part “If
P (k) is true for all k ≤ n” is known as the inductive hypothesis. The “base case”
is the induction basis.
Remark I.1. In many cases, you will only use P (n) in order to prove P (n + 1),
but there are exceptions where using only P (n) is not convenient. Some people call
usage of all P (i) with i ≤ n “strong induction”. But there is nothing strong about
this sort of induction: one can show that what can be proved with strong induction
can also be proved if you just assume P (n) for the sake of proving P (n + 1). 
Example I.2. We consider the question from the start of the section: show
that 0 + 1 + . . . + n = n(n + 1)/2. So, for n ∈ N we let the statement P (n) be
“0 + 1 + . . . + n = n(n + 1)/2”.
The base case would be n = 0 or n = 1, depending on your taste. In either
case the given statement is correct: if n = 0 then the sum on the left is the empty
sum (nothing is being added) and that means (by default) that the sum is zero. Of
course, so is 0(0 + 1)/2. One might be more sympathetic towards the case n = 1
in which the purported identity becomes 1 = 1(2)/2, clearly correct.
For the crank, one needs a way to convince other people that if one believes in
the equation
P (n) : 1 + 2 + . . . + n = n(n + 1)/2

then one should also believe in the equation

P (n + 1) : 1 + 2 + . . . + n + (n + 1) = (n + 1)(n + 1 + 1)/2.
In induction proofs for equational statements like this it is usually best to compare
the left hand side (LHS) of the presumed and the desired equality and to show that
their difference (or quotient, as the case may be) is the same as those of the right
hand sides (RHS). In other words, one tries to manufacture the new equation from
the old.
In the case at hand, the difference of the LHSs is visibly n + 1. The RHS
difference is (n+1)(n+2)/2−n(n+1)/2 = (n+2−n)(n+1)/2 = 2(n+1)/2 = n+1.
So, if one believes in the equation given by P (n) then, upon adding n + 1 on both
sides, one is forced to admit that equation P (n + 1) must also be true. This
completes the crank and the principle of induction asserts now that all statements
P (n) are true, simply because P (0) is and because one can move from any P (n) to
the next “higher” one via the crank. 
Remark I.3. For the functionality of induction it is imperative that both the
base case and the crank are in order. (It’s clear that without crank there is not
1. INDUCTION 15

much hope, but the checking of the base case is equally important, even if the crank
has already been established!)
Consider for example the following attempt of proving that 1 + 2 + . . . + n =
n(n+1)/2+6. Let’s write P 0 (n) to be the statement “1+2+. . .+n = n(n+1)/2+6”.
Now argue as follows: suppose that for some n ∈ N, P 0 (n) is true: 1 + 2 + . . . + n =
n(n + 1)/2 + 6. Add n + 1 on both sides to obtain 1 + 2 + . . . + n + (n + 1) =
n(n + 1)/2 + 6 + n + 1 = [n(n + 1) + 2(n + 1)]/2 + 6 = (n + 1)(n + 2)/2 + 6. So,
truth of P 0 (n) implies truth of P 0 (n + 1).
Of course, if you believe that we did the right thing in Example I.2 above, then
P 0 (n) can’t hold ever (unless you postulate 6 = 0). The problem with climbing the
P 0 -ladder is that while we have a crank that would move us from any step to the
next step up, we never ever actually are on any step: the base case failed! 
Remark I.4. The usual principle of induction only works with collections of
statements that are labeled by the natural numbers. If your statements involve
labels that are not natural numbers then, typically, induction cannot be used in-
discriminately.
One can make various errors in induction proofs. Indicated here are two, by
way of an incorrect proof.
(1) “Theorem”: all horses have the same color.
Proof by induction: let P (n) (n ∈ N) be the statement “within any
group of n horses, all horses have the same color”. The base case P (0) is
void (there is no horse to talk about, so P (0) is true) and P (1) is clearly
true as well.
Now suppose P (n) is true and we prove P (n + 1) from that. It
means that we must show that in any group of n + 1 horses all horses
have the same color. So let S be a group of n + 1 horses, which we
name H1 , H2 , . . . , Hn+1 . Let T1 stand for the size n group of the first
n horses, T1 = {H1 , . . . , Hn }. Let T2 stand for the last n horses, T2 =
{H2 , . . . , Hn+1 }. Since T1 has n horses in it, statement P (n) kicks in and
says that all horses inside T1 have the same color, which we denote by c1 .
Similarly, all horses in group T2 (of size n) have all one color, called c2 .
However, the horses H2 , . . . , Hn appear in both sets, and so have colors
c1 and c2 simultaneously. We conclude c1 = c2 and so all horses in S had
the same color!
(2) “Theorem”: Let a be any positive real number. Then, for all n ∈ N, one
has an = 1.
Proof by induction: let P (n) be ”an = 1”. The base case is n = 0. In
that case, a0 = a1−1 = a1 /a1 = 1.
Now assume that P (i) is true for all 0, . . . , n.. We want to show
an
an+1 = 1. Rewrite: an+1 = an an−1 . Both “an = 1” (statement P (n)) and
“an−1 = 1” (statement P (n − 1)) are covered by the inductive hypothesis
and so P (n + 1) must be true.
Both proofs imply wrong results, so they can’t really be proofs. What’s the prob-
lem? The errors are not of the same type, although similar. In the first case, we
use the collection H2 , . . . , Hn of horses that are simultaneously in both T1 and T2 .
The problem is that if n = 1 then there aren’t any such horses: T1 is just {H1 } and
T2 is just {H2 }. So there is actually no horse that can be used to compare colors
in group T1 with those in group T2 , and so c1 and c2 have no reason to be equal.
16 I. WEEK 1: INTRODUCTION

In the second proof, you were sneakily made to believe that “an−1 = 1” is
covered by the inductive hypothesis, by calling it “P (n − 1)”. But if n = 0 then
n − 1 is negative, and we are not entitled to use negative indices on our statement!!!
One must be very careful not to feed values of n outside the set of naturals into an
inductive argument. 

1.3. Archimedean property and well-order. Here is a fundamental prop-


erty of the natural numbers:
N is well-ordered.
What that means is this: every subset of N, as soon as it has any element at all,
will have a minimal element. So in particular, our set B of bad numbers has a
smallest element: there is a first n for which P (n) is false. Call this smallest bad
guy b.
This is a notable property because S might have infinitely many elements. Any
finite set has a minimum sor sure, because you can test one pair at a time. But for
infinite sets this is not an option. And not all infinite sets have a minimum, just
look at the open interval (0, 1).

Remark I.5. The second example in the emark underscores an important


point: induction only works well for the index set N. What’s so special about the
naturals? Let’s go back to the drawing board of induction. The idea is (rephrased):
if P (n) ever failed, let B be the bad indices: n is in B exactly if P (n) is false.
Question: what could this b be? Answer: surely not b = 0 since the base case
requires us explicitly to check that P (0) is true. So, the minimal bad b is positive.
Since it’s positive, b−1 is natural (not just integer, but actually not negative. Since
b was the minimal bad guy, P (0), P (1), . . . , P (b − 1) can’t be false, so they are all
true. And now comes the kicker: since we do have a crank, P (b) must also be true!
It follows, that we were mistaken: the little bad b never existed, and the claims
P (n) are all true.
Could one hope for proofs of induction when the index set is something different
from N? Not so much. Some thought reveals that making inductive proofs is the
same as the index set being well-ordered. But not many sets are well-ordered. For
example, Z is not (it has no smallest element—one could not meaningfully speak of
a lowest rung on the Z-ladder). Also, the set of real numbers in the closed interval
[0, 1] is not well-ordered (for example, the subset given by the half-open interval
(0, 1] doesn’t have a smallest element—this says that there is no real number that
“follows” 0). So, induction with index set Z or R or [0, 1] is not on the table. 

Remark I.6. One can formally turn induction “upside down”. The purpose
would be to prove that all statements in an infinite sequence P (n) are false. The
idea is: check that P (0) is false explicitly; then provide a crank that shows: if some
statement P (n) is true then there must already be an earlier statement P (i) with
i < n that is true.
This reverse method is called infinite descent and illustrated in the next exam-
ple. 

Well-order of the natural numbers is a very basic property and closely related
to the following “obvious” result:
1. INDUCTION 17

Theorem I.7 (Archimedean property). Choose a, b ∈ N with 0 < a. Then


the sequence a, a + a, a + a + a, . . . contains an element that exceeds b, so that
b − (a + . . . + a) < 0. In other words, ∃k ∈ N with ka > b. 
I call this a theorem because if one wrote down the axiomatc definition of N
then this property is one that one needs to prove from the axioms. This axiomatic
definiton, translated into English, says roughly that there is a natural number 0,
and another natural number 1, and for each natural number a there is another one
called a + 1, and there aren’t any other natural numbers than those you can make
this way. And one says a < b if b can be made from a by iterating the procedure
a 7→ a + 1.
It is not always true that collecting lots of small things (like a) gives you
something big (like b). For example, adding lots of polynomials of degree 3 does
not give a polynomial of degree 5.
Remark I.8. The Archimedean property implies well-ordering; one can see
that as follows. Suppose S ⊆ N is not empty. Pick some s ∈ S. Then the sequence
s − 1, s − 2, s − 3 . . . eventually becomes negative. So only finitely many elements
of S other than s could be the minimum of S. Compare them to one another and
take the smallest one.

Example I.9. We shall prove the following theorem: 2 is not a rational
number. (We are going to assume that we have some reasonable understanding
what “2” means. The square root of 2 must then be a number whose square equals
2. The point of the problem is to show that this root is not an easy number to
determine.)
This statement seems not much related to induction, because it is not of the
P (n)-type. However, consider the following
√ variant: let P (n) be the statement
“there is a fraction
√ m/n that equals 2, where m ∈ N”. If we can show that all
P (n) are false, 2 cannot be represented by a rational number.
The base cases n = 0 and n = 1 have P (0) and P (1) false. For n = 0 this is
because 0 may not be a denominator, while for n = 1 it follows from the fact that
02 and 12 are less than 2 while m2 > 2 for m > 1.
Now suppose, in the spirit of infinite descent, that P (n) is true for some n ∈ N
and try to show that P (i) must then also be true for some natural i < n. To
say “P (n) is true” is to say that 2 = m2 /n2 for some natural m. In particular,
2n2 = m2 so that m must be even (we are borrowing here a bit from the next
chapter), m = 2m0 . Now feed this info back into the equation: we have 2n2 = 4m02 .
The same reasoning shows now that n must be even, n = 2n0 for some n0 ∈ N. This
now leads to the equation 2n02 = m02 , and it seems we made no progress. However,
stepping
√ back, we realize that m/n = 2m0 /2n√0 = m0 /n0 which would suggest that
2 = m /n0 . But this is a representation of 2 with a denominator only half the
0

size of n. So we have shown that if P (n) holds then n is even and P (n/2) also
holds.
In concrete terms, if the first n for which P (n) holds is called b then b must
be even and P (b/2) is also true. Of course, in most cases b/2 is less than b (so b
wouldn’t actually be the first), and the only natural number for which this does
not cause a problem is b = 0. So if there is any n with P (n) true, then P (0) should
also √be true. But, as we checked, it isn’t. So we deduce that P (n) is always false
and 2 must be irrational. 
18 I. WEEK 1: INTRODUCTION

Remark I.10. The following reformulation of induction exists: Suppose there


is a statement P (n) for every natural number n, and suppose further we have
checked that P (0) is correct. Then P (n) is correct for any n ∈ N provided that:
for every k ∈ N we have a crank that says
If P (0), P (1), . . . , P (k) are ALL true, then P (k + 1) is also true.
This is usually referred to in textbooks as “strong induction”, but is not stronger
than usual induction. But the formulation has its advantages as we will show soon.
1 1 1
Exercise I.11. (1) Show that 1 = 1·2 + 2·3 + 3·4 + · · · . (Hint: find a
guess for the partial sums and then use induction.)
1 1 n(n+3)
(2) Show that 1·2·3 + · · · + n·(n+1)·(n+2) = 4·(n+1)·(n+2) and determine the
limit of the corresponding series.
(3) Show that 7 divides 11n − 4n for all n ∈ N.
(4) Show that 3 divides 4n + 5n for odd n ∈ N.
(5) If one defines a number sequence by f0 = 1, f1 = 1, and fi+1 = fi + fi−1
for i ≥ 1 then show that fi ≤ 2i .
(6) Show the Bernoulli inequality: if h ≥ −1 then 1 + nh ≤ (1 + h)n for all
n ∈ N.
(7) Recall that 1 + . . . + 1 = n and 1 + 2 + . . . + n = n(n+1) 2 . Now show that
n(n+1)(n+2)
1 · 2 + 2 · 3 + . . . + n · (n + 1) = 3 .
(8) Show that 1 · 2 · 3 + 2 · 3 · 4 + . . . + n · (n + 1) · (n + 2) = n(n+1)(n+2)(n+3)
4 .
(9) Generalize the two previous exercises to products of any length.
(10) Show that 1 + 3 + 5 + . . . + (2n − 1) = n2 both by induction and by a
picture that needs no words.
(11) Show that 5Pdivides n5 − n for all n ∈ N.
(12) Show that ∅6=S⊆{1,...,n} Q 1 σ = n. (For example, if n = 2 then the
σ∈S
possible sets S are {1}, {2} and {1, 2} and then the sum is 11 + 21 + 1·2 1
,
which equals 2 as the formula predicts.) Hint: for P (n + 1), split the set
of possible sets S into those which do and those which do not contain the
number n + 1.
(13) The list of numbers 1, 2, 3, . . . , 2N is written on a sheet of paper. Some-
one chooses N + 1 of these numbers. Prove, by induction, that of those
numbers that were picked, at least one divides another one. (Hint: this
is not easy. Consider cases: 1. what if all chosen numbers are at most
2N − 2? 2. What if at least N of the chosen numbers are at most 2N − 2?
3. If both 2N − 1 and 2N are chosen, ask whether N was chosen. If
yes, something nice happens. If not, focus on the chosen numbers that
are at most 2N − 2, and pretend that N was also chosen—even though
it was not. Now use the inductive hypothesis. How do you deal with the
fact that N was not really chosen? Recall that you DO have 2N − 1 and
especially 2N .)


2. Arithmetic

We must begin with some algebra. We officially meet the definition of a ring
only in week X > 1, but I state it here already. The idea is to list all the important
properties of the set of integers.
2. ARITHMETIC 19

Definition I.12. A (commutative) ring R is a collection of things that have


properties like the integers, namely
(1) there is an operation called addition (and usually written with a plus-sign)
on R such that
• r + s = s + r for all r, s in R (“addition is commutative”);
• r + (s + t) = (r + s) + t for all r, s, t in R (“addition is associative”);
• there is a neutral additive element (usually called “zero” and written
0R such that r + 0R = r = 0R + r;
• for each r there is an additive opposite number (usually called the
“negative”, and written −r) with r + (−r) = 0R ;
(2) there is an operation called multiplication on R (and usually written with
a dot such that
• r · s = s · r for all r, s in R (“multiplication is commutative”);
• r · (s · t) = (r · s) · t for all r, s, t in R (“multiplication is associative”);
• there is an neutral multiplicative element 1R (usually called the iden-
tity) such that 1R · r = r = r · 1R for each r ∈ R;
(3) the law of distribution applies: r · (s + t) = r · t + s · t for all r, s, t in R.
Note that no assumption is made on being able to divide (although subtraction
is guaranteed, because each ring element has a negative).
Remark I.13. (1) In some cases one may want to consider rings where
the existence of 1R is not certain, or one may allow r · s and s · r to differ. There
is a place and time for such less pleasant rings, but not here.
(2) We will usually drop the subscripts in 0R and 1R if it is clear what ring we
mean.
(3) We often skip the dot and write ab for a · b.

Example I.14. We list some examples of rings. If nothing is said, addition
and multiplication is what you think.
• The integers, Z. (The case after which the definition is modeled).
• The set of real numbers (or the complex numbers, or the rational num-
bers). These three rings are special, because in them all nonzero num-
bers even have inverses (one can divide by them). Such things are called
“fields”.
• The collection R[x] of all polynomials in the variable x with real coeffi-
cients. √
• A weird one: look at all expressions of the form a + b −5 where a and
b are integers. It is clear that adding such things gives other such things.
It is slightly less obvious that
√ multiplying has the same property (check
it!). This ring is denoted Z[ −5].

Exercise I.15. Show that the set of natural numbers N is not a ring. Find
another set that is not a ring and point out why it isn’t. 
2.1. The Euclidean algorithm. The Archimedean property allows to for-
mulate division with remainder :
For all a, b ∈ Z there are q, r ∈ Z such that a = bq + r and
0 ≤ r ≤ |b| − 1.
20 I. WEEK 1: INTRODUCTION

The number r is the remainder of a under division by b.


This property has pleasant consequences. In order to get concrete, recall that
the greatest common divisor and the least common multiple of two integer numbers
are defined as follows.
Definition I.16. Let a, b ∈ Z. Then gcd(a, b) = max{d ∈ N with d|a, d|b} and
lcm(a, b) = min{m ∈ N with a|m and b|m}. In concrete terms, factorize them into
prime powers:
a = 2a2 · 3a3 · · · pap ,
b = 2b2 · 3b3 · · · pbp .
Then
gcd(a, b) = 2min(a2 ,b2 ) · 3min(a3 ,b3 ) · · · pmin(ap ,bp ) ,
lcm(a, b) = 2max(a2 ,b2 ) · 3max(a3 ,b3 ) · · · pmax(ap ,bp ) .

Consider the equation a = qb + r derived from integers a > b through the


Archimedean property. If a number d ∈ Z divides a and b then it clearly also
divides r = a − qb. Conversely, a common divisor of b, r also divides a. So, the set
of numbers dividing a, b equals the set of numbers dividing b, r and in particular
gcd(a, b) = gcd(b, r). We now exploit this to make an algorithm to find gcd(a, b).
Example I.17 (Euclid’s Algorithm).
Input: a, b ∈ Z with b 6= 0.
Initialize:
• c0 = a, c1 = b, i = 1.
Iterate:
• Write ci−1 = qi ci + ri where qi , ri ∈ Z and 0 ≤ ri ≤ |ci | − 1.
• Set ci+1 = ri .
• Replace i by i + 1.
Until:
• ci+1 = 0.

Output: gcd(a, b) = ci . 
From what we said above, gcd(cj , cj+1 ) = gcd(cj−1 , cj ) at all stages of the
algorithm. In particular, gcd(a, b) = gcd(ci , ci+1 ) = gcd(ci , 0) by our choice of
aborting the loop. The gcd of any number and zero is that “any number”, so
gcd(a, b) is really the last nonzero remainder ci we found.
There is another aspect to the Euclidean algorithm, which is the following.
The last equation says how to write ci in terms of the previous two: ci = ri−1 =
ci−2 − qi−1 ci−1 . The second to last equation can be used to express ci−1 in terms
of ci−2 and ci−3 . Substituting, we can write ci in terms of ci−2 and ci−3 . Iterating
this backwards, one arrives at a linear combination of the form gcd(a, b) = αa + βb
for suitable integers α, β. This is a fact to remember:
Proposition I.18. Working backwards from the end of Euclid’s algorithm de-
termines a Z-linear combination
gcd(a, b) = αa + βb.
2. ARITHMETIC 21

Example I.19. Let a = 56 = c0 , b = 35 = c1 . We find 56 = 1 · 35 + 21,


so c2 = 21. Then 35 = 1 · 21 + 14, so c3 = 14. Next, 21 = 1 · 14 + 7 and so
c4 = 7. In the next iteration we get to the end: 14 = 2 · 7 + 0 so that c5 = 0. This
certifies c4 = 7 = gcd(35, 56). Working backwards, 7 = 21 − 14 = 21 − (35 − 21) =
2 · 21 − 35 = 2(56 − 35) − 35 = 2 · 56 − 3 · 35. 
Exercise I.20. Find gcd(a, b) as a Z-linear combination of a, b in the following
cases:
(1) (a, b) = (192, 108);
(2) (a, b) = (3626, 111);
(3) (a, b) = (34, 13).

2.2. Primes and irreducibles. We now inspect prime numbers. Feel free to
substitute “integer” for “ring element”.
Definition I.21. A unit of a ring R is a number to which there exists an
inverse in the given ring. (Note that this is a relative notion: 2 is a unit in R with
inverse 1/2, but not a unit in Z since the only candidate for an inverse, 1/2, fails
to be an integer. The only units in Z are ±1).
Definition I.22. For ring elements a, b in R we shall write a|b when the number
a divides the number b in R (which just means that there is some element r of R
such that ar = b). If divisibility fails, we write a - b. The example in the previous
paragraph indicates that one needs to know the ring in order to answer divisibility
questions. (2|1 in the ring R since 1/2 ∈ R, but 2 - 1 in the ring Z.)
The element p in the ring R is prime if p|ab implies that either p|a or p|b. On
the other hand, p is irreducible if p = ab implies that one of a and b must be a unit.
Note that if in some ring the element p is prime or irreducible, then the same is
true for −p, its additive opposite. One of the fundamental properties of the integers
is that being prime is the same as being irreducible in Z:
Theorem I.23. For any 0 6= n ∈ Z the statements “n is prime” and “n is
irreducible” are equivalent.
Proof. Choose 0 6= n ∈ Z. An integer n is prime if and only −n is, and it is
irreducible if and only if −n is. So we can actually assume that n ∈ N.
Suppose n ∈ Z is prime. We want to show that it is irreducible, which means
that whenever n = ab appears as a product of natural numbers, then a or b is a unit.
But that is automatic (not specific to the integers) from the definition of “prime”
and “irreducible”: if n = ab then n divides ab and so it must (as a prime) divide
one factor. Say, n divides a so that a = nq with q ∈ N. Now we have n = ab = nqb
and so 1 = qb upon division. It follows that b, as a divisor of 1, is a unit. Again,
this had nothing to do with the integers beyond the fact that we could “cancel “n”
in the product above.
Now suppose n is irreducible, and we try to show that it is prime. So, suppose
n divides a product ab with a, b ∈ N; we need to show that n divides a or b. Let g
be the gcd(a, n). If g > 1 then n = gq with q ∈ N implies that q is a unit an hence
q = 1 and so n = g divides a. On the other hand, of g = 1 then the Euclidean
algorithm says that we can write 1 = g = αa + βn with α, β ∈ N. Recall that we
started with n | ab, so cn = ab for some c ∈ N. Multiplying 1 = αa + βn by c
22 I. WEEK 1: INTRODUCTION

we get c = αca + βcn = αac + βab = a(αc + βb). In particular, a divides c. But
then, the equation cn = ab becomes (c/a)an = ab and cancellation of a shows that
n divides b. Note that this part used the Euclidean algorithm, and is not true for
all rings. 

Theorem I.24. Integers enjoy unique factorization. This means first off that
for all 0 6= n ∈ Z there is an prime factorization, which is an equation
n = c · p1 · · · pk
where each pi is a prime number and where c is a unit (which in Z implies c = ±1).
It means secondly that any two such factorizations are almost the same: if
d · q1 · · · q` = n = c · p1 · · · pk
are two such factorizations (c, d units and pi , qj prime) then k = ` and (up to sign
and suitable reordering) pi = qi
For example, 14 = 1 · 2 · 7 = (−1) · 2 · (−7) are two different but essentially
equivalent prime factorizations of 14.

Proof. What we need to show comes in two stages: given a natural number
n, we need to show it factors at all into prime factors. And then we need to show
that any two such factorizations agree, up to reordering. (Note that we can focus
on n > 0, since −n = (−1) · n and so a factorization of n corresponds to one of
−n).
We use strong induction. The base case is clear: 1 and 2 are surely factorizable
into units and primes: 1 = 1 and 2 = 2. So we focus on the crank. So let 2 ≤ n ∈ N
and assume that the numbers 1, 2, . . . , n all have a factorization into positive prime
numbers. (We don’t need to show that we can factor stuff into positive prime
numbers, but it is convenient when the number to be fctored is already positive).
We consider now n + 1. There are two cases: either n + 1 is prime, in which case
we can write n + 1 = 1 · (n + 1) as prime factorization. Or, n + 1 is not prime. Then
n + 1 is also not irreducible (since we will show later that prime = irreducible, not
prime = not irreducible), and so it factors as n + 1 = 1 · a · b with a, b not units.
Since n + 1 was positive, we can arrange a, b to be positive, and so they both fall
into the set of numbers 1, 2, . . . , n about which we already know that they can all
be factored. So, factor a = 1 · a1 · · · ak and b = 1 · b1 . . . b` into primes, so that
n + 1 = a · b = 1 · a1 · · · ak · b1 . . . b` has a factorization. So, all natural numbers do
have prime factorizations.
Now we need to show that these factorizations are unique. Take any natural
number n with two prime factorizations c1 · a1 · · · ak = n = c2 · b1 . . . b` where c1 , c2
are units and each ai , bj is a prime number. If any ai or bj is negativem we can
turn their signs by moving the signs into c1 and c2 . So all ai , bj can be assumed to
be positive.
Since a1 is prime, and since it divides the product c2 · b1 · · · b` , a1 must divide
one of the factors of this product. It cannot divide c1 since c1 = ±1 and a1 as a
prime has absolute value 2 or more. So, a1 divides some bt , so bt = a1 q1 for some
integer q1 . But bt was supposed to be prime, hence irreducible, so q1 is a unit. But
a1 and bt are positive, so q = 1 and a1 = bt .
Divide out a1 = bt to get c1 · a2 . . . ak = n/a1 = c2 · b1 · · · b̂t · b` , where the
hat indicates that bt has disappeared from the product. So these are two prime
2. ARITHMETIC 23

factorizations for n/a1 . If we now set up a (strong) induction process, we can


assume that we already know that n/a1 has unique (up to reordering and shuffling
of units) prime factorization. But then, up to units, the ai for i > 1 are the bj with
j 6= t. Since a1 = bt , follows that n also has unique prime factorization. 
What is left in this subsection will be discussed in the distant future. It is only
here to amuse.
√ √ √
Example I.25. In √ Z[ −5] one can write 6 = 2 · 3 = (1 + −5) · (1 − −5). √ It
turns out that 2, 3, 1 ± −5 are all irreducible (see the exercise below). So Z[ −5]
does not have unique factorization. 

Exercise I.26. (1) On R = Z[ −5] define a norm function
√ √
N : a + b −5 7→ N (a + b −5) := a2 + 5b2 ∈ Z.
Convince yourself that the norm of a number is the square of its (complex)
absolute value (if you read the number as a complex√ number). √
(2) Show that
√ the norm is√multiplicative: N ((a + b −5) · (c + d −5)) =
N (a + b −5) · N (c + d −5).
(3) Find all elements of our ring R that have norm 1.
(4) Show that no element √ has norm 2 or 3.
(5) Show that 2, 3, 1 ± −5 are all √ irreducible by inspecting the ways of fac-
toring N (2), N (3) and N (1 ± −5).

2.3. Some famous theorems and open problems on prime numbers.
The proofs of several theorems here is rather beyond us, but if interested you might
look at [?] for further pointers.
The most basic, famous, and memorable was (together with the proof given
here) already known to Euclid:
Theorem I.27. There are infinitely many prime numbers.
Proof. Suppose p1 , . . . , pk are prime numbers. Then M = p1 · . . . · pk + 1 is
not divisible by any of them. It might be the case that M is prime, but that does
not need to be so. However, M does have a prime factorization, M = c · q1 · · · qt
with c a unit and all qi prime. Since no pi divides M , none of qi is on the list of
primes p1 , . . . , pk . In other words, any finite list of primes is missing at least one
other prime. 
Recall now the harmonic series
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
+ + + + + + + + + + + + + + + +···
1 2 3 4
|{z} |{z} | {z } | 5 6 {z 7 8 9
} | 10 11 12 {z 13 14 15 16}
a0 a1 a2 a3 a4

Exercise I.28. Show that the harmonic series diverges (has no finite value).
(Hint: what can you say about each ai ?). 
Suppose you only took the terms that are inverses of prime numbers,
1 1 1 1 1 1 1 1 1 1 1
+ + + + + + + + + + + ··· ,
2 3 5 7 11 13 17 19 23 29 31
how does this series behave? We will need to use the following fact.
Exercise I.29. Show:
24 I. WEEK 1: INTRODUCTION

1−xn+1
(1) For any real or complex number x and any integer n, 1−x = 1+x+
x2 + · · · + xn . P∞ 1
(2) As long as |x| < 1, i=0 xi = 1−x .

Theorem I.30. The sum of the reciprocals of all the prime numbers is still
divergent.
This of course implies also that there are infinitely many primes. However, its
proof is far more delicate than Euclid’s proof above. We give here the idea behind
the proof; that the steps can be made rigorous is somewhat involved.
Proof. Any positive integer n is uniquely the product of positive prime num-
bers. The emphasis is on “unique”, if you multiply together different sets of numbers
you get different end products, pun intended.
Consider now the product
1 1 1 1 1 1 1 1 1
(1 + + + + . . .) · (1 + + 2 + . . .) · (1 + + 2 + . . .) · · · (1 + + 2 + . . .),
2 4 8 3 3 5 5 p p
where p is some prime number.
If you actually multiply this out, the resulting mess contains, for each choice
of finitely many primes p1 , . . . , pk bounded by p, the quantity 1/(p1 · · · pk ). So, the
mess actually contains exactly one copy of the inverse of every natural number that
has a prime factorization in which only powers of primes occur that are bounded
by p. Taking the limit p → ∞, one might try to believe in an equation

X 1 Y
= (1 + 1/p + 1/p2 + 1/p3 + . . .),
n=1
n
p prime

and conclude that the right hand side is, like the harmonic series on the left, infinite.
The art (which we omit) is to make this argument and all that builds on it, so that
it become mathematically sound. P∞
Using the geometric series, this suggests n=1 n1 = p prime 1−1/p 1
Q
. Now take
1
P P
logs on both sides. The RHS turns into p prime ln( 1−1/p ) = − p prime ln(1 −
1/p). Looking at the P graph of the log-function, one sees that ln(1 − x) ≤ x, so
1 1
P
p prime ln( 1−1/p ) ≤ p prime p . But the left hand side was already infinite, so
therefore the sum of the prime reciprocals must be too. 
This theorem says that there are still quite a bit of primes, namely enough
to make the sum diverge. (Remember: if you just looked at the subsum given
by powers of your favorite prime, this would be geometric and hence convergent).
How many primes are there in comparison to all numbers? This is best asked in
the context of prime density.
Theorem I.31 (Prime Number Theorem). Let pk be the k-th prime number.
pk
Then the fraction k·ln(k) approaches 1 as k → ∞.
Equivalently, if you pick randomly a number n near the number N then the
chance that n is prime is about ln(N )/N .
Another set of questions one could ask is about primes in arithmetic progres-
sions (rather than in all numbers). That means: let
A(a, n) = {a, a + n, a + 2n, . . .}
2. ARITHMETIC 25

be the arithmetic progression starting with a ∈ N and with shift n ∈ N.


Definition I.32. The Euler φ-function attaches to each natural number n the
number of natural numbers less than n that are relatively prime to n.
For example, φ(12) = 4 because of 1, 5, 7, 11, and φ(7) = 6 because of 1, 2, 3, 4, 5, 6.
Theorem I.33. The set A(a, n) contains approximately x/(ln(x) · φ(n)) prime
numbers of size less than x.
If you agree that φ(1) = 1 then this theorem specializes to the Prime Number
Theorem above.
A prime twin is a pair {p, p + 2} of primes (such as {101, 103}).
Conjecture I.34. There are infinitely many twin primes.
Not much is known except that if you looked at the subsum of the harmonic
series that is comprised of the terms that are twin primes only then this sum does
converge. So, there are rather fewer twin primes than prime numbers. However,
you might relax your twin focus a little and ask “how many pairs of primes are
there that are no further apart than 70 million”? In 2014, Yitang Zhang who has
a mathematics PhD from Purdue, proved that the answer is now “yes”; he was
awarded a very prestigious MacArthur Grant for this, see http://www.macfound.
org/fellows/927/.
There are two more famous conjectures. The first is easy to state:
Conjecture I.35. • (Goldbach, strong form) All even integers n > 2
can be written as the sum of two prime numbers.
• (Goldbach, weak form) All integers n > 1 can be written as the sum of at
most three primes.
The other requires a bit of preparation.
Definition I.36. The Riemann zeta function is

X 1
ζ(s) := .
n=1
ns

Here, the input s is to be taken a complex rather than a real number. This sum
will converge (absolutely) if the real value of s is greater than 1 because of things
we know about the geometric series from Exercise I.29. On the other hand, the
harmonic series teaches that at s = 1 the value of ζ is infinite. In the places where
s has real part less than 1, one can use a graduate-level technique called “analytic
continuation” to make sense of the series (even though it probably diverges). The
result is a nice function in s that can have poles every now and then (such as in
s = 1). Values of the zeta function appear in physics (how odd!) and chemistry
(no more even!), and they have a tendency to involve the number π. At negative
even integers, ζ(s) is zero for reasons that come from a “functional equation” that
ζ satisfies:
ζ(s) = 2s π s−1 sin(πs/2)Γ(1 − s)ζ(1 − s).
Here, π is the usual π from trigonometry, and Γ is a version of “factorial” for non-
integer input. (If you believe this equation, you must believe that ζ(−2n) = 0 by
looking at the contribution of the sine).
26 I. WEEK 1: INTRODUCTION

Conjecture I.37 (Riemann hypothesis). Apart from negative even integers,


all other values s where ζ(s) = 0 satisfy: s has real part 1/2.
This one is one of the seven Clay Millennium problems, the complete list of
which can be found under http://www.claymath.org/millennium-problems. It
would earn you $1.000.000 to crack it. It also featured (with the Goldbach Con-
jecture) as one of Hilbert’s Problems. This list, see http://en.wikipedia.org/
wiki/Hilbert’s_problems, was compiled by perhaps the last person that un-
derstood all of mathematics as it existed at the life-time of that person. The
list was presented (in part) at the International Congress of Mathematicians in
1900. It has hugely influenced mathematics and mathematicians during the 20th
century (although lots of mathematics was made that didn’t related directly to
the list), and the list of Clay Millennium Problems can be viewed as its descen-
dant. Some solutions to Hilbert’s problems have been awarded with a Fields medal
http://en.wikipedia.org/wiki/Fields_Medal.

3. Modular arithmetic
We are perfectly used to claim that 4 hours after it was 11 o’clock it will be
3 o’clock. In effect, we equate 12 with zero in these calculations. In this section
we learn how to calculate on more general “clocks” and even solve equations on
“clocks”.
3.1. Computing “modulo”: Z/nZ.
Definition I.38. For any integer n, write nZ for the set of all integer multiples
of n, so nZ stands for {. . . , −3n, −2n, −n, 0, n, 2n, . . .}.
For an integer a write then a + nZ for the collection of all integers who leave
remainder a when divided by n, so a + nZ = {. . . , a − 2n, a − n, a, a + n, a + 2n, . . .}.
Note that a + nZ = (a + n) + nZ. Such sets we call cosets modulo n, while
the various integers that float around in a given coset are called representatives.
They are also sometimes written as “a mod n”. If the value of n is understood
from the context, we may write a for a + nZ. (If n = 12, and if a = 3, then
3 = 3 mod 12 = 3 + 12Z = {. . . , −21, −9, 3, 15, 27, . . .}. This is the set of all times
on an absolute clock at which a usual clock shows “3o’clock”).
Finally, write Z/nZ (“zee modulo enn zee”) for the collection of all cosets
modulo n. (If n = 12, this is the set of all possible full hours, clustered by what a
12-hour-clock would show).
Here is an example on a very small “clock”.
Example I.39. Let n = 4. there are four cosets modulo 4, namely
(1) 0 + 4Z = {. . . , −8, −4, 0, 4, 8, . . .}, 1 + 4Z = {. . . , −7, −3, 1, 5, 9, . . .},
(2) 2 + 4Z = {. . . , −6, −2, 2, 6, 10, . . .}, 3 + 4Z = {. . . , −5, −1, 3, 7, 11, . . .}.
So, Z/4Z = {0, 1, 2, 3}. Representatives of 3 include 3, 7, −133 among others. 
Remark I.40. While this “modular arithmetic” may seem a bit esoteric, be
assured that it is far more important than you can imagine. For example, all
computers on this planet calculate in Z/2Z or a slightly more complicated scheme.
Without modular arithmetic, there would be no twitter, no email, no instagram.
Not even a digital watch. 
3. MODULAR ARITHMETIC 27

Amusingly, one can calculate with these cosets as if they were numbers. Re-
garding addition, we always knew that 4 hours after it was 11 o’clock it will be 3
o’clock because the coset of 11 plus the coset of 4 gives the coset of 15, which is to
say, of 3. That one can also multiply is a new idea:
(a + nZ) + (b + nZ) := (a + b) + nZ;
(a + nZ) − (b + nZ) := (a − b) + nZ;
(a + nZ) · (b + nZ) := (a · b) + nZ.
The amusing part is that this works well on any choice of representatives. For
example, in order to compute (2 + 7Z) + (3 + 7Z) you could say: pick representative
−5 = 2 + (−1) · 7 for 2 and representative 24 = 3 + 3 · 7 for 3. Add them to get 19
and so (2 + 7Z) + (3 + 7Z) = 19 + 7Z. Of course, you probably would have chosen
2 = 2 + 0 · 7 and 3 = 3 + 0 · 7 as representatives, resulting in the coset of 5. The
point is that 5 and 19 are actually the same. In order to prove that this is always
ok and not just in our explicit example you should carry out
Exercise I.41. Show that for all choices of a, a0 , b, b0 , n with n|(a − a0 ) and
n|(b − b0 ) one has:
• n divides (a + b) − (a0 + b0 ); (this says that the cosets of a + b and of a0 + b0
always agree);
• n divides (a − b) − (a0 − b0 ); (this says that the cosets of a − b and of a0 − b0
always agree);
• n divides ab−a0 b0 ; (this says that the cosets of ab and of a0 b0 always agree).

3.2. Divisibility tests. Suppose you are asked “is 1234567 divisible by 3?”.
You could sit down and calculate, or ask a friend with a computer, but you could
also think. Such as: n = 1234567 comes to me as a decimally expanded number,
n = 10k · ak + · · · + 10 · a1 + a0 where k = 6, a6 = 1, a5 = 2, a4 = 3, a3 = 4, a2 = 5,
a1 = 6 and a0 = 7. In order to test divisibility of n by 3, I’d like to know whether
n mod 3 is zero or not. But, n mod 3 = (10k · ak + · · · + 10 · a1 + a0 ) mod 3, and
“ mod ” goes well over addition and multiplication:
X
n mod 3 = (ai mod 3) · (10 mod 3)i
X
= (ai mod 3) · (1 mod 3)i
X
= ai mod 3.
It follows that n is a multiple of 3 if and only if the sum of its digits is a multiple
of 3. Of course, if you want, you can reapply this idea to the output:
1234567 mod 3 = (1 + 2 + 3 + 4 + 5 + 6 + 7) mod 3 = 28 mod 3
= (2 + 8) mod 3 = 10 mod 3
= (1 + 0) mod 3 = 1 mod 3.
Hence, 1234567 leaves rest 1 when divided by 3.
Obviously, a similar thought works for 9 instead of 3 since any power of 10
leaves rest 1 when divided by 9.
Exercise I.42. Prove that 11 divides n if and only if it divides a0 − a1 + a2 −
a3 + · · · + (−1)k ak . 
28 I. WEEK 1: INTRODUCTION

Example I.43. Here is a test on divisibility by 7 by way of a picture. Take


the digital representation of your number. Start a the bottom node in the picture.
Starting with the front digit, do the following for each digit: go as many steps along
simple arrows as the current digit says; then go one step along a double arrow.
If you end up at the bottom node, n is a multiple of 7. (In general, the node
index is the remainder of n divided by 7).

Question: Following the single arrows just counts how big the current digit is.
What is the function of the double arrows?
CHAPTER II

Week 2: Groups

1. Symmetries

Example II.1. Let T be an equilateral triangle . We imagine its vertices


to carry labels A, B, C, but these are not visibly written on the vertices.
We want to discuss ways to move around the triangle so that it looks after
the movement the same way as before. There are the rotations by 0◦ = e, 120◦ =:
`, 240◦ =: r. Then there are 3 reflections, a, b, c. Here, a leaves A fixed and
interchanges B with C, and so on. One checks there are no other symmetries (3! is
an upper bound). So Sym( ) = {e, r, `, a, b, c}.
Symmetries can be composed. For example, rr = `, rrr = e. The 36 products
are as follows (the entry in row r and column a, for example, is the composition
ar):
e r ` a b c
e e r ` a b c
r r ` e b c a
` ` e r c a b
a a c b e ` r
b b a c r e `
c c b a ` r e
Note: if you write ra for example, you mean “first a, then r, in the same way as
f (g(x)) evaluates first g(x) and then stuffs this into f . So, we imagine ra really
means “r applied to (the result of a applied to triangle)”.
Definition II.2. The full set of symmetries of a regular n-gon is denoted Dn
and called the n-th dihedral group.
Note that Dn consists of n rotations and n reflections (for n ≥ 3). In contrast,
D2 is the symmetries of a line segment, which is just {e, f } where f is the flip
exchanging the ends. D1 is the symmetries of a point, so just {e}. The two
e f
e
composition tables are e e f and .
e e
f f e
Remark II.3. It is clear that the row labeled e and the column labeled e always
agree with the top row and column that simply list the the symmetries. We will
henceforth skip this row and column and place e in the upper left corner. So the
e f
table for D2 would just be .
f e

Example II.4. Let OS be the oriented square . Its symmetry group has
fewer elements than that of the square, namely only the rotations {e, `, `2 , `3 } with
29
30 II. WEEK 2: GROUPS

a composition table
e ` `2 `3
` `2 `3 e
`2 `3 e `
`3 e ` `2
since `4 = e and there is no other relation. This group of symmetries is called the
cyclic group C4 , since you only need to know one element of it (such as `) and
everyone else is a power of it. The “4” comes from the fact that `4 = e and no
lower positive power will do (or, because there are 4 elements in this cyclic group.
Itis like a clock with 4 hours).
Example II.5. Now we look at the symmetries of the letter H. It has 4 elements,
the identity e, the rotation x by 180◦ the left-right flip ↔ and the up-down flip l.
The table is
e ↔ l x
↔ e x l
.
l x e ↔
x l ↔ e
(You should actually check a few of the products listed here).
This set of symmetries is called the Klein 4-goup and denoted KV4 . Felix Klein
was the superstar of symmetry investigations. Note that Sym(H) ⊆ Sym() since
drawing a box H around the H does not change the symmetries.
Note also that the tables for KV4 and C4 are seriously different since e shows
up on the diagonal with different multiplicity. (The element e is special and can be
recognized even if you use different letter as it is the one element for which ex = x
for every symmetry x).

2. Groups
We are now ready to define what a group is. It generalizes the symmetry studies
above.
Definition II.6. A group is a set G with an operation · that takes ordered
pairs (g, g 0 ) from G × G and “multiplies” them to other elements of G. (In other
symbols, · : G × G → G). This operation must satisfy the following conditions:
(1) a · (b · c) = (a · b) · c∀a, b, c ∈ G (associativity);
(2) there is an identity or neutral element e ∈ G such that ∀g ∈ G one has
e · g = g · e = g;
(3) ∀g ∈ G there is an inverse element g̃ ∈ G with g · g̃ = g̃g = e.
Remark II.7. (1) As a matter of convenience we often skip the dot and
just write ab for a · b (as we have done above for symmetries). Also, one usually
writes g −1 for the g̃ in item (3) of the definition.
(2) Be warned, that one of the conditions you might have placed here is missing:
we do not require that ab = ba in general. If you think of group elements as
prodecures that you compose, this is clear: it makes usually some difference whether
you put on socks first and then shoes, or the other way round.
(3) Note the following quirk of this asymmetry: if ab = c then c−1 = b−1 a−1 .
Thinking of socks and shoes makes this more obvious. You also have seen this for
taking inverses of matrices, and of course a matrix is just a procedure acting on a
2. GROUPS 31

vector (by multiplication), so this all fits together. The invertible n × n matrices
with real entries are one of the standard examples of a group. It is called the general
linear group Gl(n, R).
(4) Associativity implies that the product g · g · · · g of many copies of the same
element is uniquely defined and does not depend on in what order we multiply the
copis. (For example, you could take 4 copies and multiply them like ((gg)g)g or
like ((gg)(gg)). For 3 factors, this is explicit from the associativity rule, and for
more than 3 we discuss it in Lemma II.13 below).
Theorem II.8. The symmetries on any chosen object form a group.
Proof. The set G is the set of symmetries, the operation of G is composition
of symmetries. The identity is the symmetry that does not move. The inverse of a
symmetry is the symmetry done backwards. The associativity rule comes from the
fact that it holds for composition of functions (where it boils down to reading the
definition of composition). 
To each group one can write down a table similar to the tables we have looked at
for symmetries. For group tables one uses the phrase Cayley table. You surely have
noticed at this point that each row and column of such table contains each element
(once). That is no accident:if you had ac = bc for example, the same element
showing up as product twice in the same column, then also (ac)c−1 = (bc)c−1 and
so a(cc−1 ) = b(cc−1 ), or ae = be which entails a = b according to the various group
axioms. We say that groups have the cancellation property.
Example II.9. Here is a list of important groups with their operations. The ∗
just indicates usual multiplication.
(1) (Z, +), the integers with addition.
(2) (Z/nZ, +), modular addition (verification in HW);
(3) (Rn , +), the n-dimensional vector space has as part of its axioms the group
axioms for +;
(4) ({1, −1}, ∗) with a Cayley table similar to that of the dihedral group D2 ;
(5) (R>0 , ∗), which contains the previous group and uses the same operation,
identity and inverse;
(6) (R\{0}, ∗), which contains the previous group and uses the same operation,
identity and inverse;
(7) (Gl(n, R), ∗) and (Gl(n, C), ∗) as previously mentioned;
Example II.10. We consider here the list of all possible groups with 4 or fewer
elements.
(1) If |G| = 1 then G is just e and the Cayley table is that of the dihedral
group D1 .
(2) If |G| = 2 then G = {e, f } for some other element f , and by the cancellation
rule f f can’t be ef and so must be e. So G has a Cayley table essentially that of
the dihedral group D2 .
(3) If |G| = 3, G = {e, a, b}. Since ab can’t be ae = a, but also not eb = b, by
cancellation, it must be ab = e. Then we are forced to concede aa = b and bb = a,
and so a3 = e. So the table is the one you get from the rotational symmetries of
e a a2
the equilateral triangle alone: a a2 e , with b = a2 . This is essentially C3 .
a2 e a
32 II. WEEK 2: GROUPS

(4) If |G| = 4, with elements e, a, b, c then by the same reasoning as before, ab


is e or c.
First case: ab = c. Then ae = a, ab = c and since ac can’t be c (since ec is c),
e a b c
a e c b
we conclude ac = b. That then settles it, using associativity: we get .
b c e 1
c b a e
This is, up to relabeling a to ↔, and b to l, and c to x, the same table as that of
KV4 .
Second case: ab = e. Then a and b are mutual inverses. Since e is its own
inverse ee = e, c must also be (for lack of other partners) its own inverse cc = e.
Moreover, for cancellation reasons, ac can’t be a or c and it is not e (since the inverse
of c is not a but c) and so ac = b. We now know ae = a, ab = e, ac = b. Thus,
aa = c. Next, in the same way we found ac = b we also find ca = b. That forces
e a b c
a c e b
cb = a since cc = e, ca = b, ce = c. At this point, our knowledge is: .
b e
c b a e
But now the b-row is automatic. In particular, one sees a = a1 , c = a2 , b =
a3 , e = a4 . This is the same table as for C4 , just the letter a replacing the letter `.
Definition II.11. If two groups G and G0 of equal size permit a pairing of
their elements such that renaming the elements of G by their partner elements of
G0 turns the Cayley table of G into the Cayley table of G0 , then we call G and G0
isomorphic and write G ∼ = G0 .
For example, Sym(S) ∼ = D2 ∼= Sym(A) although the actual motions that carry
S to S (the roation x) and A to A (the flip ↔) are very different. We only care
about the abstract relationships of the symmetries, and they are in both cases the
e x
table , with x being x in one case, and ↔ in the other.
x e

3. Cyclic and Abelian groups


Definition II.12. A group (G, ·) is called cyclic isf there is some element g ∈ G
such that every other element g 0 ∈ G is a (possibly negative) power of g.
The element g is a generator for G.
The standard example is (Z, +) where there are two generators: every integer
is a multiple of 1, but it also a multiple of −1.
Other examples include the group of rotational symmetries on a regular n-gon
(these are the rotations that form 50% of the dihedral group Dn , n ≥ 3), and the
groups (Z/nZ, +) for any n ∈ N.
Writing down the Cayley tables for these cyclic groups one notices that these
Cayley tables are all symmetric. In other words, ab = ba for all a, b in such a group.
This is no accident as we show now.
Lemma II.13. If G is cyclic, generated by g ∈ G, then for all elements a, b ∈ G
we have ab = ba.
Proof. In fact, we pay back a debt here on the meaning of g i . We denote g 2
the product gg, and g 3 the product g(gg) = (gg)g, the results being the same by
3. CYCLIC AND ABELIAN GROUPS 33

associativity. For higher powers, argue as follows. Suppose we have proved that
the product of k copies of g is independent of the placement of parentheses. Then,
for i + j = k + 1 and i, j > 0 we have (g i )(g j ) = (g i )(g(g j−1 )) = ((g i )g)(g j−1 ) =
(g i+1 )(g j−1 ). So one may shuffle one copy of g after the other from one factor to the
other without changing the product. So, a product of k copies of g only depends
on g and k but not the placing of parentheses.
Let a, b be in a cyclic group generated by g. According to the definition of
a cyclic group, there are numbers i, j ∈ Z such that a = g i , b = g j . But then
g i g j = g j g i since they are both the product of i + j copies of g. 
Definition II.14. If in a group (G, ·) it is true that gh = hg for all g, h ∈ G
then G is Abelian.
Cyclic groups are Abelian, but lots of groups are not, such as Sym( ). (the
elements a, b, c only have two different powers, e and themselves, and `, r only have
the three powers e, `, r). Also, Sym(H) is not cyclic as one sees easily from the
squares.
The question when a power of an element is e seems to be important:
Definition II.15. For an element g of the group (G, ·), the smallest number
k ∈ N>0 such that g k = e is its order ord(g). (There might not be such a k (like
for 3 ∈ (Z, +) for example. We then say ord(g) = 0 or ord(g) = ∞.)
We call |G| the order of the group.

Inside Sym( ), both the powers of ` and the powers of a form what we call a
subgroup.
Definition II.16. If (G, ·) is a group, then a subgroup is a subset H ⊆ G
which, when equipped with the multiplication of G, is a group in its own right.

As mentioned, H1 = {e, `, r} and H2 = {e, a} are subgroups of Sym( ). The


Cayley table of a subgroup is simply the appropriate subtable of the Cayley table
for G.
Note that G counts as a subgroup of G, but he empty set is not a subgroup.
This is because one group axiom postulates the existence of an identity in G, so
{e} is the smallest subgroup of any G (called the “trivial subgroup”). A subgroup
different from G and {e} is called a proper subgroup.
Remark II.17. (1) If you recall the idea of a vector subspace, there was a
criterion that said “if W ⊆ V is a subset then it is a subspace provided that W is
closed under addition, and under scaling by real numbers”. There is a similar test
for subgroups: ∅ =6 H ⊆ G is a subgroup if for all h1 , h2 ∈ H the element h−1 1 h2 is
again in H.
Why? Associativity is inherited from G; e is in H because if h ∈ H then by
the test, h−1 h = e is in H; if h ∈ H then h−1 e = h−1 is also in H.
(2) If H ⊆ G is a subgroup and h ∈ H then the order of h as element of H is
the same as the order of h as element of G, since we use the same operation.
Example II.18. This is rehashing a previous remark. Suppose G and G0 are
groups of the same size, and assume further that there is a bijection between the
elements of G and the elements of G0 that turns one Cayley table into the other.
(We called such groups isomorphic).
34 II. WEEK 2: GROUPS

If you take an element g ∈ G then the order of g in G is the same as the order
of its twin in G0 . This follows from the translation of the Cayley tables. Basically,
this says that if φ is the bijection then φ(a ·G b) = φ(a) ·G0 φ(b).
The upshot is that one can use order to discriminate between groups. For
example, KV4 is not C4 because KV4 has 3 elements of order 2, and C4 only one.
One can also count subgroups and compare: KV4 has 5 subgroups, namely
{e}, {e, ↔}, {e, l}, {e, x}, KV4 . But C4 has only three: {e}, {e, `2 }, C4 . So these
two groups cannot be isomorphic.
Recall that for sets A, B the Cartesian product A × B is the set of all ordered
pairs (a, b) with a ∈ A, b ∈ B.
Definition II.19. If G, G are groups, then G × G0 is also a group, with mul-
tiplication (g1 , g10 ) · (g2 , g20 ) = (g1 ·G g2 , g10 ·G0 g20 ).
For example, (R2 , +) is simply (R, +) × (R, +).
Example II.20. The cyclic groups C2 = {e, a} with a2 = e and C3 = {e, b, b2 }
with b3 = e have Cayley tables as discussed earlier. In these groups, e has order 1,
a has order 2 and b has order 3. What about elements of C2 × C3 ?
The list of elements has 2×3 members, and they are (e, e), (e, b), (e, b2 ), (a, e), (a, b), (a, b2 ).
One sees easily that (e, e) has order 1 = lcm(1, 1); (e, b) and (e, b2 ) have order
3 = lcm(1, 3); (a, e) has order 2 = lcm(2, 1); and (a, b) and (a, b2 ) have order
6 = lcm(2, 3).
(We explain the lcm statements: in general one has ord(x, y) = lcm(ord(x), ord(y)).
Why? Surely, the lcm is a power that sends (x, y) to (e, e). The powers satisfy
xk = xord(x)+k and y k = y ord(y)+k . So y k = e implies y k = y ord y = e and so
y gcd(ord(y),k) = e. But the gcd can’t be bigger than ord(y) because it needs to
divide it, and it can’t be smaller than the order because of the definition of order.
The only way out is that the order is the gcd. So the order of y divides any k with
y k = e. Similarly, the order of x divides any exponent i with xi = e and the order
of (x, y) divides any exponent with (xi , y i ) = (e, e). So whatever the order of (x, y)
is, it must be a multiple of ord(x) and ord(y), while being as small as possible.
That is simply the lcm.

4. Automorphisms
Definition II.21. An automorphism of a group G is a a relabeling of its
elements that preserves the Cayley table.
For example, C3 is the group {e, a, b} with rules ab = ba = e, ea = ae = a, be =
eb = b. This is completely symmetric in a, b. So the bijection
a 7→ b,
b 7→ a,
e 7→ e
is an automorphism of C3 . (Geometrically, this switches left roation with right
rotation in the rotational symmetries of an isosceles triangle).
In principle, an automorphism is just a special permutation of the elements.
So one can search through all permutations and just keep those that preserve the
Cayley table. This is not efficient if G has many elements, one should use the group
structure in the search.
4. AUTOMORPHISMS 35

Note, that one possible automorphism is always just to leave everything as is.
That is like the e in a group. In fact, automorphisms do form a group in their
own right. e ∈ Aut(G) is the relabeling that sends every element of G to itself;
multiplication of two automorphisms is just doing one relabeling after the other;
the inverse of an automorphism is the relabeling done backwards.
Looking at C3 : there are only two automorphisms, the identity on C3 , and the
switch discussed above. This is because eG must be sent to eG (yx = x for all x is
something only the element y = e does, and relabelings must preserve products!).
Composing the switch with itself gives the identity on C3 . So, it is fair to say that
Aut(C3 ) is basically the group with table as in Example II.10 part (2).
Another interesting example occurs in C4 , which is the group of symmetries of
the oriented square , with elements a = `, b = `2 , c = `3 and the understanding
4
` = e. Here one can interchange a and c while keeping e, b fixed:
a 7→ c,
c 7→ a,
b 7→ b,
e 7→ e.
It is easy to check that this relabeling preserves the table when written with a, b, c, e.
Again, this is the only automorphism since we must send e to e (the only
element of order one) and b to b (the only element of order two). Aside, of course,
of the identity on C4 leaving everything fixed. So, Aut(C4 ) is the “same” group as
Aut(C3 ), both isomorphic to C2 , sameness in the sense that ther Cayley tables are
the same after renaming.
Example II.22. The automorphisms of KV4 are more interesting.
Suppose we fix a. Then we could fix b but that also forces us to fix c as the only
remaining element of order 2. That then comes down to the identity, sending each
element of KV4 to itself. Alternatively, if we do not fix b, the only open destination
for b is c. So a 7→ a, b 7→ c, c 7→ b.
Alternatively, we can try sending a 7→ b. If we fix c then we are in a similar
situation as before, because then b must go to a. On the other hand, we could send
a 7→ b and b 7→ c which forces c 7→ a.
The cases where a 7→ c are similar, with the letters b and c exchanged.
Altogether, the 6 options are summed up in the following table, where each
row represents an automorphism, and where it sends the elements of G is recorded
in the row.
e a b c
ψe e a b c
ψa e a c b
ψb e c b a
ψc e b a c
ψ` e b c a
ψr e c a b
For notation: ψe keeps everyone fixed; ψx for x ∈ {a, b, c} keeps e, x fixed and
switches the other two; ψ` encodes a rotation (b, c, a) of the letters a, b, c to the left
in the sense that we read the sequence (b, c, a) as the instruction a goes where b
36 II. WEEK 2: GROUPS

was, b goes where c was, and c goes where a was, which is now really a rotation to
the left), and ψr moves them according to the instruction the right to make (c, a, b).
The notation is intentionally reminding you of Sym( ). Indeed, if you align
ψx in Aut(KV4 ) with x ∈ Sym( ) then you find this to be an isomorphism (see
Definition IV.7 below): it is a one-to-one correspondence between the elements
of Sym( ) and the elements of KV4 . For example, ψ` after ψa first sends a
to a, and then to b. And it sends b first to c and then that c is sent to a. So
ψ` ψa is e 7→ e, a 7→ b, b 7→ a, c 7→ c. This is the same effect as that of psic ,
and so ψ φa = ψc . If we compare to the Cayley table of Sym( ) then we
also have correspondingly `a = c. Checking the entire list of products, we see
Aut(KV4 ) = Sym( ).

5. Free groups
Definition II.23. A group is free (on the elements g1 , . . . , gk ) if, for some
k ∈ N, it is isomorphic to the group Fk of all words in the letter set Lk =
{e, x1 , . . . , xk , y1 , . . . , yk } with the rules (and no other rules) of
• ez = z = ze for all z ∈ L;
• xi yi = e = yi xi for all 1 ≤ i ≤ k;
• associativity.
Here, the group operation is simply writing two words next to each other in the
given sequence.
These groups are “free” because their elements have no other constraints aside
from the group axioms. They are not Abelian for k > 1 (since we do not require
xi xj = xj xi ). In contrast, F1 = {. . . , y12 , y1 , e, x1 , x21 , . . .} is isomorphic to the
Abelian group (Z, +) via the identificatiuon xk1 ↔ k ∈ Z, y1k ↔ −k ∈ Z.
There are also free groups on infinite numbers of letter. We will not look at
them much.
It is a fact that all subgroups of a free group are free (basically, because there
are no relations, but the proof is not so easy), and somewhat shockingly, F2 contains
subgroups isomorphis to F3 , F4 , . . .. We won’t discuss this phenomenon.
It is also a fact that one can take any group G and interpret it as a free group
“with extra rules”.
Definition II.24. If G is a group we call a list L of elements a generating set
if every element of G is a product of elements from L ∪ L0 where L0 is the list of
inverses of L.
If such list has been chosen, we refer to elements of L as generators.
Evidently, L = G is a generating set, although usually not an interesting one.
Example II.25. Z × Z is generated by {(1, 0), (0, 1)}. Because of this we can
view Z × Z as “F2 with the additional rules x1 x2 = x2 x1 and x1 y2 = y2 x1 and
x2 y1 = y1 x2 and y1 y2 = y2 y1 .
To se this note first that we get the relations x1 y1 = y1 x1 and x2 y2 = y2 x1 for
free, because all four of these products give e.
Secondly, we read x1 as (1, 0), and x2 as (0, 1), which then suggests y1 is (−1, 0)
and y2 is (0, −1). Then all additional rules imposed on F2 above correspond to Z×Z
being Abelian.
CHAPTER III

Week 3: Z/nZ and cyclic groups

The main hero in this week is the group Z/nZ with addition, where n ∈ N.
Recall that it is a cyclic group, generated by the coset of 1. The order of the element
1 + nZ is n as one easily sees, and the order of Z/nZ is also n.
All groups Z/nZ are Abelian, because Z is Abelian and we just install new
rules in order to make Z/nZ from Z.

1. Subgroups of cyclic groups


We want to study first how different the elements in Z/nZ are for the purpose
of generating subgroups.
Example III.1. Let G = Z/12Z. We check for each element of G what group
it generates inside G. We find:
• 1 + 12Z, 5 + 12Z, 7 + 12Z, 11 + 12Z all generate all of G. For example, the
multiples of 7 are {7, 2, 9, 4, 11, 6, 1, 8, 3, 10, 5, 0} in that sequence.
• 2 + 12Z, 10Z both generate the subgroup of cosets of even numbers.
• 3 + 12Z, 9 + 12Z both generate the subgroup of cosets of numbers divisible
by 3.
• 4 + 12Z, 8 + 12Z both generate the subgroup of cosets of numbers divisible
by 4.
• 6 + 12Z generates the subgroup of cosets of numbers divisible by 6.
• 0 + 12Z both generate the subgroup of cosets of numbers divisible by 12.
Note that the elements listed in the same item above always have the same
order (this is kind of obvious since the order of an element is precisely the order of
the cyclic group it generates, and we hqave grouped in the same item the elements
that generate the same group).
Note also that if we had asked “classify the elements of Z/12Z by their order”,
we would have written the same exact list. This is because to each possible sub-
group size (namely, 0, 1, 2, 3, 4, 6, 12) there is exactly one subgroup of that size, even
though there are usually several different ways to generate that subgroup.
It is natural to ask at this point how one can predict which elements will
generate the same subgroup. But perhaps an easier question is “if I take k + nZ,
what is its order?”. We now consider these questions. For this we collect some
facts.
Lemma III.2. If ord(g) = n > 0 then the exponents i with g i = e are precisely
the multiples of n.
In other words, g i = g j if and only if n|(i − j).
Proof. If i = kn then g i = (g n )k = ek = e. Conversely, if g i = e (and i > 0)
and also g n = e then write the gcd of i, n as a linear combination an + bi with
37
38 III. WEEK 3: Z/nZ AND CYCLIC GROUPS

a, b ∈ Z. Note that this gcd is positive since n, i are. Then compute g an+bi =
(g n )a (g i )b = ea eb = ee = e. So gcd(n, i) is an exponent that when used over g gives
e. But n = ord(g) is supposedly the smallest positive exponent of this sort. So,
gcd(n, i) = n and so n|i.
For the last part, g i = g j implies, when multiplying witht he inverse of g j , that
i−j
g = e, which then by the first part gives n|(i − j). If on the other hand we have
n|(i − j) then g i−j = e and so g i = g j . 
Definition III.3. If g ∈ G we write hgi for the group of al powers—negative
and positive—of g in G. This is the cyclic subgroup generated by g.
Corollary III.4. Up to renaming, (hgi, ·) is (Z/ ord(g)Z, +) in the sense that
the renaming identifies the Cayley tables.
Proof. Let n = ord(g). Then we associate to g i ∈ hgi the element 1 + nZ in
Z/nZ. Then g i · g j = g i+j corresponds to (i + nZ) + (j + nZ) = (i + j) + nZ, and
g n = e to (1 + nZ) + . . . + (1 + nZ). 
| {z }
n copies

The next result then tells us how the groups generated by powers of g ∈ G will
look.
Corollary III.5. Let g ∈ Ghave order n. Then the group hg k i generated by g k
is the same group as the group hg gcd(n,k) i that is generated by g gcd(n,k) . Moreover,
abstractly this group is the same as the cyclic group Cn/ gcd(n,k) .
Proof. By the same argument as in the previous proof, hg k i contains g gcd(n,k) ,
and so also all its powers. Conversely, gcd(n, k) divides k and so of course hg gcd(n,k) i
contains g k and all its powers. So, the groups hg k i and hg gcd(n,g) i are contained
one in the other in both directions and hence equal.
Let h = g gcd(n,k) . What could the order of h be? Write n = d · gcd(n, k); then
h = (g gcd(n,k) )d = g n = e and so the order of h is no more than d. But if hi = e
d

for some i < d then we also have e = (hi ) = (g gcd(n,k) )i , and this would contradict
ord(g) = n since gcd(n, k) · i < gcd(n, k) · d = n. 
We can now complete a table from above on subgroups of Z/12Z:
g := k mod 12Z size of hgi gcd(n, k) n/ gcd(k, n) = ord(k + 12Z)
1, 5, 7, 11 12 1 12
2, 10 6 2 6
3, 9 4 3 4
4, 8 3 4 3
6 2 6 2
0 1 12 1
Looking at this table, the next natural question is: how do you predict the
exponents i that give an equation hgi = hg i i?
As a starter, let’s ask for the generators of Z/nZ, the guys for which hgi is the
entire group Z/nZ. For n = 12, the relevant cosets are 1, 5, 7, 11. These are the
numbers that are coprime to 12. (Note: any representative in k is coprime to n when
k is coprime to n. For example, gcd(5, 12) = 1 and so also gcd(5 + 127 · 12, 12) = 1
and 5 + 127 · 12 lives in the same coset as 5.)
The magic therefore lies in coprimeness.
2. PRODUCTS AND SIMULTANEOUS MODULAR EQUATIONS 39

Definition III.6. For n ∈ Z let φ(n) be the Euler φ-functin that counts the
number of cosets in Z/nZ that consist of representatives coprime to n.
For example, φ(12) = 4 since modulo 12 the cosets 1+12Z, 5+12Z, 7+12Z, 11+
12Z are those that are made of numbers coprime to 12.
Lemma III.7. If G = hgi is cyclic of order n then the generators of G are
exactly the elements g k with gcd(n, k) = 1.
Proof. Any element h of G is some power h = g k of g since G = hgi. A
generator is an element g k of G with hg k i = G, which is the case exactly when
ord(g k ) = n. But ord(g k ) = n/ gcd(n, k), and so we find that g k is a generator if
and only if gcd(n, k) = 1. So counting the generators is the same as counting the
cosets of Z/nZ that are mnade of numbers coprime to n. 

We can now move and ask when hg i i = hg j i for some exponents i, j. Since the
size of hg i i is n/ gcd(n, i) we find the implication
[hg i i = hg j i] ⇒ [gcd(n, i) = gcd(n, j)].
In reverse, if the gcd equality holds, then gcd(n, i) = gcd(n, j) is a divisor of j
which forces g j inside hg gcd(n,i) i and so hg i i = hg gcd(n,i) i contains hg j i. Exchenging
i, j gives the reverse containment, hence an equality.
We have now seen all parts of
Theorem III.8. Let g be an element of order n. So hgi is Cn up to relabeling.
(1) Subgroups of cyclic groups are always cyclic.
(2) For all i ∈ Z, ord(g i ) divides n and equals n/ gcd(n, i).
(3) If k|n then there is a unique subgroup of size k inside hgi, and it is exactly
the set of n/k-th powers hg n/k i of g n/k .
(4) If k|n then he number of elements of order k inside hgi is equal to φk. If
k 6 |n, no elements have order k.
(5) Obviously, if g i generates a subgroup of order k then it does not generate
a subgroup P of order different from k. It follows from the previous item,
that n = d|n φ(d).

To see the last part in action, look at Z/nZ. Our table above on elements and
groups they generate runs in the left column through all the cosets and puts them
into one row if they generate the same subgroup. There are 12 such elements, they
get grouped as
12 = |{z}
4 + |{z}
2 + |{z}
2 + |{z}
2 + |{z}
1 + |{z}
1 .
=φ(12) =φ(6) =φ(4) =φ(3) =φ(2) =φ(1)

2. Products and simultaneous modular equations


Example III.9. • Z/2Z × Z/2Z is not cyclic, since no element can have
order 4.
• Z/3Z × Z/2Z is generated by (1, 1).
Lemma III.10. G := Z/n1 Z × Z/n2 Z × . . . × Z/nk Z is cyclic if and only if
gcd(ni , nj ) = 1 for every pair i 6= j.
40 III. WEEK 3: Z/nZ AND CYCLIC GROUPS

Proof. If the gcd condition is in force, take the element g = (1, . . . , 1). Its
order is a multiple of every ni , but as they have no common factor, it is a multiple
of the product n1 · · · nk , which is |G|. But no element can have order greater than
|G|, so ord(g) = n1 · · · nk and so g generates G.
On the other hand, any element of G is always of order at most lcm(n1 , . . . , nk ),
since this power creates the neutral element in very component of the product. If
gcd(ni , nj ) > 1 for any i 6= j then this lcm cannot be the product n1 ·n2 · · · nk = |G|,
so everyones order is less than |G|. So G will then have no element of order |G|. 
In particular, this says that a product Z/(p1 )e1 Z × Z/(p2 )e2 Z × . . . × Z/(pk )ek Z
for distinct primes p1 < p2 < . . . < pk is cyclic.
Note that our first example showed that distinctness is crucial.
Example III.11. Let’s try to make this more explicit. We know that Z/7Z ×
Z/5Z is cyclic, and must be of order 7×5 = 35. So abstractly we know Z/7Z×Z/5Z
is Z/35Z in disguise. But can we see that inside Z/35Z?
We are looking for an identification of Z/35Z with the product Z/7Z×Z/5Z that
preserves the Cayley table (which means it has to preserve the group operation +).
Let’s make a naı̈ve guess: take i+35Z and attach to it the element (i+7Z, i+5Z) in
Z/7Z×Z/5Z. Surely, this attachment will respect addition since (i+35Z)+(j+35Z)
would be attached to (i + 7Z, i + 5Z) + (j + 7Z, j + 5Z) = ((i + j) + 7Z, (i + j) + 5Z)
as you would expect. We write π for this recipe, π(i + 35Z) = (i + 7Z, i + 5Z).
(Important note here: in Z/35Z, we have grouped numbers together into a coset
whenever they differ by a multiple of 35. Since multiples of 35 are also multiples
of both 5 and 7, we can make “cosets of cosets” and read for example the cosets
3 + 35Z, 8 + 35Z, 13 + 35Z, 18 + 35Z, 23 + 35Z, 28 + 35Z, 33 + 35Z as a partition of
the coset 3 + 5Z in Z/5Z. So, moving from i + 35Z to i + 5Z actually makes sense
since it does not destroy cosets but preserves them and makes them even larger. So
it is actually legal to go from Z/35Z to Z/5Z by the assignment “i + 35Z becomes
i + 5Z”. Same argument for going from Z/35Z to Z/7Z. But you could not, for
example, go from Z/35Z to Z/6Z: in Z/35Z, 3 and 38 belong to the same coset,
but in Z/6Z they do not. Destroying cosets isnot legal when moving groups about.)
So we have a way to go from Z/35Z to Z/7Z × Z/5Z. Big question, how do we
go back? In other words, given a pair (a + 7Z, b + 5Z) in Z/7Z × Z/5Z, how do we
find i + 35Z such that (a + 7Z, b + 5Z) = π(i + 35Z)?
What we know is that this is supposed to work based on the fact that 5 and 7
are coprime. So gcd(7, 5) = 1 must get used somewhere. The Euclidean algorithm
says that there are numbers x, y ∈ Z with 1 = 7x + 5y. (Specifically, x = −2 and
y = 3 works). Then let’s consider the number i = a · y · 5 + b · x · 7. (That one
should look at this is not obvious and only becomes clear after a good number of
examples). Then we compute:
(a · y · 5 + b · x · 7) + 7Z = a · y · 5 + 7Z = a(1 − 7x) + 7Z = a + 7Z,
(a · y · 5 + b · x · 7) + 5Z = b · x · 7 + 5Z = b(1 − 5y) + 5Z = b + 5Z.
We have basically proved:
Lemma III.12. If m, n are relatively prime and a, b ∈ N are given, then the
simultaneous equations
i mod mZ = a mod mZ,
i mod nZ = b mod nZ
3. U (n): AUTOMORPHISMS OF Z/nZ 41

have a solution given by i = a · y · n + b · x · m where 1 = mx + ny. 


Remark III.13. If three pairwise number are m, n, p given, one can also solve
simultaneous equations
i mod mZ = a mod mZ,
i mod nZ = b mod nZ,
i mod pZ = c mod pZ.
First deal with two equations, then throw in the last.

3. U (n): Automorphisms of Z/nZ


We have seen that the generators of (Z/nZ, +) are the cosets a+nZ for elements
a that have the property gcd(n, a) = 1. So for example, we can think of Z/5Z as
the group h1 + 5Zi generated by 1 + 5Z as we usually do, but also as the group
h3 + 5Zi. Abstractly, there is no difference how we think. The two interpretations
align any coset a + 5Z with the coset of 3a + 5Z, since we are required to respect
group operation + and so (1 + . . . + 1 +5Z) must correspond to (3 + . . . + 3 +5Z).
| {z } | {z }
a copies a copies
Then this correspondence ψ is as follows:
g 0 + 5Z 1 + 5Z 2 + 5Z 3 + 5Z 4 + 5Z
ψ(g) = 3g 0 + 5Z 3 + 5Z 1 + 5Z 4 + 4Z 2 + 5Z
You could think of this as having a clock with 5 hours that fell off the table.
Now the clockwork is still ok, but the face is broken. You try to reassemble the
face in such a way that the clock still works, but you make a mistake and read “3”
as “1” in the dark. It’s still a clock with 5 hours, but made for aliens that count
3, 1, 4, 2, 5 = 0 instead of how we count.
Instead of sending 1 + 5Z to 3 + 5Z we could have taken any other generator.
BUT, we could not have send it to 0 + 5Z since that is not a generator.
Going back to gcd tests for being generator, note that gcd(ab, n) = 1 for a, b ∈ Z
happens if and only if both gcd(a, n) = 1 and gcd(b, n) = 1. We conclude that if
we take two generators a + nZ and b + nZ of the group Z/nZ then their product is
another such generator. That leads to the idea of taking the set of generators for
Z/nZ and to turn it into a group with multiplication.
Definition III.14. Let n ∈ Z and define U (n) to the subset of Z/nZ whose
elements are the cosets a+nZ with gcd(a, n) = 1. We call U (n) the n-th unit group.
Each element u + nZ of U (n) corresponds to an automorphism of Z/nZ that is
determined by sending 1 + nZ to u + nZ and then using additivity.
So multiplication is an operation · : U (n) × U (n) → U (n) (by the above gcd
considerations) that is associative (because multiplcation of integers is already as-
sociative) and there is an identity element for this multiplication process (namely
the coset 1 + nZ). The interesting claim is that U (n) also has inverses. Namely,
if gcd(a, n) = 1 then we know from Euclid’s algorithm that there are x, y ∈ Z
with ax + ny = 1. This implies directly that gcd(x, n) is also 1 (since 1 is a linear
combination of x and n and therefore is divided by the actual gcd) and also that
(a + nZ) · (x + nZ) = (ax + nZ) = ((1 − yn) + Z) = 1 + Z. So x + nZ is an inverse
for a + nZ.
42 III. WEEK 3: Z/nZ AND CYCLIC GROUPS

So, U (n) is a group, and encodes the automorphisms of Z/nZ,


U (n) = Aut(Z/nZ).
You can think of making U (n) from Z/nZ by asking ”if I want to make a
mutliplication group from Z/nZ, what do I need to do”?
Answer:The new identity will be 1 + nZ. Wanting inverses forces you to dump
0 + nZ. And if n divides ab then (a + nZ)(b + nZ) = 0 + nZ = (0 + nZ)(b + nZ)
wounld contradict the cancellation property. So all a + nZ with gcd(a, n) > 1 must
also be kicked out.
Here are some examples.
Example III.15. (1) if n = 2 then U (n) is just the coset 1 + 2Z, which is
its own inverse. So, U (n) is up to relabeling the trivial group {e}.
(2) if n = 3, U (n) = {1 + 3Z, 2 + 3Z} with the rule that aa = e. So, U (3) is
| {z } | {z }
e a
the same as the group Z/2Z and also the same as D2 .
(3) if n = 4, U (n) is {1 + 4Z, 3 + 4Z} with the same Cayley table as U (3).
| {z } | {z }
e a
(4) If n = 5 then U (4) has 4 = 5 − 1 elements (as 5 is prime) and since
22 = 4, 23 = 8 = 3 + 5, 24 = 16 = 1 + 3 · 5, every element of U (5) is a power of
2 + 5Z. So, U (4) is cyclic and of order 4, so it must be C4 .
(5) If p is prime then U (pk ) has pk−1 (p − 1) elements. Indeed, if you want
to be coprime to pk all you need to do is not have p as a factor. So out of any
p consecutive numbers, only p − 1 will make it into U (n). Since Z/pk Z has pk
elements, U (pk ) will have pk · p−1
p elements.
(6) If p is a prime number then of corse U (p) has p − 1 element. We will see
later that U (p) is always cyclic. In fact, unless p = 2 we will also see that U (pn ) is
cyclic. (Recall from above that in contrast U (4) is not cyclic, and in fact U (2k ) is
never cyclic for k > 1).
There are lots of non-prime numbers n for which U (n) is cyclic.
Example III.16. For example, U (6) is Z/2Z. Let’s try to understand that.
Recall from last time that we proved that there is an assignment ψ : Z/6Z →
Z/2Z × Z/3Z that sends the coset a + 6Z to the coset pair (a + 2Z, b + 3Z) and that
this map respects addition and multiplication, and that it is bijective. Since Z/6Z
is cyclic, generated for example by 1 + 6Z, we conclude that Z/2Z × Z/3Z is also
cyclic.
We make this more explicit. If you start with a coprime to 6 then this is the
same as saying that a is coprime both to 3 and 2. So, if a+6Z actually lives in U (6)
then a + 2Z lives in U (2) and a + 3Z lives in U (3). This is also true conversely since
gcd(a, 2) = 1 and gcd(a, 3) = 1 implies gcd(a, 6) = 1. What this means is that ψ
sets up not just a correspondence between Z/6Z on one side and Z/2Z × Z/3Z on
the other, but also that under this identification U (6) corresponds to U (2) × U (3).
Explicitly, this correspondence relates 1 + 6Z with (1 + 2Z, 1 + 2Z) and 5 + 6Z
with (5 + 2Z, 5 + 3Z) = (1 + 2Z, 2 + 3Z).
By making the above paragraphs more abstract (replace 2 by m, and 3 by n),
one obtains the following theorem.
Theorem III.17. If m, n have gcd(m, n) = 1 then U (mn) = U (m) × U (n).
3. U (n): AUTOMORPHISMS OF Z/nZ 43

Again, if you have 3 or more coprime numbers, one gets a corresponding results
on products of unit groups.
Example III.18. How many elements does U (750) have?
The bad way is to write them all out. The enlightened 453 student says:
750 = 3 · 53 · 2, and so U (750) = U (3) × U (53 ) × U (2). I know |U (3)| = 2,
|U (53 )| = 53−1 (5 − 1), and |U (2)| = 1. Hence |U (750)| = (2) · (25 · 4) · 1 = 200.
Remark III.19. Recall the Euler φ-function that counts for n ∈ N how many
numbers from 1, . . . , n are relatively prime to n. Recall also that a is relatively
prime to n if and only if a + nZ is a generator of the group Z/nZ. (In other words,
the order of a + nZ is n, or yet in other words, na is the lowest positive multiple of
a that is divisible by n).
Since U (n) is made of the cosets of Z/nZ that come from numbers relatively
prime to n, there are exactly φ(n) elements in U (n). That means also that if m, n
are relatively prime, then φ(mn) = φ(m)φ(n) because of the theorem above.
CHAPTER IV

Week 4: Cosets and morphisms

1. Equivalence relations
Definition IV.1. Let S be a set. An equivalence relation is a binary relation
' on S such that
• a ' a for all a ∈ S (reflexivity);
• [a ' b] ⇔ [b ' a] for all a, b ∈ S (symmetry);
• [a ' b and b ' c] ⇒ [a ' c] for all a, b, c ∈ S (transitivity).
Examples of such equivalence relations are :
• the usual equality of numbers
• congruence of geometric figures;
• equality in the module caculation (this is really the relation i ' j on Z
whenever n|(i − j).
An example of a realtion that is not an equivalence relation is the usual ≤,
because it is not symmetric: 3 ≤ 4 but not 4 ≤ 3.
Lemma IV.2. If S is a set with equivalence relation ' then one can partition
S into cosets/equivalence classes where any coset contains all the elements that are
mutually equivalent to one another.
If we denote the cosets S1 , S2 , . . . , then we have: Si ∩Sj is empty unless Si = Sj .
Moreover, S is the union of all Si .
Lemma IV.3. Let G be any group, and pick n ∈ N. Then let A be the collection
of all group elements a ∈ G that have order exactly n. Then |A| is a multiple of
φ(n).
Proof. If a ∈ A, then hai is a cyclic group of order n. By last week’s results,
hai contains exactly φ(n) elements whose order is exactly n, and these are exactly
the generators of hai. So, make an equivalence relation on A where x ' y if and
only if hxi = hyi. Each equivalence class has size φ(n), they do not meet, and their
union is A. So the coset size divides |A|. 
Note: if G is cyclic and n = |G|, then G = hgi and |A| = φ(n) by last week.

2. Morphisms
0
Let G, G be two groups.
Definition IV.4. A morphism (or homomorphism) is a function ψ : G → G0
that respects the group operations:
ψ(g1 ·G g2 ) = ψ(g1 ) ·G0 ψ(g2 )
for all g1 , g2 ∈ G.
45
46 IV. WEEK 4: COSETS AND MORPHISMS

You have seen many examples already.


Definition IV.5. Denote R× and C× the nonzero real numbers and the
nonzero complex numbers respectively.
Here is a list of morphisms that you have seen at least in part.
• G = Z = G0 , ψ =multiplication by 42.
• G = GL(2) = the invertible 2 × 2 matrices with matrix multiplication ,
G0 = (R× , ·) and ψ =the determinant: det(AB) = det(A) det(B).
• the exponential map (R, +) → (R× , ·), since exp(x + y) = exp(x) · exp(y).
• The logarithm function ln : (R× , ·) → (R, +), since ln(a
√· b) = ln(a) + ln(b).
√ √ √
• The square root function : (R× , ·) → (R× , ·), since a · b = a · b.
• The third power map (−)3 : (R× , ·) → (R× , ·), since (a · b3 = a3 · b3 .
• The third power map (−)3 : U (7) → U (7) since ((a + 7Z)(b + 7Z))3 =
(a + 7Z)3 · (b + 7Z)3 .
Example IV.6. Suppose we want to make a morphism k : Z/mZ → Z/nZ that
sends a + mZ to ka + nZ. That means that every element of the form a + tm with
t ∈ Z should be turned by multiplication by k into an element of the form ka + sn
where s ∈ Z.
Let’s examine this. For example, if a = 0 and t = 1 this means that km should
look like sn for a suitable s ∈ Z. This just asks that km is a multiple of n. As
one can check, n|km is also enough to make sure everything else goes well. We
conclude:
Multiplication by k sets up a morphism k : Z/mZ → Z/nZ if
and only if n divides mk.
Definition IV.7. A morphism ψ is a isomorphism if it is bijective (= injective
+ surjective= into + onto). It is an automorphism if it is an isomorphism where
G0 = G.
Note that this forces ψ(eG ) = eG0 , since eg eg = eG forces ψ(eG )ψ(eG ) = ψ(eG )
and (because of the cancellation property in G0 ) such an equation can only be
satisfied by eG0 .
An automorphism is a relabeling of G that preserves the group structure. An
isomorphism is a way of linking in tiwn pairs the elements of G and G0 while making
sure that products work in both groups the same way. That means, an isomorphism
is the matching of 2 Cayley tables, and an automorphism is a switch in columns
and rows of a Cayley table that reproduces the same Cayley table.
For example, exp : (R, +) → (R× , ·) is an isomorphism, but not an automor-
phism. The map the multiplies by 5 is an automorphism of Z/12Z (since it sends
the generator 1 + 12Z to the generator 5 + 12Z).
Here is a list of things an isomorphism ψ needs to do/have. This is a good list
for checking whether isomorphisms between G and G0 can exist at all.
• |G| = |G0 |;
• both G and G0 are cyclic, or neither one is cyclic;
• both are Abelian, or neither is;
• the number of elements of G that have order k is the same as the number
of elements of G0 that have order k (for any k).
Example IV.8. If G, G0 are both cyclic of the same order n then they are
isomorphic. Namely, if G = hgi and G = hg 0 i, let the morphism send g i to (g 0 )i .
3. COSETS FOR SUBGROUPS 47

We saw last week that the automorphisms of Z/nZ are labelled by the cosets
k + nZ with gcd(k, n) = 1. In other words,
Aut(Z/nZ, +) = U (n).
Example IV.9. While U (12) and U (10) both have 4 elements, they are not
isomorphic. Indeed, U (10) is cyclic generated by 3, and U (12) has no element of
order 4.
One natural way of making automorphisms is the following:
Definition IV.10. Let a ∈ G be a group element. Define a map ψa : G → G
by setting
ψa (g) = aga−1 .
This is the inner automorphism on G induced by a.
Note first that this indeed respects multiplication: ψa (g1 )ψa (g2 ) = ag1 a−1 ag2 a−1 =
ag1 g2 a−1 = ψa (g1 g2 ).
Note next that if G is Abelian, then ψa (g) = aga−1 = aa−1 g = ea = a for any
choice of g and a. So in an Abelian group, every inner automorphism is the identity
map.

Example IV.11. If G = Sym( ), then the 6 inner automorphisms are as


follows:
• ψe fixes every element;
• ψ` sends e → e, r → r, ` → `, a → c, b → a, c → b;
• ψr sends e → e, r → r, ` → `, a → b, b → c, c → a;
• ψa , ψb , ψc are quite similar: ψx fixes e, r, `, x and interchanges the two
remaining elements of G.
If ψx , ψy are inner automorphisms of G, they can be composed: ψx (ψy (z)) =
xyzy −1 x−1 = ψxy (z). Thus, we find:
Lemma IV.12. The assignment a 7→ ψa is a morphism innG from the group G
to the group of its inner automorphisms Inn(G).

We will see later that since Inn(Sym( )) has 6 different elements just like
Sym( ) itself, then the conclusion is that innG is actually an isomorphism, so that
as abstract groups there is no difference between Sym( ) and Inn(Sym( )).

3. Cosets for subgroups


Definition IV.13. Let H be a subgroup of G and choose g ∈ G. We write
gH for the set of all products gh with h ∈ H. We call gH the left coset of H to g.
The set of products Hg is the right coset of H to g.
Note that if G is Abelian, then left and right cosets agree, gH = Hg. Note
also that (because of the cancellation property) gH and Hg contain equally many
elements, namely |H| many.

Example IV.14. Let G = Sym( ) and H = {e, a}. Then e · H = a · H =


H = {e, a}, r · H = bḢ = {b, r}, and ` · H = c · H = {c, `}.
Note that these are disjoint, one of them is H, and their union is G.
48 IV. WEEK 4: COSETS AND MORPHISMS

These observations generalize as follows.


Lemma IV.15. Let H be a subgroup of G. For all a, b ∈ G we have:
(1) a ∈ aH (since 1 ∈ H).
(2) aH meets H if and only of a ∈ H (since ah = h0 gives a = h0 h−1 ).
(3) aH = bH or aH ∩ bH = ∅ (since c = ah = bh0 implies b−1 a = h0 h−1 ∈ H
and so b−1 aH = H by the previous item, and so aH = bH.
(4) |aH| = |H| (since cancellation dictates that the map h → ah is injective,
and so is the map ah → ahh−1 = a).
(5) From the definitions, aH = Ha if and only if aHa−1 = H if and only if
H is stable (as set, not necessarily element by element) under the inner
automorphism ψa .
(6) aH is a subgroup of G iff a ∈ H (since a subgroup needs e, and e ∈ aH
means a−1 ∈ H, hence a ∈ H).
The main upshot of this lemma is
Theorem IV.16. Let G be a finite group, H a subgroup. Then |H| divides |G|,
and the number of left cosets of H is equal to |G|/|H|.
Proof. The various cosets gH with g ∈ G are either equal to one another, or
disjoint. So G is the disjoint union of a bunch of cosets, and they all have size |H|
but the lemma. 

Definition IV.17. The quotient |G|/|H| from the theorem is denoted [G : H]


and called the index of H in G.
Corollary IV.18 (Lagrange’s Theorem). If g ∈ G then the order of g divides
the order of G.
Proof. The cyclic group hgi is a subgroup of G. It has ord(g) elements and
by the theorem this number divides |G|. 

Corollary IV.19. If a group has a prime number of elements, it must be


cyclic.
Proof. Take an element g ∈ G. Its order divides |G|, which is supposed to
be prime. So the choices are ord(g) = |G| or ord(g) = 1. In the second case, g = e
must be the identity. So, take another element that is not the identity. Now ord(g)
cannot be 1, so it must be |G| as in the first case. But if an element has order |G|
then the cyclic subgroup it generates has |G| elements, and that means it fills out
G completely. So G = hgi for any g different from e. 

Theorem IV.20 (Fermat’s little theorem). If p is a prime number then p|(ap−1 −


1) for all a ∈ Z. In particular, app Z = a mod pZ.
Proof. The group of units U (p) has p − 1 elements. That means, that the
order of every element in U (p) divides p − 1. In other words, g p−1 is the identity
for all g ∈ U (p). Unraveling this gives ap−1 + pZ = 1 + pZ for all a ∈ Z. The second
statement follows from multiplication by a. 

Note that p prime is essential: Fermat’s little theorem fails for p = 4. Question:
are there non-primes for whch this does work?
4. KERNELS AND NORMAL SUBGROUPS 49

4. Kernels and normal subgroups


Recall from homework the concept of conjugation:
Definition IV.21. If H is a subgroup of G, and if a, g ∈ G then the conjugate
of g with respect to a is the element aga−1 . The set of conjugates of elements of
H, aHa−1 , is the conjugate of H with respect to a.
We note that aga−1 ag 0 a−1 = agg 0 a−1 and so that products of conjugates
are conjugates. In particular, a conjugates eG to eG and the inverse of aga−1
is ag −1 a−1 . In fact, conjugation by a provides an inner automorphism ψa of G and
aHa−1 is a subgroup of G.
Definition IV.22. Suppose φ : G → G0 is a morphism. Set ker(φ) = {g ∈
G|φ(g) = eG0 } be the kernel of φ.
Theorem IV.23. For any morphism φ : G → G0 , the kernel ker(φ) is a subgroup
of G. Moreover, ker(φ) is stable under all inner automorphisms ψx for x ∈ G.
Proof. For being a subgroup we need to show that ker(φ) is closed under G-
multiplication, and under taking inverses. We will use that we already proved that
a morphism must take eG to eG0 . I will be very explicit about where multiplications
happen, in G or in G0 .
So let g1 , g2 ∈ ker(φ). By definition that means φ(g1 ) = φ(g2 ) = eG0 , the
identity in G0 . The morphism property then gives φ(g1 ·G g2 ) = φ(g1 ) ·G0 φ(g2 ) =
eG0 ·G0 eG0 = eG0 . Moreover, if g ∈ G then eG0 = φ(eG ) = φ(g ·G g −1 ) = φ(g) ·G0
φ(g −1 ) which shows that φ(g) and φ(g −1 ) are always inverse to one another in G0 .
In particular, if φ(g) = eG0 then the same is true for g −1 . That shows that if
g ∈ ker(φ) then g −1 ∈ ker(φ). We have therefore shown that ker(φ) is a subgroup
that we denote H for brevity in the rest of the proof.
Now consider conjugation by x ∈ G. All elements of xHx−1 have the form
xgx with g ∈ ker(φ), so φ(g) = eG0 . Then we need to show that xgx−1 is also
−1

in the kernel of φ. So we test it: φ(x ·G g ·G x−1 ) = φ(x) ·G0 φ(g) ·G0 φ(x−1 ) =
φ(x) ·G0 eG0 ·G0 φ(x−1 ) = φ(x) ·G0 φ(x−1 ) = eG0 . So, indeed we have xgx−1 ∈ H. So
H is stable under conjugation. 
Definition IV.24. If H ⊆ G is a subgroup that is stable under all inner
automorhisms, we call H a normal subgroup.
As a side remark, this is not “normal” behavior in the usual sense of language.
Looking at all subgroups H of a given group G, it is usually quite unnormal for H
to be normal. Normal subgroups are quite special.
Note that aHa−1 = H is equivalent to aH = Ha so that left and right cosets
agree for each a ∈ G precisely when H is normal.
Example IV.25. The kernel of any morphism is normal as we proved in the
theorem above.
Example IV.26. The subgroup {e, a} ⊆ Sym( ) is not normal. Indeed,
a ∈ H but `a`−1 = c`−1 = cr = b is not in H.
One can check that there are not many normal subgroups of Sym( ): the only
ones stable unde all conjugations are the trivial group {e}, the rotation subgroup,
and the whole group. (These latter two are always normal and never interesting as
subgroups).
50 IV. WEEK 4: COSETS AND MORPHISMS

Remark IV.27. A subgroup is normal if and only if the left cosets aH agree
with the right cosets Ha. This follows because [aH = Ha] ⇔ [aHa−1 = H] as one
sees by multiplying with a−1 on the right.
Example IV.28. Let G be the 2 × 2 invertible matrices with real entries, with
matrix multiplication as group operation. Let φ : G → R× be the morphism that
takes determinants. Linear algebra says that det(ABA−1 ) = det(A) det(B)/ det(A) =
det(B). So if det(B) = 1 then this is also true for all its conjugates.
CHAPTER V

Week 5: Permutations and the symmetric group

Definition V.1. The symmetric group Sn is the group of all permutations on


n elements. It makes no difference what the permuted n things are. We usually
assume they are the numbers 1 . . . , n.
Note that Sn has n! elements.

Example V.2. We have met S3 as Sym( ). We denote the elements of Sn by


arrays. For example, if our triangle has the letters A, B, C written on the vertices
in counterclockwise order, then we have the correspondence
     
A B C A B C A B C
e↔ ,r ↔ ,l ↔
A B C C A B B C A
     
A B C A B C A B C
a↔ b↔ c↔
A C B C B A B A C
between symmetries of the triangle  (on the left)
 and the permutations of A, B, C.
A B C
The meaning of for example is that the lower row indicates the
B C A
letter that is replaced by the one on top of it. So, B (below) is replaced by A (above
it) and so on. Another way of saying this is: the letter A (above) moves to where
the letter B (below A) was. Better yet is to say “what used to be in the A-bucket
is now moving in the B-bucket.
If one composes, one gets for example
    
A B C A B C A B C
ra = b ↔ = .
C A B A C B C B A
It takes some practice to read this product correctly. The important bit is that one
again carries it out right to left. So, if you want to know what this
 product on the
A B C
right does to the letter B you first check what the right factor does
A C B
to B. And you find, it sends it to where
 C used
 to be. So B is now in bucket 3.
A B C
Next, you ask what the left factor does to stuff in bucket 3, so you
C A B
look at the column labeled C. And it says that stuff in bucket 3 is being moved
to bucket 2, since under the C is a B. So, combining both steps, B first moves to
bucket 3 and then back to bucket 2. So in the product one should have B above
B, which is exactly right.
Similarly, A in the right factor moves to bucket 1, and then with the left factor
to bucket 3. So A should be above C in the product. Finally, C moves with the
right factor to bucket 2, and then with the left factor to bucket 1. So, C in the
product should stand above A.
51
52 V. WEEK 5: PERMUTATIONS AND THE SYMMETRIC GROUP

Definition V.3. There is another way to wrote permutatins called cycle no-
tation. You start with some  letter (A, 
and then record where A goes. For example,
A B C
for the right rotation r = we write down (A, C. Next you ask where
C A B
C goes, and under r that is B. So we continue to (A, C, B. But B now is moved
to A and that “closes the cycle, so we write (A, B, C).
If a cycle closes before you have witten down
 what happens to all
 elements,
A B C D E F
just open another cycle. So, the permutation has cycle
B C A D F E
notation (A, B, C)(D)(E, F ). It rotates A, B, C in a 3-cycle and also rotates E, F
in a 2-cycle, and leaves D put.
One may or may not indicate 1-cycles (since they are talking about elements
that do not move, the assumption is that if an element does not show in a cycle
that you wrote, then it is not moving. For example (1, 3, 5) is a permutation that
leaves 2 and 4 fixed.)
A cycle of length 2 is a transposition.
How does one compose cycles? Just the same as always: start on the right.
So, (1, 4, 5)(2, 3, 4, 1)(3, 5) is decoded as follows. Start with 1. under (3,5) it goes
to position 1, then 1 goes under (2,3,4,1) to position 2. So the 1 we started with is
now in position 2. Stuff in position 2 moves under (1,4,5) not al all, so position 2
is the final destination of 1. So we start writing the prouct as (1, 2.
Next we redo this all with input 2. Under (3,5), 2 stays put. Under (2,3,4,1)
stuff in bucket 2 moves to bucket 3. And then under (1,4,5) stuff in bucket 3 stays
put. So overall, 2 moves to bucket 3. So we are now at (1, 2, 3.
Restart with input 3. Under (3,5), 3 moves to bucket 5, and under (2,3,4,1)
bucket 5 stays put. The at the end (1,4,5) mive bucket 5 to bucket 1, and that
means our 3 lands in bucket 1. So, we have found the first part of the product cycle
as (1,2,3).
This does not yet explain what happens to 4 and 5 under the product. Let’s
check 4. Under (3,5) the bucket 4 stays put. Then it is moved to bucket 1 under
(2,3,4,1). And bucket 1 is moved to bucket 4 under (1,4,5). Hence the number 4
stays put overall. That means, 5 also must stay put since there is no more open
space. So, the product is (1, 2, 3)(4)(5).
Remark V.4. • What our product procedure produces is disjoint cy-
cles. That is, the cycles we write down as anser are such that no number
occurs in more than one cycle. Disjoint cycles are prefereable since we
“understand” better.
• For example, the order of any cycle (in the group theory sense) is its own
length: if you rotate left a bunch of k people on a round table, you need
to repeat this k times until everyone is back to his own seat. Moreover,
if you have the product of a bunch of disjoint cycles, then the order of
this product is th lcm of the cycle lengths. For example, the order of
(1,2,3)(4,5) is 6, because only iteration multiples of 3 make 1,2,3 go back
home, and only even numbers of iterations make 4,5 go back home.
In contrast, (1,2,3)(2,3,4)=(1,3)(2,4) has order 2, not 3 (the lack of
disjointness messes with thing!)
Theorem V.5. Any permutation is a product of 2-cycles (usually not disjoint!).
V. WEEK 5: PERMUTATIONS AND THE SYMMETRIC GROUP 53

Proof. It is enough to show that any single cycle can be made from 2-cycles.
If n = 2 this is clear (except that you need to say that the identity is writeable as
(1, 2)(1, 2). )
If n ≥ 3 check that (a1 , . . . , ak ) = (a1 , a3 , . . . , ak )(a1 , a2 ). So the theorem
follows from induction. 
Lemma V.6. If you take a permutation σ and write it as product of permuta-
tions, then the number of transposition is not determined by σ, but the number of
permutations for the same σ is either always odd or always even.
Before we embark on a proof, one more concept:
Definition V.7. Let σ be a permutation of 1, . . . , n. We say that [i, j] is a
switch of σ if i < j but σplaces i in a position  behind where it places j. In other
1 2 ... n
woords, if you write σ = then i < j but σi > σj .
σ1 σ2 . . . σ n
The disorder of σ is the number of switches of σ. The parity of σ is the answer
to the question “Is the disorder of σ even or odd’ ?”
 
1 2 3 4
For example, the cycle (1,2,3,4) is the permutation and so has
2 3 4 1
switches [1, 4], [2, 4], [3, 4] and so has disorder 3 and is an odd permutation (has odd
parity)
Proof of the lemma: It is enough to prove that the identity cannot be written
as the product of an odd number of transpositions. (Since e = (1, 2)(1, 2) is an even
way of writing the identity).
The main idea is that if some σ is composed with a transposition then its
 for 1 ≤ i < j ≤ n,
parity changes. Let’s check that. So we imagine,  composing the
1 2 ... n
transposition (i, j) with the permutation σ = and we count
σ1 σ2 . . . σn
the change in the disorder.
Let’s say the output of σ is the sequence
s1 , . . . , si−1 , si , si+1 , . . . , sj−1 sj , sj+1 , . . . , sn .
Then the output of (i, j)σ is
s1 , . . . , si−1 , sj , si+1 , . . . , sj−1 si , sj+1 , . . . , sn .
We consider the change in the number of switches.
If a switch of σ does not involve i nor j then it is a switch also of the composition
(i, j)σ. So we need to focus on switches that involve either i or j or both. We next
study when a switch involves one of i, j.
If k < i, the number of st that appear to the right of sk but are smaller than
sk does not change if we interchange si ith sj .
If k > j then the number of st that are to the right of sk and are smaller than
sk chenges even less.
If i < k < j, there are 4 cases:
(1) If sk < si and sk < sj then [k, i] is a switch in σ and [k, j] is a switch in
(i, j)σ.
(2) If sk < si and sk > sj then [k, i], [k, j] are a switch in σ and neither of
[k, i], [k, j] is a switch in (i, j)σ.
54 V. WEEK 5: PERMUTATIONS AND THE SYMMETRIC GROUP

(3) If sk > si and sk < sj then neither of [k, i], [k, j] is a switch in σ and both
[k, i], [k, j] are a switch in (i, j)σ.
(4) If sk > si and sk > sj then [k, j] is a switch in σ and [k, i] is a switch in
(i, j)σ.
In all cases then, so far, the change of the number of switches in σ versus (i, j)σ is
even.
Finally, consider the pair i, j. If it is not a swutch for σ then it must be one for
(i, j)σ, and conversely. So overall, the number of switches is an even number plus
one, and hence odd.
What this means is that if you write any σ as product of transpositions, the
number of transpositions must agree with the parity of σ (which does not depend
on how you write σ as such product!). So, the number of transpositions used is
even if and only of the parity of σ is even. 

Definition V.8. Let An be the altermating group of all even permutations.


Note that we just proved that this definition makes sense since products of even
permutations are even (just as odd times even or even times odd is odd, and odd
times odd is even).
Now recall the Cayley table to the group D2 , and line it up as e ↔even f ↔odd.
Note that D2 can be viewed as (Z/2Z, +). That means there is a morphism
sign : Sn → (Z/2Z, +)
that sends even permutations to 0 mod 2Z, odd permutations to 1 + 2Z, and turns
composition of permutations into addition of signs. The kernel of this morphism is
An .
Example V.9. For n = 3, An =the rotations. Note that indeed composition
of An -elements (rotations) gives you other An -elements (rotations).
Note that for n > 3 the rotations do not fill out An , although they do belong
into An . For example, (1, 2)(3, 4) is not a rotation but still even.
The following explains the special position of permutation groups.
Theorem V.10. Any group G can be viewed as a permutation group.
Proof. Take the base set for the permutations to be the elements of G. Then
take g ∈ G and read it as a permutation σ g by recording in the permutation σ g the
products g · g 0 , letting g 0 run through all of G. By the cancellation property you do
indeed get a permutation. Multiplication in G then corresponds to composition of
permutations. 

Example V.11. Recall that KV4 is the symmetry group of the letter H. We
can make it a subgroup of S4 as follows. Take as symbols of the group the “letters”
e, ↔, l, x. Now ask what the effect of multiplying by the group elements is on the
sequence {e, ↔, l, x}. We find
e · {e, ↔, l, x} = {ee, e ↔, e l, e x} = {e, ↔, l, x}
l ·{e, ↔, l, x} = {l, x, e, ↔}
↔ ·{e, ↔, l, x} = {↔, e, x, l}
x ·{e, ↔, l, x} = {x, l, ↔, e}.
V. WEEK 5: PERMUTATIONS AND THE SYMMETRIC GROUP 55

If one now reads these as permutations, one can write them as cycles as:
e becomes ()
l becomes (e, l)(↔, x)
↔ becomes (e, ↔)(l, x)
x becomes (e, x)(l, ↔).
If one translates into symbols 1, 2, 3, 4 we get
KV4 = {(), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)}.
Remark V.12. In many cases, there is a more obvious way of embedding a
given group into a symmetric group. For example, the symmetry group of a cube
is a naturally a subgroup of S8 since the symmetries of the cube move around the
8 vertices. But that is sort of an accident: not every group comes to us as the
symmetry group of a small set of things (with certain constraints). If someone
hands us the symmetry group of a cube without saying what it really is, and if
we don’t notice it, we would have to take recourse to the recipe of the proof of
the proposition. And that would view the symmetry group of the cube (with 48
elements) as a subgroup of S48 , a rather unpleasant idea. So the proposition conveys
a principle, but it pays to be opportunistic.
CHAPTER VI

Week 6: Quotients and the Isomorphism Theorem

Let me start with recalling some ideas from the past. If H ⊆ G is a subgroup
(same identity element, same multiplication) then is is a normal subgroup if it has
no conjugate subgroups aside form itself. This is saying, that aHa−1 = H (or
aH = Ha)for any a ∈ G. Note that this says that aha−1 is again in H, but it does
not require that aha−1 = h (but it also does not say this should not be true)
Let us also recall that H can be used to make H-formed clusters in G by looking
at the left cosets aH; any element of G belongs to one such coset, so their union is
all of G, and two cosets either do not meet at all, or agree completely. No partial
agreement is possible (because of the cancellation property).

1. Making quotients
Definition VI.1. Let us denote the collection of all left H-cosets in G by
G/H.
Note the similarities: when G = Z is all integers, and H = nZ the subgroup of
integers divisible by n then G/H = Z/nZ is exactly the collection of cosets a + nZ
with a ∈ Z.
Note also that Z/nZ is a group itself; we would like to arrange for G/H to be a
group as well. The natural plan would be to define (aH) ∗ (bH) = abH. Lets look
in an example what that is like.
Example VI.2. Let G = S4 be the symmetry group of the equilateral tetra-
hedron (also known as the permutation on 4 elements) and take as H the group of
permutations {(), (12)(34), (13)(24), (14)(23)}. We saw at the end of last class that
this group can be identified with KV4 , the symmetry group of the letter H.
As S4 has 24 elements and H has 4, the clusters we make for cosets have size
4 and there will be 6 such cosets. They are: (no shortcut here, I just sat down and
computed each set aH by hand):
E := H = {(), (1, 2)(3, 4), (1, 3)(2, 4), (1, 4)(2, 3)},
γ := (12)H = {(1, 2), (34), (1, 3, 2, 4), (1, 4, 2, 3)},
β := (13)H = {(1, 3), (1, 2, 3, 4), (2, 4), (1, 4, 2, 3)},
α := (14)H = {(1, 4), (1, 2, 4, 3), (1, 3, 2, 4), (2, 3)},
λ := (123)H = {(1, 2, 3), 1, 3, 4), (2, 4, 3), (1, 4, 2)},
ρ := (124)H = {(1, 2, 4), (1, 4, 3), (1, 3, 2), (2, 3, 4)}.
Now we would like to make these 6 clusters into a group. As mentioed above, we
aim for (aH)(bH) = abH. In order to avoid problems such as we met in Assignment
4a when we were looking at morphisms from Z/mZ to Z/nZ that were not even
functions (because they destroyed cosets), we need to keep cosets together. More
57
58 VI. WEEK 6: QUOTIENTS AND THE ISOMORPHISM THEOREM

explicitly, we need that for all choices of a, a0 ∈ G and h, h0 ∈ H we have that


ghg 0 h0 ∈ gg 0 H. (Multiplication should not depend on the specific representative we
picked; if it does, multiplication would destroy cosets). But
[ghg 0 h0 ∈ H]
⇔ [hg 0 h0 ∈ g 0 H]( cancelling a g)
−1 −1
⇔ [g 0 hg 0 h0 ∈ H]( left-multiply by g 0 )
0 −1 0 0 −1
⇔ [g hg ∈ H]( right-multiply by h )
0
⇔ [aHa ∈ H]( renaming g to a)
⇔ [ H is normal in G.
So, we should check whether H is normal. Since every element of S4 is a
product of transpositions (i, j), we do not need to test all 24 elements a ∈ G
whether aH = Ha, but only tr transpositions. And since H stays H when you
arbitrarily rename the permuted objects 1, 2, 3, 4, it suffices to check that aH = Ha
for a = (1, 2). (Note: H consists of the identity, and all 3 possible products of
disjoint 2-cycles. This description takes no recourse to the name of actual elements,
so renaming keeps H stable).
We compute:
(1, 2)()(1, 2)−1 = (),
(1, 2)((1, 2)(3, 4))(1, 2)−1 = (1, 2)(3, 4),
−1
(1, 2)((1, 3)(2, 4))(1, 2) = (1, 4)(3, 2),
−1
(1, 2)((1, 4)(2, 3))(1, 2) = (1, 3)(2, 4).
So, the conjugate by (1, 2) of every element of H is again an elemnt of H. It follows
that H is normal and our idea of setting (aH)(bH) = abH will indeed work.
As a side remark, note that every element of H is an even permutation. (Since
they are made of zero or of two 2-cycles). It follows that each coset aH is either
completely even or completely odd.
Now that we know that our S4 /KV4 is a group, it is a reasonable question to
ask: what group is it? A first step towards this is always to compute the Cayley
table. An easy but painstaking computation reveals that is is as follows:
 
E ρ λ α β γ
 ρ λ E β γ α
 
λ E ρ γ α β 
 
α γ β E λ ρ 
 
β α γ ρ E λ
γ β α λ ρ E
Now checking back all the way at the start of Week 2, if we use the translations

e ↔ E, a ↔ α, b ↔ β, c ↔ γ, r ↔ ρ, ` ↔ λ,
we see that up to the renaming we are looking at Sym( ) = S3 .
Could we have seen this somehow? Yes, I think so, and here is how. The fact
that we have any group structure at all on G/H is because the normality of H
assures us that whenever a0 belongs to the coset aH and b0 belongs to the coset bH,
then a0 b0 is in the coset abH. Now look at the 6 cosets, and pick out the elements
in each coset that do not use 4. We find E ∈ E, (1, 2) ∈ γ, (1, 3) ∈ β, (2, 3) ∈ α,
1. MAKING QUOTIENTS 59

(1, 2, 3) ∈ λ and (1, 3, 2) ∈ ρ. The remarkable fact is that there is exactly one in
each coset. Composing or inverting these elements can only produce other elements
that also do not use 4, so these 6 elements actually form a group by themselves,
a subgroup of S4 . And it is easy to see that this subgroup is exactly S3 . The
renaming was made in such a way that a Greek letter corresponds to the Roman
letter that we have the element of S3 sitting inside the coset indicated by the Greek
letter.
Let us look at a somewhat easier example, easier because of commutativity.
Example VI.3. Let G = (Z/24Z, +) and let H be the subgroup formed by
the multiples of 6. As in the previous example, |G| = 24 and |H| = 6. But in this
case there is no question that H is normal, since G is Abelian and so even ah = ha
element by element, and not just aH = Ha as a set.
The cosets are then
(3) {0 + 24Z, 6 + 24Z, 12 + 24Z, 18 + 24Z},
(4) {1 + 24Z, 7 + 24Z, 13 + 24Z, 19 + 24Z},
(5) {2 + 24Z, 8 + 24Z, 14 + 24Z, 20 + 24Z},
(6) {3 + 24Z, 9 + 24Z, 15 + 24Z, 21 + 24Z},
(7) {4 + 24Z, 10 + 24Z, 16 + 24Z, 22 + 24Z},
(8) {5 + 24Z, 11 + 24Z, 17 + 24Z, 23 + 24Z}.
So, for example, the last of these cosets contains all numbers that leave rest 5 when
divided by 6. If we call these collections 0, . . . , 5 (in the given order), and recall
that we are supposed to use addition, it is clear how we want to think of the group
G/H: it is Z/6Z.
We formulate officially what we have seen in examples.
Theorem VI.4. If H is a normal subgroup of G then one can equip the col-
lection of left cosets {aH|a ∈ G} with a group structure. The multiplication in this
group takes aH and bH and multiplies them to abH. The resulting group is denoted
G/H and called the quotient group of G by H.
If H is normal, the same construction can be carried out for the right cosets
Ha, and that also leads to a group. One can check that these are the same groups,
so that the symbol “G/H” is unambiguous. 
It is often good for understanding a definition when one sees a case where the
defined concept is absent.

Example VI.5. Let G be S3 = Sym( ) and take H to be the subgroup


H = {1, a}. We have checked ultiple times that H is not normal. Let us see how
this impacts G/H being a group.
The collection of left cosets is {1, a} = aH, {b, ba = r} = bH and {c, ca} = cH.
Let us name these cosets E, R, L in that order.
If we hope that we can mae {E, R, L} a group, then the product structure must
come from multiplication in G. That means for example, that we should be able
to multiply any element of the coset E with any element of the coset R and get
elements that all live in the same coset. (Presumably, that product coset should
be R again, since E contains the identity e and therefore should be the coset that
gives the identity element in G/H).
60 VI. WEEK 6: QUOTIENTS AND THE ISOMORPHISM THEOREM

However, we find: eb = b, er = r, ab = `, ar = c. These four products do not


lie in any one of the three cosets E, R, L but in fact cover two of them, E and L.
Thus, there is no meaningful product E · R and we cannot hope to make G/H into
a group.
Note how this failure is quite similar to looking for morphisms Z/mZ → Z/nZ
using multiplication by k that does not satisfy n|km. The underlying theme is
that one is only allowed to do operations on cosets that do not destroy the cosets.
Friends should stay friends!

2. The isomorphism theorem


Now suppose ψ : G → G0 be a morphism. We checked previously, that then
ker(ψ) is a subgroup of G. We also checked that this subgroup is normal in G (since
if φ(h) = eG0 then ψ(aha−1 ) = ψ(a)ψ(h)ψ(a−1 ) = φ(a)eG0 φ(a)−1 = eG0 showing
that aha−1 belongs to ker(ψ) as well).
It follows that G/ ker(ψ) can be turned into a group. We want to find ut what
this quotient group has to do with φ in concrete terms.

Example VI.6. Let G = Z/28Z, G0 = Z/42Z and ψ : G → G0 be “multiplica-


tion by 9” in the sense that ψ(a + 28Z) = 9a + 42Z.
Start with noting that 42 indeed divides 9 · 28, so by our often-mentioned
criterion “multiplication by 9” does indeed give a function that does not destroy
cosets.
The kernel of ψ consists of those cosets a + 28Z for which 9a is a multiple of
42. But 42|9a if and only if 14|a. So, ker(ψ) has two elements, {0 + 28Z, 14 + 28Z}.
We call this subgroup H.
You might want to think of H as Z/2Z “stretched by the factor 14: as a group
of
 elements
2  they are isomorphic. The identity is 0 + 28Z and the Cayley table is
0 14
. Formally, that is the table of Z/2Z.
14 0
Now in G/H we make “cosets of cosets”. For example, in the coset eH we throw
together 0 + 28Z and 14 + 28Z. Note that this boils down to lumping together all
the multiples of 14 into one big family. And that this is going to be the identity
element of the quotient group G/H. So, G/H is “G with 14 declared to be zero”.
But that just means Z/14Z.
Now that we have understood the quotient, let us see what else ψ can tell
us. The morphism ψ involves the groups G and G0 , and we also have concocted
the group ker(ψ). There is a fourth group lurking here, namely the group of all
elements that are output of ψ. In the case at hand, that is all cosets of the form
9a + 42Z. So, these are the cosets modulo 42 of 0, 9, 18, 27, 36, 45, . . .. But in
Z/42Z, 45 counts the same as 3=45-42. So this set of output representatives really
reads 0, 9, 18, 27, 36, 3, 12, 21, 30, 39, 6, 15, 24, 33, 0. Then it cycles, and so we get
any b + 42Z for which 3|b.
The collection of cosets 3 + 42Z, 6 + 42Z, . . . , 0 + 42Z is a group with addition
and sits inside Z/42Z. It is called the image of ψ, written im ψ), and it is made
of all the possible outputs of ψ. You can view it as Z/14Z “scaled up” by a factor
of 3. But remember: Z/14Z was also what the group G/H looked like! The two
groups G/ ker(ψ) and im(ψ) have the same structure!
The fact that this is typical behavior is our next theorem.
2. THE ISOMORPHISM THEOREM 61

Before we state it, let me remind you that you have seen something like this
before: if A is a real m × n matrix, you can view it as a way to turn vectors v ∈ Rn
into vectors A · v of Rm . Both Rn , Rn are groups (the first three axioms that you
learned for a vector space mean just that it is a group for addition of vectors!)
The kernel of the matrix A used to be the vectors v that have Av = 0, and since
the zero vector is the identity element for vector addition, this old “kernel” idea
for vector spaces agrees exactly with our new one for groups. And you were also
told that the image of A (you used to call it the column span) is a vector space
(hence group!) of dimension rk(A). And to top it all off, you learned that rank lus
nullity gives n. In new and fancy terms this can be phrased as “the kernel of A is a
vector space of dimension n − rk(A), and Rn / ker(A) is a vector space of dimension
n − (n − rk(A)) = rk(A). This quotient is precisely the column space of A, a vector
space of dimension rk(A) just like Rn / ker(A).
Theorem VI.7 (The isomorphism theorem). If
ψ : G → G0
is a morphism of groups with kernel H := ker(ψ) sitting inside G, and with image
im(ψ) sitting inside G0 , then there is an isomorphism
ψ : G/ ker(ψ) ' im(ψ)
where ψ(aH) = ψ(a).
Here, the group operation in G/H is (aH)(bH) = abH and the operation in
im(ψ) is the one from G0 .
I will not prove this theorem in detail, but here is why you should think it is
true:
(1) As you move from G to G0 using ψ, products are preserved but all of H
is crunched dwon to eG0 , basically by definition. Therefore if you want
to relate stuff n G with stuff in G0 , you need to form the cosets G/H to
account for the “lumping together” of anything in H.
(2) You are not going to be able to relate elements of G0 that are not outputs
of ψ to anything in G since ψ is your only comparison vehicle, and stuff
in G0 that ψ “does not see” is stuff that ψ has no opinion about.
(3) So, really the question is what G/ ker(ψ) hasto do with im(ψ). And the
function φ I mentioned, which sends a coset aH to ψ(a), can be shown
to be a morphism (easy, since ψ is), and injective (confusing, but easy) ,
and surjective (easy). But that makes it an isomorphism.
In particular, if ψ is surjective,
G/ ker(ψ) ' im(ψ).
Example VI.8. Here I will talk through some examples.
(1) Let ψ be the morphism from Z/15Z to Z/25Z that multiplies by 5, sending
a + 15Z to 5a + 25Z. (Recall that if k is to be used as morphism from
Z/mZ to Z/nZ then we need that n|km).
Then ker(ψ) = {a+15Z|5a+25Z = 0+25Z}. This requires that 25|5a
so that a must be a multiple of 5. So, ker(ψ) = {0+15Z, 5+15Z, 10+15Z}.
You can view this as Z/3Z “inflated by a factor of 5”.
The image im(ψ) of ψ are the cosets {5a + 25Z}. That is a group of
5 elements. We know (5 is prime) that this is a cyclic group, and indeed
62 VI. WEEK 6: QUOTIENTS AND THE ISOMORPHISM THEOREM

5 + 25Z is a generator as all other image elements are multiples of 5 + 25Z.


Abstractly, im(ψ) looks like Z/5Z therefore. And one could say that it is
Z/5Z “inflated by a factor of 5”.
So, the isomorphism theorem says that (Z/15Z)/5 ◦ (Z/3Z) ' 5 ◦
(Z/5Z), where the 5◦ means “inflate by 5, and ◦4 means “inflate by 3”.
(2) More generally, let n|km and consider the morphism k : Z/mZ → Z/nZ
that multiplies by k. Then the kernel is the elements a + mZ with n|ak
and these are just the cosets in Z/mZ corresponding to the multiples of
n/ gcd(n, k). (The lowest a with n|ak is the mimimal a that satisfies:
ak is a multiple of k; ak is a multiple of n. So we want the smallest
a for which ak is a multiple of lcm(n, k), and of course that smallest
ak that is a multiple of lcm(n, k) is just lcm(n, k). It follows that the
corresponding a is lcm(n, k)/k and so equals n/ gcd(n, k) since in general
xy = lcm(x, y) · gcd(x, y).)
The number of elements in the kernel is κ := m/(n/ gcd(n, k)) =
m · gcd(n, k)/n = mk/ lcm(n, k). This group looks like Z/κZ inflated by
lcm(n, k)/k.
The image is the subgroup of Z/nZ consisting of the cosets to elements
of the form ak. Since gcd(k, n) is a linear combination of k and n, this
image is the same as the subgroup generated by the coset gcd(k, n) + nZ.
It can be viewed as the group Z/νZ inflated by gcd(k, n), where ν :=
n/ gcd(k, n) = lcm(k, n)/k.
Altogether, the isomorphism theorem says:
(Z/mZ)
' gcd(k, n) ◦ (Z/νZ).
ν ◦ (Z/κZ)
Note that κ · ν = (mk/ lcm(n, k)) · (lcm(k, n)/k) = m as it should.
(3) In the previous items, normality came for free since the groups were nor-
mal. Let now G be the octonians. Its center Z consists of the elements
{±1} as is easy to see. The center of a group is made of th elements
that commute with everyone, so in particular the center of G is a normal
subgroup.
The quotient of the octonians by their center is a group of 8/2 = 4
elements. Which group is it? We know it can only be Z/4Z or KV4 since
these are the only groups of size 4.
If you look at the cosets, they are E = {±1}, I = {±i}, J = {±j}, K =
{±k}. Note that I · I is exactly E, and similarly J · J = E = K · K.It
follows that no element of G/Z has order 4, and so G/Z must be KV4 . We
can check explicitly (element by element) that I · J = J · I = K, I · K =
K · I = J, J · K = K · J = I. So we can align G/Z with KV4 by
E ↔ (), I ↔ (12)(34), J ↔ (13)(24), K ↔ (14)(23), preserving Cayley
tables.
In fact, (one can check that) we can make a morphism π : G → KV4
by sending 1 and −1 to E, i and −i to I, j and −j to J, and k and −k to
K, respecting multiplication. The kernel of this morphism is {±1}, and
so the isomorphism theorem predicts G/Z ' KV4 .
CHAPTER VII

Week 7: Finitely generated Abelian groups

1. Row reduced echelon form over the integers


A linear transformation (in linear algebra) is a function T : V → W such that
T (~v + ~v 0 ) = T (~v ) + T (~0 ) and T (λ~v ) = λT (~v ) for all v, v 0 ∈ V and all λ ∈ R.
A moment’s thought shows that this is a (somewhat special) morphism from the
group (V, +) to the group (W, +).
Suppose dim(V ) = n, dim(W ) = m, and suppose one has chosen bases BV =
{bV1 , . . . , bVn } and BW = {bW W
1 , . . . , bm } in V and W respectively. (You might want
to think of BV , BW as matrices whose columns are the elements of the basis). Then
to each v ∈ V there is a coefficient vector cBV (~v ) ∈ Rn such that ~v is the linear
combination BV · cBV (~v ) = c(~v )B V V
P
i bi .
Recall that if T : V → W is a linear transformation (like in linear algebra) then
there is a real m × n matrix A such that if cBV (~v ) is the coefficient vector for ~v
relative to the basis BV , then A · cBV (~v ) is the coefficient vector of T (~v ) relative to
the basis BW , and this happens for all ~v ∈ Rn . In other words, T (BV · cBV (~v )) =
BW · (A · cBV (~(v)).
If we change the basis on the source and target space to BV0 and BW 0
, then
n×n m×m 0
there are matrices QV ∈ R and QW ∈ R such that BV QV = BV and
0
BW QW = BW . The coefficient vector for ~v relative to BV0 is then the vector
0 0
c (~v ) such that BV0 · cBV (~v ) = ~v = BV · cBV (~v ), but as BV0 QV = BV this means
BV
0 B0
BV0 QV · cBV (~v ) = BV · cBV (~v ) = ~v = BV0 · cBV (~v ), hence cV V (~v ) = QV · cBV (~v ).
Then we have
W
~ = B W cBW (w)
T (~v ) = w ~ = B W AcBV (~v ) = B 0 QW AcBV (~v )
W
= B 0 QW A(QV )−1 QV cBV (~v )
W V
= B 0 [QW A(QV )−1 ]c0 (~v ).

This says that the transformation T (which exists independently of the choice of
W V
basis) is represented relative to the bases B 0 , B 0 by the matrix A0 = QW AQ−1 V .
(As a special case, if V = W and one chooses BV = BW then the change of
coordinates has the effect of conjugation on A. In some sense this is clear: if you
have a recipe tor a tranformation (called A) that works in one language (the bases
BV , BW ) and you want to use in a different language (the bases BV0 , BW
0
) then you
first translate the ingredients from the new into the old language (by Q−1 V ), then
use the recipe (namely A), and then translate the result into the new language (by
QW ). Once again, this goes right to left because that is the way functions work).
The moral of this linear algebra story is that a transformation is not affected
by the way we think of the input and the output space, but the tools we use
63
64 VII. WEEK 7: FINITELY GENERATED ABELIAN GROUPS

to compute what the transformation does (namely, A) do change, and do so in


predictable manner.
The main motivation is that we don’t care too much what the bases are that
we use, but want to understand only the nature ot T , we can perhaps arrange them
so that the matrix A looks very simple.
Recall now, that a change of basis requires that QV , QW are invertible (so
that you can undo the change, with the inverse matrix). Recall also that in linear
algebra you learned that row reduction leads to row reduced echeleon form and can
be accomplished by three elementary row operation steps: (I) interchanging two
rows, (II) adding some number of copies of one row to another, and (III) scaling a
row by a an invertible number. Recall finally that the process of row reduction of
a matrix A is mirrored by multiplication of A on the left by elementary matrices,
corresponding to the three steps, and so the row reduced echelon form of A can
be achieved as a product E · A where E is the product of all the elementary row
operations used to row reduce A. Naturally, this E is invertible since each row
reduction step can be reversed.
Similar to row operations, one can discuss column operations and column re-
duced echeleon form, which is practically the transpose of the row reduced echelon
form of the transpose of A. Now imagine what the row reduced echeleon form
turns into when you column reduce it. The row reduced echelon form has rank
many nonzero rows, and they start with leading ones placed on a Northwest-to-
Southeast “diagonal”. If you now column reduce, all that remains are the rank
many leading ones.
Now we need to make a leap of sorts: we need to consider what happens to all
this when we do not use real numbers, but just integers.
The main issue that comes up is that we can’t divide most of the time. In
particular, our usual formula for an inverse matrix,
1
A−1 = adj(A)
det(A)
involving the adjoint matrix, indicates that most matrices cannot be inverted over
the integers. It only will work if det(A) = ±1. This rules out one of the basic
row operation steps, the one that says “rescale row i by λ”. So we will have to
live without that. On the other hand, switching 2 rows or 2 columns, or adding
multiples of one row to another row, or adding multiples of one column to another
column, are all processes that can be inverted with integers. So we still get to use
these 2 kinds of operations, but now on rows and on columns.
Example VII.1. Suppose G = Z3 , the set of all 3-vectors with integer coor-
dinates, and we want to understand the quotient G/H by the supgroup H that is
generated by the columns (1, 0, −1)T , (4, 3, −1)T , (0, 9, 3)T and (3, 12, 3)T . So, H
consists of all linear combinations of these 4 columns. The difficulty in understand-
ing the ramifications of “setting elements of H to zero” in the process of going from
G to G/H is that the individual coordinates of a vector in H are not independent
from one another.  
1 4 0 3
So make a matrix  0 3 9 12. Read it as a map from Z4 to Z3 , sending
−1 −1 3 3
~v ∈ Z4 to A · ~v in Z3 .
1. ROW REDUCED ECHELON FORM OVER THE INTEGERS 65

Row reduction says that the relations of these 4 columns don’t change (and
neither does therow span) if  you add row 1 to row 3 the 1 to wipe out the -1,
1 4 0 3
which leads to 0 3 9 12. Of course, it does have an effect on the column
0 3 3 6
span of the matrix, so this amounts to a coordinate change in Z3 (the target of the
map).
 Now our rowreduction can go on, with 3 as pivot, erasing 3 below it. We get
1 4 0 3
0 3 9 12 . That is another change of basis in the target space Z3 . We
0 0 −6 −6
can now change the -6 to a 6, and then normal row reduction would stop.
The row steps are a reflection of a change of basis in the target of the transfor-
mation, but we can also change basis in the source. That is encoded by (invertible)
column operations. For example,
 we can use the top left 1 to wipe out the other
1 0 0 0
numbers in row I to get 0 3 9 12. And now we can use the 3 to wipe out
 0 0 6 6
1 0 0 0
all that is to its right: 0 3 0 0. And then the left 6 to kill the right 6:
  0 0 6 6
1 0 0 0
0 3 0 0.
0 0 6 0

Definition VII.2. The shape of this matrix is called Smith normal form. It
features: only nonzero entries on the diagonal, and from upper left to lower right
the diagonal entries are multiples of the previous entries.

The business of base change in source and target does not change the structure
of the quotient group (target/rowspan), although it changes how we think of it (as
any coordinate change does). So, our quotient group G/H now turns out to be Z3
modulo the linear combinations of the columns of the last matrix above. In other
words,

G/H ' (Z × Z × Z)/{(a, 3b, 6c)|a, b, c ∈ Z}.

The point of the row reduction work is that the stuff in H now has been “decoupled”:
the first coordinate of an element of H is any number, the second is any number
divisible by 3, the last is any multple of 6. The coordinates do no longer “talk to
each other”, they have become independent.
This also makes clear what G/H is equal to: (Z/1Z) × (Z/3Z) × (Z/6Z). Note
that Z/1Z is the trivial group as 1Z = Z.
Recall, that Z/6Z = (Z/2Z) × (Z/3Z) since 2, 3 are coprime. So G/H =
(trivial group) × (Z/3Z) × ((Z/2Z) × (Z/3Z)).

There is one big hurdle we did not meet in the previous example: our pivots
came as a free gift. The following example shows what to do when lunch is not
free.
66 VII. WEEK 7: FINITELY GENERATED ABELIAN GROUPS

3
Example VII.3. Lets try this for H the subgroup of G = Z  generated by 
10 −6 4
(10, −4, 8)T , (−6, −6, −16)T , (4, −10, −8)T , which yield the matrix −4 −6 −10.
8 −16 −8
There is no element here that can be used as a pivot, because a pivot should divide
all the other numbers it is used to wipe out (we don’t have access to fractions. . . ).
This means, we have to make a pivot first, by clever row or column operations, or
both.
The main question is what we can hope and aim for. Surely, we can’t make a 1
here since all numbers are even. But we could hope for a 2, and that would divide
every other number.  And we can make a 2 by subtracting  row III from row
 I, to
2 10 12 2 10 12
get −4 −6 −10. Now clean out the front column: 0 14 14 . Then
8 −16 −8   0 −56 −56
2 10 12
one more row step leads to 0 14 14 and then 3 column operations produce
0 0 0
 
2 0 0
0 14 0.
0 0 0
We infer that G/H ' (Z/2Z)×(Z/14Z)×(Z/0Z) = (Z/2Z)×(Z/2Z)×(Z/7Z)×
(Z) since 2, 7 are coprime.
Note that the zero on the diagonal is actually very important here, it tells us
about Z being a factor of G/H (and so makes G/H have infinitely many elements).
Definition VII.4. If A is an integer m × n matrix with m ≤ n then let A0
be the Smith normal form of A. The diagonal elements of A0 are the elementary
divisors of A.
If m > n, first augment A with m − n columns of zeros on the right, and then
proceed to compute Smith normal form. (This has the effect of adding m − n zeros
to the set of elementary divisors). 
Theorem VII.5 (FTFGAG, Part 1). We assume that A is m × n, with m ≤ n.
If m > n, augment A with m − n columns of zeros. We start with properties of
Smith normal form and elementary divisors.
(1) The Smith normal form of A can be computed by row and column opera-
tions of types I and II.
(2) The Smith normal form of A is determined by A alone, and not on how
we compute the normal form by pivot choices.
(3) The elementary divisors d1 , . . . , dn of A are the m numbers on the diagonal
of the Smith normal form A0 of A.
(4) The elementary divisors satisfy di |di+1 ∀i.

2. Generating groups
Recall that if a group Q is cyclic with generator g then the elements of H are
powers of either g or g −1 . This gives a presentation
Q = Z/ ord(g)Z
as a quotient of Z by a suitable subgroup.
2. GENERATING GROUPS 67

More generally, we say that a set {g1 , . . . , gk } of elements of H is a generating


set if every element of Q is a product of powers of the gi and/or their inverses.
For a general group Q with generating set {g1 , . . . , gk } one can make a surjective
morphism from the free group Fk on k symbols to H, by sending the i-th symbol
of Fk to gi . If Q is Abelian, one can also make a surjective morphism Zk → Q by
sending the i-th unit vector in Zk to gi . These surjections are called presentations.
Theorem VII.6 (FTFGAG, Part 2). We consider a subgroup H of G = Zm
and investigate the quotient G/H.
(5) Any subgroup H of Zm can be generated with a finite number of columns
from a suitable matrix A.
(6) The group G/H is isomorphic to
(Z/d1 Z) × · · · × (Z/dm Z)
where d1 , . . . , dm are the elementary divisors of A. They do not depend
on how one chooses A.
0
(7) For comparisons of different groups Zm /H and Zm /H 0 , one can split
Z/di Z further using coprime factors.
0
(8) Two quotients Zm /H and Zm /H 0 are isomorphic if and only if their lists
of elementary divisors are equal after striking all appearances of di = 1
from both lists.
A comment on the last item: Z/1Z is the trivial group, and so Z/1Z × G = G
for any group G. So erasing instances of Z/1Z from the third item of the theorem
does not change anything.
All parts except the last one are cler from what we have done and said in
examples. At the end of the section I explain why the last part is true (why
different elementary divisors must come from non-isomorphic groups).
Let us ask what we can do for arbitrary Abelian groups. The answer is: with
a bit of preprocessing, the exact same things.
Example VII.7. Let G = KV4 = {e, l, ↔, x}. This group is Abelian, and has
3 elements aside from the identity. The main observation of this example is that
we can make a morphism π : Z3 → KV4 that sends (1, 0, 0) to l, (0, 1, 0) to ↔ and
(0, 0, 1) to x.
This map is surjective, but surely not an isomorphism (for example, because Z3
is infinite and KV4 is not). What is in the kernel of π? These are the expressions
in l, ↔, x that give the identity in KV4 . For example, l · l= e in KV4 and so
(1, 0, 0) + (1, 0, 0) ∈ ker(π). To understand how this came about, recall that we
have the morphism rule π(~v + w) ~ = π(~v ) · π(w).~ So if ~v = w~ = (1, 0, 0) then
π((1, 0, 0) + (1, 0, 0)) = π((1, 0, 0)) · π((1, 0, 0)) =l · l= e, placing (1, 0, 0) + (1, 0, 0)
in the kernel of π. (Recall: kernel is whoever is mapped to the identity).
Other elements in the kernel are: (0, 2, 0), (0, 0, 2), basically for similar reasons.
But there is another more interesting relation: since l · ↔=x, the corresponding
relation is (1, 1, −1). It turns out that these 4 elements generate the kernel of π.
3
So let us run the elementary divisor business  on H, the subgroup
 of Z spanned
2 0 0 1
by the kernel of π, which is the column span of 0 2 0 1 . If we move the
0 0 2 −1
(1, 1, −1) column to the left and then clear out lower parts of the left column, we get
68 VII. WEEK 7: FINITELY GENERATED ABELIAN GROUPS

   
1 2 0 0 1 2 0 0
0 −2 2 0. We then use the −2 as pivot to get 0 −2 2 0. At last, we
0 2 0 2   0 0 2 2
1 0 0 0
do column operations to get to 0 2 0 0, which certifies that KV4 = Z3 /H
0 0 2 0
is isomorphic to (Z/1Z) × (Z/2Z) × (Z/2Z) = (Z/2Z) × (Z/2Z). We can associate
KV4 with (Z/2Z) × (Z/2Z) via
e l ↔ x
(0 + 2Z, 0 + 2Z) (1 + 2Z, 0 + 2Z) (0 + 2Z, 1 + 2Z) (1 + 2Z, 1 + 2Z)
and this assignment is an isomorphism.
One can use this result to count the number of Abelian groups of a certain
order, and also compare different Abelian groups for being isomorphic.
Example VII.8. How many Abelian groups G with 168 elements are there?
For each, find the elementary divisors.
168 = 23 · 31 · 71 . By FTFGAG, G should be a product of some Z/2di Z
and Z/3ei Z and Z/7fi Z. Of course, in order to make the group indeed have 168
elements, we need the sum of the di to be 3, and the sum of the ei to be 1 and
the sum of the fi to be 1 as well. That actually leaves very little choice, since an
exponent of 0 can be ignored. We must have one e and one f of value 1. The only
interesting bit is how we partition 3. As we know, this could be as 1 + 1 + 1 or as
1 + 2 or as 3.
So the possibilities are:
(Z/8Z) × (Z/3Z) × (Z/7Z),
(Z/2Z) × (Z/4Z) × (Z/3Z) × (Z/7Z),
(Z/2Z) × (Z/2Z) × (Z/2Z) × (Z/3Z) × (Z/7Z).
The elementary divisors satisfy: the product is 168, they divide each other (and
1’s can be ignored since they lead to Z/1Z factors which are trivial). Since 3 and
7 appear with power 1 in 168, both appear only in the last (biggest) elementary
divisor. The possibilities are: 168, or 2 · 84, or 2 · 2 · 42. One can see that the
partitions of the exponent 3 of 2 correspond to these factorizations: 1 + 1 + 1
corresponds to (21 ) · (21 ) · (21 · 31 · 71 ), 1 + 2 to (21 ) · (22 · 31 · 71 ), and 3 to (23 · 31 · 71 ).
Of course, the same applies to the partitions of the exponent 1 over 3 and 7, but
since 1 can’t be partitioned nontrivially, that is not so thrilling and 3, 7 only appear
in the last elementary divisor. 
We want to explain lastly, why for example G = (Z/2Z) × (Z/4Z) is not iso-
morphic to Z/8Z, and in the process understand all similar questions with more or
higher exponents.
The underlying reason is by finding elements that are “killed” by 2 (or its
powers) in this case. By this we mean elements g ∈ G that when you double them
are zero. In Z/2e Z, there is always exactly one element that is not zero but yet
killed by 2, namely the coset of 2e−1 . More generally, we learned when we studied
cyclic groups, that the number of elements in a cyclic group Z/nZ of exact order
d is either zero (when d does not divide n) or (if d|n) equals φ(d), the Euler phi
2. GENERATING GROUPS 69

function that counts numbers relatively prime to d. Since φ(2) = 1 this agrees with
the above search.
So in a cyclic group of order divided by p, the number of elements that are
killed by the prime number p is exactly φ(p) + 1, the 1 coming from the fact that
the identity is killed by p but already dead (and so did not count for the order-p-
count in φ(p)). But φ(p) + 1 = p, so in a cyclic group of order divided by p there
are exactly p elements killed by p if p is prime.
Now, in a product such as (Z/2Z) × (Z/4Z) there are now 2 × 2 elements killed
by 2, because if a pair is killed by 2 then each component is killed by 2. And since
there are 2 choices in each component, then there are 2 × 2 such pairs.
More generally, in a product (Z/pe1 Z) × · · · × (Z/pek Z), pk elements will be
killed by p. So, groups with a different number of factors of the sort Z/pe Z cannot
be isomorphic, because they have different numbers of elements that are killed by
p.
If the number of such Z/pe Z is the same, consider the number of elements
killed by p2 . In each Z/pei , if ei = 1 there are p elements killed by p2 , but if
ei > 1 then there are p2 such elements. So in (Z/pe1 Z) × · · · × (Z/pek Z) there are
p#{ei ≥1} · p#{ei >1} elements killed by p2 . So, groups with equal number of factors
of type Z/pZ but different numbers of factors Z/p2 Z are not isomorphic.
In this manner one can prove by induction the last part of the theorem.
Remark VII.9. The above is relevant to the finite part of a group (the part
to elementary divisors different from 0). In homework you will show that Zm and
Zn are isomorphic exactly if m = n. That then finishes the last part of FTFGAG
stated next.
Theorem VII.10 (FTFGAG, Part 3). Let G be any finitely generated Abelian
group, and choose generators g1 , . . . , gm . Then G has a presentation π : Zm  G
with π(ei ) = gi where ei is the i-th unit vector of Zm . Then this identifies G =
Zm /H as Zm modulo some subgroup H of Zm . Here the
One can find a matrix A whose column span is exactly H. The elementary
divisors of A do not depend on the chosen presentation of G nor do they depend on
the chosen matrix A. They only depend on G.
The finitely generated Abelian group G is characterized by the elementary divi-
sors in the sense that two groups have the same elementary divisors if and only if
they are isomorphic.
CHAPTER VIII

Week 8: Group actions

We have seen in two different places that one can read a group as a bunch of
permutations. First, as symmetries of actual objects (like an equilateral triangle,
for example) where the permutations occur at special places of the objects (the
corners of the triangle). Secondly, and much more formally, we have interpreted a
group element g ∈ G as a permutation σ g of the elements of G via left mutiplica-
tion: σ g (g 0 ) = gg 0 . In this section we formalize this sort of idea and discuss some
consequences.
Definition VIII.1. Let X be a set and let G be a group. Under the following
circumstances we shall speak of a left action of G on X:
(1) There should be a way of “multiplying” any element of G onto any element
of X. In other words, we need a function
λ : G × X → X,
(g, x) 7→ λ(g, x).
We then want that this action behaves well with respect to group multi-
plication as follows.
(2) The identity element e = eG should ”fix” every element of X, so that we
have
λ(eG , x) = x
for all x ∈ X.
(3) Given any two group elements g, g 0 ∈ G we require
λ(g, λ(g 0 , x)) = λ(gg 0 , x).
We will look exclusively at left actions, and henceforth just say “action” when
we mean “left action”. (Just to fill the void: a right action ρ : X × G → X would
want that ρ(g 0 , ρ(g, x)) = ρ(gg 0 , x); note the reversion in the order of g, g 0 here).
We will often write less officially gx for the result λ(g, x) of g acting on x. Then
the two rules above become
eg = g ∀x ∈ X, ∀g ∈ G,
g(g 0 x) = (gg 0 )x ∀g, g 0 ∈ G, ∀x ∈ X.
I recommend thinking of the elements of X as physical objects (“points”) that
one can draw and touch, and the process λ(g, −) as a way of moving the points in
X about. Here, λ(g, −) is the process the lets g ∈ G act on all points of X, the −
is just a place holder.
I order to say interesting things about group actions, we need a few more
concepts that arise naturally.
Definition VIII.2. Let λ be an action of G on X and choose x ∈ X.
71
72 VIII. WEEK 8: GROUP ACTIONS

• The orbit of x is those points y in X that you can “get to from x” using
multiplication of x by elements of G. In symbols, denoting the orbit of x
by orbG (x),
orbG (x) = {y ∈ X|∃g ∈ G with gx = λ(g, x) = y}
|{z}
or simply orbG (x) = Gx.
• If starting from x, the action can carry you to all other points of X, then
we say that the action is transitive. If G acts transitively on X then it is
customary to call X a homogeneous G-space.
• Complementary to the orbit of x is the notion of the stabilizer of x,
StabG (x) = {g ∈ G with gx = x},
the group elements that do not move x. Here we say that g moves x if
gx 6= x.
• If no element of G moves x, that is when StabG (x) = G, we call x a fixed
point of G. If g does not move x, we say that x is a fixed point for g, or
that g fixes x. We write FixX (g) for the points x ∈ X for which gx = x.
Remark VIII.3. You will show in homework that StabG (x) is a subgroup of
G.
We consider some examples, concrete and abstract.

Example VIII.4. Let G = Sym( ) and let X consist of the vertices of the
triangle. As we said many times, G can also be interpreted as SX , the permutation
group on the elements of X.
Let x be the A-vertex. Then StabG (x) consists of the identity e and the A-flip
a, since the other 4 elements b, c, `, r of G all move x.
Similarly, the stabilizer of C is {e, c} and that of B is {e, b}.
The rotations `, r have no fixed points, and the fixed points of e are al points
of X. The reflections a, b, c have only one fixed point each.
The action is transitive, since already the rotations are capable to carry any
point to any other point.
Example VIII.5. Let G be the symmetris of a cube, and let G be the rigid
symmetry group of a cube. cube. (This is the subgroup of all symmetries of the
cube consisiting of just the rigid motions that are cube symmetries). We found that
|G| = 48, 24 rotations from G, plus 24 non-rigid motions that are a composition of
a rotation and the antipodal map (which sends each vertex to the one diametrically
across).
Let X be the vertices of the cube and study the action of G (or G) on X.
If x is the upper left front vertex, there are 3 rigid motions that stabilize it (the
3 rotations that fix the big diagonal on which x lies) and then 3 more non-rigid
motions that combine the antipodal map with the 3 rotations that exchange x with
its antipode. So | StabG (x)| = 3 and StabG (x)| = 6.
Both actions are transitive. (Since G ⊆ G, it is enough to check that for G,
but we know that one can rotate any vertex of the cube into any other).
Most elements of G have no fixed point in X. Note that if a motion fixes a
vertex, it must also fix the antipodal point of that vertex. The 2 × 4 non-trivial
rotations that fix a big diagonal have 2 fixed points. The identity of G has 8 fixed
points.
VIII. WEEK 8: GROUP ACTIONS 73

The 3×4 motions from G that combine the antipodal map with a rotation about
one of the big diagonals followed by a reflection about the plane perpendicular to
this diagonal also have two fixed points.
Example VIII.6. Let G be any group and H a subgroup. We do not require
H to be normal. Let G/H be the set of all cosets gH relative to H. We take X to
be G/H and act on it by left multiplication:
λ(g, g 0 H) = gg 0 H for all g, g 0 ∈ G.
It is straightforward to check the group action rules: λ(eG , gH) = eG gH = gH, and
λ(g, λ(g 0 , g”H)) = gg 0 g”H = λ(gg 0 , g”H) because of associativity of multiplication
in H.
The stabilizer of a coset gH is the set of all a ∈ G with agH = gH, which says
g −1 agH = H and that is equivalent to g −1 ag ∈ H. For example, if g = e and so
gH = eH = H, the condition becomes a ∈ H, so the stabilizer of the “point” eH
in X is exactly H. In general, the the equation agH = gH means that for every
h ∈ H the expression agh should be of the form gh0 for some h0 ∈ H. That means
ag = gh0 h−1 and so a = gh0 h−1 g −1 . Since the product h0 h−1 is again in H, we find
that a must be ing gHg −1 . On the other hand, (gHg −1 )(gH) = gH(gg −1 )H =
gHH = gH so that the stabilizer of gH is exactly the set gHg −1 . This says that
the stabilizers of gH are always conjugate subgroups of H. In particular, if H
happens to be normal (but only then), each stabilizer is equal to H.
If gH wants to be a fixed point for multiplication by g 0 then we need g 0 gH = gH,
which amounts to g −1 g 0 gH = H. This forces g −1 g 0 g to be in H, so there should
be an element h ∈ H with g −1 g 0 g = h, or g 0 = ghg −1 . So, gH is a fixed point for
g 0 precisely if g 0 is in the conjugate subgroup gHg −1 .
In reverse, given g 0 then gH is fixed under multilication with g 0 precisely when
gHg −1 contains g 0 . Note that belonging to gHg −1 may not be very easy for g 0 . For
example, if H is normal then the condition “g 0 should belong to some conjugate
subgroup of H” just boils down to “g 0 must be in H”. Specifically, this applies in
an Abelian group Gm as then all subgroups H are normal.
We are interested in counting. That usualy means, G and X should be finite.
Theorem VIII.7 (Stabilizer–Orbit Theorem). If G acts transitively on X and
both are finite, then
|G| = | orbG (x)| · | StabG (x)|
for every point x of X. If the action is transitive, so that there is only one orbit
X, this becomes
|G| = |X| · | StabG (x).
I won’t prove this formally, but give some ideas.
If x, y ∈ X are in the same orbit, then StabG (x) and StabG (y) are conjugate
subgroups as you show in homework. So in particular, they have the same size. This
explains why in the theorem it is not important which x one takes: all stabilizers
are conjugate to one another.
Next you cluster the elements of G in such a way that g, g 0 are in the same
cluster if and only if gx = g 0 x. Note that g · StabG (x) all end up in the same cluster
since they all send x to gx. Note that these sets are just the left cosets relative to
H := StabG (x). One then checks (easy but detailed) that g, g 0 belonging to different
H-costets rules out the possibility of gx = g 0 x. So, the clusters all are of the same
74 VIII. WEEK 8: GROUP ACTIONS

size as StabG (x). So, G is partitioned into clusters of size | StabG (x)|, and elements
g, g 0 in different clusters produce different output gx 6= g 0 x when multiplied against
x. But the collection of all outputs Gx is just the orbit of x. So, as the theorem
claims, |G| = | orbG (x)| · | StabG (x)|.
(This all should remind you much of Lagrange’s Theorem and its proof. In
fact, this proof here is the proof for Lagrange’s Theorem if you take X to be the
coset space for the subgroup H as in the example above. Then the Stabilizer-Orbit
Theorem becomes Langrange’s: ”|G| = |G/H| · |H|”. In reverse, this theorem and
its proof is simply Lagrange applied to G and its subgroup H := StabG (x)).
Finally, if G acts transitively, there is only one orbit, and so orbG (x) = X.

Example VIII.8. Let G be the rigid symmetries of a cube, choose as X the


vertices of the cube, and let x be the upper front left vertex. The orbit of x is X
since the action is transitive, and |X| = 8 . The stabilizer of x has 3 elements (the
big diagonal rotations that fix x) as discussed above. And indeed, 3 · 8 = 24 = |G|.

Now we discuss fixed point counts. Recall that for g ∈ G, the set FixX (g) is
the points of X that are unmoved by g, so gx = x. Let us alo write X/G for the
orbit space of X under G. This is just the set of all orbits, the notation sugesting
that X/G arises from X by clustering elements of X where clusters are orbits. The
following theorem addresses the question of counting the number of orbits.

Theorem VIII.9 (Burnside). If G acts on X and both are finite, then the size
of the orbit space is
1 X
|X/G| = | FixX (g)|.
|G|
g∈G

P Again, I won’t give a very formal proof but the main ideas. Let us count
g∈G | FixX (g)| as follows. Look at the collection of pairs (g, x) in the Cartesian
product G × X for which gx = x. Let F be the collection of all such pairs. We can
sort them by the separate g, or the separate x. If we sort them by P g then we get
clusters FixX (g) and so the number of all such pairs is precisely g∈G | FixX (g)|.
But if we cluster by x, then each cluster has the form {g ∈ G|gxP = x} and that is
exactly StabG (x). So, if we now sum this over all x we get x∈X | StabG (x)|. Of
course, these two counts must agree:
X X
| StabG (x)| = | FixX (g)|.
x∈X g∈G

We now need to interpret the sum on the left a bit differently. From the
Stabilizer–Orbit Theorem, if we let G just act on the orbit Gx of x, we know
P |G| = | StabG (x)| · | orb
that PG (x)|. So, restricting the sumP to the orbit of x, we get
x∈orbG (x) | StabG (x)| = x∈orbG (x) |G|/| orbG (x)| = |G| x∈orbG (x) 1/| orbG (x)| =
|G|. P
So, orbit by orbit, the expression x∈X | StabG (x)| contributes one copy of |G|.
If you sum over all orbits, this is |G| times the sum P of the number of orbits. The
latter is |X/G|, and so we find that |G| · |X/G| = x∈X | StabG (x)|. Combined
1
P
with the equation |X/G| = |G| g∈G | FixX (g)| this shows the Burnside Theorem.
Note that there is very little “power” in this proof, it relies on 2 ways of counting
the same thing.
VIII. WEEK 8: GROUP ACTIONS 75

Example VIII.10. How many different dice can one make with labels 1 through
6 on them? It turns out, this question is made for Mr Burnside.
First off, if the die can’t move, there are 720 = 6! ways to paint numbers on the
faces of the cube. The problem is that dice can move, and so many of the seemingly
different dice will turn out to be the same.
Let us write X for the 720 different dice that we painted. Let G be the sym-
metry group of the cube, it moves the dice around and has |G| = 24 elements.
If 2 dice are truly differently labeled, they would not look the same under any
symmetry. So they would not be in the same G-orbit. In other words, we want to
count the size of the orbit space X/G.
If we plan to use the Burnside Theorem, we need to study the fixed points of
all motions. Note that a “fixed point” is now a labeling of the cube that looks the
same no matter what we do with that cube. But it is clear that every rigid motion
of the cube will move a face and in fact several. So there are no g witha fixed point.
Unless, of course, you took g to be the identity motion, which has every labeling
as a fixed point. So, in the Burnside formula there is exactly one summand that
contributes anything, namely the one that belongs to g = e. And the summand for
g = e is | FixX (e)| = |X| = 720. All other summands belong to a g without fixed
1
points and contribute 0. So the formula says |X/G| = 24 (720 + 0 + . . . + 0) = 30.
The example makes clear a special case of the Burnside Theorem:
Corollary VIII.11. If X acts on G and no element e 6= g ∈ G has any fixed
point, then |X/G| = |X|/|G|.
Review

• Week 1
– induction, well ordering
– modular arithmetic
– primes and irreducibles in a domain
– Euclidean algorithm in Z, gcd, lcm, relative prime (coprime)
• Week 2
– symmetries of an object and composition of symmetries
– group (axioms), and Cayley table
– cancellation property in groups
– exampls: symmetry groups, KV4 , GL(n, R), vector spaces, (Z/nZ, +),
U (n), Cn , free groups, Zn
– Abelian groups, cyclic groups
– order of a group, and of elements in a group
– subgroup
– product group G × H
– Aut(G), the relabelings of G that preserve the Cayley table, a group
with composition
• Week 3
– the Euler φ-function
– the number of elements in the cyclic group Cn that have order d
(distinction for d|n and d 6 |n)
– the number of subgroups of a given size in Cn
– the number of generators for Cn
– φ(Z/pqZ) = φ(Z/pZ) · φ(Z/qZ) if gcd(p, q) = 1
– if a = a1 · ak then Ca1 × Ca2 × · · · Cak = Ca provided that the ai are
pairwise coprime
– solving x mod n = a mod n, a mod m = b mod m when m, n co-
prime
– U (mn) = U (m) × U (n) if coprime
– |U (pk )| = pk−1 (p − 1), and why
– φ : Z/mnZ → (Z/mZ) × (Z/nZ) via 1 + mnZ 7→ (1 + mZ, 1 + nZ) is
isomorphism provided m, n coprime
• Week 4
– left and right cosets of G relative to the subgroup H; coset space
G/H
– morphisms ψ : G → G0 and a list of examples
– know how to test whether multiplication by k ∈ N gives a morphism
Z/mZ → Z/nZ
– conjugation by a, g 7→ aga−1

77
78 REVIEW

– inner automorphisms φa : G → G, sending g 7→ aga−1


– properties of cosets: either equal or disjoint; union is G; all same size
– Lagrange: ord(g)| ord(G); |G| = |H| · |G/H
– if |G| prime then G cyclic
– normal subgroup (stable under conjugation)
– kernels of morphisms are normal
– G Abelian, then every subgroup normal
– index 2 subgroups are normal
• Week 5
– the symmetric group Sn
– 3 notations: output notation, standard notation, cycle notation;
know how to convert one into the other and how to make cycles
disjoint
– tranposition = 2-cycle
– disorder as number of switches
– odd/even: parity
– sign of a permutation
– the alternating group as the kernel of σ 7→ sign(σ), a group morphism
from Sn to (±1, ·).
– every group is a subgroup of a pemutation group
• Week 6
– suppose here H is normal. Then G/H can be made a group, (g1 H) ·
(g)2H) = (g1 g2 H). That this works is precisely because H is normal.
– the kernel of a morphism is a normal subgroup
– if φ : G → G0 is a morphism, denote H the kernel. Then G/H is
isomorphic to the image of φ (this is a subgroup of G0 )
• Week 7
– A ∈ Zm,n has a Smith normal form, computable via standard row
and column reduction steps
– the diagonal elements of the Smith normal form are the elementary
divisors of A, independent of pivot choices in the reduction
– elementary divisors d1 , . . . , dm divide one another, di |di+1
– If G = Zm /H where H is the column span of A ∈ Zm,n with m ≤
n, then the elementary divisors d1 | . . . |dm of A characterize G: if
you discard any “1” on that list, then two Abelian groups G, G give
the same lists of elementary divisors if and only if G and G0 are
isomorphic. So, elementary divisors solve the “classification problem”
for finitely generated Abelian groups
CHAPTER IX

Week 9: Introduction to rings

This begins the second part of the course, where we study structures that
allow both addition and multiplication. The standard example is Z, with Q, R, C
following closely behind.
Definition IX.1. A ring is a set R with a binary operation + : R × R → R
called addition and a second binary operation · : R × R → R called multiplication
such that
(1) (R, +) is an Abelian group;
(2) multiplication is associative, (r · s) · t = r · (s · t) for all r, s, t ∈ R;
(3) the distributivity law is intact: r(s+t) = r ·s+r ·t and (r +s)·t = r ·t+s·t
for all r, s, t ∈ R;
(4) there is a neutral element for multiplication, written 1R , with 1R · r = r =
r · 1R for all r ∈ R.
It is perhaps useful to make some comments here.
• We denote 0R (or just 0) the neutral element for addition in R, and write
(−a) for the additive inverse of a ∈ R. Note the following two facts.
a · 0 = a · (0 + 0) = (a · 0) + (a · 0), so a · 0 = 0;
0 = a · 0 = a · (1 + (−1)) = a · 1 + a · (−1),
so that (−1) · a is the additive inverse of a. We usually denote it by −a
and write b − a for b + (−1) · a.
• We will almost exclusively look at commutative rings, which are those
where r · s = s · r for all r, s ∈ R. But there is a general consensus that
non-commutative rings are important enough for not being disqualified
from the start.
• Some people no not require the existence of 1R . Rings without multiplica-
tive identity are not difficult to find, but they lack key features of rings
that we want to discuss in our remaining chapters.
• One thing to note is something that a ring need not have, and that is
multiplicative inverses. We are not saying that invereses must not exist
(after all, 1R is always its own inverse!); we just concede that they may
not exist in all cases. Do not confuse + and ·; + is always commutative
by definition.
• However, if an element a ∈ R does have a multiplicative inverse, this
inverse is unique, because if a0 , a” both are inverses to a ∈ R then a0 =
a0 · e = a0 · (a · a”) = (a0 · a) · a” = e · a” = a”.
Example IX.2. Here is a list of standard rings that come up all the time in
mathematics. The first three are all commutative.
79
80 IX. WEEK 9: INTRODUCTION TO RINGS

• The rings after which all others are modelled is (Z, +, ·), the set of integers
with usual addition and multiplication.
• The three collections Q, R, C of rational, real and complex numbers re-
spectively are all rings as well. They are rather special rings, since in
contract to Z, every non-zero number in these three rings does have a
multiplicative inverse (whereas in Z that is only the case for ±1.)
• The groups Z/nZ are also all rings, with addition and multiplication of
cosets.
• A collection of non-commutative matrix rings arises for each number n
and choice of coefficient ring K as follows. Let Mn (K) be the set of all n×n
matrices with entries in K. Then usual matrix addition and multiplication
has the usual properties, which are those listed in Definition IX.1 above.
Note that in general A · B 6= B · A so that Mn (K) is not commutative.
• Another collection of rings are the polynomial rings K[x1 , . . . , xn ] over
a chosen coefficients field K. These are commutative rings, and their
elements are the polynomials in the variables x1 , . . . , xn which have coef-
ficients in K.
• A type of ring we will not look at much is popular in analysis: the set of
all real-valued functions on the interval [0, 1]. Addition and multiplication
is pointwise, which means that (f + g)(x) is declared as f (x) + g(x) and
likewise for multiplication.

Example IX.3. Here is an example of an extension ring. Let Z[ −1] be√the set
√ numbers that have both real and imaginary value integer. So, Z[ −1] =
of complex
{a + b −1 with a, b ∈ Z}. You might want to√ think of this as a “vector space of
dimension
√ 2 over Z, √ spanned by 1 ∈ Z and √−1”. So, we add componentwise:
(a + b −1) + (c + d −1) = (a + c) + (b + d) −1. Multplication has a bit of a
surprise, as it does
√ not go componentwise,
√ but instead like√for complex numbers in
general: (a + b −1) · (c + d −1) = (ac − bd) + (bc + ad) −1. This is the ring of
Gaussian integers.
Definition IX.4. If in a ring R we have a, b ∈ R both nonzero, but ab = 0,
then we call a and b zero-divisors.
Most of the rings listed in examples here do not have zero-divisors. The excep-
tions are: Z/nZ if n is not prime; Mn (R) in the case n > 1 and also in the case
that R itself has zero-divisors; the polynomial ring R[x1 , . . . , xn ] in the case that R
has zero-divisors. You might want to check these three claims explicitly by finding
one example of zerodivision in each of the three scenarios.
Definition IX.5. A commutative ring that has no zero-divisors is called a
domain.
Note that if a ∈ R has an inverse, then a cannot be a zero-divisor. Indeed, if
ab = 1 and ca = 0 then c = c · 1 = cab = 0 · b = 0.
Definition IX.6. Consider 1R , 1R +1R , 1R +1R +1R , . . .. This sequence might
or might not contain the element 0R . If it does, there is a smallest number c ∈ N+
such that adding 1R c times gives 0R . We call this c the characteristic of R.
If this sequence never produces 0R we say that the characteristic of R is zero.
Lemma IX.7. If R is a domain, its characteristic is a prime number or zero.
IX. WEEK 9: INTRODUCTION TO RINGS 81

Proof. Suppose 1 + 1 + . . . = 0 and c is the characteristic (the smallest pos-


| {z }
c copies
itive such c). Suppose c = mn factors. Then let em = 1 + 1 + . . . and en =
| {z }
m copies
1 + 1 + . . .. Using the distributive property, em · en is the sum of mn copies of 1,
| {z }
n copies
hence zero. Since R is supposed to be a domain, it can’t have zero-divisors, and so
we must have em = 0 or en = 0. But if c = mn is really a factorizatin of c, m and
n are strictly less than t, which makes it impossible that em = 0 or en = 0. We
conclude c does not factor and is therefore a prime number. 

Definition IX.8. A commutative ring that has multiplicative inverses for each
nonzero element is called a field.

We will discuss fields in detail later.

Theorem IX.9. If R has finitely many elements (“R is finite”), is commutative


and is a domain, then it is a field.

Proof. We need to show that the absence of zero-divisors forces the presence
of inverses when R is finite. Take a ∈ R nonzero. then multiplication by a gives
a permutation of the elements of R. Indeed, let r1 , . . . , rt be the complete list
of nonzero elements of R. Then ar1 , . . . , art is another list of elements of R. No
expression ari can be zero, sine a 6= 0 and ri 6= 0, and R is supposed to be a domain.
Also, there is no repetition on this list since if ari = arj then a(ri − rj ) = 0 and
a 6= 0 now forces (ri − rj ) = 0 as otherwise we would be looking at zero-divisors
which can’t exist in a domain. So, the second list is a permutatin of the first list,
because both list all nonzero elements of R. (This is where finiteness is used: if
R were infinite we could not argue like this. For example, the multiples of 2 are
not a permutation of the nonzero integers. But in a finite set, if you list as many
different elements of S as the set has, you listed them all). It follows, that one of
the elements on the second list is 1R , which amounts to saying that there is ri ∈ R
with a · ri = 1R . 

Remark IX.10. A postscript of this proof goes like this: let R have p elements.
Then the nonzero elements are a group with multiplication, since the theorem
assures the esitence of inverses. This group has p − 1 elements. So Lagrange says
that if a is a nonzero element of R then its mutiplicative order divides p − 1. In
particular, there is a power c such that ac = 1R . But then ac−1 · a = 1R and so the
inverse of a is actually a power of a.

Example IX.11. In the same way the Gaussian integers √ are an extension of
the integers, one can mak√ extensions of fields. For example, Q[ 2] is the collection

of all √ √ a, b ∈ Q. One adds componentwise,
expressions a + b 2 with √ (a + b√2) +
(c + d 2) = (a + c) + (b√+ d) 2 and multiplies according to (a + b 2) · (c + d 2) =
(ac + 2bd) + (bc + ad) 2. √
√ √
Note how one computes inverses here: (a + b 2)−1 = (a+(a−b √
2)

2)(a− 2)
= a−b 2
a2 −2b2 =

a b

a2 −2b2 + a2 +2b2 2 is of the required form. Recall that we proved that there cannot
be rational numbers a, b with a2 = 2b2 and so the numerator is nonzero.
82 IX. WEEK 9: INTRODUCTION TO RINGS

Example
√ IX.12.
√ One can do this also with modular numbers. Let R =
(Z/3Z)[ 2]. Here, 2 stands for a symbol whose square is the coset of 2. (Note
that there is no element in Z/3Z whose square is the coset of 2, just like there was
no rational number whose square was 2.) √
This is a ring with 9 elements, the possible expressions of the form a + b 2
with a, b ∈ Z/3Z. You calculate exactly as expected, always going modulo 3. √
√ √
So for example, the inverse of 2 + 1 2 is 2+1 2 = (2+1 2)(2−1 2) = 2−1
1√ 2−1 2 √

4−2
2
=
√ √ √ √ √
1 − 2 2 = 1 + 1 2. And indeed, (2 + 1 2)(1 + 1 2) = 1 + 0 2.
CHAPTER X

Week 10: Ideals and morphisms

Recall that we insist that our rings have a (multiplicative) 1. (All rings have a
neutral element for + (which we write as 0), since R with + is a group).
Definition X.1. A ring morphism is a function f : R → R0 from one ring to
another such that it is a morphism of groups (R, +) → (R0 , +), and moreover it
respects ring multiplication: f (r1 r2 ) = f (r1 )f (r2 ).
Examples of such things abound.

• the inclusions Z ,→ Q ,→ R ,→ CC and the inclusions Z ,→ Z[ −1] ,→ C;
• the surjection Z → Z/nZ sending k to k + nZ for any n;
• if m|n, the surjection Z/nZ → Z/mZ sending k + nZ to k + mZ;
• complex conjugation;√ √ √ √
• the “conjugation” Z[ 2] → Z[ 2] sending a + b 2 to a − 2 and any
similar constructs;
• the polynomial map C[x, y] → C[t] that sends x 7→ t2 , y 7→ t3 ;
• If O is the collection of real functions defined on the real line, then any
a ∈ R induces an evaluation morphism a : O → R that sends f (x) ∈ O
to the value f (a) of f at a.
Example X.2. Recall that there are rings of positive characteristic. If char(R) =
p > 0 is prime, there is the Frobenius morphism Frob : R → R that sends r ∈ R to
Frob(r) = rp . That this is then a morphism is due to freshman’s dream in algebra:
(x + y)p = xp + y p in characteristic p, since by the binomial theorem every missing
term of (x + y)p is a multiple of p.
Definition X.3. If f : R → R0 is a ring morphism, its kernel is the elements
of R that are sent to 0 ∈ R0 by f .
Definition X.4. An ideal in a ring R is a subset I ⊆ R such that
• I is a subgroup of R with respect to addition;
• For all x ∈ I and all r ∈ R, the product xr is in I.
Remark X.5. A standard way of producing ideals is as follows. Let f1 , . . . , fk
be elements of the ring R. Then let I be the set of all R-linear combinations you
can make from f1 , . . . , fk . In other words, I is made precisely of all things like
r1 f1 + . . . + rk fk where r1 , . . . , rk run through all possible elements in R. Then I
is an ideal: sums of such things as well as differences of such things are such things
again, and multiplying any such element by an arbitrary ring element gives another
thing of this type.
It is important to note that it is allowed for an ideal to have infinitely generators.
Often, one can simplify such a situation to finitely many generators, but not always.
The rings we consider all will have only ideals that are finitely generated, but
proving this can be dicey (although we will prove it in some nice cases).
83
84 X. WEEK 10: IDEALS AND MORPHISMS

For example, the multiples of 641 are an ideal of Z. So are the C[x, y]-linear
combinations of x3 − y 2 and x3 + y 4 in C[x, y].
Proposition X.6. The kernel of a ring morphism is an ideal.
Proof. That the kernel of a ring morphism f : R → R0 is a subgroup of
R follows straight from the fact that f is a group morphism. Now take x ∈ I
and r ∈ R. Then f (x) = 0 and so f (x)f (r) = 0 and so f (rx) = 0 and so
rx ∈ ker(f ) = I. 

We next turn this around and use ideals to make factor rings and morphisms.
Definition X.7. Let I ⊆ R be an ideal. The factor ring R/I is the group R/I
together with multiplication (x + I)(y + i) = xy + I. There is an induced morphism
π : R → R/I that sends r ∈ R to r + I.
That this construction indeed produces a ring is not difficult to see. One
basically needs to check that multiplication is well-defined (this means that if x+I =
x0 + I and y + I = y 0 + I then xy + I = x0 y 0 + I, but that is quite easy.
If f : R → R0 is a ring morphism and I an ideal of R and J an ideal of R0 , then
inspection shows that
• f (I) may not be an ideal in R0 (for example, 2Z is an ideal in Z but when
you inject Z ,→ R then the even integers are no longer an ideal; make sure
you believe this, it is due to the fact that products of integers and reals
are often not integer).
• the preimage f −1 (J) in contrast is always an ideal of R. This is seen
as follows. Since f is a group morphism, the preimage is a group. Take
x ∈ f −1 (J) and y ∈ R. Then f (xy) = f (x)f (y) ∈ J · R0 = J and so
xy ∈ f −1 (J) as required.
• If char(R) = n then there is a natural morphism Z/nZ ,→ R induced by
sending 1 ∈ Z to 1 ∈ RR and using the morphism rule.
The main structure theorem for ideals says:
Theorem X.8. If I is an ideal of R then under the natual surjection π : R →
R/I the ideals of R/I correspond to the ideals of R that contain I. More precisely,
if J is an ideal of R that contains I then the quotient group J/I is an ideal of R/I.
In reverse, if J/I is an ideal of R/I then the preimage f −1 (J/I) is an ideal of R.
For example, if R = Z and I = 6Z then R/I has 4 ideals: the whole ring R0 =
Z/6Z, the zero idewal {0+6Z} and two interesting ideals J2 = {0+6Z, 2+6Z, 4+6Z}
and J3 = {0 + 6Z, 3 + 6Z}. To J2 corresponds the ideal 2Z of R, and it indeed
contains I. To J3 corresponds the ideal 3Z of R, and indeed it contains I. The
only ideals that contain I are I, 2Z, 3Z, Z. The first of these corresponds to the
zero idealin R/I and the last one to the whole of R/I.
We come now to talk about certain special types of ideals.
Definition X.9. A prime ideal of a ring R is an ideal P such that if a, b ∈ R
with ab ∈ P then at least one of a, b is in P .
Being a prime ideal is equivalent to saying that R/P is a domain. (There are
a, b ∈ R but not in P such that ab ∈ P if and only if in R/P we have (a+P )(b+P ) =
0 + P which can happen if and only if R/P is not a domain).
X. WEEK 10: IDEALS AND MORPHISMS 85

Definition X.10. An ideal M of R is maximal if there is no other ideal between


M and R. So, M is as large as it can be without equaling R.
Remark X.11. If you think of an ideal as generated by some elements of R,
then primeness and maximality can change
√ withthe ring. For example,
√ the multiples

of 3 form a prime ideal in Z, but in Z[ −2] you can factor 3 = (1+ −2)(1− −2).
Similarly, maximality can toggle: the multiples of 2 are a maximal ideal in Z, but
in Q the multiples of 2 are all of Q, which does nto qualify as maximal.
By the main structure theorem on factor rings, the ideal I is maximal if and
only if R/I has only two ideals, I/I and R/I.
Definition X.12. A commutative ring with only two ideals is called a field. A
non-commutative ring with only two ideals is a skew-field.
If a ring has only two ideals, one has to be the ideal h0i, and the other the ideal
R = h1i. So, in a field it is clear what the two ideals involved are. In any ring, the
zero ideal and the whole ring are considered “improper ideals”. Not in the sense
that they are running around naked, but in the sense that we don’t want to truly
(properly) call interesting ideals.
Lemma X.13. In a field, every nonzero element is invertible. In particular,
fields are domains.
Proof. Let 0 6= x ∈ R and let I be the ideal defined by x (so, I consists
precisely of all the multiples of x). Since f 6= 0, I is not h0i. So, as we are in a
field, I = R. This means in particular that 1R ∈ I and so 1R is a multiple of f .
But then f must be invertible.
If a field R is not a domain, then 0 = ab for some a 6= 0 6= b in R. But as a field,
R contains an inverse for a and then a−1 ab = a−1 0 leads to a contradiction. 
Examples of fields include Q, R, C but also things like Z/pZ with p prime:
Lemma X.14. If p ∈ Z is prime then Z/pZ is a field and conversely.
Proof. We look at the morphism Z → Z/pZ that sends 1 to 1 + pZ. Then
the zero ideal in Z/pZ corresponds to the ideal hpi of Z by the theorem on factor
rings, and we need to show that there is no ideal of Z strictly between pZ and Z.
Suppose we have an ideal I with pZ ⊆ I but I is strictly greater than pZ. Then
I contains all multiples of p, and at least one number a that is not a multiple of
p. The Euclidean algroithm says that 1 = gcd(a, p) can be written as a linear
combination 1 = ax + py with a, b integers. Thus, in Z/pZ, a + pZ and x + pZ are
inverses, and in particular I/pZ contains 1 + pZ = (a + pZ)(x + pZ). So, I/pZ is in
fact Z/pZ and hence Z/pZ is a field.
On the other hand, if p is not prime and can be factored as p = mn with m, n
not units, then in Z/pZ we have (m + pZ)(n + pZ) = 0 + pZ and so neither factor
can have an inverse. But m + pZ is not zero since p does nbot divide m (since n is
not unit). So Z/pZ can’t be a field. 
More generally,
Proposition X.15. Let I be an ideal of R.
(1) I is a prime ideal if and only if R/I is a domain;
(2) I is a maximal ideal if and only if R/I is a field.
86 X. WEEK 10: IDEALS AND MORPHISMS

Proof. The second claim, as mentioned previously, follows directly from the
structure theorem of factor rings. The proof for the first claim is analogous to the
proof of the preceding lemma.Namely, if I is a prime ideal and (a + I)(b + I) = 0 + I
in R/I then we must have ab ∈ I and so by primeness of I one of a, b is in I, and
thus one of a + I, b + I is zero in R/I. If I is not prime, there are a, b ∈ R that are
not in I but with ab ∈ I. Then (a + i)(b + I) = 0 + I are zerodivisors and so R/I
is not a domain. 
Theorem X.16. Every ideal in Z and in Z/nZ is generated by one element.
Proof. Suppose the ideal I ⊆ Z contains a and b. By the Euclidean algorithm,
it also contains their gcd g. On the other hand, a, b are multiples of g and so we
see that any ideal that contains a, b also contains gcd(a, b) and conversely.
Iterating this argument, ha, b, ci = hgcd(a, b), ci = hgcd(a, b, c)i, and ha, b, c, di =
hgcd(a, b), c, di = hgcd(a, b, c), di = hgcd(a, b, c, d)i, and in this way every finite gen-
erator set a1 , . . . , ak for an ideal can be replaced by the single generator given by
the gcd of all ai .
Now imagine an infinite list a1 , a2 , . . . , an , . . .. We know that
gcd(a1 ) ≥ gcd(a1 , a2 ) ≥ gcd(a1 , a2 , a3 ) . . . ≥ 0.
It follows that this sequence of ≥ symbols reaches a point (say, when the index is
k) from where onwards each ≥ is actually a =.
What this means is that gcd(a1 , . . . , ak ) divides ak+1 , ak+2 , . . .. But then
ak+1 , ak+2 , . . . are already in the ideal generated by a1 , . . . , ak and we can say
that
ha1 , a2 , . . . , an , . . .i = ha1 , ldots, ak i = hgcd(a1 , . . . , ak )i
is a generated by one element. 
Definition X.17. Ideals generated by one element are called principal. The
theorem says that Z has only principal ideals. Since Z is a domain (has no zerodi-
visors), it is referred to as a principal ideal domain.
Remark X.18. You will prove in homework that ideals in Z/nZ are also all
principal.
CHAPTER XI

Week 11, Euclidean rings

We start with polynomial rings. As before, rings are commutative (unless


expressly indicated not to be) and have a 1.
Definition XI.1. Let R be any ring and x a symbol (distinct from any element
of R). We let xi , for i ∈ N be a new symbol and we postulate that the symbols
x0 , x1 , x2 , . . . are linearly independent over R. (Of course, we think of them as
powers of x, but what really is a power of a symbol???) Then x is an indeterminate
over R. We abbreviate x1 to x and identify x0 with 1R .
P∞A polynomiali
f (x) in x with coefficients in R is an infinite series f (x) =
i=0 r i x in which almost all coefficients ri are zero. In other words, only finitely
many coefficients are allowed to be nonzero.
The collection of all these polynomials is denoted R[x] and called the polynomial
ring in x over R.
We consider two such expressions equal,

X ∞
X
ri xi = ri0 xi
i=0 i0

ri0
if and only if we have ri = for all i. Note that for large i this is automatic since
eventually all coefficientsPare zero.

Given a polynomial i=0 ri xi , there is a largest index d for which rd is nonzero,
and this index d we call the degree deg(f ) of thePpolynomial. If d = deg(f ) then

we usually write r0 + r1 x + . . . + rd xd instead of i=0 ri xi , and call rd the leading
coefficient lc(f ) of f (x). .
We add polynomials degree by degree:

X ∞
X ∞
X
ri xi + ri0 xi = (ri + ri0 )xi .
i=0 i=0 i=0

We multiply them according to xi xj = xi+j and extend by requiring linearity.


It is easy to see that these two operations make R[x] into a commutative ring,
with zero element 0x0 + 0x1 + 0x2 + . . . and 1-element 1x0 + 0x1 + 0x2 + . . ..
Remark XI.2. • deg(f g) = deg(f )+deg(g) and deg(f +g) ≤ max(deg(f ), deg(g)).
• lc(f g) = lc(f ) · lc(g). That implies that if R is a domain then also R[x] is
a domain since f g = 0 implies lc(f ) lc(g) = 0.

1. Euclidean rings
Definition XI.3. A domain has a Euclidean measure if there is a function
δ : R \ {0} → N that satisfies
(1) δ(a) ≤ δ(ab) for all a 6= 0 6= b in R;
87
88 XI. WEEK 11, EUCLIDEAN RINGS

(2) if a, b ∈ R are given, there is an oracle that finds q, r ∈ R with a = bq + r


and either r = 0 or δ(b) > δ(r).
Example XI.4. We already know some examples like this.
• R = Z, with Euclidean measure δ(n) = |n| the absolute value. This works
because for any a ∈ Z and 0 6= b ∈ Z there is some multiple qb of b with
q ∈ Z such that |a − qb is less than |a|.
• If R is a polynomial ring over a field, we can take δ(f ) = deg(f ). This
works because the remainder of a by b with division leaves a rest r of
degree less than deg(b). √
• As you will check in HW, √ the ring2 Z[ 2 −1] is also equipped with a Eu-
clidean measure, δ(a + b −1) = a + b .
Theorem XI.5. A domain R with Euclidean measure is a Euclidean ring (has
a Euclidean algorithm).
Proof. Let δ be a Euclidean measure on the domain R and pick a, b ∈ R.
If b = 0 there is nothing to do, since gcd(a, 0) = a. If b 6= 0, according to the
definitions, there are q, r ∈ R with a = bq + r and either r = 0 or δ(r) < δ(b).
Let inductively a0 = a, a1 = b, q0 = q, r + 0 = r. For each index i for which
ai , bi , qi , ri have already been found with bi 6= 0, define ai+1 = bi , bi+1 = ri and
choose qi+1 , ri+1 so that ai+1 = qi+1 bi+1 + ri+1 with either ri+1 = 0 or δ(ri+1 <
δ(bi+1 ).
Note that this scheme is set up so that gcd(ai , bi ) = gcd(a−qi bi , bi ) = gcd(ri , bi ) =
gcd(bi+1 , ai+1 ). So, the gcs of a, b is the same as that of ai , bi for all i.
Since δ(bi ) > δ(ri ) = δ(bi+1 ), the sequence {δ(bi )} is strictly descending. But
they are all natural numbers (since δ can only have natural output by definition.
This seems to be a contradiction, since no eternally strictly descending chains
of natural numbers can exist. The only way out is that at some point bi was zero,
since then we would not try to go another round.
Now bi = 0 means ri−1 = 0 and so ai−1 = qi−1 bi−1 . But then clearly
gcd(ai−1 , bi−1 ) = bi−1 = ri−2 and we have found the gcd of a, b using repeatedly
the oracle of the Euclidean measure. 
Note that one can now use back substitution from ai−2 = qi−2 bi−2 + ri−2 to
rewrite ri−2 as linear combination of ai−3 , bi−3 and then of ai−4 , bi−4 , etc, and
finally as linear combination of a and b.
Corollary XI.6. If R has a Euclidean measure then for all a, b ∈ R one can
find x, y ∈ R such that gcd(a, b) = ax + by.
Example XI.7. Let’s find gcd(a := 3x2 + 4x + 3, b := 4x2 + 2x + 4) in Z/5Z[x].
(Strictly speaking, I should write bars over every number, but maybe we can liuve
without that for a moment).
We have (keep in mind that 2 · 3 = 1
4 3
a = b + (3x2 + 4x + 3 − (4x2 − 2x − 4))
3 4
4 2
= b − (0x − 20x + 0)
3
and modulo 5, 20=0. So actually, a = 4b/3. This is a warning that in modular
arithmetic it is not easy to see whether two polynomials are multiples of one another.
1. EUCLIDEAN RINGS 89

Example XI.8. Let’s do one that is a bit more thrilling. Let’s compute gcd of
x10 − 1 and of x6 − 1 in Q[x].

x10 − 1 = x4 (x6 − 1) + (x4 − 1).


x6 − 1 = x2 (x4 − 1) + (x2 − 1).
x4 − 1 = x2 (x2 − 1) + (x2 − 1),
and so
x4 − 1 = x2 (x2 − 1) + 1(x2 − 1) = (x2 + 1)(x2 − 1) + 0.
So, gcd(x10 − 1, x6 − 1) = gcd(x6 − 1, x4 − 1) = gcd(x4 − 1, x2 − 1) = gcd(x2 − 1, 0) =
x2 − 1.
In Euclidean rings, a lot is like in Z.
Theorem XI.9. In Euclidean rings, all ideals are principal.
Proof. This goes parallel to the proof of Theorem X.16, where we really
used only that Z has a Euclidean algorithm. The main idea (as shown there)
is that for any sequence a1 , a2 , . . . of generators of an ideal we have ha1 , a2 , . . .i =
hgcd(a1 , a2 ), a3 , a4 , . . .i = hgcd(a1 , a2 , a3 ), a4 , a5 , . . .i and that δ(a1 ) ≥ δ(gcd(a1 , a2 )) ≥
δ(gcd(a1 , a2 , a3 )) ≥ · · · .
This decreasing sequence has to level off, since it can’t go down indefinitely. It
follows that from some index i onwards, δ(gcd(a1 , . . . , ai )) = δ(gcd(a1 , . . . , aj )) for
all j > i. This in turn implies that ai+1 , . . . are all divisible by g = gcd(a1 , . . . , ai ).
But then the ideal generated by al ak is the same as the ideal of just a1 , . . . , ai , and
this is just the multiples of g. 
Theorem XI.10. In a Euclidean ring, “prime” and “irreducible” are the same
concepts.
Proof. Recall that prime things are always irreducible. So we need to show
that irreducible elements are prime. Let p ∈ R be irreducible. Take a product ab
that is a multiple of p. Suppose p does not divide a, and try to show it must divide
b.
Let g = gcd(a, p) and find with the Euclidean algorithm x, y ∈ R with ax+py =
g. Since g is the end product of the Eucidean algorithm of a, p, we can say that
δ(p) > δ(g). As g does divide p we can find h ∈ R with gh = p. If h is a unit,
g = ph−1 and so p would divide g, but then p would also divide a which we know
to be false. So, h is not unit, but then irreducibility of p implies that g is a unit.
Then g = ax + py gives b = bg −1 g = bg −1 (ax + py) = g −1 (abx + pby) is a multiple
of p (since ab is). But that is what we wanted to show. 
Example XI.11. The ring of polynomials with integer coefficients R = Z[x] is
not Euclidean.
Proof. On the face of it, this seems almost unprovable, since we are required to
show that one cannot put a Euclidean measure on R. The strategy is therefore to say
“if R were Euclidean, it should have some properties that follow from Euclideanness,
and maybe we can find one such property that R does not have”.
Above we proved that in Euclidean rings all ideals are principal. So if we find
an ideal in Z[x] that is not principal, R can’t be Euclidean. Let’s look at the ideal
90 XI. WEEK 11, EUCLIDEAN RINGS

geerated by 2 and x, I = {2a + xb|a, b ∈ R}. Let is assume for the moment that I
is principal, generated by the polynomial f . So that means that 2 is a multiple of
f and also x is a multiple of f ,
2 = f g, x = f h,
with g, h ∈ R. Plugging x = 0 into the second equation, 0 = f (0)h(0) and so one of
f (0) and h(0) has to be zero. Plugging x = 0 into the first equation, 2 = f (0)g(0)
and the says that f (0) is not zero, hence h(0) = 0. But then h is a multiple of x,
h = xk with k ∈ R. Together then, x = f h = f kx says that 1 = f k when dividing
out x. That says that the ideal hf i of multiples of f contains 1. Since we labor
under the belief that h2, xi = hf i, 1 should be a linear combination of 2 and x,
1 = 2a + xb with a, b ∈ R. Then evaluation at x = 0 gives 1 = 2a(0) which is not
possible since a(0) is an integer.
It follows that h2, xi is not principal and so R cannot have a Euclidean algorithm
and thus cannot have a Euclidean measure. 

Remark XI.12. A domain in which every ideal is principal is called a principal


ideal domain, PID for short. We have seen that Euclidean rings (ER for short)
are PIDs. And we have seen that in Euclidean rings the notion of primeness and
of irreducibility agree. This can be used to show that in a Euclidean ring every
element has a decomposition into prime factors, just like in Z. Such rings are called
unique factorization domains, UFD for short. As it turns out, a PID always has
the UFD property, so there is a sequence of implications
[R is a ED] ⇒ [R is a PI] ⇒ [R is a UFD] ⇒ [R is a domain].
One can show that each implication is strict, so there are domains that are not
UFDs, and there are UFDs that are not PIDs, and there are PIDs that are not
ERs.
Recall that we called norm on a ring R a function N : R → N for which N (rs) =
N (r) · N (s), and [N (r) = 0] ⇔ [r = 0], and [N (r) = 1] ⇔ [r is a unit].

Example XI.13. (1) The ring Z[ −5] is a domain but not a UFD. To
see this, note that it has√a multiplicative norm function given by complex absolute
2 2
value, squared: N (a + √b −5) =√a + 5b .
Now 2 · 3 = (1 + −5)(1 − −5) are √ two different factrizations of the number
6. Note that N (2) = √ 4, N (3) = 9, N (1 ± −5) = 6, and we use this to show that
the factors 2, 3, 1 ± −5 are irreducible. √
For example, if 2 could be factored, 2 = rs with rs ∈ Z[ −5], then 4 =
N (r)N (s). The point of the norm is that we are now down to integer arithmetic
only. So, either N (r) = 1 or N (r) = 1 or N (r) = 2. The latter case is impossible
since no√expression a2 + 5b2 can ever be 2 (with integer a, b). On the other hand
N (a + b −5) = 1 implies b = 0 and a = ±1. So the only factorization of 2 is as
product of ±1 √ with ±2. So 2 is irreducible.
For 3, 1 ± −5 the calculations are similar (see HW).
(2) The ring Z[x] is not a PID from the example above; we’ll prove the UFD
property below. √
(3) The ring Z[(1 + −19)/2] is a PID but not a Euclidean ring. That this is
so is a bit out of the realm of this course, you need to know a bit of what is called
number theory.
1. EUCLIDEAN RINGS 91

Definition XI.14. If R is a domain, then its ring of fractions is the ring whose
elements are fractions of the form f /g with f, g ∈ R but g nonzero. Addition and
muliplication are exactly as you would think.
So, for example, the ring of fractions of the domain Z is the ring of rational
numbers, and the ring of fractions of the polynomial ring R[x] is the ring of rational
functions with real coefficients.
Let us note that in a ring of fractions, f /g has inverse g/f unless f = 0. This
means that a ring of fractions of a domain is actually a field. Note also that there is
an inclusion of rings of R into its ring of fractions that sends f ∈ R to the fraction
f /1R . This is the natural generalization of the inclusion Z ,→ Q via z 7→ z/1.
The notion of a ring of fractions comes up in the proof of the next result.
Theorem XI.15 (The Gauß Lemma). If R is a domain and has unique fac-
torization, then so does R[x].
Proof. The idea is as follows. Let K be the ring of fractions of R. Then we
have an inclusion R[x] ,→ K[x] that is a ring morphism. Given f (x) a polynomial in
R[x], we can now also read it as a polynomial in K[x]. But as K is a field, we have
shown that K[x] is Euclidean, and therefore a UFD. So, in K[x] we can uniquely
factor f (x) = g1 (x) · · · gk (x) where each gi (x) is a polynomial in x with coefficients
in K, and no gi (x) can be factored further in K[x].
The question is how to translate this back into R[x]. The problems are: first
off, no gi (x) might be in R[x] (because of the fractions in the coefficients); secondly,
if we ever manage to make a translation, why is the resulting factorization for f (x)
in R[x] unique?
Skipping all of the details, the main part of the work consists now in showing
that one can rearrange the denominators in the various gi (x) such that after the
rewriting all factors have coefficients in R. In other words, if a product of poly-
nomials with fractional coefficients only has “whole” coefficients, then one rewrite
to a factorization with whole coefficients in each factor. For example, we can take
x1 − in Z[x] and rewrite in Q[x] as (2x + 1)(x/2 − 1/2), but by moving around the
1/2 we can also rewrite to x2 − 1 = (x + 1)(x − 1).
The official statement to be proved is:
Lemma XI.16. If f ∈ R[x] can be factored as f (x) = g(x)h(x) with g, h ∈ R[x],
then any prime element p ∈ R that divides f coefficient by coefficient, must divide
one of g or h coefficient by coefficient.
The proof of the lemma proceeds by an iterated induction on the degrees of
f, g, h.
With the lemma in hand one can prove that a factorization of f in K[x] always
yields a related factorization in R[x]. Uniqueness is then rather easy. You might
look at the proof of the Gauß Lemma in anyt textbook, if you are curious. 
CHAPTER XII

Divisibility, Field Extensions

1. Divisibility
Let R be a commutative ring with 1, and take an element f (x) in the polynomial
ring R[x]. For ever r ∈ R there is an evaluation morphsim
εr : R[x] → R
that sends f (x) to the element f (r) in R. It is immediately clear that if a polynomial
f (x) is a multiple of x − r then εr (f ) = 0 simply because εr (x − r) = 0. So, the
kernel of εr contains at least all multiples of x − r (that is, the ideal generated by
x − r).
It turns out that this kernel is precisely the ideal generated by x − r. The
argument is the following. Write f (x) = a0 + a1 x + . . . + ad xd , d the degree of f ,
and suppose εr (f ) = 0. Since εf (x − r) = 0 as well, then for arbitrary g(x) ∈ R[x]
we also have εr (f (x) − g(x) · (x − r)) = 0, since we can do the plug-in process
separately in the two polynomials.
Let’s pick a g1 (x) in such a way that f1 (x) := f (x) − g1 (x) · (x − r) has degree
less than d. By construction, εr (f ) = εr (f1 ). Now repeat: find g2 (x) such that
f2 (x) := f1 (x) − g2 (x) · (x − r) has degree less than deg(f1 ). Keep going. At the
end of the day, this must stop, because when you found a fk (x) that is constant,
you can’t keep the iteration going.
We have εr (f ) = εr (f1 ) = εr (f2 ) = . . . = εr (fk ) and that fk (x) is a constant.
But as a constant, pluggin in has no effect. So εr (f ) = fk . This says that the
remainder that you get when you divide f (x) by x − r (that is hat fk (x) really is!)
is precisely the value of f (x) at input x = r.
Lemma XII.1. Let f (x) ∈ R[x] and choose r ∈ R. The value f (r) is the
remainder of division of f (x) by x − r.
Going back to the kernel of our morphism εr , this lemma says that: f (x) ∈
ker(εr ) happens if and only if f (x) has remainder zero when dividing by x − r. But
the latter statement is just a euphemism for “f (x) is a multiple of x − r. So,
ker(εr ) = R[x] · (x − r).
Definition XII.2. If f (x) ∈ ker(εr ) we call r a root of f (x) in R.
Roots can be funny.
Example XII.3. (1) The roots of x2 − 1 in Z/12Z are 1 + 12Z.5 + 12Z, 7 +
12Z, 11 + 12Z. So a degree 2 polynomial can have more than 2 roots. The culprit
is the fact that Z/12Z is not a domain. Note that this is also reflected in possible
factorizations: x2 − 1 = (x − 1)(x + 1) = (x − 5)(x − 7) in Z/12Z.
(2) (x + 1)2 has only one root, -1, but with multiplicity two.
93
94 XII. DIVISIBILITY, FIELD EXTENSIONS

(3) The roots of x2 − 1 in Z/2Z are 1 and 1 again, since (x − 1)2 = x2 − 1


modulo 2. So 1 is a double root.
(4) x2 + 1 has no roots in Q.
Theorem XII.4. If R is a domain, then any polynomial f (x) has at most
deg(f ) roots in R, even when counting with mutiplicity.
Proof. If r1 is a root of f (x) then we can write f (x) = (x − r1 ) · f1 (x) because
of the lemma above. Iterate this to get that f (x) = (x−r1 )(x−r2 ) . . . (x−rk )fk (x),
and we can keep going with this until fk (x) does not have any roots in R.
Now suppose that perhaps f (x) has yet another root r. Then x − r divides
f (x), and so it divides (x − r1 )(x − r2 ) . . . (x − rk )fk (x). At this point we use that
polynomial rings over fields are Euclidean rings, and in particular are UFDs. The
way we use this is: x − r can’t be a product of two polynomials (since its degree
is 1) except for products of the form (unit of R) times (f divided by that unit).
So, x − r is irreducible. Since R[x] is a UFD, x − r is prime. As x − r is prime
and divides the product (x − r1 )(x − r2 ) . . . (x − rk )fk (x), it must divide one of
the factors of this product. The factor it divides can’t be fk (r) because then fk (x)
should have root r, but we agreed that fx (x) has no root. So, x − r divides some
x − ri . In other words, r = ri . 

In a while, we will go and try to manufacture new fields from old. As a stepping
stone we need to know when a polynomial is irredecble. We will be mainly concerned
with R = Z and Z/pZ with prime p ∈ Z. Note that over R = Z/pZ we can
actually go and test all elements of the field on whether they are roots as there are
finitly many things to test. Over Q that is much harder. Here is a basic test for
irreducibility.
Lemma XII.5. Let f (x) ∈ Z[x] be given, and assume that the coefficients of
f (x) have no common factor. If you can find a prime number p ∈ Z such that
f (x) mod p is irreducible and of the same degree as f , then f is irreducible in Z[x]
and even in Q[x]..
Proof. The Gauß Lemma says that if we can show that f (x) is irreducible in
Z[x] then it is also irreducible in Q[x]. So we focus on irreducibility on Z[]x.
Suppose f = gh with g, h ∈ Z[x]. Then take this equation and reduce modulo
p to get f (x) mod p = (g(x) mod p)(h(x) mod p). Since f (x) mod p is supposed
to be irreducible, this new equation must have one of g mod p, h mod p be a unit.
But units must have degree zero. So between g(x) mod p and h(x) mod p, one
has degree zero. However, that means that the other factor must have degree
deg(f mod p) = deg(f ), and so one of g or h themselves has degree deg(f ). That
now means that the other one of g, h has degree zero and so is an integer.
We have shows that f (x) can only be factored in Z[x] as (integer) times (poly-
nomial of degree deg(f )). Since the coefficients of f have no common factor by
hypothesis, the integer factor is a unit and we are done. 

Remark XII.6. The lemma says that irreducibility “lifts” from Z/pZ[x] to
Z[x]. It is not true that reducibility also lifts. Many polynomials are reducible
modulo p but irreducible over Z. There are even polynomials in Z[x] that are
irreducible but become reducible modulo every prime p. You will work through
one such example (f (x) = x4 + 1) in the homeworks.
1. DIVISIBILITY 95

Moreover, is pretty complicated to predict whether the reduction modulo some


prime p of a polynomiual f (x) will
√ give you soemthing irreducible. For example,
x2 +2 is irreducible over Z (since −2 is not an integer) and does not factor modulo
5, but does factor over Z/2Z since there it is just x2 . But there are much more
“non-obvious” factorizations as the above-mentioned homework problem shows.
It would be good to have a test for irreducibilty based on reduction modulo a
prime where one can see right away that it will work.
Theorem XII.7 (Eisenstein Criterion). Let f (x) = a0 + . . . + ad xd ∈ Z[x] be of
degree d with no common factor in the coefficients. If there is a prime p ∈ Z with
(1) p 6 |ad ,
(2) p|ai for i = 0, . . . , d − 1,
(3) p2 6 |a0 ,
then f is of degree d in Z/pZ[x], and irreducible in Z[x] and Q[x].
Proof. Let’s assume that f = gh in Z[x] and find a contradiction.
Since p 6 |ad , the degree of f mod p is also d. In fact, since p divides all
coefficients except the top one, f (x) mod p = xd mod p. So, in Z/pZ[x], which
is a Euclidean domain, and thus a UFD, the only factorizations of f mod p =
(g mod p)(h mod p) are of the type g = xd−k mod p, h = xk mod p. So, there are
α, β ∈ Z[x] with g = xn−k + pα, h = xk + pβ. But this implies that the constant
term of f = gh is twice divisible by p, hence by p2 . And there is our contradiction.
So f is irreducible over Z[x]. Ireducibility over Q[x] follows now from the Gauss
Lemma. 
Example XII.8. Let f (x) = xn − p. It fits the Eisenstein conditions, so is
irreducible over Z[x] and Q[x].
Definition XII.9. For n ∈ N set Φn (x) ∈ Z[x] be the n-th cyclotomic poly-
nomial, defined as the factor of xn − 1 that does not appear in xm − 1 for any
m < n.
Example XII.10. (1) x1 − 1 = (x − 1) and Φ1 (x) = x − 1.
2
(2) x − 1 = (x − 1)(x + 1) and Φ2 (x) = x + 1.
(3) x3 − 1 = (x − 1)(x2 + x + 1) and Φ3 (x) = x2 + x + 1.
(4) x4 − 1 = (x − 1)(x + 1)(x2 + 1) and Φ4 (x) = x2 + 1.
(5) x5 − 1 = (x − 1)(x4 + x3 + x2 + x + 1) and Φ5 (x) = x4 + x3 + x2 + x + 1.
(6) x6 − 1 = (x − 1)(x + 1)(x2 + x + 1)(x2 − x + 1) and Φ6 (x) = x2 − x + 1.
If p is prime, then xp − 1 = (x − 1)(xp−1 + xp−2 + . . . + x + 1) and it turns out
that the second factor is irreducible, and thus
Φp (x) = xp−1 + xp−2 + . . . + x + 1 if p is prime.
p
To see this, note that (x − 1)/(x − 1) becomes under x = y + 1 the quotient
((y + 1)p − 1)/y, and p
 0 says that (y + 1) − 1)/y = y p +
the binomial theorem p−1
p p−2 p p−k−1 p

1 y + ... + k y + . . . + p−1 y . We will check in a minute that k is
always divisible by p for 0 < k < p. That means that (y + 1)p − 1)/y satisfies the
conditions of the Eisenstein test, and must be irreducible. But then (xp −1)/(x−1)
is also irreducible because you get one from the other by a linear substitution (that
can be done backwards).
Lemma XII.11. For natural 0 < k < p, cp,k := kp is a multiple of p.

96 XII. DIVISIBILITY, FIELD EXTENSIONS

Proof. You learned in discrete math that number cp,k is the number of of
ways to pick k things from p given ones, and that there is an explicit formula
p!
cp,k = k!(p−k)! . In particular, the numerator is a multiple of p. It then suffices to
show that the denominator is not a multiple of p (since p is prime!). To see this,
note that k and p − k are both less than k, and so neither k! nor (p − k)! has a
factor divisible by p. Since p is prime, and p divides neither factor, it also does not
divide the product. 

2. Making new fields form old


Recall that a field is a ring in which every nonzeo element has an invers. Ex-
amples include Q, R, C, Z/pZ for p prime, and more fancy ones like R(x), the ring
of all rational functions in x with real coefficients. (A function is rational if it is
the quotient of two polynomials.
We have seen that if F is a field then F[x] is a Euclidean domain, the Euclidean
measure being degree of the polynomial.
Definition XII.12. If F is a field and f (x) ∈ F[x] is an irredcible polynomial,
then the quotient ring F[x]/hf i is the Kronecker extension of F by f . We denote it
Kron(f, F).
Lemma XII.13. If f ∈ F[x] is irreducible, Kron(f, F) is actually a field.
Proof. The point is that we need to show that every nonzero element of
Kron(f, F) has an inverse. Recall that F[x] is Euclidean. Take g ∈ F[x]. If the coset
of g in Kron(f, F) is nonzero, g is not a multiple of f . If in addition f is irreducible,
then gcd(f, g) = 1. So in this case, we can write af + bg = 1 for some a, b ∈ F[x].
This equation in Kron(f, F) means that b is the inverse of g in Kron(f, F). 
Example XII.14. If F = R and f (x) = x2 + 1 then F[x]/hf i is as real vector
space spanned by the class of the constant polynomial 1 and the class of the polyno-
mial x in F[x]/hf i. Since x2 + 1 = 0 in this quotient, we have x2 = −1.
√ So, we can
identify the coset a + bx in F[x]/hf i with the complex number a + b −1 and this
identification preserves addition and multiplication. It is thus a ring isomorphism.
Another way to make new fields is as follows.
Definition XII.15. Let F ⊆ E be fields, and pick β ∈ E. Then F(β) is defined
to be the smallest field that contains F and β. It is also the intersection of all fields
that contain F and β. Clearly, F(β) is inside E.
Theorem XII.16 (Kronecker Extension Theorem). Let f ∈ F[x] be an irre-
ducible polynomial.
The Kronecker extension Kron(F, f ) = F[x]/hf i is always a field (provided that
f is irreducible). If viewed as a vector space over Q, its dimension is the degree of
f.
If any extension field E ⊇ F contains a root β of f (x) then the smallest field
F(β) inside E that contains both F and β is isomorphic to Kron(F, f ).
Proof. Write R for Kron(F, f ). For the first part we need to show that this
ring is a domain, and that every nonzero element has an inverse.
Since F is a filed, F[x] is a Euclidean ring. In particular, it is a UFD and so
“prime” is the same as ”irreducible”. So f is a prime element in F[x]. Now suppose
2. MAKING NEW FIELDS FORM OLD 97

g, h ∈ F[x] are such that g and h in R are zerodivisors in R. That means that
gh ∈ hf i, so that gh = αf for some α ∈ F[x]. But then gh is divided by f and since
f is prime, f muts divide one of them, say f |g. But then g = 0 in R, so R has no
zerodivisors and is a domain.
Now take g ∈ F[x] with g nonzero in R and look for an inverse. Since g
is nozero, f can’t divide g. Since F[x] is a Euclidean ring, the gcd of f, g is a
linear combination of f, g. But this gcd is 1 since f is irreducble. Thus there are
polynomails a(x), b(x) with f (x)a(x) + g(x)b(x) = 1. Read this module hf i to get
g · b = 1. So b is an inverse to g.
From the definition, it is clear that Kron(F, f ) has a basis given by 1, x, . . . , xdeg f −1 .
Now we prove the last claim. Let us make a ring morphism from F[x] to F(β)
by sending x to β and any element of F to itself. The kernel is the polynomials p(x)
for which p(β) = 0. These are the multiples of the minimal polynomial of β. This
minimal polynomial is a divisor of f (x) and since f (x) is irreducible, this minimal
polynomial is f (s) itself.
It follows that we can actually make a ring morphism from Kron(F, f ) to F(β)
by sending the coset of x to β and all elements of F to themselves.
As we know, Kron(F, f ) is a field, and so is by definition F(β). So we have
one field contained in another, which makes the bigger field a vector space over the
smaller one. The bigger one is generated over F by 1, β, β 2 . . . , β deg f −1 , and each
of these is in the image of the morphism, xi 7→ β i . So, the morphism is a surjective
inclusion, hence an isomorphism. 

Example XII.17. Let F = Z/3Z and choose f (x) = x2 + x − 1. We study the


Kronecker extension R = F[x]/hf i.
Denote α the coset of x in R. Then α2 + α − 1 = 0 in R. So, powers of α higher
then the first power can be replaced by lower powers. So, R is a Z/3Z-vector space
spanned by 1 and α. So the element of R are
0, 1, 2, α, 1 + α, 2 + α, 2α, 1 + 2α, 2 + 2α.
Moreover, α2 + α − 1 = 0 in R means that the polynomial y 2 + y − 1 has a root
in R. (This is built into the construction for any Kronecker extension). One could
ask “what is the other root?”. Let’s find it. We do long division of y 2 + y − 1 by
(y − α). The answer is: y 2 + y − 1 = (y − α)(y + α + 1). (You might want to check
this: (y − α)(y + α + 1) = y 2 + y(−α + α + 1) + (−α)(α + 1). The linear term is fine,
and for the constant term observe that is equals −α2 − α. But as α2 + α − 1 = 0,
this constant term is −1.
So, if you make a Kronecker extension, a previously irreducible polynomial will
have at least one root. This says that you can (somehat artificially) make bigger
fields in which your favorite polynomial splits completely into linear terms. That
is a topic of a future lecture. For now, more examples.
Example XII.18. Let F = Q and choose f (x) = x3 − 2. Note that we need no
fancy theorems to let us know that this is an irreducible polynomial: the cubic root
of 2 is not a rational number (you can carry out the same proof as for irrationality
of the square root of 2) and so f (x) has no linear factor. But it’s cubic, so if it
factors at all it must have a linear factor.
As f (x) is irreducible, R = F[x]/hf i is a field. Denote again the coset of x in
R by α. Then in R we have a linear dependence 1 · α3 + 0 · α2 + 0 · α1 − 2 · α0 = 0,
98 XII. DIVISIBILITY, FIELD EXTENSIONS

which allows to rewrite all powers of α in terms of just second, first, and zeroth
powers. So, R is as a vector space over Q spanned by 1, α, α2 .
Supposedly, f has a root in R called α. Let;s find the other factor: divide y 3 −2
by y − α. We find a quotient of y 2 + αy + α2 .
It is entirely reasonable to ask whether this quadric splits further. In other
words, does f have a further root in R? To find out, pick an element of R; it will
look like aα2 + bα + c with rational numbers a, b, c. Now take this and plug it for
y into y 2 + αy + α2 . After sifting through the mess, you find that you obtained
(using that α3 = 2 of course)
α2 (b2 + 2ac + b + 1) + α(2a2 + 2bc + c) + 1(c2 + 4ab + 2a).
We are asking whether this can be zero for suitable choices of a, b, c ∈ Q. This is
here not totally easy, and in general (other Kronecker extensions) can be extremely
hard.
Here we can argue as follows: if the displayed expression is zero, then the three
expressions b2 +2ac+b+1, 2a2 +2bc+c = 2a2 +c(2b+1), c2 +4ab+2a = c2 +2a(2b+1)
are all zero. But then 2a2 = −c(2b+1) and c2 = −2a(2b+1) gives c/2a = 2a2 /c2 . So
(2a/c)3 = 2 and as weknow there are no rational numbers a, c suchthat (2a/c)3 = 2.
It follows that the quadric does not have any further roots in R.
CHAPTER XIII

Splitting fields and extension towers

Definition XIII.1. If f ∈ F[x] is a polynomial over F then a splitting field for


f is any field extension E ⊇ F such that f splits a product of linear polynomials
over E.
We let Split(f ) stand for any splitting field of f that is minimal with this
respect.
Let us prove that such things exist:
Theorem XIII.2. For any field F and any f ∈ F[x], splitting fields exist.
Proof. If f factors over F, factor inasmuch as you can. Then you are left with
proving that you can sp-lit any irreducible polynomial. So assume from the start
that f is irreducible.
Let E be the Kronecker extension Kron(F, f ). Then we know that f has a root
in E, namely the coset β of x. So you can split off a linear factor from f . Do that,
split what is left as far as you can in E and repeat the argument. Since degree
goes down at each step, eventually you will arrive at an extension in which f splits
completely. 
Note that this also proves that Split(f ) exists. However, there are choices being
made in the process and it is not clear right away to what extent these choices
have an impact on the final result. It is a fact that no matter how you construct
Split(f ), each version is isomorphic to any other, and that any such isomorphism
can be arranged to identify the copy of F each splitting field contains. One says
that the various versions of Split(f ) are isomorphic over F.
Note also that the worst case scenario is that we need to make Kronecker
extensions to polynomials of degrees deg(f ), deg(f ) − 1, . . . , 3, 2.
Example XIII.3. If f (x) = x2 − 2 with F = Q, then Split(f ) = Kron(Q, f ).
Indeed, x2 − 2 does not split over Q, so Split(f ) is not Q. On the other hand,
Kron(Q, f ) contains one root β of f , and dividing x2 − 2 by x − β leaves a linear
polynomial. So, f splits over Kron(Q, f ). It follows that we can find a copy of
Split(f ) inside Kron(Q, f ).
On the other hand, as a vector space over Q, Kron(Q, f ) is 2-dimensional, with
basis {1, β}. So Split(f ), another vector space over Q, is wedged between the one-
dimensional Q-vector space Q and the 2-dimensional Q-vector space Kron(Q, f ).
Since we know that Split(f ) can’t be Q, it must be Kron(Q, f ).
This argument works of course for any base field and any irreducible quadric.
Indeed, if β is a root of x2 + ax + b then x2 + ax + b = (x − β)(x + a + β) in
Kron(F, f )[x].
Example XIII.4. As we have seen, the quadratic factor x2 + βx + β 2 = (x3 −
2)/(x − β) remains irreducible over Kron(Q, x3 − 2). So in order to make a splitting
99
100 XIII. SPLITTING FIELDS AND EXTENSION TOWERS

field for x3 − 2 we need first β and then additionally a Kronecker extension that
catches a root of x2 + βx + β 2 .
It is time for the following concept.
Definition XIII.5. If F ⊆ E is an extension of fields, E is a vector space over
F. We denote the vector space dimension of E over F by [E : F] and call it the
degree of the extension.
Clearly, the degree of a Kronecker extension Kron(F, f ) is the degree of the
polynomial f , since Kron(F, f ) has the basis 1, x, x2 , . . . , xdeg f −1 .
Example XIII.6. Kron(Q, x3 − 2) is degree 3 over Q; Split(Q, x3 − 2) is degree
6 over Q; Split(F, f ) is of degree at most (deg(f ))! over F.
Example XIII.7. It is definitely possible for a cubic polynomial to split in a
degree 3 extension (within the first Kronecker extension of the iterative splitting
process).
Let F = Z/2Z and choose f = x3 + x + 1. Since f has no roots in F (check
that!), it has no linear factors over Z/2Z. Since it is degree 3, it has no factors and
is thus irreducible.
Let β be the Kronecker root for f in K := Kron(F, f ). Then β 2 and β 2 + β are
also roots of f inside K. This can be seen by stupidly plugging in: (β 2 )3 +(β 2 )1 +1 =
β 6 + β 2 + 1 = (β 3 + β + 1)2 since we are in characteristic 2. But β 3 + β + 1 = 0.
Similarly, (β 2 + β)3 + (β 2 + β)1 + 1 = β 6 + 3β 5 + 3β 4 + β 3 + β 2 + β + 1 =
(β + β 2 + 1) + (β 5 + β 3 + β 2 ) + (β 4 + β 2 + β 1 ) (remember that 2=0 here!). Each
6

bracket is zero since it is a multiple of β 3 + β + 1.


You might wonder how I knew these 2 other roots in the example. One comes
as follows.
Lemma XIII.8. If β is a root of f (x) and the field has charactersitic p, then
β p is also a root.
Proof. In charactersitic p, freshman’s dream for p-th powers holds: f (x)p =
f (xp ). Plug in x = β. 

On the third root in the example: if a cubic f (x) has 2 roots r1 , r2 in some
field, the third root is also in the field, and you can find it by longly dividing f (x)
first by x − r1 and then what you got by x − r2 . You’ll be left with x − r3 , and that
is what I did.

1. Roots with multiplicity


Some polynomials, like x2 + 2x−3 = (x+ 3)(x−1), have all their roots distinct.
For polynomials x2 + px + q of degree 2, we know that this happens exactly when
the discriminant p2 − 4q is nonzero. (I am assuming here that we can use the
quadratic formula, which necessitates that 2 6= 0 in the ring). So for example,
when p = 6, q = 9 we find that x2 + px + q = x2 + 6x + 9 = (x + 32 and one is
prompted to say that −3 is a root of x2 + 6x + 9 of multiplicity two.
For polynomials of higher degree, there is a similar discriminant test; the prob-
lem is that the formula for the discriminant gets impossibly difficult to remember.
For example, for a cubic x3 +px2 +qx+r, the discriminant is 8pqr4p3 r+q 2 r2 4r3 27r2 .
1. ROOTS WITH MULTIPLICITY 101

It is kind of clear that if a polynomial has a factor of the sort (x − r)k , then r
is a root and of multiplicity at least k. So, before one investigates multiplicity, one
should perhaps split the polynomial as far as one can into irreducibles.
As one sees by examples, it is ofetn interesting to take a polynomial over one
ring R and ask for its roots in a bigger ring. For eample, we know that we need to
look inside C for roots of x2 + 1.
Some strange things can happen.
Example XIII.9. Let K = Z/pZ. Then xp −1 (which over the complex numbers
has the p different roots of 1 as solutions) is equal to (x−1)p (because of freshman’s
dream in characteristic p). So, it has only one root, x = 1, and that with multiplicity
p.
Stranger yet, there are polynomials that are irreducible and yet have multiple
roots in a suitable larger ring.
Example XIII.10. Let K = Z/pZ(t). So, p = 0 in our ring, t is a variable, and
we are looking at the rational functions in t with coefficients in Z/pZ. (Recall that
“rational function” means “quotient of two polynomials).
Now look at the polynomial xp − t. In K, this has no roots because the p-th
root of a variable is not expressible as a quotient of 2 polynomials in that variable.
We will show in√a bit, that xp − t is also irreducible. Let’s make the field bigger,
say K̃ = Z/pZ( p t) the rational functions with Z/pZ coefficients in the symbol p-th
root of t. √ √ √
If p = 2, we have (x − 2 t)(x − 2 t) = x2 − 2x 2 t + t = x2 + t = x2 − t since
2 = 0. √ √ √2
For p = 3 we have√(x − 3 t)3 = x3 − 3x3 3 t + 3x 3 t − t = x3 − t since 3 = 0.
In general, (x − p t)p = xp + p(stuff) − t where the middle part is the stuff
that comes from√ the binomial theorem for i = 1. . . . , p − 1. In all cases then,
xp − t = (x − p t)p has a p-fold root in K̃ while it was irreducible over K.
Here is a way for testing whether a polynomial can ever have multiple roots.
The prime in the theorem denotes taking the derivative according to the rules of
calculus: product rule and power rule. (You might ask “What other rules might I
possible want to use for a derivativ, isn’t that a stupid thing to say?”. You are sort
of right. There are no other rules one should ever use. But the fact is that in some
environments, calculus seems like a dubious activity to engage in. For example, in
Z/3Z[x], what could “differentiation” mean? Normally, a derivative is a limit, but
in Z/pZ there are only finitely many “numbers”, so limits are very limited in their
nature. . . )
Theorem XIII.11. Let F be a field and f (x) ∈ F[x]. Then f (x) has a double
root in some (perhaps mysterious) extension field E ⊇ F if and only if gcd(f, f 0 ) is
not 1.
In other words, if f, f 0 are coprime then f has single roots in any field.
Proof. If in some extension field E we have (x − r)2 |f (x) (so r ∈ E is a
multiple root) write f (x) = (x − r)2 · g(x). Then taking derivatives, we have
f 0 (x) = 2(x − r)g(x) + (x − r)2 g(x) = (x − r)[2g(x) + (x − r) · g 0 (x)] is a multiple
of x − r. Of course, so is f itself, and so x − r divides both f, f 0 and hence must
divide their gcd. This means that a multiple root in an extension field prevents the
gcd of f, f 0 being 1.
102 XIII. SPLITTING FIELDS AND EXTENSION TOWERS

Now suppose the gcd of f, f 0 is not 1, or in other words, some g(x) of positive
degree divides both f and f 0 . Then let E be the Kronecker extension on F for
any irreducible factor of g(x). In E, g(x) has the Kronecker root β, and so g(x)
is a multiple of x − β and also f (x) is a multiple of x − β. So we can write
f (x) = h(x)(x − β). Then the derivative of f is f 0 (x) = h0 (x)(x − β) + h(x). Now
plug in x 7→ β. We know that (x − β)|g(x)|f 0 (x), so f 0 (β) = 0. But then h(β) must
also be zero. That says that (x − β) divides h(x), and so f (x) = (x − β)h(x) has
x − β twice as factor. So β is a double root of f in E. 
Remark XIII.12. In characteristic zero (when K contains Q) an irreducible
polynomial is relatively prime to its own derivative, because the derivative is a
nonzero polynomial of lower degree, and so cannot have a common divisor with the
irreducible f .
In prime characteristic, the derivative f 0 (x) can be zero without f being a
constant. For example, the polynomial xp −t from the example above has derivative
zero, since (xp )0 = pxp−1 and p = 0. (Note that we take x-derivatives, so (t)0 = 0
as t and x do not relate in that example!) In that case, then, we have gcd(f mf 0 ) =
gcd(f, 0) = f .
Definition XIII.13. A polynomial f (x) with coefficients in the field F is sep-
arable if f does not have multiple roots in any extension field of F. Any other
polynomial is inseparable.
The choice of “separable” indicates that separable polynomials have their roots
“separated out” in any extension: the roots never equal one another. In character-
istic zero, “irreducible” implies “separable”. But in characteristic p, separability is
an actual condition. It is a fact that over a finite field, “irreducible” still implies
“separable”, but in infinite fields of characteristic p one needs to be careful.
CHAPTER XIV

Week 14: Minimal polynomials and finite fields

1. Minimal Polynomials
Recall that a field extension F ⊆ E makes E a vector space over F. (Think of
R ⊆ C). The start of our investigations is based on
Definition XIV.1. If F ⊆ E is a field extension it is called algebraic if for any
α ∈ E the powers 1, α, α2 , . . . of α are linearly dependent over F.
There are many field extensions that are not algebraic. For example, Q ⊆ R
is not finite, because the powers 1, π, π 2 , . . . of π have no Q-linear dependence.
Another way is to say that π does not occur as a root of a polynomial in Q[x], π is
transcendental. Finiteness of a field extension indicates that the two fields are not
too far from one another, in a sense to be discussed this week and next.
For now we note the obvious
Theorem XIV.2. If F ⊆ E is algebraic, then for any α ∈ E there is a monic
irreducible polynomial f (x) ∈ F[x] such that f (α) = 0.
Note that previously we started with a polynomial and looked for roots; this is
now the other way round.
Pk
Proof. If the powers of α are dependent, there is an expression i=0 ri αi
Pk
that equates to zero. The polynomial i=0 ri xi is what we are looking for. It can
be made monic (have lead coefficient 1) by dividing out the actual lead coefficient.
Note that this division does not affect the vanishing of the polynomial at x = α. 

If we have several polynomials that vanish for x = α, then their gcd has the
same property. So, the gcd of all polynomials that vanish at x = α is one such as
well, and clearly the one of lowest degree. (Note: if I is the ideal of all polyno-
mials vanishing at x = r then the generator for this ideal—principal since F[x] is
Euclidean—is the one we want.)
Definition XIV.3. If F ⊆ E is a field extension, and if the powers of α ∈ F are
linearly dependent over F then the monic polynomial f (x) ∈ F[x] of minimal degree
with f (α) = 0 is the mimimal polynomial of α over F and denoted minpolF (α).
√ 2
Note that√ √ to know F in this definition: minpolR ( −1) = x +1,
one really needs
but minpolC ( −1) = x − −1. We note for future purposes that the second
(complex) minimum polynomial divides the first (real).
Definition XIV.4. If F ⊆ E is a field extension and if the powers of α ∈
E are linearly dependent over F then α is algebraic over F. Elsewise we call α
transcendental over F.
103
104 XIV. WEEK 14: MINIMAL POLYNOMIALS AND FINITE FIELDS

Definition XIV.5. If F ⊆ E is a field extension, it is called finite if E is a


finite-dimensional vector space over F.
Lemma XIV.6. A field extension F ⊆ E that is finite is also algebraic.
Proof. If E is a finite-dimensional vector space, then any infinite collection of
elements in E must be linearly dependent, since the size of any linearly indepen-
dent set is a lower bound for the size of any basis (and the size of a basis is the
dimension of the vector space). In particular, the infinitely many powers of α must
be dependent. 
It is time to introduce some notation regarding field and ring extensions.
Definition XIV.7. If R ⊆ S are ring and α some element of S then R[α]
denotes the smallest ring that contains α and all of R.
If F ⊆ K are fields, and α ∈ K, then F(α) is the smallest field that contains F
and α. (This may be considerably larger than F[α] since it also must contain all
inverses to the elements of F[α]).
Many, but not all, algebraic extensions are finite. For example, if F is any field
and α is a root to any polynomial f (x) ∈ F[x] then F(α) (the smallest field that
contains F and α) is finite over F. This is simply because F(α) is the Kronecker
extension Kron(α, F) = F[x]/hf i which is a vector space of dimension deg(f ) over
Fm spanned by 1, x, . . . , xdeg(f )−1 .
In fact, since deg(f ) is finite, so is its factorial, and it follows that Split(f, F)
is a finite extension of F.
On the other hand, it is not true that any algebraic extension is also finite. For
example, the field that you get when you start with Q and then throw in all n-th
roots of 2 (n = 2, 3, . . .) is algebraic but not finite. That is is not finite is kind of
believable since whatever finite basis this field might have over Q, this basis can’t
involve all roots of 2. It is more difficult to believe that this extension is algebraic,
because while of course the n-th root of 2 fits the equation xn = 2 (and thus is
algebraic over Q), it is far less clear that unpleasantries such as
√ √ √
33 46 2 + 112 17 2 − 641 666 2
√ √
6 4 2 − 3352295 532 2
fit into a polynomial with rational coefficients. It turns out that they indeed do,
and the reason is the following.
Recall that we defined the vector space dimension of E over F as the degree of
the extension, and wrote [E : F]. If one iterates extensions, F ⊆ F0 ⊆ F”, we have a
formula
[F00 : F] = [F00 : F0 ] · [F0 : F].
This is kind of clear from linear algebra: if F00 looks like (F0 )r and F0 looks like Fs
then F00 looks like (Fs )r = Frs . √ √ √
This formula implies that any extension of Q of the form Q[ 2 2, 3 2, . . . , k 2] is
still finite over Q. So the powers of any element in this extension are still algebraic
over F, even if it is often very difficult to find a polynomial that they fit into. In
particular, the monster in the display above has some minimal polynomial. (My
guess is that it has degree equal to a number of about 40 digits).
Example XIV.8. If F = Q and f = x3 − 2, then we can build splitting fields
one step at a time. F0 := Kron(F, f ) = Q[x]/hf i contains at least the Kronecker
2. FINITE FIELDS 105

root β. So in F0 [x] we can factor x3 − 2 = (x − β)(x2 + xβ + β 2 ). This first extension


is degree 3, because we adjoined the root of a cubic.
If we picture β as the real third root of 2, it is clear that F0 is still inside the field
of real numbers, and in particular can’t contain all three third roots of 2 because
the other two are complex and not real.
Thus, x2 + xβ + β 2 has no roots in F0 and hence does not factor in F0 [x]. A
second Kronecker extension F 00 := Kron(F0 , x2 + xβ + β 2 ) is a degree two extension
of F0 and therefore a degree 2·3 = 6 extension of F. In that field, f splits completely.
Hence F00 = Split(F, f ).

2. Finite Fields
Example XIV.9. Let F = Z/2Z and take f (x) = x3 + x + 1, g(x) = x3 + x2 + 1.
It is easy to check that neither f nor g have a root in F, and so (as cubics) are
irreducible.
Kron(F, f ) = F[x]/hf i has 8 = 23 elements {0, 1, x, x + 1, x2 , x2 + 1, x2 + x, x2 +
x+1}. The same is true for Kron(F, g) = F[y]hgi, but we must not confuse elements
in the different extensions because in the first, we go modulo f and in the other we
go modulo g. I intentionally write different variables x, y here.
Let α be a the Kroenecker root for f , so α = x mod hf i. Write β for the
Kronecker root of g, so β = x mod hgi.
So α is a root of f , who else? f (x) : (x − α) = x2 + αx + (α2 + 1), which we
call f2 (x). Then if you plug α2 into f2 (x), you get zero, so α2 is a root of f2 and
also then of f (x). The remaining root can be found as α2 + α. (As a test, if you
multiply out (x − α)(x − α2 )(x − α2 − α) you get f (x) back, using that f (α) = 0.)
So Kron(F, f ) is actually the splitting field of f over F.
Now plug x − 1 into f (x). You get (x − 1)3 + (x − 1) + 1 = x3 + x2 + 1 = g(x).
So, g has roots equal to thos of f shifted up by 1. They are α + 1, α2 + 1, α2 + α + 1.
In particular, Kron(F, f ) is also the splitting field for g(x). So there is no real
difference between α+1 in Kron(F, f ) and β ∈ Kron(F, g). There is only one degree
3 extension of F.

Remark XIV.10. Here is an amusing computation that explains the previous


example. Any degree 3 extension of Z/2Z will be a vector space of dimension 3 over
Z/2Z. As such, it contains 23 elements, of which 7 are nonzero. Since in a field all
nonzero elements have inverses, these 7 elements form a group with multiplication.
So, all group elements have order dividing 7, by Lagrange. That translates to: ”all
nonzero elements satisfy a7 = 1”, and so all elements satisfy a8 − a = 0. We can
factor x8 − x = x(x + 1)(x3 + x + 1)(x3 + x2 + 1) and we find here the irreducibles
f and g as factors of x8 − x.
So, any field of 8 elements contains all roots to f , and all roots to g. There is
only one field with 8 elements.

Theorem XIV.11. Let F = Z/pZ be the field with p elements, p a prime


number.
Choose e ∈ N|geq1 . Then there exists a field GF(p, e) with pe elements. All
e e
elements a ∈ GF(p, e) are roots of the polynomial xp − x, and xp − x completely
splits over GF(p, e). In other words, GF(p, e) is the splitting field of the polynomial
e
xp − x.
106 XIV. WEEK 14: MINIMAL POLYNOMIALS AND FINITE FIELDS

Every element of GF(p, e) is equal to its pe -th power, and so every element has
e
a p -th root in GF(p, e).
One has GF(p, 1) = Z/pZ.
The degree of the field extension [GF(p, e) : GF(p, 1)] is e. In consequence,
GF(p, e) is the Kronecker extension Kron(F, g(x)) for every irreducible polynomial
of degree e.
e
Sketch. Existence Take any splitting field K for f (x) := xp − x (for
example, a suitable field inside an iteration of Kronecker extensions). Denote this
set by GF(p, e).
Note that if a, b are both roots of f (x) then that is also true for a±b and ab and
a/b provided that b 6= 0. (Why? For ± the binomial theorem gives you p-divisible
e
coefficients in (a ± b)p in all but the first and last term. For ab and a/b this is very
easy.) It follows that GF(p, e) is closed under + and −, and under multiplication
and division. So, this set of roots is a field (This
√ is really weird and only happens
over finite fields. For example, the field Q[ 2] has lots and lots of elements that
are not roots of x2 − 2. . . )
Splitting By construction, f has all its roots in GF(p, e), so GF(p, e) is the
smallest field over which f splits. It follows that GF(p, e) is the splitting field.
e
Size The gcd of f (x) and f 0 (x) = pe xp −1 − 1 is 1, since pe = 0 and so f 0 (x) =
−1. So, f has no multiple roots in any extension, and in particular not in GF(p, e).
So, GF(p, e) is full of single roots of f (x) and so must have pe elements (it is a
plitting field!).
e
Uniqueness Any field with pe elements has ap − a = 0 by the argument of the
e
remark above, so any field with p elements is the splitting field of f .
(1) Since f (a) = 0 for all a ∈ GF(p, e), each element agrees with its own pe -th
power.
(2) If e = 1, we want the splitting field over Z/pZ of xp − x. But Little Fermat
says that ap = a for each a ∈ Z/pZ. So all roots of xp − x are in Z/pZ and we need
no extension.
(3) A vector space with pe elements over a field of p elements has to have
e basis vectors. Let g be an irreducible polynomial of degree e over Z/pZ. Its
Kronecker extension is a field extension of order e, so has pe elements, and so must
e
be a splitting field for xp − x. So Kron(F, g) = GF(p, e)

Corollary XIV.12. The nonzero elements U (p, e) of GF(p, e) form an Abelian
group with respect to multiplications. This group is cyclic.
Proof. The first sentence is clear since fields have commutative multiplication
and every nonzero element in a field has an inverse.
As Abelian group, U (p, e) can be writen as Z/a1 Z×· · ·×Z/ak Z with a1 |a2 | · · · |ak ,
by FTFGAG. If q is an element of this product group, it has order ak . So the el-
ements of U (p, e) have their ak -th power equal to the identity. That means, they
are roots of xak − 1. So elle elements of GF(p, e) are roots to xak +1 − x. But such
a polynomial can have only ak + 1 roots, and we know GF(p, e) is the set of these
roots, pe in number. So pe = ak + 1. So ak = pe − 1. But a1 · a2 · · · ak shouldbe
pe − 1 = |U (p, e)|, and that means that k = 1 and so U (p, e) is cyclic. 
Example XIV.13. Let p = 2 and take f (x) = x4 + x + 1. Since f (0) = f (1) =
1 ∈ Z/2Z, f has no linear factors.
2. FINITE FIELDS 107

If f were to factor, then, it should factor as the product of 2 quadrics. But


over Z/2Z there is (easy check!) only one irreducible quadric, x2 + x + 1. And
x2 + x + 1 doesn ot divde f (x). So f (x) is irreducible and Kron(Z/2Z, f ) has
24 = 16 elements.
By the theorem, Kron(Z/2Z, f ) = GF(2, 4).
Let α be the Kronecker root and compute explicitly: (α2 + α + 1)2 + (α2 + α +
1) + 1 = 0 in Kron(Z/2Z, f ). This says that α2 + α + 1 is a root in GF(2, 4) of the
irreducible quadric x2 +x1 . In particular, GF(2, 4) contains Kron(Z/2Z, x2 +x+1) =
GF(2, 2).
Example XIV.14. Let p = 2 and take f (x) = x3 + x + 1, an irreducible cubic
in Z/2Z[x]. Take GF(p, 3) to be its splitting field Kron(Z/2Z, f ). The 8 elements
3
of GF(2, 3) are precisely the roots of x2 − x = x(x − 1)(x3 + x + 1)(x2 + x2 + 1).
Let us look for a copy of GF(2, 2) in here. Since elements of GF(2, 2) are
2
characterized by satisfying x2 − x = 0, we need the 4 roots to x(x − 1)(x2 + x + 1)
to be in GF(2, 3). But gcd(x2 + x + 1, x8 − x) = 1, so any element that makes both
of these polynomials to zero should also make the polynomial 1 to zero. That being
preposterous, nobody can be in GF(2, 3) and GF(2, 2) simultanously.
It is natural to ask when finite fields sit inside one another.
Theorem XIV.15. GF(p, e) sits inside GF(p0 , e0 ) if and only if p = p0 and e|e0 .
Proof. In GF(p, e) we have that 1 + . . . + 1 (e copies) gives 0. In GF(p0 , e0 ),
this is so with p0 copies. If p 6= p0 then gcd(p, p0 ) = 1 copies of 1 aso amount to
zero. We conclude p = p0 is necessary.
Suppose GF(p, e) sits inside GF(p, e0 ). Then GF(p, e0 ) is a vector space over
0 0
GF(p, e). Since one has pe elements, and the other pe elements, pe should be a
power of pe , and that means that e0 should be a multiple of e.
Now suppose e|e0 and look for GF(p, e) inside GF(p, e0 ). The main step is to
e e0
see that e|e0 gives you that xp − x divides xp − x. To see this, write e0 = de and
calculate
e0
xp − x = (xe − x)(x(d−1)e + x(d−2)e + . . . + x1e + x0 .
e0 e
But then the splitting field of xp − x must contain the splitting field of xp − x as
we wanted to show. 
It is a natural question to ask ”if GF(p, e) is a subfield of GF(p, e0 ), how do
we best identify the smaller field? (So far we only know that it must be in there
somewhere).
Corollary XIV.16. If e|e0 , the subfield GF(p, e) inside GF(p, e0 ) consists of
e
exactly those elements that satisfy xp = x.
One can obtain elements in GF(p, e) by raising any element of GF(p, e0 ) to the
0
power (pe − 1)/(pe − 1).
Proof. That GF(p, e) is inside GF(p, e0 ) comes from the preceding theorem.
e
Since eny element in any version of GF(p, e) is a root of xp − x, selecting the ones
that do this is the right strategy.
0
Since U (p, e0 ) is cyclic of order pe − 1 and since U (p, e) of size pe − 1 sits inside
0
U (p, e0 ) it must be so that U (p, e) is the (pe − 1)/(pe − 1)-powers of the elements
of U (p, e0 ). 
108 XIV. WEEK 14: MINIMAL POLYNOMIALS AND FINITE FIELDS

Example XIV.17. Let’s look at the finite fields inside the field of size 224 .
They are the fields of sizes 212 , 28 , 26 , 24 , 23 , 22 , 21 . The containment relations are
GF(2, 1) ⊆ GF(2, 2) ⊆ GF(2, 4) ⊆ GF(2, 8) ⊆ GF(2, 24),
GF(2, 1) ⊆ GF(2, 3) ⊆ GF(2, 6) ⊆ GF(2, 12) ⊆ GF(2, 24),
and additional containments GF(2, 2) ⊆ GF(2, 6) and GF(2, 4) ⊆ GF(2, 12).
CHAPTER XV

Galois

1. The Frobenius
In a ring of characteristic p > 0 (such as in GF(p, e) or indeed a ring containing
Z/pZ), we have
(a + b)p = ap + pp
since the intermediate terms arising from the binomial theorem all are multiples of
p, hence zero. It follows that
Frob : GF(p, e) → GF(p, e),
γ 7→ γ p
is a morphism of additive groups. Since clearly 1p = 1 and (γγ 0 )p = γ p (γ 0 )p , the
Frobenius also respects the multiplicative structure. It is therefore a ring moor-
phism.
Theorem XV.1. The p-Frobenius (p-th power map) is a field isomorphism
Frob : GF(p, e) → GF(p, e)
for any e.
If e0 |e and GF(p, e0 ) therefore sits inside GF(p, e) then the Frobenius sends
elements of this subfield into the subfield.
The e-fold iteration of the Frobenius on GF(p, e) is the identity map. One can
interpret this as
the group Z/eZ acts on GF(p, e) by sending the coset of t to the
t-fold iteration of Frob.
The elements of GF(p, e) that are fixed by Frob are exactly the elements of
GF(p, 1) = Z/pZ. More generally, the elements of GF(p, e) that are fixed under the
k-fold iteration of the p-th power map are precisely the elements of GF(p, gcd(e, k)).
Suppose α ∈ GF(p, e) is the root of a polynomial f (x) with coefficients in Z/pZ.
Then αp is a root of f (x) as well. In fact, iterating the p-th power map will produce
all other roots in GF(p, e) of f (x). In other words,
The orbit of α ∈ GF(p, e) under the action of Z/eZ above is the
set of all roots of the minimal polynomial of α over Z/pZ.
If k ∈ N one can define an action of Z/eZ via λ(t, α) = (αp )t . The orbits under
this action are the roots of the minimal polynomial of α over GF(p, gcd(e, k)).
Proof. Let α be in the kernel of Frob. Then αp = 0 and since a field has no
zerodivisors, α = 0. So Frob is injective. By the isomorphism theorem, the image
of Frob is isomorphic to GF(p, e). But that means it has pe elements, and hence fills
out the target field. So, Frob is bijective and hence an isomorphism. It permutes
the elements of GF(p, e).
109
110 XV. GALOIS

e
All elements in GF(p, e) satisfy αp = α. If e0 |e, the elements of GF(p, e0 ) inside
e0
GF(p, e) are characterized by being those elements for which αp = α already. Take
e 0 e0 e0
such α and raise it to the p-th power. Then note that (αp )p = αp·p = (αp )p =
αp . In other words, Frob(α) belongs to GF(p, e0 ) again. So the isomorphisms that
the Frobenius induces on the various fields GF(p, −) are compatible with inclusions.
e
The e-fold iteration of Frob sends α ∈ GF(p, e) to αp = α, so it is the identity
on GF(p, e). It follows that we can read Frob as a group actiion of Z/eZ on the
t
elements of GF(p, e) via λ(t mod eZ, α) 7→ Frobt (α), which is nothing but αp .
Since Frobe is the identity on GF(p, e), Frobk acts the same way as Frobgcd(e,k) .
(You should make sure you believe this before going on. It can be seen via the
Euclidean algorithm: Frobk (α) = Frobk−e (α); now iterate). If αp = α then α
is a root of xp − x, and there are exactly p of those, the elements of Z/pZ =
gcd(e,k
GF(p, 1). If Frobk (α) = α then α is a root of xp − x and therefore belongs to
GF(p, gcd(k, e)).
Suppose e0 |e, so GF(p, e0 ) sits inside GF(p, e). If f (α) = 0 and the coefficients
0
of f come from a field GF(p, e0 ) then the coefficients ci satisfy Frobe (ci ) = ci .
e0
ci αi produces under e0 -fold Frobenius that 0 = ci (αp )i .
P P
Thus, 0 = f (α) =
0
In other words, Frobe (α) is a root to the same polynomial as α. Since the degree of
α over GF(p, 1) is the product of the degree of α over GF(p, e0 ) with e/e0 , it follows
that the degree of the minimal polynomial of α over GF(p, e0 ) is e/e0 . This implies
0
that iterating Frobe on α makes it circle throuugh all the roots of f . (If it did not
move through all roots, one could take the roots it moves thorugh and construct a
minimal polynomial of lower degree, which cannot be). 

Example XV.2. Let p = 3 and e = 4. There are 81 elements in GF(3, 4).


An irreducible polynomial of degree 4 is x4 − x3 − 1, so we can view GF(3, 4) as
Kron(Z/3Z, x4 − x3 − 1) = Z/3Z[x]/h(ix4 − x3 − 1).
4
A slightly horrendous calculation shows that x3 −x factors as (x)∗(x−1)∗(x+1)
2 2 2
times (x + 1) ∗ (x − x − 1) ∗ (x + x − 1) times

(x4 − x − 1) ∗ (x4 + x − 1) ∗ (x4 − x2 − 1) ∗ (x4 + x2 − 1) ∗ (x4 + x2 − x + 1) ∗


∗ (x4 + x2 + x + 1) ∗ (x4 − x3 − 1) ∗ (x4 − x3 + x + 1) ∗ (x4 − x3 − x2 + x − 1) ∗
∗ (x4 − x3 + x2 + 1) ∗ (x4 − x3 + x2 − x + 1) ∗ (x4 − x3 + x2 + x − 1) ∗ (x4 + x3 − 1) ∗
∗ (x4 + x3 − x + 1) ∗ (x4 + x3 − x2 − x − 1) ∗ (x4 + x3 + x2 + 1) ∗
∗ (x4 + x3 + x2 − x − 1) ∗ (x4 + x3 + x2 + x + 1).

In particular, there are 3 irreducible linear polynomials, 3 irreducible quadrics, and


18 irreducible quartics over Z/3Z. (We learn nothing about cubics, because cubics
make field extension of degree 3, and right now we are looking at an extension of
degree 4 and 3 does not divide 4, so no GF(3, 3) is inside GF(3, 4).)
Note that 3 · 1 + 3 · 2 + 18 · 4 = 3 + 6 + 72 = 81 as it should.
So the roots of x81 − x, which are precisely the elements of GF(3, 4), come in
3 types:
• elements of GF(3, 1): as the roots to x = 0, x − 1 = 0, x + 1 = 0;
• elements of GF(3, 2) that are not in GF(3, 1): they come in pairs of the
roots of x2 + 1 = 0, x2 − x − 1 = 0, x2 + x − 1 = 0;
1. THE FROBENIUS 111

• elements in GF(3, 4) that are not in GF(3, 2): these come in quadruplets
as the roots of the 18 irreducible quartics.
Let us take the irreducible quadric x2 + 1, and let α be the Kronecker root
of GF(3, 4) = Kron(Z/3Z, x4 − x3 − 1) for f (x) = x4 − x3 − 1. In other words,
α = x. Let’s try to find a copy of GF(3, 2) inside this field. This would require, for
example, finding the roots to x2 + 1 (one of the three irreducible quadrics above).
We calculate
(α3 + α2 + 1)2 + 1 = α6 + 2α5 + α4 + 2α3 + 2α2 + 1 + 1
= α2 · (α3 + 1) + 2α5 + α4 + 2α3 + 2α2 + 2
= 3α5 + α4 + 2α3 + 3α2 + 2
= α4 − α3 − 1 = 0.
It follows that α3 + α2 + 1 is a root of x2 + 1. (The other root is 2(α3 + α2 + 1).)
So, inside GF(3, 4) the copy of GF(3, 2) consists of the Z/3Z-linear combinations
of 1 and β := α3 + α2 + 1. These are the 9 elements
0, 1, 2, β, β + 1, β + 2, 2β, 2β + 1, 2β + 2.
Now look at what the Frobenius (third power map) does to them:
03 = 0,
3
1 = 1,
23 = 8 = 2,
(α3 + α2 + 1)3 = α9 + α6 + 1 = . . . = 2α3 + 2α2 + 2,
((α3 + α2 + 1) + 1)3 = . . . = 2α3 + 2α2 + 0,
((α3 + α2 + 1) + 2)3 = . . . = 2α3 + 2α2 + 1,
3 2 3
(2(α + α + 1)) = . . . = α3 + α2 + 1,
(2(α3 + α2 + 1) + 1)3 = . . . = α3 + α2 + 2,
(2(α3 + α2 + 1) + 2)3 = . . . = α3 + α2 + 0
So, the third-power map flips them about in pairs. The 3 pairs correspond to the
roots of the 3 irreducible quadrics above.
Now let us look what the Frobenius does to general elements of GF(3, 4), those
that do not live in smaller fields. As a starter, we look at what happens to α itself
under iterates of Frob. It is clear that Frob(α) = α3 and we leave it like that since
we can’t rewrite polynomials of degree less than four.
Then Frob(Frob(α)) = α9 and that can be rewritten (with labor) as α3 +α2 +2α.
The third power of this is α3 + 2α2 + 1, and the Frobenius sends this last guy to
α. So the Frobenius action circles
α 7→ α3 7→ α3 + α2 + 2α 7→ α3 + 2α2 + 1 7→ α.
These 4 elements are the roots of x4 + x3 + 1, since we took one such root, and
applied Frobenius. (Frobenius takes the equation x4 − x3 − 1 = 0 and turns it into
(x3 )4 − (x3 )3 − 1 = 0, so that if you “Frobenius a root” then you get a root back).
The same sort of thing happens to the roots of the other 17 irreducible quadrics:
the Frobenius circles them within their lucky clover leaf, preserving that they are
roots to whatever quadric they are roots of.
So, Z/4Z (the 4 is because e = 4 and the 4-th power of the Frobenius is the
identity) acts on the 81 elements of GF(3, 4). Three elements are fixed points, there
112 XV. GALOIS

are thre orbits of size 2 (pairing the roots of the quadrics) and there are 18 orbits
of size 4 (the 18 quadruplets that occur as roots of the irreducible quartics)

2. Symmetries
3. Applications
Stuff for later

3.1. Zerodivisors. When we listed the arithmetic operations that we can


perform with cosets, we did not list division. There are good reasons for that. First
off, we don’t really expect division to work in general since even for usual integers
division is problematic (try dividing 3 by 2, for example). But it is stranger than
that. Even if dividing one integer by another would be just fine (let’s say you
planned to divide 12 by 6) it is not clear that in the modulo world this is still going
as expected.
Example XV.3. To get a feeling, let’s try to divide 12 by 6 but modulo 8. The
quotient, let’s call it a, should live in Z/8Z and have the property that a · 6 = 12.
Of course, a could be 2.
But if you list the multiples of 6 you find:
0 · 6 = 0, 1 · 6 = 6, 2 · 6 = 4, 3 · 6 = 2,
4 · 6 = 0, 5 · 6 = 6, 6 · 6 = 4, 7 · 6 = 2.
So we see that there are actually two different cosets that compete for being a
quotient 12/6, namely 2 and 6. This comes from the fact that we can think of 12
also as 4.
Moreover, one can see that the people in Z/8Z split into two classes, the cosets
that are multiples of 6 and those that are not, where each coset that shows up at
all as multiple of 6 shows exactly twice. 
Example XV.4. This time, let’s try to divide 7 by 5. Usually that would not
seem like a good idea (at least if you hope for integer answers), but let’s do this
again modulo 8. Writing out the multiples of 5 in Z/8Z we find
0 · 5 = 0, 1 · 5 = 5, 2 · 5 = 2, 3 · 5 = 7,
4 · 5 = 4, 5 · 5 = 1, 6 · 5 = 6, 7 · 5 = 3.
So, quite against expectations, 7/5 can be found in Z/8Z, and there is exactly one
answer: 3. In fact, as one can see, any coset in Z/8Z can be divided by 5 in exactly
one way.
In this section we will try to understand and predict this kind of behavior. 

The coset of 0 in Z/nZ is “the zero” in this new system of numbers, since adding
it to any coset does not change the coset. As seen in Example XV.3 above, it is
possible that this new zero shows up as a product of nonzero inputs, a phenomenon
not encountered in the integers.
Definition XV.5. If a, b are in Z/nZ, with neither a nor b divisible by n, then
they are called zerodivisors if ab = 0.
113
114 STUFF FOR LATER

This ability to multiply to zero in Z/nZ of course comes from the fact that we
equate (every multiple of) n with zero. So, a composite n will allow for products
to be zero (that is, multiples of n) in several ways. We try to understand by way
of an example.
Example XV.6. Let n = 6; then 2 · 3 = 0.
Indeed, in order to prepare what is to come in a bit, let’s list all multiples of 2:
2 · 0 = 0, 2 · 1 = 2, 2 · 2 = 4, 2 · 3 = 0, 2 · 4 = 2, 2 · 5 = 4.

The reason that 2 was capable to yield 0 when multiplied with a nonzero coset
was of course that 2 has an interesting common factor with 6. In the general case,
suppose a is a coset in Z/nZ and we look for another element b ∈ Z/nZ such that
ab = 0. If we set gcd(a, n) = d and if d happens to be greater than 1, then we can
write n = d · e and so a · e is a multiple of d · e = (de) = n = 0. But a multiple of 0
must be 0 itself.
On the other hand, pick now an a such that gcd(a, n) = 1. This means by
Proposition I.18 that there are integers α, β with aα + nβ = 1. Reading this
“modulo n”, we get a · α + (nβ) = 1. Naturally, (nβ) = n · β = 0. So, a · α = 1. It
follows that for any b ∈ Z, a · (bα) = b. This says that every single coset in Z/nZ
is the result of some coset being multiplied by a.
Let’s try to understand what this means. There are n cosets in Z/nZ, each
of which you can multiply with a. The process of multiplication produces all n
of these (provided gcd(a, n) = 1). It follows there is exactly one coset that when
multiplied by a gives you any given coset b. In particular, there is only one coset
that when multiplied gives 0 (and of course this one coset is 0 itself).
Putting it all together, we have proved most of the following theorem:
Theorem XV.7. If gcd(a, n) = 1 then multiplication by a is a bijection on
Z/nZ. In other words, for each b ∈ Z/nZ there is exactly one x ∈ Z/nZ such that
a · x = b. Yet in other words, a becomes a unit in Z/nZ.
Conversely, if gcd(a, n) = d > 1 then multiplication by a is neither surjective
nor injective. There are exactly n/d different cosets that arise through multiplication
by a, and each is d times output of such a multiplication. In this case, a is a
zerodivisor in Z/nZ.
Example XV.8. Let n = 6. The numbers a that have gcd(a, n) = 1 are living
in the cosets 1 and 5. Everyone is a multiple of 1 for obvious reasons, and everyone
is also a multiple of 5 because 5 = −1.
The multiples of 2 are {0, 2, 4}, and these are also exactly the multiples of 4.
Note that each one of {0, 2, 4} is a multiple of both 2 and 4 in 2 = gcd(6, 2) =
gcd(6, 4) ways. For example, 4 = 4 × 1 = 4 × 4 and also 4 = 2 × 2 = 2 × 5.
The multiples of 3 are 3 and 0, and each of {0, 3} arises 3 = gcd(3, 6) times as
multiple. For example, 3 = 3 × 1 = 3 × 3 = 3 × 5. 
Exercise XV.9. For n = 10 and a = 1, 2, . . . , 9 determine
(1) which cosets in Z/10Z are multiples of a;
(2) how many cosets in Z/10Z are multiples of a and express these numbers in
terms of a and 10.

STUFF FOR LATER 115

Theorem XV.7 implies that the units of Z/nZ are exactly the cosets of those
numbers between 1 and n − 1 inclusive that are relatively prime to n. All other
cosets exhibit ambiguity (at best) or impossibility (at worst) when trying to divide
by them. Which case happens depends on the two cosets to be divided. For
example, in Z/4Z, trying to divide by 2 one fails when the input is 1 or 3 while one
gets too many suggestions when one divides 2 by 2 (namely, 1 and 3) or when one
divides 0 by 2 (namely, 0 and 2).
In order to explain this behavior, we shall need the following observation:
Exercise XV.10. Prove that lcm(a, b) · gcd(a, b) = ab. 
Now suppose a · x = b has at least one solution, so ax − b is a multiple of
n. If you added c = n/ gcd(a, n) to x then we calculate: a(x + c) = ax + ac =
b + (an/ gcd(a, n)) = b + lcm(a, n) = b since lcm(a, n) = 0 (like any other multiple
of n) represents the coset of zero. It follows that besides x all expressions x + i · c
are also solutions to ax = b.
How many such are there? On the face of it, infinitely many but recall that
x+i·c and x+j·c are in the same coset of Z/nZ as soon as (x+i·c)−(x+j·c) = (i−j)c
is a multiple of n. That of course happens exactly if i − j is a multiple of n/c. So,
there are n/c different cosets x, x + c, . . . , x + ((n/c) − 1)c that all solve ax = b.
(Of course, n/c = gcd(a, n) by definition of c).
Exercise XV.11. Group the elements of Z/24Z in such a way that two cosets
a, b are in the same group exactly when their sets of multiples {1a, 2a, 3a, . . .} and
{1b, 2b, 3b, . . .} agree as sets (perhaps after reordering). Describe in words each
group. 
Remark XV.12. The Euclidean algorithm can also be carried out in the poly-
nomial ring R[x]; the idea of size (absolute value for integers) in the Archimedean
principle is then taken over by the degree of the polynomial. The relevant statement
is then:
For all polynomials a(x), b(x) in R[x] there are q(x), r(x) ∈ R[x]
such that a(x) = b(x)q(x)+r(x) and 0 ≤ deg(r) ≤ deg(b)−1.
The polynomials q(x) and r(x) are furnished by the method of (polynomial) long di-
vision. Exactly as for integers, one can work this division process into an algorithm
to compute the gcd between polynomials. 
Exercise XV.13. Compute the gcd between
(1) x3 + 1 and x1 + 1;
(2) x3 + 1 and x2 + 1;
(3) x3 + 1 and x4 + 1;
(4) x3 + 1 and x5 + 1;
(5) x3 + 1 and x6 + 1;
(6) x3 + 1 and xn + 1 for any natural number n (this will require to consider
cases depending on the remainder of division of n by 6.

3.2. Cartesian Products, Euler’s φ-function, Chinese Remainder. We
wish to find a formula for the number of cosets in Z/nZ that are units. By Theorem
XV.7, we need to count the numbers on the list 1, . . . , n − 1 that are coprime to n.
For this, recall the Euler φ-function from Definition I.32.
116 STUFF FOR LATER

If p is a prime number, it is clear that φ(p) = p − 1. So, Z/pZ has p − 1 units


whenever p is prime.
Exercise XV.14. (1) Determine for n = 4, 8, 9, 16 the value of φ(n) by
listing explicitly the units in Z/nZ.
(2) Suppose n = pk is a power of a prime number p. Prove that φ(n) is n−n/p.

If φ is composite, the question becomes more interesting. Below, we will discuss
that if n is factored into relatively prime factors ab = n there is an easy formula: if
gcd(a, b) = 1 then φ(ab) = φ(a) · φ(b). For example, φ(12) = φ(4) × φ(3) = 2 · 2 = 4.
(It is important to note that the gcd-condition is crucial: φ(16) is not φ(4) · φ(4),
compare Exercise XV.14 above).
In order to understand why for coprime a, b the φ-function should be multi-
plicative, we take the following approach. Let n = ab and assume gcd(a, b) = 1. For
simplicity of notation, write Φ(n) for the numbers on the list {0, 1, . . . , n − 1} that
are coprime to n, and for any two numbers r, s ∈ Z write r%s for the remainder of
division of r by s coming from Euclid’s Theorem. Note that |Φ(n)| = φ(n).
Now pick i ∈ Φ(n). Then surely gcd(i, a) = gcd(i, b) = 1, and so i%a is in Φ(a)
while i%b is in Φ(b). So one could make up a function that takes inputs in Φ(n)
and outputs pairs whose first component is in Φ(a) and whose second component
is in Φ(b); the function would just send i to (i%a, i%b). If we could show that this
function is reversible (that is, one could construct for each pair with first coordinate
in Φ(a) and with second coordinate in Φ(b) an i ∈ Φ(n) that produces this pair via
the function) then φ(n) should be φ(a) · φ(b).
Let’s look at an example.
Example XV.15. For n = 12, a = 4 and b = 3 we have Φ(12) = {1, 5, 7, 11},
Φ(4) = {1, 3} and Φ(3) = {1, 2}. The function discussed above sends: 1 mod 12 to
(1 mod 4, 1 mod 3); 5 mod 12 to (1 mod 4, 2 mod 3); 7 mod 12 to (3 mod 4, 1 mod 3);
11 mod 12 to (3 mod 4, 2 mod 3).
Unfortunately, if you multiply the Φ-sets directly, you get {1, 2} · {1, 3} =
{1, 2, 3, 6} which are mostly not units in Z/12Z. So while Φ(12) = {1, 5, 7, 11} is
not Φ(4) · Φ(3) = {1, 2, 3, 6}, we do have at least φ(12) = φ(4) · φ(3). 
The example teaches that one should not multiply units in Z/aZ with units in
Z/bZ and hope to get units in Z/nZ. Indeed, what we need to do is: given i ∈ Z/aZ
and j ∈ Z/bZ, find x ∈ Z such that
(x mod a) = (r mod a) and (x mod b) = (s mod b).
Before we go and look for x, note that changing a solution x by a multiple of n = ab
does not change the solution property: if x is a solution then so are . . . , x − 2ab, x −
ab, x, x + ab, x + 2ab, . . .. Conversely, if x and x0 are both solutions, then a|(x − x0 )
and b|(x − x0 ). Of course, this means that x − x0 is a simultaneous multiple of both
a and b (and hence of their lcm), and since gcd(a, b) = 1 is assumed, x − x0 is a
multiple of lcm(a, b) = ab/ gcd(a, b) = ab = n. Therefore, the solutions x, if they
exist at all, form precisely one coset of Z/nZ.
So the whole question boils down to: if you take a pair of cosets modulo a and
b respectively, can you find a coset modulo n that “gives birth” to the given cosets
by going modulo a and b respectively. Let’s look at an example.
STUFF FOR LATER 117

Example XV.16. Let n = 36, factored as 36 = 4 · 9. Choose r = 2 and s = 7.


Is there x + 36Z such that x + 4Z = 2 + 4Z while x + 9Z = 7 + 9Z?
Some experimentation reveals that, yes, there is such x; anything of the form
34 + k · 36 will do, so that x + 36Z = 34 + 36Z. But how can one go about this
systematically? Here is how.
If x leaves rest 2 when divided by 4 then x mod 36 must look like one of the
numbers 2 + 4k, 0 ≤ k ≤ 8. Similarly, if x leaves rest 7 when divided by 9, then
x mod 36 must look like 7+9`, with 0 ≤ ` ≤ 3. In other words, we want k and ` such
that (7 + 9`) and (2 + 4k) differ by a multiple of 36: (7 + 9`) + 36Z = (2 + 4k) + 36Z
Now go back to Z/4Z where this reads (7 + 9`) + 4Z = (2 + 4k) + 4Z or
3 + 1 · ` + 4Z = 2 + 4Z. This is fancy speak for: 1 + 1 · ` is a multiple of 4. Pick
` = 3. It follows that x = 7 + 9` = 7 + 9 · 3 = 34 mod 36.
One could also have gone the other way, and reduce modulo 9: one learns from
(7+9`)+36Z = (2+4k)+36Z that one should also have (7+9`)+9Z = (2+4k)+9Z
which boils down to: 5 − 4k is a multiple of 9. So we’d like to solve 5 = 4k mod 9.
(This doesn’t seem so easy as before. We were rather lucky earlier, because 9 mod 4
is 1 and so the coefficient in “1 + 1 · ` is a multiple of 4” was 1, making it easy
to solve for `). Since 4 is coprime to 9 (by hypothesis, not by accident!) there is
actually such a k. Tests show that 4 · 8 = 32 = 5 mod 9. So, k = 8 would work.
We see then that x + 36Z should be 2 + 4 · 8 + 36Z = 34 + 36Z, as we already found
twice. 
In the following theorem, we state formally what exactly happened in the com-
putation above, and how to accomplish it.
Theorem XV.17 (Chinese Remainder Theorem). Suppose gcd(a, b) = 1 and
set ab = n. Choose integers r, s. The set of integers x for which
x + aZ = r + aZ and x + bZ = s + bZ
fills exactly one coset inside Z/nZ.
This coset can be found as follows. Find integers i, j such that i · b = 1 mod a
and j · a = 1 mod b. Let x = r · i · b + s · j · a. Then x + nZ is the sought after coset.
In the above example, a = 4, b = 9, i = 1, j = 7, r = 2, s = 7. Hence x =
2 · 1 · 9 + 7 · 7 · 4 = 18 + 196 = 214 = 34 mod 36.
Example XV.18. Let’s solve
x = 13 mod 29 and x = 8 mod 12.
Matching letters, we have a = 29, b = 12, r = 13, s = 8. We need to find i, j
with i · 12 = 1 mod 29 and j · 29 = 1 mod 12. Both will come out of the Euclidean
algorithm when applied to 12 and 29:
29 = 2 · 12 + 5; 12 = 2 · 5 + 2; 5 = 2 · 2 + 1.
We derive (going backwards):
1 = 5 − 2 · 2 = 5 − 2 · (12 − 2 · 5) = 5 · 5 − 2 · 12 = 5 · (29 − 2 · 12) − 2 · 12 = 5 · 29 − 12 · 12.
It follows that we can take i = −12 and j = 5. This gives x = r ·i·b+s·j ·a = −712.
One can test easily that this is correct. 
If one needs to solve three simultaneous equations,
x = r mod a; x = s mod b; x = t mod c,
118 STUFF FOR LATER

with gcd(a, b) = gcd(b, c) = gcd(c, a) = 1, one has two options. Either, take the
souped-up version of the Chinese Remainder Theorem which we state below. Or,
one first solves x = r mod a and x = s mod b as above and then y = x mod ab and
y = t mod c again as above.
Here is the multiverse formulation of the Chinese Remainder Theorem; its proof
is in parallel to its little brother above.
Theorem XV.19 (Chinese Remainder Theorem). Let n1 , n2 , . . . , nt be pairwise
coprime numbers. Choose values a1 , . . . , at . Then the set of integers x which leave
remainder ai when divided by ni for all i are the elements in the coset x + n1 · · · nt Z
determined as follows. Let N = n1 · · · nt and set Ni = N/ni . Find, for each i, a
solution xi to the equation Ni · xi = 1 mod ni . Then
t
X
x= xi Ni ai .
i=1

Exercise XV.20. Solve the simultaneous equations


(1) x = 1 mod 3, x = 2 mod 5, x = 3 mod 7.
(2) x = 2 mod 3, x = 2 mod 5, x = 3 mod 7.
(3) x = 1 mod 9, x = 2 mod 5, x = 3 mod 7.

It now follows from the Chinese Remainder theorem that:
Corollary XV.21. If you factor n = a1 · · · ak into pairwise prime numbers
a1 , . . . , ak then φ(n) = φ(a1 ) · . . . · φ(ak ).
In particular, if n = pa1 1 · pa2 2 · · · pakk and all pi are prime and distinct and all
ai are positive, then
1 1 1 Y
pai i − pai i −1 .

φ(n) = n · (1 − )(1 − ) · · · (1 − ) =
p1 p2 pk
Exercise XV.22. (1) Determine φ(666).
(2) Determine φ(720).
(3) Is there an n 6= 29 with φ(n) = 28?
(4) Is there an n with φ(n) = 24?
(5) Is there an n with φ(n) = 14?
(6) Prove that φ(n) is always even if n > 2.

3.3. Fermat’s little theorem.
Theorem XV.23. For any prime number p and any a ∈ N with gcd(a, p) = 1,
ap = a mod p.
In fact, unless p divides a, one has ap−1 = 1 mod p.
Proof. If p divides a then the first equation is obviously true, so assume p
does not divide a. Then gcd(p, a) = 1 since p is a prime number.
The numbers 1, 2, . . . , p − 1 are coprime to p and so their cosets modulo p are
units in Z/pZ. In particular, a is a unit in Z/pZ and so multiplication by a is a
bijection on the elements of Z/pZ, since for units multiplication is an invertible
operation. That means that the sets {1, 2, . . . , p − 1} and {a, 2a, . . . , (p − 1)a} are
the same, up to reordering.
STUFF FOR LATER 119

Equal sets have equal products:


p−1
Y p−1
Y p−1
Y
i= ai = ap−1 · i.
i=1 i=1 i=1
Qp−1
The product of units is a unit, so one can cancel out the term i=1 i and obtain
1 = ap−1 mod p. 
Exercise XV.24. By looking at the smallest non-prime number, show that for
composite numbers n the equation ap = a mod n may fail. 
Exercise XV.25. (Not easy). Formulate and prove a theorem like Fermat’s
little theorem in the case where p is not prime. 
Exercise XV.26. Show that (n − 1)n(n + 1) is a multiple of 24 if n is a prime
number greater than 2. Is “prime” really needed? 
Exercise XV.27. Suppose n is a number with k digits ak , . . . , a0 : n = ak ·
10k + . . . + a1 · 10 + a0 . Show that n is divisible by 7 if and only if the expression
· · · + 5a5 + 4a4 + 6a3 + 2a2 + 3a1 + 1 · a0
is divisible by 7. Here, the dots towards the left mean that the coefficient pattern
5, 4, 6, 2, 3, 1 that appears should be repeated. So, a6 gets coefficient 1 again (like
a0 ), a7 gets 3 again (like a1 ) and so on: the coefficient of ai+6 is the same as of ai .


You might also like