02 Elimination
02 Elimination
Elimination theory
The material of this chapter is largely taken from [Win96], where also proofs of theo-
rems are given.
Before we start with the technical details, let us briefly review the historical devel-
opment leading to the concept of Gröbner bases. In his seminal paper of 1890 D. Hilbert
gave a proof of his famous Basis Theorem as well as of the structure and length of the
sequence of syzygy modules of a polynomial system. Implicitly he also showed that the
Hauptproblem, i.e. the problem whether f ∈ I for a given polynomial f and polynomial
ideal I, can be solved effectively. Hilbert’s solution of the Hauptproblem (and similar
problems) was reinvestigated by G. Hermann in 1926. She counted the field operations
required in this effective procedure and arrived at a double exponential upper bound in
the number of variables. In fact, Hermann’s, or for that matter Hilbert’s, algorithm always
actually achieves this worst case double exponential complexity. The next important step
came when B. Buchberger, in his doctoral thesis of 1965 advised by W. Gröbner, intro-
duced the notion of a Gröbner basis (he did not call it that at this time) and also gave an
algorithm for computing it. Gröbner bases are very special and useful bases for polynomial
ideals. In subsequent publications Buchberger exhibited important additional applications
of his Gröbner bases method, e.g. to the solution of systems of polynomial equations. In
the worst case, Buchberger’s Gröbner bases algorithm is also double exponential in the
number of variables, but in practice there are many interesting examples which can be
solved in reasonable time. But still, in the worst case, the double exponential behaviour
is not avoided. And, in fact, it cannot be avoided by any algorithm capable of solving the
Hauptproblem, as was shown by E.W. Mayr and A.R. Meyer in 1982.
When we are solving systems of polynomial (algebraic) equations, the important pa-
rameters are the number of variables n and the degree of the polynomials d. The Buch-
berger algorithm for constructing Gröbner bases is at the same time a generalization of
Euclid’s algorithm for computing the greatest common divisor (GCD) of univariate poly-
nomials (the case n = 1) and of Gauss’ triangularization algorithm for linear systems (the
case d = 1). Both these algorithms are concerned with solving systems of polynomial
equations, and they determine a canonical basis (either the GCD of the inputs or a tri-
angularized form of the system) for the given polynomial system. Buchberger’s algorithm
can be seen as a generalization to the case of arbitrary n and d.
Let K be a computable field and K[X] = K[x1 , . . . , xn ] the polynomial ring in n
indeterminates over K. If F is any subset of K[X] we write hF i or ideal(F ) for the ideal
9
generated by F in K[X]. By [X] we denote the monoid (under multiplication) of power
products xi11 · · · xinn in x1 , . . . , xn . 1 = x01 . . . x0n is the unit element in the monoid [X].
lcm(s, t) denotes the least common multiple of the power products s, t.
Commutative rings with 1 in which the basis condition holds, i.e. in which every ideal
has a finite basis, are usually called Noetherian rings. This notation is motivated by the
following lemma.
Lemma 2.1.1. In a Noetherian ring there are no infinitely ascending chains of ideals. ⊓
⊔
Theorem 2.1.2. (Hilbert’s Basis Theorem) If R is a Noetherian ring then also the uni-
variate polynomial ring R[x] is Noetherian.
A proof of Hilbert’s Basis Theorem will be given in a later chapter. Hilbert’s Basis
Theorem implies that the multivariate polynomial ring K[X] is Noetherian, if K is a field.
So every ideal I in K[X] has a finite basis, and if we are able to effectively compute with
finite bases then we are dealing with all the ideals in K[X].
We will define a Gröbner basis of a polynomial ideal via a certain reduction relation
for polynomials. A Gröbner basis will be a basis with respect to which the corresponding
reduction relation is confluent. Before we can define the reduction relation on the polyno-
mial ring, we have to introduce an ordering of the power products with respect to which
the reduction relation should be decreasing.
Definition 2.1.1. Let < be an ordering on [X] that is compatible with the monoid
structure, i.e.
(i) 1 = x01 . . . x0n < t for all t ∈ [X] \ {1}, and
(ii) s < t =⇒ su < tu for all s, t, u ∈ [X].
We call such an ordering < on [X] an admissible ordering. ⊓
⊔
Example 2.1.1. We give some examples of frequently used admissible orderings on [X].
(a) The lexicographic ordering with xπ(1) > xπ(2) > . . . > xπ(n) , π a permutation of
{1, . . . , n}:
xi11 . . . xinn <lex,π xj11 . . . xjnn iff there exists a k ∈ {1, . . . , n} such that for all l < k
iπ(l) = jπ(l) and iπ(k) < jπ(k) .
If π = id, we get the usual lexicographic ordering <lex .
(b) The graduated lexicographic ordering w.r.t. the permutation π and the weight function
w : {1, . . . , n} → R+ :
for s = xi11 . . . xinn , t = xj11 . . . xjnn we define s <glex,π,w t iff
n
X n
X n
X n
X
w(k)ik < w(k)jk or w(k)ik = w(k)jk and s <lex,π t .
k=1 k=1 k=1 k=1
deg(s) < deg(t) or (deg(s) = deg(t) and t <lex,π s, where π(j) = n − j + 1).
10
(d) The product ordering w.r.t. i ∈ {1, . . . , n − 1} and the admissible orderings <1 on
X1 = [x1 , . . . , xi ] and <2 on X2 = [xi+1 , . . . , xn ]:
for s = s1 s2 , t = t1 t2 , where s1 , t1 ∈ X1 , s2 , t2 ∈ X2 , we define s <prod,i,<1 ,<2 t iff
Definition 2.1.3. Any admissible ordering < on [X] induces a partial ordering ≪ on
R[X], the induced ordering, in the following way:
f ≪ g iff f = 0 and g 6= 0 or
f 6= 0, g 6= 0 and lpp(f ) < lpp(g) or
f 6= 0, g 6= 0, lpp(f ) = lpp(g) and red(f ) ≪ red(g). ⊓
⊔
One of the central notions of the theory of Gröbner bases is the concept of polynomial
reduction.
11
If we want to indicate which power product and coefficient are used in the reduction, we
write
c
g −→f,b,t h, where b = .
lc(f )
We say that g reduces to h w.r.t. F (g −→F h) iff there is f ∈ F such that g −→f h. ⊓
⊔
Definition 2.1.6. (a) −→ is Noetherian or has the termination property iff every reduction
sequence terminates, i.e. there is no infinite sequence x1 , x2 , . . . in M such that x1 −→
x2 −→ . . . .
(b) −→ is Church–Rosser or has the Church–Rosser property iff a ←→∗ b implies a ↓∗ b.
(c) −→ is confluent iff x ↑∗ y implies x ↓∗ y, or graphically every diamond of the following
form can be completed:
u
∗ ւ ց∗
x y
ց∗ ւ ∗
v
(d) −→ is locally confluent iff x ↑ y implies x ↓∗ y, or graphically every diamond of the
following form can be completed:
u
ւ ց
x y
ց∗ ւ ∗
v ⊓
⊔
12
As an immediate consequence of the previous definitions we get that the reduction
relation −→ is (nearly) compatible with the operations in the polynomial ring. Moreover,
the reflexive–transitive–symmetric closure ←→∗F of the reduction relation −→F is equal to
the congruence modulo the ideal generated by F .
Theorem 2.1.7. Let F ⊆ K[X]. The ideal congruence modulo hF i equals the reflexive–
transitive–symmetric closure of −→F , i.e. ≡hF i = ←→∗F . ⊓
⊔
So the congruence ≡hF i can be decided if −→F has the Church–Rosser property. Of
course, this is not the case for an arbitrary set F . Such distinguished sets (bases for
polynomial ideals) are called Gröbner bases.
Definition 2.1.5. A subset F of K[X] is a Gröbner basis (for hF i) iff −→F is Church–
Rosser. ⊓
⊔
For testing whether a given basis F of an ideal I is a Gröbner basis it suffices to test for
local confluence of the reduction relation −→F . This, however, does not yield a decision
procedure, since there are infinitely many situations f ↑F g. However, Buchberger has
been able to reduce this test for local confluence to just testing a finite number of sitations
f ↑F g [Buchberger 1965]. For that purpose he has introduced the notion of subtraction
polynomials, or S–polynomials for short.
is the critical pair of f and g. The difference of the elements of cp(f, g) is the S–polynomial
spol(f, g) of f and g. ⊓
⊔
If cp(f, g) = (h1 , h2 ) then we can depict the situation graphically in the following way:
lcm(lpp(f ), lpp(g))
•
ւf ցg
• •
h1 h2
The critical pairs of elements of F describe exactly the essential branchings of the reduction
relation −→F .
13
Theorem 2.1.8. (Buchberger’s Theorem) Let F be a subset of K[X].
(a) F is a Gröbner basis if and only if g1 ↓∗F g2 for all critical pairs (g1 , g2 ) of elements of
F.
(b) F is a Gröbner basis if and only if spol(f, g) −→∗F 0 for all f, g ∈ F .
Buchberger’s theorem suggests an algorithm for checking whether a given finite basis
is a Gröbner basis: reduce all the S–polynomials to normal forms and check whether they
are all 0. In fact, by a simple extension we get an algorithm for constructing Gröbner
bases.
Theorem 2.1.9. (Dickson’s Lemma) Every A ⊆ [X] contains a finite subset B, such that
every t ∈ A is a multiple of some s ∈ B.
The termination of GRÖBNER B also follows from Hilbert’s Basis Theorem applied
to the initial ideals of the sets G constructed in the course of the algorithm, i.e. hin(G)i.
See Exercise 8.3.4.
The algorithm GRÖBNER B provides a constructive proof of the following theorem.
14
that there is a free choice of pairs in the loop):
(1) spol(f1 , f2 ) = f1 − yf2 = −xy + y − 1 =: f3 is irreducible, so G := {f1 , f2 , f3 }.
(2) spol(f2 , f3 ) = f2 + xf3 = xy −→f3 y − 1 =: f4 , so G := {f1 , f2 , f3 , f4 }.
(3) spol(f3 , f4 ) = f3 + xf4 = y − x − 1 −→f4 −x =: f5 , so G := {f1 , . . . , f5 }.
All the other S–polynomials now reduce to 0, so GRÖBNER B terminates with
G = {x2 y 2 + y − 1, x2 y + x, −xy + y − 1, y − 1, −x}. ⊓
⊔
In addition to the original definition and the ones given in Theorem 1.8, there are
many other characterizations of Gröbner bases. We list only a few of them.
Theorem 2.1.11. Let I be an ideal in K[X], F ⊆ K[X], and hF i ⊆ I. Then the following
are equivalent.
(a) F is a Gröbner basis for I.
(b) f −→∗F 0 for every f ∈ I.
(c) f −→F for every f ∈ I \ {0}.
(d) For all g ∈ I, h ∈ K[X]: if g −→∗F h then h = 0.
(e) For all g, h1 , h2 ∈ K[X]: if g −→∗F h1 and g −→∗F h2 then h1 = h2 .
(f) hin(F )i = hin(I)i.
The Gröbner basis G computed in Example 2.1.3 is much too complicated. In fact,
{y − 1, x} is a Gröbner basis for the ideal. There is a general procedure for simplifying
Gröbner bases.
Theorem 2.1.12. Let G be a Gröbner basis for an ideal I in K[X]. Let g, h ∈ G and
g 6= h.
(a) If lpp(g) | lpp(h) then G′ = G \ {h} is also a Gröbner basis for I.
(b) If h −→g h′ then G′ = (G \ {h}) ∪ {h′ } is also a Gröbner basis for I.
From Theorem 2.1.12 we obviously get an algorithm for transforming any Gröbner
basis for an ideal I into a normed reduced Gröbner basis for I. No matter from which
Gröbner basis of I we start and which path we take in this transformation process, we
always reach the same uniquely defined normed reduced Gröbner basis of I.
Theorem 2.1.13. Every ideal in K[X] has a unique finite normed reduced Gröbner basis.
Observe that the normed reduced Gröbner basis of an ideal I depends, of course, on
the admissible ordering <. Different orderings can give rise to different Gröbner bases.
15
However, if we decompose the set of all admissible orderings into sets which induce the
same normed reduced Gröbner basis of a fixed ideal I, then this decomposition is finite.
This leads to the consideration of universal Gröbner bases. A universal Gröbner basis for
I is a basis for I which is a Gröbner basis w.r.t. any admissible ordering of the power
products.
If we have a Gröbner basis G for an ideal I, then we can compute in the vector space
K[X]/I over K. The irreducible power products (with coefficient 1) modulo G form a basis
of K[X]/I . We get that dim(K[X]/I ) is the number of irreducible power products modulo
G. Thus, this number is independent of the particular admissible ordering.
Example 2.1.4. Let I = hx3 y − 2y 2 − 1, x2 y 2 + x + yi in Q[x, y]. Let < be the graduated
lexicographic ordering with x > y. Then the normed reduced Gröbner basis of I has
leading power products x4 , x3 y, x2 y 2 , y 3 . So there are 9 irreducible power products.
If < is the lexicographic ordering with x > y, then the normed reduced Gröbner basis
of I has leading power products x and y 9 . So again there are 9 irreducible power products.
In fact, dim(Q[x, y]/I ) = 9. ⊓
⊔
For a 0–dimensional ideal I in regular position a very strong structure theorem has
been derived by Gianni and Mora. I is in regular position w.r.t. the variable x1 , if a1 6= b1
for any two different zeros (a1 , . . . , an ), (b1 , . . . , bn ) of I. Clearly it is very likely that an
arbitrary 0–dimensional ideal is in regular position w.r.t. x1 . Otherwise, nearly every
linear change of coordinates will make the ideal regular.
Theorem 2.1.14. (Shape Lemma) Let I be a radical 0–dimensional ideal in K[X], regular
in x1 . Then there are g1 (x1 ), . . . , gn (x1 ) ∈ K[x1 ] such that g1 is squarefree, deg(gi ) <
deg(g1 ) for i > 1 and the normed reduced Gröbner basis F for I w.r.t. the lexicographic
ordering < with x1 < · · · < xn is of the form
On the other hand, if the normed reduced Gröbner basis for I w.r.t. < is of this form,
then I is a radical 0–dimensional ideal.
Proof: Since I is in regular position, the first coordinates Qof zeros of I are all different, say
m
a11 , . . . , a1m . Then the squarefree polynomial g1 (x1 ) = i=1 (x1 − a1i ) is in I ∩ K[x1 ] and
so it has to be in F . Since by the observation above m is the dimension of K[X]/I , the
normed reduced Gröbner basis for I has to have the specified form.
To prove the converse, let a11 , . . . , a1m be the zeros of g1 (x1 ). Then the zeros of I are
{(a1i , g2 (a1i ), . . . , gn (a1i ) | i = 1, . . . , m}. ⊓
⊔
16
2.2 Solving ideal theoretic problems by Gröbner bases
Theorem 2.2.1. The irreducible power products modulo G, viewed as polynomials with
coefficient 1, form a basis for the vector space K[X]/I over K.
Ideal membership
By definition Gröbner bases solve the ideal membership problem for polynomial ideals,
i.e.
given: f, f1 , . . . , fm ∈ K[X],
decide: f ∈ hf1 , . . . , fm i.
Let G be a Gröbner basis for I = hf1 , . . . , fm i. Then f ∈ I if and only if the normal form
of f modulo G is 0.
4z − 4xy 2 − 16x2 − 1 = 0,
2y 2 z + 4x + 1 = 0,
2x2 z + 2y 2 + x = 0
between the quantities x, y, z, and we want to decide whether the additional relation (hy-
pothesis)
g(x, y) = 4xy 4 + 16x2 y 2 + y 2 + 8x + 2 = 0
follows from them, i.e. whether we can write g as a linear combination of the axioms or,
in other words, whether g is in the ideal I generated by the axioms.
Trying to reduce the hypothesis g w.r.t. the given axioms does not result in a reduction
to 0. But we can compute a Gröbner basis for I w.r.t. the lexicographic ordering with
x < y < z, e.g. G = {g1 , g2 , g3 } where
17
Radical membership
Sometimes, especially in applications in geometry, we are not so much interested in
the ideal membership problem but in the radical membership problem, i.e.
given: f, f1 , . . . , fm ∈ K[X],
decide: f ∈ radical(hf1 , . . . , fm i).
The radical of an ideal I is the ideal containing all those polynomials f , some power
of which is contained in I. So f ∈ radical(I) ⇐⇒ f n ∈ I for some n ∈ N. Geometrically
f ∈ radical(hf1 , . . . , fm i) means that the hypersurface defined by f contains all the points
in the variety (algebraic set) defined by f1 , . . . , fm .
The following extremely important theorem relates the radical of an ideal I to the
set of common roots V (I) of the polynomials contained in I. We will give a proof of this
theorem later.
Equality of ideals
We want to decide whether two given ideals are equal, i.e. we want to solve the ideal
equality problem:
given: f1 , . . . , fm , g1 , . . . , gk ∈ K[X],
decide: hf1 , . . . , fm i = hg1 , . . . , gk i.
| {z } | {z }
I J
Choose any admissible ordering. Let GI , GJ be the normed reduced Gröbner bases of
I and J, respectively. Then by Theorem 2.1.13 I = J if and only if GI = GJ .
f1 (x1 , . . . , xn ) = 0,
.. (2.1)
.
fm (x1 , . . . , xn ) = 0,
18
n
Theorem 2.2.3 Let G be a normed Gröbner basis of I. (2.1) is unsolvable in K if and
only if 1 ∈ G.
Now suppose that (2.1) is solvable. We want to determine whether there are finitely
or infinitely many solutions of (2.1) or, in other words, whether or not the ideal I is
0–dimensional.
Theorem 2.2.4. Let G be a Gröbner basis of I. Then (2.1) has finitely many solutions
(i.e. I is 0–dimensional) if and only if for every i, 1 ≤ i ≤ n, there is a polynomial gi ∈ G
such that lpp(gi ) is a pure power of xi . Moreover, if I is 0–dimensional then the number
of zeros of I (counted with multiplicity) is equal to dim(K[X]/I ).
The rôle of the Gröbner basis algorithm GRÖBNER B in solving systems of algebraic
equations is the same as that of Gaussian elimination in solving systems of linear equations,
namely to triangularize the system, or carry out the elimination process. The crucial
observation, first stated by Trinks in 1978, is the elimination property of Gröbner bases. It
states that if G is a Gröbner basis of I w.r.t. the lexicographic ordering with x1 < . . . < xn ,
then the i–th elimination ideal of I, i.e. I ∩K[x1 , . . . , xi ], is generated by those polynomials
in G that depend only on the variables x1 , . . . , xi .
A proof can be found in [Win96], Chap.8. Theorem 2.2.5 can clearly be generalized
to product orderings, without changing anything in the proof.
are polynomials in Q[x, y, z]. We are looking for solutions of this system of algebraic
3
equations in Q , where Q is the field of algebraic numbers.
Let < be the lexicographic ordering with x < y < z. The algorithm GRÖBNER B
applied to F = {f1 , f2 , f3 } yields (after reducing the result) the reduced Gröbner basis
G = {g1 , g2 , g3 }, where
By Theorem 3.1 the system is solvable. Furthermore, by Theorem 3.2, the system has
finitely many solutions. The Gröbner basis G yields an equivalent triangular system in
which the variables are completely separated. So we can get solutions by solving the
19
univariate polynomial g3 and propagating the partial solutions upwards to solutions of the
full system. The univariate polynomial g3 is irreducible over Q, and the solutions are
1 √ p 1
(α, ± √ α 16α3 − 108α2 + 16α − 17, − (64α4 − 432α3 + 168α2 − 354α + 104)),
26 65
20
Proof: (a) and (b) are easily seen.
(c) Let f ∈ I ∩J. Then tf ∈ hti·I(t) and (1−t)f ∈ ht−1i·J(t) . Therefore f = tf +(1−t)f ∈
hti · I(t) + h1 − ti · J(t) .
On the other hand, let f ∈ (hti · I(t) + h1 − ti · J(t) ) ∩ K[X]. So f = g(X, t) + h(X, t),
where g ∈ htiI and h ∈ h1 − tiJ. In particular, h(X, t) is a linear combination of the basis
elements (1 − t)g1 , . . . , (1 − t)gs of h1 − tiJ. Evaluating t at 0 we get
f = g(X, 0) + h(X, 0) = h(X, 0) ∈ J.
Simlilarly, by evaluating t at 1 we get f = g(X, 1) ∈ I.
(d) h ∈ I : J if and only if hg ∈ I for all g ∈ J if and only if hgj ∈ I for all 1 ≤ j ≤ s if
and only if h ∈ I : hgj i for all 1 ≤ j ≤ s.
If f ∈ hh1 /g, . . . , hm /gi and a ∈ hgi then af ∈ hh1 , . . . , hm i =P
I ∩ hgi ⊂ I, i.e. f ∈ I : hgi.
Conversely, suppose f ∈ I : hgi. Then f g ∈ I ∩ hgi. So f g = bk hk for some bk ∈ K[X].
Thus, X
f= bk · ( hk /g ) ∈ hh1 /g, . . . , hm /gi. ⊓
⊔
| {z }
polynomial
So all these operations can be carried out effectively by operations on the bases of the
ideals. In particular the intersection can be computed by Theorem 2.2.6(c).
We always have I · J ⊂ I ∩ J. However, I ∩ J could be strictly larger than I · J.
For example, if I = J = hx, yi, then I · J = hx2 , xy, y 2i and I ∩ J = I = J = hx, yi.
Both I · J and I ∩ J correspond to the same variety. Since a basis for I · J is more easily
computed, why should we bother with I ∩ J? The reason is that the intersection behaves
much better with respect to the operation of taking radicals (recall that it is really the
radical ideals that uniquely correspond to algebraic sets). Whereas the product of radical
ideals in general fails to be radical (consider I ·I), the intersection of radical ideals is always
radical. For a proof see Theorem 8.4.10 in [Win96].
√ √ √ √
Theorem 2.2.7 Let I, J be ideals in K[X]. Then I ∩ J = I ∩ J ( I means the
radical of I).
21
Now let us compute the ideal I6 defining V (I5 ) − V (I3 ), i.e. the Zariski closure of V (I5 ) \
V (I3 ), i.e. the smallest algebraic set containing V (I5 ) \ V (I3 ).
For proving Theorem 2.2.8 we use the following lemma (as suggested in [CLO98]), the
proof of which we leave as an exercise.
Lemma 2.2.9 Let I be an ideal in K[x1 , . . . , xn ], and let p = (x1 − a1 ) · · · (x1 − ad ), where
a1 , . . . , ad are distinct.
T
(a) Then I +Q hpi ⊂ j (I + hx1 − aj i).
(b) Let pj = i6=j (x1 − ai ). Then pj · (I + hx1 − aj i) ⊂ I + hpi.
(c) p1 , . . . , pdPare relatively prime, and therefore there are polynomials h1 , . . . , hd such
that 1 = j hj pj .
T
(d) j (I + hx1 − aj i) ⊂ I + hpi.
(e) From these partial results we finally get
d
\
I + hpi = I + hx1 − aj i .
j=1
22
Proof of Theorem 2.2.8:
√ Write J = I + hp̃1 , . . . , p̃n i. We first prove that J is a radical
ideal, i.e., the J = J. For each i, using the fact that K is algebraically closed, we can
factor p̃i to obtain
p̃i = (xi − ai1 ) · · · (xi − ai di ),
where the aij are distinct for given i ∈ {1, . . . , n}. Then
\
J = J + hp̃1 i = J + hx1 − a1j i ,
j
where the first equality holds since p̃1 ∈ J and the second follows from Lemma 2.2.9. Now
use p̃2 to decompose each J + hx1 − a1j i in the same way. This gives
\
J= J + hx1 − a1j , x2 − a2k i .
j,k
Since hx1 − a1j1 , . . . , xn − anjn i is a maximal ideal, the ideal J + hx1 − a1j1 , . . . , xn − anjn i
is either hx1 − a1j1 , . . . , xn − anjn i or the whole polynomial ring K[x1 , . . . , xn ]. Since a
maximal ideal is radical and an intersection of radical ideals is radical (compare Theorem
2.2.7), we conclude that J is a radical √ ideal.
Now we can prove that √ J = I. The inclusion I ⊂ J is obvious from the definition
of J. The inclusion J ⊂ I follows from Hilbert’s Nullstellensatz (Theorem 4.2.3), since
the polynomials p̃i vanish at all the points of V (I). Hence we have
√
I ⊂ J ⊂ I.
√ √
Taking
√ radicals in this chain of inclusions shows that J = I. But J is radical, so
J = J and we are done. ⊓
⊔
23
Observe that all modules in this sequence except M are free.
If there is an l ∈ N s.t. nl 6= 0 but nk = 0 for all k > l, then we say that the resolution
is finite, of length l. A finite resolution of length l is usually written as
Let’s see how we can construct a free resolution of a finitely generated module M =
hm1 , . . . , mn0 i. We determine a basis (generating set) {s1 , . . . , sn1 } of Syz(m1 , . . . , mn0 ),
the syzygy module of (m1 , . . . , mn0 ). Let
ϕ0 : Rn0 −→ PM
(r1 , . . . , rn0 )T 7→ ri mi
ϕ1 : Rn1 −→ PRn0
(r1 , . . . , rn1 )T 7→ ri s i
Then we have im(ϕ1 ) = Syz(mi ) = ker(ϕ0 ), so the sequence
is exact. Continuing this process with Syz(mi ) instead of M , we finally get a free resolution
of M .
I = hx2 − x, xy, y 2 − y i
| {z }
F
in R = K[x, y]. In geometric terms, I is the ideal of the variety V = {(0, 0), (1, 0), (0, 1)}
in K 2 . Let
ϕ0 : R3 −→ I
r1 r1
r2 7→ (x2 − x, xy, y 2 − y) · r2
| {z }
r3 A
r3
The mapping ϕ0 represents the generation of I from the free module R3 . Next we determine
relations between the generators, i.e. (first) syzygies. The columns of the matrix
y 0
B = −x + 1
y − 1
0 −x
generate the syzygy module Syz(F ). Bases for syzygy modules can be computed via
Gröbner bases; see for instance Theorem 8.4.8 in [Win96]. So for
2
ϕ1 : R −→ R3
r1 r1
7→ B·
r2 r2
24
we get the exact sequence
R2 −→ϕ1 R3 −→ϕ0 I −→ 0 .
The resolution process terminates right here. If (c1 , c2 ) is any syzygy of the columns of B,
i.e. a second syzygy of F , then
y 0 0
c1 −x + 1 + c2 y − 1 = 0 .
0 −x 0
Looking at the first component we see that c1 y = 0, so c1 = 0. Similarly, from the third
component we get c2 = 0. Hence the kernel of ϕ1 is the zero module 0. There are no non-
trivial relations between the columns of B, so the first syzygy module Syz(F ) is isomorphic
to the free module R2 . Finally this leads to the free resolution
0 −→ R2 −→ϕ1 R3 −→ϕ0 I −→ 0
f1 z1 + . . . + fs zs = f, (2.2)
f1 z1 + . . . + fs zs = 0. (2.3)
Let F be the vector (f1 , . . . , fs ). The general solution of (2.2) and (2.3) is to be sought in
K[X]s . The solutions of (2.3) form a module over the ring K[X], a submodule of K[X]s
over K[X].
Definition 2.2.5. Any solution of (2.3) is called a syzygy of the sequence of polynomials
f1 , . . . , fs . The module of all solutions of (8.4.3) is the module of syzygies Syz(F ) of
F = (f1 , . . . , fs ). ⊓
⊔
Theorem 2.2.6. If the elements of F = (f1 , . . . , fs ) are a Gröbner basis, then from the
reductions of the S-polynomials to 0 one can extract a basis for Syz(F ).
Theorem 2.2.7. Let F = (f1 , . . . , fs )T be a vector of polynomials in K[X] and let the
elements of G = (g1 , . . . , gm )T be a Gröbner basis for hf1 , . . . , fs i. We view F and G as
column vectors. Let the r rows of the matrix R be a basis for Syz(G) and let the matrices
A, B be such that G = A · F and F = B · G. Then the rows of Q are a basis for Syz(F ),
where
Is − B · A
Q = ............... .
R·A
25
What we still need is a particular solution of the inhomogeneous equation (2.2). Let
G = (g1 , . . . , gm ) be a Gröbner basis for hF i and let A be the transformation matrix such
that G = A · F (G and F viewed as column vectors). Then a particular solution of (2.2)
exists if and only if f ∈ hF i = hGi. If the reduction of f to normal form modulo G yields
f ′ 6= 0, then (2.2) is unsolvable. Otherwise we can extract from this reduction polynomials
h′1 , . . . , h′m such that
g1 h′1 + . . . + gm h′m = f.
So H = (h′1 , . . . , h′m ) · A is a particular solution of (2.2).
26
2.3 Basis conversion for 0-dimensional ideals — FGLM
Gröbner bases are strongly dependent on the admissible ordering < on the terms or
power products. Also the computation complexity is strongly influenced by <.
Let I be a 0-dimensional ideal, i.e. an ideal having finitely many common solutions
n
in K . Let a basis of I be given by polynomials of total degree less or equal to d. Then
the complexity of computing a Gröbner basis for I is as follows:
• w.r.t. to graduated reverse lexicographic ordering:
2
dO(n )
in general, and
dO(n)
if there are also only finitely many common solutions “at infinity”, e.g. if the basis is
homogeneous,
• w.r.t. lexicographic ordering:
3
dO(n ) .
But lexicographic orderings are often the orderings of choice for practical problems,
because such Gröbner bases have the elimination property (Theorem 2.2.5). Let D =
dimK[x1 , . . . , xn ]/I , i.e. D is the number of solutions of I, counted with multiplicites.
Then by methods of linear algebra we can transform a Gröbner basis w.r.t. the ordering
<1 to a Gröbner basis w.r.t. <2 , where the number of arithmetic operations in this
transformation is proportional to O(n · D3 ).
We will use the following notation:
R = K[x1 , . . . , xn ],
I is a 0-dimensional ideal in R,
G is a reduced Gröbner basis w.r.t. to the ordering <,
D(I) = dimK R/I , the degree of the ideal I,
B(G) = {b | b irreducible w.r.t. G} the canonical basis of the K-vector space R/I ,
M (G) = {xi b | b ∈ B(G), 1 ≤ i ≤ n, xi b 6∈ B(G)} the margin of G.
27
Theorem 2.3.1. For every m ∈ M (G) exactly one of the following conditions holds:
(i) for every variable xi occurring in m (i.e. xi |m) we have m/xi ∈ B(G); this is the case
if and only if m = lpp(g) for some g ∈ G,
(ii) m = xj mk for some j and some mk ∈ M (G).
Proof: (i) For such an m we have m −→G (m is reducible by G), i.e. lpp(g)|m for
some g ∈ G. But m/xi is irreducible modulo G for every variable xi . So we must have
m = lpp(g). Since G is a reduced Gröbner basis, we get also the converse.
(ii) Let xj |m and m/xj 6∈ B(G). Let mk = m/xj . So m = xj mk = xi b for some variable xi
and b ∈ B(G). We have i 6= j and mk /xi = b/xj ∈ B(G). Thus mk = xi (b/xj ) ∈ M (G).
⊓
⊔
By nf G (f ) we denote the (uniquely defined) normal form of the polynomial f w.r.t. the
Gröbner basis G. We investigate the n-linear mappings φi on B(G), defined by
φi : bk 7→ nf G (xi bk ).
Definition 2.3.1. The translational tensor T (G) = (tijk ) of the order n × D(I) × D(I) is
defined as
Theorem 2.3.2. Let I be a 0-dimensional ideal, G1 a reduced Gröbner basis for I w.r.t.
<1 , and <2 a different admissible ordering.
Then a Gröbner basis G2 for I w.r.t. <2 can be computed by means of linear algebra.
This requires O(n · D(I)3 ) arithmetic operations.
Proof: Let
B(G1 ) = {a1 , . . . , aD(I) }, M (G1 ), T (G1 ) as above.
We have to determine
B(G2 ) = {b1 . . . , bD(I) } and G2 .
If I = R, then obviously G1 = G2 = {1} and B(G1 ) = B(G2 ) = ∅. So let us assume that
I 6= R.
For determining B(G2 ) and G2 we construct a matrix
C = cki ,
such that
D(I)
X
bi = cji aj , for every bi ∈ B(G2 ).
j=1
28
We proceed iteratively and start by setting
Now let
m := min xj bi | 1 ≤ j ≤ n, bi ∈ B(G2 ), xj bi 6= B(G2 ) ∪ M (G2 ) .
<2
Then, by Theorem 2.3.1, we are necessarily in one of the following three cases:
(1) m = lpp(g) for some g which has to be added to G2 ,
(2) m has to be added to B(G2 ),
(3) m has to be added to M (G2 ), but m is a proper multiple of lpp(g) for some g ∈ G2 .
Case (3) can be checked easily: lpp(g) < m for every admissible ordering <, and therefore
this g has already been added to G2 , i.e. we already have lpp(g) in M (G2 ).
Now let us consider the cases (1) and (2): using the precomputed tensor T (G1 ) = tijk
and the already computed components of C we can determine the coordinates of m = xj bi
w.r.t. B(G1 ) as follows: X
m = xj b i = xj cki ak
k
X
= cki (xj ak )
k
X X
= cki tjhk ah
k h
X X
= tjhk cki ah
h k
X
=: c(m)h ah .
h
If the vector
c(m) = (c(m)1 , . . . , c(m)D(I) )
is linearly independent of the vectors in C, then we are in case (2) and we have found a
new term m ∈ B(G2 ).
On the other hand, if c(m) is linearly dependent on the vectors in C, then from this
dependence we get a new element g ∈ G2 .
We leave the complexity bound as an exercise. ⊓
⊔
29
The proof of Theorem 2.3.2 is constructive and we can extract the following algorithm
for basis transformation. Since this algorithm is based on the paper of Faug‘ere, Gianni,
Lazard, and Mora, it is called the FGLM algorithm.
is a reduced Gröbner basis w.r.t. <1 , the graduated lexicographic ordering with x < y, for
I = hG1 i.
We want to determine a Gröbner basis G2 for I w.r.t. <2 , the lexicographic ordering
with x < y.
In Step (1) of FGLM we determine
a = (a1 , . . . , a7 ) = B(G1 ) = (1, x, x2 , y, xy, x2y, y 2 ),
M (G1 ) = {y 3 , xy 2 , x2 y 2 , x3 y, x3 },
T (G1 ):
1 x x2 y xy x2 y y 2
xa1 1
ya1 1
xa2 1
ya2 1
xa3 2 −2
30
1 x x2 y xy x2 y y 2
ya3 1
xa4 1
ya4 1
xa5 1
ya5 1
xa6 −2 2
ya6 1
xa7 1
ya7 1
The matrix C will be determined in the course of the execution of FGLM. But we give
here already the final result.
C: b1 b2 b3 b4 b5 b6 b7
1 x x2 x3 x4 x5 x6
a1 1 0 0 0 0 0 0
a2 0 1 0 0 0 0 0
a3 0 0 1 0 0 0 0
a4 0 0 0 2 0 0 0
a5 0 0 0 -2 2 4 -8
a6 0 0 0 0 -2 2 4
a7 0 0 0 0 0 -4 4
31
m = min<2 {y, x2 , xy} = x2 ,
not case (3),
m = x2 = (0, 0, 1, 0, 0, 0, 0) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,C2.
m = min<2 {y, xy, x3, x2 y} = x3 ,
not case (3),
m = x3 = (0, 0, 0, 2, −2, 0, 0) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,C2. ,C3.
m = min<2 {y, xy, x2y, x4 , x3 y} = x4 ,
not case (3),
m = x4 = (0, 0, 0, 0, 2, −2, 0) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,...,C4.
m = min<2 {y, xy, x y, x y, x , x4 y} = x5 ,
2 3 5
32
2.4 Resultants
Lemma 2.4.1. Let a, b ∈ K[x] be polynomials of degrees m > 0 and n > 0, respectively.
Then a and b have a common factor if and only if there are polynomials c, d ∈ K[x] such
that:
(i) ac + bd = 0,
(ii) c and d are not both zero,
(iii) c has degree at most n − 1 and d has degree at most m − 1.
We can use linear algebra to decide the existence of c and d, and in the positive case
compute them. The idea is to turn ac + bd = 0 into a system of linear equations as follows:
a = a m xm + · · · + a 0 , am 6= 0
b = b n xn + · · · + b 0 , bm 6= 0
c = cn−1 xn−1 + · · · + c0
d = dm−1 xm−1 + · · · + d0
Def. 2.4.1. The Sylvester matrix of f and g w.r.t. x, denoted Sylx (f, g), is the coefficient
matrix of the linear system above.
The resultant of f and g w.r.t. x, denoted Resx (f, g), is the determinant of the
Sylvester matrix. ⊓⊔
33
Solving systems of algebraic equations by resultants
Theorem 2.4.3. suggests a method for determining the solutions of a system of alge-
braic, i.e. polynomial, equations over an algebraically closed field. Suppose, for example,
that a system of three algebraic equations is given as
Let, e.g.,
b(x) = resz (resy (a1 , a2 ), resy (a1 , a3 )),
c(y) = resz (resx (a1 , a2 ), resx (a1 , a3 )),
d(z) = resy (resx (a1 , a2 ), resx (a1 , a3 )).
In fact, we might compute these resultants in any other order. By Theorem 2.4.3, all the
roots (α1 , α2 , α3 ) of the system satisfy b(α1 ) = c(α2 ) = d(α3 ) = 0. So if there are finitely
many solutions, we can check for all of the candidates whether they actually solve the
system.
Unfortunately, there might be solutions of b, c, or d, which cannot be extended to
solutions of the original system, as we can see from Example 1.2.
For further reading on resultants we refer to [CLO98].
34