Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
2 views26 pages

02 Elimination

article

Uploaded by

tchatchiemh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views26 pages

02 Elimination

article

Uploaded by

tchatchiemh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Chapter 2

Elimination theory

2.1 Existence and construction of Gröbner bases

The material of this chapter is largely taken from [Win96], where also proofs of theo-
rems are given.
Before we start with the technical details, let us briefly review the historical devel-
opment leading to the concept of Gröbner bases. In his seminal paper of 1890 D. Hilbert
gave a proof of his famous Basis Theorem as well as of the structure and length of the
sequence of syzygy modules of a polynomial system. Implicitly he also showed that the
Hauptproblem, i.e. the problem whether f ∈ I for a given polynomial f and polynomial
ideal I, can be solved effectively. Hilbert’s solution of the Hauptproblem (and similar
problems) was reinvestigated by G. Hermann in 1926. She counted the field operations
required in this effective procedure and arrived at a double exponential upper bound in
the number of variables. In fact, Hermann’s, or for that matter Hilbert’s, algorithm always
actually achieves this worst case double exponential complexity. The next important step
came when B. Buchberger, in his doctoral thesis of 1965 advised by W. Gröbner, intro-
duced the notion of a Gröbner basis (he did not call it that at this time) and also gave an
algorithm for computing it. Gröbner bases are very special and useful bases for polynomial
ideals. In subsequent publications Buchberger exhibited important additional applications
of his Gröbner bases method, e.g. to the solution of systems of polynomial equations. In
the worst case, Buchberger’s Gröbner bases algorithm is also double exponential in the
number of variables, but in practice there are many interesting examples which can be
solved in reasonable time. But still, in the worst case, the double exponential behaviour
is not avoided. And, in fact, it cannot be avoided by any algorithm capable of solving the
Hauptproblem, as was shown by E.W. Mayr and A.R. Meyer in 1982.

When we are solving systems of polynomial (algebraic) equations, the important pa-
rameters are the number of variables n and the degree of the polynomials d. The Buch-
berger algorithm for constructing Gröbner bases is at the same time a generalization of
Euclid’s algorithm for computing the greatest common divisor (GCD) of univariate poly-
nomials (the case n = 1) and of Gauss’ triangularization algorithm for linear systems (the
case d = 1). Both these algorithms are concerned with solving systems of polynomial
equations, and they determine a canonical basis (either the GCD of the inputs or a tri-
angularized form of the system) for the given polynomial system. Buchberger’s algorithm
can be seen as a generalization to the case of arbitrary n and d.
Let K be a computable field and K[X] = K[x1 , . . . , xn ] the polynomial ring in n
indeterminates over K. If F is any subset of K[X] we write hF i or ideal(F ) for the ideal

9
generated by F in K[X]. By [X] we denote the monoid (under multiplication) of power
products xi11 · · · xinn in x1 , . . . , xn . 1 = x01 . . . x0n is the unit element in the monoid [X].
lcm(s, t) denotes the least common multiple of the power products s, t.
Commutative rings with 1 in which the basis condition holds, i.e. in which every ideal
has a finite basis, are usually called Noetherian rings. This notation is motivated by the
following lemma.

Lemma 2.1.1. In a Noetherian ring there are no infinitely ascending chains of ideals. ⊓

Theorem 2.1.2. (Hilbert’s Basis Theorem) If R is a Noetherian ring then also the uni-
variate polynomial ring R[x] is Noetherian.
A proof of Hilbert’s Basis Theorem will be given in a later chapter. Hilbert’s Basis
Theorem implies that the multivariate polynomial ring K[X] is Noetherian, if K is a field.
So every ideal I in K[X] has a finite basis, and if we are able to effectively compute with
finite bases then we are dealing with all the ideals in K[X].
We will define a Gröbner basis of a polynomial ideal via a certain reduction relation
for polynomials. A Gröbner basis will be a basis with respect to which the corresponding
reduction relation is confluent. Before we can define the reduction relation on the polyno-
mial ring, we have to introduce an ordering of the power products with respect to which
the reduction relation should be decreasing.

Definition 2.1.1. Let < be an ordering on [X] that is compatible with the monoid
structure, i.e.
(i) 1 = x01 . . . x0n < t for all t ∈ [X] \ {1}, and
(ii) s < t =⇒ su < tu for all s, t, u ∈ [X].
We call such an ordering < on [X] an admissible ordering. ⊓

Example 2.1.1. We give some examples of frequently used admissible orderings on [X].
(a) The lexicographic ordering with xπ(1) > xπ(2) > . . . > xπ(n) , π a permutation of
{1, . . . , n}:
xi11 . . . xinn <lex,π xj11 . . . xjnn iff there exists a k ∈ {1, . . . , n} such that for all l < k
iπ(l) = jπ(l) and iπ(k) < jπ(k) .
If π = id, we get the usual lexicographic ordering <lex .
(b) The graduated lexicographic ordering w.r.t. the permutation π and the weight function
w : {1, . . . , n} → R+ :
for s = xi11 . . . xinn , t = xj11 . . . xjnn we define s <glex,π,w t iff
n
X n
X  n
X n
X 
w(k)ik < w(k)jk or w(k)ik = w(k)jk and s <lex,π t .
k=1 k=1 k=1 k=1

We get the usual graduated lexicographic ordering <glex by setting π = id and w =


1const .
(c) The graduated reverse lexicographic ordering:
we define s <grlex t iff

deg(s) < deg(t) or (deg(s) = deg(t) and t <lex,π s, where π(j) = n − j + 1).

10
(d) The product ordering w.r.t. i ∈ {1, . . . , n − 1} and the admissible orderings <1 on
X1 = [x1 , . . . , xi ] and <2 on X2 = [xi+1 , . . . , xn ]:
for s = s1 s2 , t = t1 t2 , where s1 , t1 ∈ X1 , s2 , t2 ∈ X2 , we define s <prod,i,<1 ,<2 t iff

s1 <1 t1 or (s1 = t1 and s2 <2 t2 ). ⊓


A complete classification of admissible orderings is given by L. Robbiano in 1985.

Lemma 2.1.3. Let < be an admissible ordering on [X].


(i) If s, t ∈ [X] and s divides t then s ≤ t.
(ii) < (or actually >) is Noetherian, i.e. there are no infinite chains of the form t0 > t1 >
t2 > . . ., and consequently every subset of [X] has a smallest element.

Throughout this chapter let R be a commutative ring with 1, K a field, X a set of


variables, and < an admissible ordering on [X].

Definition 2.1.2. Let s be a power product in [X], f a non-zero polynomial in R[X], F


a subset of R[X].
By coeff(f, s) we denote the coefficient of s in f .
lpp(f ) := max< {t ∈ [X] | coeff(f, t) 6= 0} (leading power product of f ),
lc(f ) := coeff(f, lpp(f )) (leading coefficient of f ),
in(f ) := lc(f )lpp(f ) (initial of f ),
red(f ) := f − in(f ) (reductum of f ),
lpp(F ) := {lpp(f ) | f ∈ F \ {0}},
lc(F ) := {lc(f ) | f ∈ F \ {0}},
in(F ) := {in(f ) | f ∈ F \ {0}},
red(F ) := {red(f ) | f ∈ F \ {0}}. ⊓

If I is an ideal in R[X], then lc(I) ∪ {0} is an ideal in R. However, in(F ) ∪ {0} in


general is not an ideal in R[X].

Definition 2.1.3. Any admissible ordering < on [X] induces a partial ordering ≪ on
R[X], the induced ordering, in the following way:
f ≪ g iff f = 0 and g 6= 0 or
f 6= 0, g 6= 0 and lpp(f ) < lpp(g) or
f 6= 0, g 6= 0, lpp(f ) = lpp(g) and red(f ) ≪ red(g). ⊓

Lemma 2.1.4. ≪ (or actually ≫) is a Noetherian partial ordering on R[X]. ⊓


One of the central notions of the theory of Gröbner bases is the concept of polynomial
reduction.

Definition 2.1.4. Let f, g, h ∈ K[X], F ⊆ K[X]. We say that g reduces to h w.r.t.


f (g −→f h) iff there are power products s, t ∈ [X] such that s has a non–vanishing
coefficient c in g (coeff(g, s) = c 6= 0), s = lpp(f ) · t, and
c
h=g− · t · f.
lc(f )

11
If we want to indicate which power product and coefficient are used in the reduction, we
write
c
g −→f,b,t h, where b = .
lc(f )
We say that g reduces to h w.r.t. F (g −→F h) iff there is f ∈ F such that g −→f h. ⊓

Example 2.1.2. Let F = {. . . , f = x1 x3 + x1 x2 − 2x3 , . . .} in Q[x1 , x2 , x3 ], and g =


x33 + 2x1 x2 x3 + 2x2 − 1. Let < be the graduated lexicographic ordering with x1 < x2 < x3 .
Then g −→F x33 − 2x1 x22 + 4x2 x3 + 2x2 − 1 =: h, and in fact g −→f,2,x2 h. ⊓

Definition 2.1.5. Let −→ be a reduction relation, i.e. a binary relation, on a set X.


• by −→∗ be denote the reflexive and transitive closure of the relation −→;
• by ←→ we mean the symmetric closure of −→;
• x −→ means x is reducible, i.e. x −→ y for some y;
• x−→ means x is irreducible or in normal form w.r.t. −→. We omit mentioning the
reduction relation if it is clear from the context;
• x ↓ y means that x and y have a common successor, i.e. x −→ z ←− y for some z;
• x ↑ y means that x and y have a common predecessor, i.e. x ←− z −→ y for some z;
• x is a −→–normal form of y iff y −→∗ x. ⊓

Definition 2.1.6. (a) −→ is Noetherian or has the termination property iff every reduction
sequence terminates, i.e. there is no infinite sequence x1 , x2 , . . . in M such that x1 −→
x2 −→ . . . .
(b) −→ is Church–Rosser or has the Church–Rosser property iff a ←→∗ b implies a ↓∗ b.
(c) −→ is confluent iff x ↑∗ y implies x ↓∗ y, or graphically every diamond of the following
form can be completed:
u
∗ ւ ց∗
x y
ց∗ ւ ∗
v
(d) −→ is locally confluent iff x ↑ y implies x ↓∗ y, or graphically every diamond of the
following form can be completed:
u
ւ ց
x y
ց∗ ւ ∗
v ⊓

As a consequence of the Noetherianity of admissible orderings we get that −→F is


Noetherian for any set of polynomials F ⊂ K[X]. So, in contrast to the general theory
of rewriting, termination is not a problem for polynomial reductions. But we still have to
worry about the Church-Rosser property.

Theorem 2.1.5. (a) −→ is Church–Rosser if and only if −→ is confluent.


(b) (Newman Lemma) Let −→ be Noetherian. Then −→ is confluent if and only if −→ is
locally confluent.

12
As an immediate consequence of the previous definitions we get that the reduction
relation −→ is (nearly) compatible with the operations in the polynomial ring. Moreover,
the reflexive–transitive–symmetric closure ←→∗F of the reduction relation −→F is equal to
the congruence modulo the ideal generated by F .

Lemma 2.1.6. Let a ∈ K ∗ , s ∈ [X], F ⊆ K[X], g1 , g2 , h ∈ K[X].


(a) −→F ⊆≫,
(b) −→F is Noetherian,
(c) if g1 −→F g2 then a · s · g1 −→F a · s · g2 ,
(d) if g1 −→F g2 then g1 + h ↓∗F g2 + h. ⊓

Theorem 2.1.7. Let F ⊆ K[X]. The ideal congruence modulo hF i equals the reflexive–
transitive–symmetric closure of −→F , i.e. ≡hF i = ←→∗F . ⊓

So the congruence ≡hF i can be decided if −→F has the Church–Rosser property. Of
course, this is not the case for an arbitrary set F . Such distinguished sets (bases for
polynomial ideals) are called Gröbner bases.

Definition 2.1.5. A subset F of K[X] is a Gröbner basis (for hF i) iff −→F is Church–
Rosser. ⊓

A Gröbner basis of an ideal I in K[X] is by no means uniquely defined. In fact,


whenever F is a Gröbner basis for I and f ∈ I, then also F ∪ {f } is a Gröbner basis for I.

For testing whether a given basis F of an ideal I is a Gröbner basis it suffices to test for
local confluence of the reduction relation −→F . This, however, does not yield a decision
procedure, since there are infinitely many situations f ↑F g. However, Buchberger has
been able to reduce this test for local confluence to just testing a finite number of sitations
f ↑F g [Buchberger 1965]. For that purpose he has introduced the notion of subtraction
polynomials, or S–polynomials for short.

Definition 2.1.8. Let f, g ∈ K[X]∗ , t = lcm(lpp(f ), lpp(g)). Then


 1 t 1 t 
cp(f, g) = t − · · f, t − · ·g
lc(f ) lpp(f ) lc(g) lpp(g)

is the critical pair of f and g. The difference of the elements of cp(f, g) is the S–polynomial
spol(f, g) of f and g. ⊓

If cp(f, g) = (h1 , h2 ) then we can depict the situation graphically in the following way:
lcm(lpp(f ), lpp(g))

ւf ցg
• •
h1 h2
The critical pairs of elements of F describe exactly the essential branchings of the reduction
relation −→F .

13
Theorem 2.1.8. (Buchberger’s Theorem) Let F be a subset of K[X].
(a) F is a Gröbner basis if and only if g1 ↓∗F g2 for all critical pairs (g1 , g2 ) of elements of
F.
(b) F is a Gröbner basis if and only if spol(f, g) −→∗F 0 for all f, g ∈ F .

Buchberger’s theorem suggests an algorithm for checking whether a given finite basis
is a Gröbner basis: reduce all the S–polynomials to normal forms and check whether they
are all 0. In fact, by a simple extension we get an algorithm for constructing Gröbner
bases.

algorithm GRÖBNER B(in: F ; out: G);


[Buchberger algorithm for computing a Gröbner basis. F is a finite subset of K[X]∗ ;
G is a finite subset of K[X]∗ , such that hGi = hF i and G is a Gröbner basis.]
(1) G := F ;
C := {{g1 , g2 } | g1 , g2 ∈ G, g1 6= g2 };
(2) while not all pairs {g1 , g2 } ∈ C are marked do
{choose an unmarked pair {g1 , g2 };
mark {g1 , g2 };
h := normal form of spol(g1 , g2 ) w.r.t. −→G ;
if h 6= 0
then{C := C ∪ {{g, h} | g ∈ G};
G := G ∪ {h} };
};
return ⊓

Every polynomial h constructed in GRÖBNER B is in hF i, so hGi = hF i throughout


GRÖBNER B. Thus, by Theorem 1.8 GRÖBNER B yields a correct result if it stops. The
termination of GRÖBNER B is a consequence of Dickson’s Lemma which implies that in
[X] there is no infinite chain of elements s1 , s2 , . . . such that si 6 | sj for all 1 ≤ i < j. The
leading power products of the polynomials added to the basis form such a sequence in [X],
so this sequence must be finite.

Theorem 2.1.9. (Dickson’s Lemma) Every A ⊆ [X] contains a finite subset B, such that
every t ∈ A is a multiple of some s ∈ B.

The termination of GRÖBNER B also follows from Hilbert’s Basis Theorem applied
to the initial ideals of the sets G constructed in the course of the algorithm, i.e. hin(G)i.
See Exercise 8.3.4.
The algorithm GRÖBNER B provides a constructive proof of the following theorem.

Theorem 2.1.10. Every ideal I in K[X] has a Gröbner basis. ⊓


Example 2.1.3. Let F = {f1 , f2 }, with f1 = x2 y 2 + y − 1, f2 = x2 y + x. We compute


a Gröbner basis of hF i in Q[x, y] w.r.t. the graduated lexicographic ordering with x < y.
The following describes one way in which the algorithm GRÖBNER B could execute (recall

14
that there is a free choice of pairs in the loop):
(1) spol(f1 , f2 ) = f1 − yf2 = −xy + y − 1 =: f3 is irreducible, so G := {f1 , f2 , f3 }.
(2) spol(f2 , f3 ) = f2 + xf3 = xy −→f3 y − 1 =: f4 , so G := {f1 , f2 , f3 , f4 }.
(3) spol(f3 , f4 ) = f3 + xf4 = y − x − 1 −→f4 −x =: f5 , so G := {f1 , . . . , f5 }.
All the other S–polynomials now reduce to 0, so GRÖBNER B terminates with
G = {x2 y 2 + y − 1, x2 y + x, −xy + y − 1, y − 1, −x}. ⊓

In addition to the original definition and the ones given in Theorem 1.8, there are
many other characterizations of Gröbner bases. We list only a few of them.

Theorem 2.1.11. Let I be an ideal in K[X], F ⊆ K[X], and hF i ⊆ I. Then the following
are equivalent.
(a) F is a Gröbner basis for I.
(b) f −→∗F 0 for every f ∈ I.
(c) f −→F for every f ∈ I \ {0}.
(d) For all g ∈ I, h ∈ K[X]: if g −→∗F h then h = 0.
(e) For all g, h1 , h2 ∈ K[X]: if g −→∗F h1 and g −→∗F h2 then h1 = h2 .
(f) hin(F )i = hin(I)i.

The Gröbner basis G computed in Example 2.1.3 is much too complicated. In fact,
{y − 1, x} is a Gröbner basis for the ideal. There is a general procedure for simplifying
Gröbner bases.

Theorem 2.1.12. Let G be a Gröbner basis for an ideal I in K[X]. Let g, h ∈ G and
g 6= h.
(a) If lpp(g) | lpp(h) then G′ = G \ {h} is also a Gröbner basis for I.
(b) If h −→g h′ then G′ = (G \ {h}) ∪ {h′ } is also a Gröbner basis for I.

Observe that the elimination of basis polynomials described in Theorem 2.1.12(a) is


only possible if G is a Gröbner basis. In particular, we are not allowed to do this during
a Gröbner basis computation. Based on Theorem 2.1.12 we can show that every ideal has
a unique Gröbner basis after suitable pruning and normalization.

Definition 2.1.9. Let G be a Gröbner basis in K[X].


G is minimal iff lpp(g) 6 | lpp(h) for all g, h ∈ G with g 6= h.
G is reduced iff for all g, h ∈ G with g 6= h we cannot reduce h by g.
G is normed iff lc(g) = 1 for all g ∈ G. ⊓

From Theorem 2.1.12 we obviously get an algorithm for transforming any Gröbner
basis for an ideal I into a normed reduced Gröbner basis for I. No matter from which
Gröbner basis of I we start and which path we take in this transformation process, we
always reach the same uniquely defined normed reduced Gröbner basis of I.

Theorem 2.1.13. Every ideal in K[X] has a unique finite normed reduced Gröbner basis.

Observe that the normed reduced Gröbner basis of an ideal I depends, of course, on
the admissible ordering <. Different orderings can give rise to different Gröbner bases.

15
However, if we decompose the set of all admissible orderings into sets which induce the
same normed reduced Gröbner basis of a fixed ideal I, then this decomposition is finite.
This leads to the consideration of universal Gröbner bases. A universal Gröbner basis for
I is a basis for I which is a Gröbner basis w.r.t. any admissible ordering of the power
products.
If we have a Gröbner basis G for an ideal I, then we can compute in the vector space
K[X]/I over K. The irreducible power products (with coefficient 1) modulo G form a basis
of K[X]/I . We get that dim(K[X]/I ) is the number of irreducible power products modulo
G. Thus, this number is independent of the particular admissible ordering.

Example 2.1.4. Let I = hx3 y − 2y 2 − 1, x2 y 2 + x + yi in Q[x, y]. Let < be the graduated
lexicographic ordering with x > y. Then the normed reduced Gröbner basis of I has
leading power products x4 , x3 y, x2 y 2 , y 3 . So there are 9 irreducible power products.
If < is the lexicographic ordering with x > y, then the normed reduced Gröbner basis
of I has leading power products x and y 9 . So again there are 9 irreducible power products.
In fact, dim(Q[x, y]/I ) = 9. ⊓

For a 0–dimensional ideal I in regular position a very strong structure theorem has
been derived by Gianni and Mora. I is in regular position w.r.t. the variable x1 , if a1 6= b1
for any two different zeros (a1 , . . . , an ), (b1 , . . . , bn ) of I. Clearly it is very likely that an
arbitrary 0–dimensional ideal is in regular position w.r.t. x1 . Otherwise, nearly every
linear change of coordinates will make the ideal regular.

Theorem 2.1.14. (Shape Lemma) Let I be a radical 0–dimensional ideal in K[X], regular
in x1 . Then there are g1 (x1 ), . . . , gn (x1 ) ∈ K[x1 ] such that g1 is squarefree, deg(gi ) <
deg(g1 ) for i > 1 and the normed reduced Gröbner basis F for I w.r.t. the lexicographic
ordering < with x1 < · · · < xn is of the form

{g1 (x1 ), x2 − g2 (x1 ), . . . , xn − gn (x1 )}.

On the other hand, if the normed reduced Gröbner basis for I w.r.t. < is of this form,
then I is a radical 0–dimensional ideal.
Proof: Since I is in regular position, the first coordinates Qof zeros of I are all different, say
m
a11 , . . . , a1m . Then the squarefree polynomial g1 (x1 ) = i=1 (x1 − a1i ) is in I ∩ K[x1 ] and
so it has to be in F . Since by the observation above m is the dimension of K[X]/I , the
normed reduced Gröbner basis for I has to have the specified form.
To prove the converse, let a11 , . . . , a1m be the zeros of g1 (x1 ). Then the zeros of I are
{(a1i , g2 (a1i ), . . . , gn (a1i ) | i = 1, . . . , m}. ⊓

16
2.2 Solving ideal theoretic problems by Gröbner bases

Computation in the vector space of polynomials modulo an ideal


The ring K[X]/I of polynomials modulo the ideal I is a vector space over K. If I is
a prime ideal then this ring is called the coordinate ring ov V (I) (compare Chapter 6). A
Gröbner basis G provides a basis for this vector space.

Theorem 2.2.1. The irreducible power products modulo G, viewed as polynomials with
coefficient 1, form a basis for the vector space K[X]/I over K.

Ideal membership
By definition Gröbner bases solve the ideal membership problem for polynomial ideals,
i.e.
given: f, f1 , . . . , fm ∈ K[X],
decide: f ∈ hf1 , . . . , fm i.
Let G be a Gröbner basis for I = hf1 , . . . , fm i. Then f ∈ I if and only if the normal form
of f modulo G is 0.

Example 2.2.1. Suppose that we know the polynomial relations (axioms)

4z − 4xy 2 − 16x2 − 1 = 0,
2y 2 z + 4x + 1 = 0,
2x2 z + 2y 2 + x = 0

between the quantities x, y, z, and we want to decide whether the additional relation (hy-
pothesis)
g(x, y) = 4xy 4 + 16x2 y 2 + y 2 + 8x + 2 = 0
follows from them, i.e. whether we can write g as a linear combination of the axioms or,
in other words, whether g is in the ideal I generated by the axioms.
Trying to reduce the hypothesis g w.r.t. the given axioms does not result in a reduction
to 0. But we can compute a Gröbner basis for I w.r.t. the lexicographic ordering with
x < y < z, e.g. G = {g1 , g2 , g3 } where

g1 = 32x7 − 216x6 + 34x4 − 12x3 − x2 + 30x + 8,


g2 = 2745y 2 − 112x6 − 812x5 + 10592x4 − 61x3 − 812x2 + 988x + 2,
g3 = 4z − 4xy 2 − 16x2 − 1.

Now g −→∗G 0, i.e. g(x, y) = 0 follows from the axioms. ⊓


17
Radical membership
Sometimes, especially in applications in geometry, we are not so much interested in
the ideal membership problem but in the radical membership problem, i.e.
given: f, f1 , . . . , fm ∈ K[X],
decide: f ∈ radical(hf1 , . . . , fm i).
The radical of an ideal I is the ideal containing all those polynomials f , some power
of which is contained in I. So f ∈ radical(I) ⇐⇒ f n ∈ I for some n ∈ N. Geometrically
f ∈ radical(hf1 , . . . , fm i) means that the hypersurface defined by f contains all the points
in the variety (algebraic set) defined by f1 , . . . , fm .
The following extremely important theorem relates the radical of an ideal I to the
set of common roots V (I) of the polynomials contained in I. We will give a proof of this
theorem later.

Theorem 2.2.2. (Hilbert’s Nullstellensatz) Let I be an ideal in K[X], where K is an


algebraically closed field. Then radical(I) consists of exactly those polynomials in K[X]
which vanish on all the common roots of I.

By an application of Hilbert’s Nullstellensatz we get that f ∈ radical(hf1 , . . . , fm i)


if and only if f vanishes at every common root of f1 , . . . , fm if and only if the system
f1 = . . . fm = z · f − 1 = 0 has no solution, where z is a new variable. I.e.

f ∈ radical(hf1 , . . . , fm i) ⇐⇒ 1 ∈ hf1 , . . . , fm , z · f − 1i.

So the radical membership problem is reduced to the ideal membership problem.

Equality of ideals
We want to decide whether two given ideals are equal, i.e. we want to solve the ideal
equality problem:
given: f1 , . . . , fm , g1 , . . . , gk ∈ K[X],
decide: hf1 , . . . , fm i = hg1 , . . . , gk i.
| {z } | {z }
I J

Choose any admissible ordering. Let GI , GJ be the normed reduced Gröbner bases of
I and J, respectively. Then by Theorem 2.1.13 I = J if and only if GI = GJ .

Solution of algebraic equations by Gröbner bases


We consider a system of equations

f1 (x1 , . . . , xn ) = 0,
.. (2.1)
.
fm (x1 , . . . , xn ) = 0,

where f1 , . . . , fm ∈ K[X]. The system (2.1) is called a system of polynomial or algebraic


n
equations. First let us decide whether (2.1) has any solutions in K , K being the algebraic
closure of K. Let I = hf1 , . . . , fm i.

18
n
Theorem 2.2.3 Let G be a normed Gröbner basis of I. (2.1) is unsolvable in K if and
only if 1 ∈ G.

Now suppose that (2.1) is solvable. We want to determine whether there are finitely
or infinitely many solutions of (2.1) or, in other words, whether or not the ideal I is
0–dimensional.

Theorem 2.2.4. Let G be a Gröbner basis of I. Then (2.1) has finitely many solutions
(i.e. I is 0–dimensional) if and only if for every i, 1 ≤ i ≤ n, there is a polynomial gi ∈ G
such that lpp(gi ) is a pure power of xi . Moreover, if I is 0–dimensional then the number
of zeros of I (counted with multiplicity) is equal to dim(K[X]/I ).

The rôle of the Gröbner basis algorithm GRÖBNER B in solving systems of algebraic
equations is the same as that of Gaussian elimination in solving systems of linear equations,
namely to triangularize the system, or carry out the elimination process. The crucial
observation, first stated by Trinks in 1978, is the elimination property of Gröbner bases. It
states that if G is a Gröbner basis of I w.r.t. the lexicographic ordering with x1 < . . . < xn ,
then the i–th elimination ideal of I, i.e. I ∩K[x1 , . . . , xi ], is generated by those polynomials
in G that depend only on the variables x1 , . . . , xi .

Theorem 2.2.5. (Elimination Property of Gröbner Bases) Let G be a Gröbner basis of I


w.r.t. the lexicographic ordering x1 < . . . < xn . Then
I ∩ K[x1 , . . . , xi ] = hG ∩ K[x1 , . . . , xi ]i,
where the ideal on the right hand side is generated over the ring K[x1 , . . . , xi ].

A proof can be found in [Win96], Chap.8. Theorem 2.2.5 can clearly be generalized
to product orderings, without changing anything in the proof.

Example 2.2.2 Consider the system of equations f1 = f2 = f3 = 0, where

4xz − 4xy 2 − 16x2 − 1 = 0,


2y 2 z + 4x + 1 = 0,
2x2 z + 2y 2 + x = 0,

are polynomials in Q[x, y, z]. We are looking for solutions of this system of algebraic
3
equations in Q , where Q is the field of algebraic numbers.
Let < be the lexicographic ordering with x < y < z. The algorithm GRÖBNER B
applied to F = {f1 , f2 , f3 } yields (after reducing the result) the reduced Gröbner basis
G = {g1 , g2 , g3 }, where

g1 =65z + 64x4 − 432x3 + 168x2 − 354x + 104,


g2 =26y 2 − 16x4 + 108x3 − 16x2 + 17x,
g3 =32x5 − 216x4 + 64x3 − 42x2 + 32x + 5.

By Theorem 3.1 the system is solvable. Furthermore, by Theorem 3.2, the system has
finitely many solutions. The Gröbner basis G yields an equivalent triangular system in
which the variables are completely separated. So we can get solutions by solving the

19
univariate polynomial g3 and propagating the partial solutions upwards to solutions of the
full system. The univariate polynomial g3 is irreducible over Q, and the solutions are

1 √ p 1
(α, ± √ α 16α3 − 108α2 + 16α − 17, − (64α4 − 432α3 + 168α2 − 354α + 104)),
26 65

where α is a root of g3 . We can also determine a numerical approximation of a solution


from G, e.g.
(−0.1284722871, 0.3211444930, −2.356700326). ⊓

Arithmetic of polynomial ideals


In commutative algebra and algebraic geometry there is a strong correspondence be-
tween radical polynomial ideals and algebraic sets, the sets of zeros of such ideals over the
algebraic closure of the field of coefficients. For any ideal I in K[x1 , . . . , xn ] we denote
by V (I) the set of all points in An (K), the n–dimensional affine space over the algebraic
closure of K, which are common zeros of all the polynomials in I. Such sets V (I) are
called algebraic sets (we will introduce algebraic sets below). On the other hand, for any
subset V of An (K) we denote by I(V ) the ideal of all polynomials vanishing on V . Then
for radical ideals I and algebraic sets V the functions V (·) and I(·) are inverses of each
other, i.e.
V (I(V )) = V and I(V (I)) = I.
This correspondence extends to operations on ideals and algebraic sets in the following
way:
ideal algebraic set
I +J V (I) ∩ V (J)
I · J, I ∩ J V (I) ∪ V (J)
I:J V (I) − V (J) = V (I) − V (J) (Zariski closure of the difference)
So we can effectively compute intersection, union, and difference of varieties if we can carry
out the corresponding operations on ideals.

Definition 2.2.1 Let I, J be ideals in K[X].


The sum I + J of I and J is defined as
I + J = {f + g | f ∈ I, g ∈ J}.
The product I · J of I and J is defined as
I · J = h{f · g | f ∈ I, g ∈ J}i.
The quotient I : J of I and J is defined as
I : J = {f | f · g ∈ I for all g ∈ J}. ⊓

Theorem 2.2.6 Let I = hf1 , . . . , fr i and J = hg1 , . . . , gs i be ideals in K[X].


(a) I + J = hf1 , . . . , fr , g1 , . . . , gs i.
(b) I · J = hfi gj | 1 ≤ i ≤ r, 1 ≤ j ≤ si.
(c) I ∩ J = (hti · I(t) + h1 − ti · J(t) ) ∩ K[X], where t is a new variable, and I(t) , J(t) are
the ideals
Tsgenerated by I, J, respectively, in K[X, t].
(d) I : J = j=1 (I : hgj i) and
I : hgi = hh1 /g, . . . , hm /gi, where I ∩ hgi = hh1 , . . . , hm i.

20
Proof: (a) and (b) are easily seen.
(c) Let f ∈ I ∩J. Then tf ∈ hti·I(t) and (1−t)f ∈ ht−1i·J(t) . Therefore f = tf +(1−t)f ∈
hti · I(t) + h1 − ti · J(t) .
On the other hand, let f ∈ (hti · I(t) + h1 − ti · J(t) ) ∩ K[X]. So f = g(X, t) + h(X, t),
where g ∈ htiI and h ∈ h1 − tiJ. In particular, h(X, t) is a linear combination of the basis
elements (1 − t)g1 , . . . , (1 − t)gs of h1 − tiJ. Evaluating t at 0 we get
f = g(X, 0) + h(X, 0) = h(X, 0) ∈ J.
Simlilarly, by evaluating t at 1 we get f = g(X, 1) ∈ I.
(d) h ∈ I : J if and only if hg ∈ I for all g ∈ J if and only if hgj ∈ I for all 1 ≤ j ≤ s if
and only if h ∈ I : hgj i for all 1 ≤ j ≤ s.
If f ∈ hh1 /g, . . . , hm /gi and a ∈ hgi then af ∈ hh1 , . . . , hm i =P
I ∩ hgi ⊂ I, i.e. f ∈ I : hgi.
Conversely, suppose f ∈ I : hgi. Then f g ∈ I ∩ hgi. So f g = bk hk for some bk ∈ K[X].
Thus, X
f= bk · ( hk /g ) ∈ hh1 /g, . . . , hm /gi. ⊓

| {z }
polynomial

So all these operations can be carried out effectively by operations on the bases of the
ideals. In particular the intersection can be computed by Theorem 2.2.6(c).
We always have I · J ⊂ I ∩ J. However, I ∩ J could be strictly larger than I · J.
For example, if I = J = hx, yi, then I · J = hx2 , xy, y 2i and I ∩ J = I = J = hx, yi.
Both I · J and I ∩ J correspond to the same variety. Since a basis for I · J is more easily
computed, why should we bother with I ∩ J? The reason is that the intersection behaves
much better with respect to the operation of taking radicals (recall that it is really the
radical ideals that uniquely correspond to algebraic sets). Whereas the product of radical
ideals in general fails to be radical (consider I ·I), the intersection of radical ideals is always
radical. For a proof see Theorem 8.4.10 in [Win96].
√ √ √ √
Theorem 2.2.7 Let I, J be ideals in K[X]. Then I ∩ J = I ∩ J ( I means the
radical of I).

Example 2.2.3 Consider the ideals


I1 = h2x4 − 3x2 y + y 2 − 2y 3 + y 4 i,
I2 = hx, y 2 − 4i,
I3 = hx, y 2 − 2yi,
I4 = hx, y 2 + 2yi.
The coefficients are all integers, but we consider them as defining algebraic sets in the
affine plane over C. In fact, V (I1 ) is the tacnode curve (compare Fig. 1.1.3). V (I2 ) =
{(0, 2), (0, −2)}, V (I3 ) = {(0, 2), (0, 0)}, V (I4 ) = {(0, 0), (0, −2)}.
First, let us compute the ideal I5 defining the union of the tacnode and the 2 points
in V (I2 ). I5 is the intersection of I1 and I2 , i.e.
I5 = I1 ∩ I2 = (hziI1 + h1 − ziI2 ) ∩ Q[x, y]
= h−4y 2 + 8y 3 − 3y 4 + 12x2 y − 8x4 − 2y 5 + y 6 − 3x2 y 3 + 2y 2 x4 ,
xy 2 − 2xy 3 + xy 4 − 3x3 y + 2x5 i.

21
Now let us compute the ideal I6 defining V (I5 ) − V (I3 ), i.e. the Zariski closure of V (I5 ) \
V (I3 ), i.e. the smallest algebraic set containing V (I5 ) \ V (I3 ).

I6 = I5 : I3 = (I5 : hxi) ∩ (I5 : hy 2 − 2yi)


= h2x4 − 3x2 y + y 2 − 2y 3 + y 4 i ∩
hy 5 − 3y 3 + 2y 2 − 3x2 y 2 + 2yx4 − 6x2 y + 4x4 , 2x5 − 3x3 y + xy 2 − 2xy 3 + xy 4 i
= hy 5 − 3y 3 + 2y 2 − 3x2 y 2 + 2yx4 − 6x2 y + 4x4 , 2x5 − 3x3 y + xy 2 − 2xy 3 + xy 4 i.

V (I6 ) is the tacnode plus the point (0, −2).


Finally, let us compute the ideal I7 defining V (I6 ) − V (I4 ), i.e. the Zariski closure of
V (I6 ) \ V (I4 ).

I7 = I6 : I4 = (I6 : hxi) ∩ (I6 : hy 2 + 2yi)


= h2x4 − 3x2 y + y 2 − 2y 3 + y 4 i ∩ h2x4 − 3x2 y + y 2 − 2y 3 + y 4 i
= I1 .

So we get back the ideal I1 defining the tacnode curve. ⊓




The radical I of an ideal I generalizes the square-free part of a polynomial. For
principal ideals hp(x)i in K[x] the radical is simply the ideal generated by the square-free
part of the generator p, i.e. p
hpi = hqi,
where q(x) is the square-free part of p(x). There are also algorithms for computing the
radical of an ideal in K[x1 , . . . , xn ]. Here we only quote Proposition (2.7) of p.39 in
[CLO98], which shows how to get the radical of a 0-dimensional ideal.
Theorem 2.2.8 Let K be algebraically closed, I a 0-dimensional ideal in K[x1 , ..., xn].
For each i = 1, . . . , n, let pi be the unique monic generator of I ∩ K[xi ], and let p̃i be the
square-free part of pi . Then √
I = I + hp̃1 , . . . , p̃n i.

For proving Theorem 2.2.8 we use the following lemma (as suggested in [CLO98]), the
proof of which we leave as an exercise.
Lemma 2.2.9 Let I be an ideal in K[x1 , . . . , xn ], and let p = (x1 − a1 ) · · · (x1 − ad ), where
a1 , . . . , ad are distinct.
T
(a) Then I +Q hpi ⊂ j (I + hx1 − aj i).
(b) Let pj = i6=j (x1 − ai ). Then pj · (I + hx1 − aj i) ⊂ I + hpi.
(c) p1 , . . . , pdPare relatively prime, and therefore there are polynomials h1 , . . . , hd such
that 1 = j hj pj .
T
(d) j (I + hx1 − aj i) ⊂ I + hpi.
(e) From these partial results we finally get

d
\ 
I + hpi = I + hx1 − aj i .
j=1

22
Proof of Theorem 2.2.8:
√ Write J = I + hp̃1 , . . . , p̃n i. We first prove that J is a radical
ideal, i.e., the J = J. For each i, using the fact that K is algebraically closed, we can
factor p̃i to obtain
p̃i = (xi − ai1 ) · · · (xi − ai di ),
where the aij are distinct for given i ∈ {1, . . . , n}. Then
\ 
J = J + hp̃1 i = J + hx1 − a1j i ,
j

where the first equality holds since p̃1 ∈ J and the second follows from Lemma 2.2.9. Now
use p̃2 to decompose each J + hx1 − a1j i in the same way. This gives
\ 
J= J + hx1 − a1j , x2 − a2k i .
j,k

If we do this for all i = 1, 2, . . . n, we get the expression


\ 
J= J + hx1 − a1j1 , . . . , xn − a2jn i .
j1 ,...,jn

Since hx1 − a1j1 , . . . , xn − anjn i is a maximal ideal, the ideal J + hx1 − a1j1 , . . . , xn − anjn i
is either hx1 − a1j1 , . . . , xn − anjn i or the whole polynomial ring K[x1 , . . . , xn ]. Since a
maximal ideal is radical and an intersection of radical ideals is radical (compare Theorem
2.2.7), we conclude that J is a radical √ ideal.
Now we can prove that √ J = I. The inclusion I ⊂ J is obvious from the definition
of J. The inclusion J ⊂ I follows from Hilbert’s Nullstellensatz (Theorem 4.2.3), since
the polynomials p̃i vanish at all the points of V (I). Hence we have

I ⊂ J ⊂ I.
√ √
Taking
√ radicals in this chain of inclusions shows that J = I. But J is radical, so
J = J and we are done. ⊓

Resolution of modules and ideals

In the following let R be a commutative ring with 1.


Definition 2.2.2. Consider a sequence of R-modules and homomorphisms

· · · −→ Mi+1 −→ϕi+1 Mi −→ϕi Mi−1 −→ · · ·

We say the sequence is exact at Mi iff im(ϕi+1 ) = ker(ϕi ).


The entire sequence is said to be exact iff it is exact at each Mi which is not at the
beginning or the end of the sequence. ⊓

Definition 2.2.3. Let M be an R-module. A free resolution of M is an exact sequence


of the form
· · · −→ Rn2 −→ϕ2 Rn1 −→ϕ1 Rn0 −→ϕ0 M −→ 0 .

23
Observe that all modules in this sequence except M are free.
If there is an l ∈ N s.t. nl 6= 0 but nk = 0 for all k > l, then we say that the resolution
is finite, of length l. A finite resolution of length l is usually written as

0 −→ Rnl −→ Rnl−1 −→ · · · −→ Rn1 −→ Rn0 −→ M −→ 0 . ⊓


Let’s see how we can construct a free resolution of a finitely generated module M =
hm1 , . . . , mn0 i. We determine a basis (generating set) {s1 , . . . , sn1 } of Syz(m1 , . . . , mn0 ),
the syzygy module of (m1 , . . . , mn0 ). Let

ϕ0 : Rn0 −→ PM
(r1 , . . . , rn0 )T 7→ ri mi

ϕ1 : Rn1 −→ PRn0
(r1 , . . . , rn1 )T 7→ ri s i
Then we have im(ϕ1 ) = Syz(mi ) = ker(ϕ0 ), so the sequence

Rn1 −→ϕ1 Rn0 −→ϕ0 M −→ 0

is exact. Continuing this process with Syz(mi ) instead of M , we finally get a free resolution
of M .

Example 2.2.4. (from [CLO98], Chap. 6.1)


Consider the ideal (which is also a module)

I = hx2 − x, xy, y 2 − y i
| {z }
F

in R = K[x, y]. In geometric terms, I is the ideal of the variety V = {(0, 0), (1, 0), (0, 1)}
in K 2 . Let
ϕ0 : R3  −→ I  
r1 r1
 r2  7→ (x2 − x, xy, y 2 − y) ·  r2 
| {z }
r3 A
r3
The mapping ϕ0 represents the generation of I from the free module R3 . Next we determine
relations between the generators, i.e. (first) syzygies. The columns of the matrix
 
y 0
B = −x + 1
 y − 1
0 −x

generate the syzygy module Syz(F ). Bases for syzygy modules can be computed via
Gröbner bases; see for instance Theorem 8.4.8 in [Win96]. So for
2
ϕ1 : R  −→ R3 
r1 r1
7→ B·
r2 r2

24
we get the exact sequence
R2 −→ϕ1 R3 −→ϕ0 I −→ 0 .
The resolution process terminates right here. If (c1 , c2 ) is any syzygy of the columns of B,
i.e. a second syzygy of F , then
     
y 0 0
c1 −x + 1 + c2 y − 1 = 0  .
    
0 −x 0

Looking at the first component we see that c1 y = 0, so c1 = 0. Similarly, from the third
component we get c2 = 0. Hence the kernel of ϕ1 is the zero module 0. There are no non-
trivial relations between the columns of B, so the first syzygy module Syz(F ) is isomorphic
to the free module R2 . Finally this leads to the free resolution

0 −→ R2 −→ϕ1 R3 −→ϕ0 I −→ 0

of length 1 of the module (ideal) I in R = K[x, y]. ⊓


The problem of computing syzygies is considered in [Win96], Chapter 8. Here we only


remind ourselves of the basic definition and facts.
For given polynomials f1 , . . . , fs , f in K[X] we consider the linear equation

f1 z1 + . . . + fs zs = f, (2.2)

or the corresponding homogeneous equation

f1 z1 + . . . + fs zs = 0. (2.3)

Let F be the vector (f1 , . . . , fs ). The general solution of (2.2) and (2.3) is to be sought in
K[X]s . The solutions of (2.3) form a module over the ring K[X], a submodule of K[X]s
over K[X].

Definition 2.2.5. Any solution of (2.3) is called a syzygy of the sequence of polynomials
f1 , . . . , fs . The module of all solutions of (8.4.3) is the module of syzygies Syz(F ) of
F = (f1 , . . . , fs ). ⊓

Theorem 2.2.6. If the elements of F = (f1 , . . . , fs ) are a Gröbner basis, then from the
reductions of the S-polynomials to 0 one can extract a basis for Syz(F ).
Theorem 2.2.7. Let F = (f1 , . . . , fs )T be a vector of polynomials in K[X] and let the
elements of G = (g1 , . . . , gm )T be a Gröbner basis for hf1 , . . . , fs i. We view F and G as
column vectors. Let the r rows of the matrix R be a basis for Syz(G) and let the matrices
A, B be such that G = A · F and F = B · G. Then the rows of Q are a basis for Syz(F ),
where  
Is − B · A
Q =  ...............  .
R·A

25
What we still need is a particular solution of the inhomogeneous equation (2.2). Let
G = (g1 , . . . , gm ) be a Gröbner basis for hF i and let A be the transformation matrix such
that G = A · F (G and F viewed as column vectors). Then a particular solution of (2.2)
exists if and only if f ∈ hF i = hGi. If the reduction of f to normal form modulo G yields
f ′ 6= 0, then (2.2) is unsolvable. Otherwise we can extract from this reduction polynomials
h′1 , . . . , h′m such that
g1 h′1 + . . . + gm h′m = f.
So H = (h′1 , . . . , h′m ) · A is a particular solution of (2.2).

26
2.3 Basis conversion for 0-dimensional ideals — FGLM

This section is based on


J.C. Faugère, P. Gianni, D. Lazard, T. Mora, “Efficient Computation of Zero-
dimensional Gröbner Bases by Change of Ordering”, J. Symbolic Computation
16, pp. 329–344 (1993).

Gröbner bases are strongly dependent on the admissible ordering < on the terms or
power products. Also the computation complexity is strongly influenced by <.
Let I be a 0-dimensional ideal, i.e. an ideal having finitely many common solutions
n
in K . Let a basis of I be given by polynomials of total degree less or equal to d. Then
the complexity of computing a Gröbner basis for I is as follows:
• w.r.t. to graduated reverse lexicographic ordering:
2
dO(n )

in general, and
dO(n)
if there are also only finitely many common solutions “at infinity”, e.g. if the basis is
homogeneous,
• w.r.t. lexicographic ordering:
3
dO(n ) .

But lexicographic orderings are often the orderings of choice for practical problems,
because such Gröbner bases have the elimination property (Theorem 2.2.5). Let D =
dimK[x1 , . . . , xn ]/I , i.e. D is the number of solutions of I, counted with multiplicites.
Then by methods of linear algebra we can transform a Gröbner basis w.r.t. the ordering
<1 to a Gröbner basis w.r.t. <2 , where the number of arithmetic operations in this
transformation is proportional to O(n · D3 ).
We will use the following notation:
R = K[x1 , . . . , xn ],
I is a 0-dimensional ideal in R,
G is a reduced Gröbner basis w.r.t. to the ordering <,
D(I) = dimK R/I , the degree of the ideal I,
B(G) = {b | b irreducible w.r.t. G} the canonical basis of the K-vector space R/I ,
M (G) = {xi b | b ∈ B(G), 1 ≤ i ≤ n, xi b 6∈ B(G)} the margin of G.

Example 2.3.1. Consider the Gröbner basis


G = {x3 + 2xy − 2y, y 3 − y 2 , xy 2 − xy}
of I w.r.t. the graduated ordering with x < y. Then
D(I) = 7,
B(G) = {1, x, y, x2, xy, y 2, x2 y},
M (G) = {x3 , x3 y, x2 y 2 , xy 2 , y 3 }. ⊓

27
Theorem 2.3.1. For every m ∈ M (G) exactly one of the following conditions holds:
(i) for every variable xi occurring in m (i.e. xi |m) we have m/xi ∈ B(G); this is the case
if and only if m = lpp(g) for some g ∈ G,
(ii) m = xj mk for some j and some mk ∈ M (G).
Proof: (i) For such an m we have m −→G (m is reducible by G), i.e. lpp(g)|m for
some g ∈ G. But m/xi is irreducible modulo G for every variable xi . So we must have
m = lpp(g). Since G is a reduced Gröbner basis, we get also the converse.
(ii) Let xj |m and m/xj 6∈ B(G). Let mk = m/xj . So m = xj mk = xi b for some variable xi
and b ∈ B(G). We have i 6= j and mk /xi = b/xj ∈ B(G). Thus mk = xi (b/xj ) ∈ M (G).

The finitely many elements of B(G) can be listed as

B(G) = {b1 , . . . , bD(I) }.

By nf G (f ) we denote the (uniquely defined) normal form of the polynomial f w.r.t. the
Gröbner basis G. We investigate the n-linear mappings φi on B(G), defined by

φi : bk 7→ nf G (xi bk ).

Definition 2.3.1. The translational tensor T (G) = (tijk ) of the order n × D(I) × D(I) is
defined as

tijk := j − th coordinate w.r.t. the basis B(G) of nf G (xi bk ), for bk ∈ B(G).


PD(I)
So nf G (φi (bk )) = j=1 tijk bj . ⊓

The tensor T (G) can be computed in O(n · D(I)3 ) arithmetic operations.

Theorem 2.3.2. Let I be a 0-dimensional ideal, G1 a reduced Gröbner basis for I w.r.t.
<1 , and <2 a different admissible ordering.
Then a Gröbner basis G2 for I w.r.t. <2 can be computed by means of linear algebra.
This requires O(n · D(I)3 ) arithmetic operations.
Proof: Let
B(G1 ) = {a1 , . . . , aD(I) }, M (G1 ), T (G1 ) as above.
We have to determine
B(G2 ) = {b1 . . . , bD(I) } and G2 .
If I = R, then obviously G1 = G2 = {1} and B(G1 ) = B(G2 ) = ∅. So let us assume that
I 6= R.
For determining B(G2 ) and G2 we construct a matrix

C = cki ,

such that
D(I)
X
bi = cji aj , for every bi ∈ B(G2 ).
j=1

28
We proceed iteratively and start by setting

B(G2 ) := {1}, M (G2 ) := ∅, G2 := ∅.

Now let

m := min xj bi | 1 ≤ j ≤ n, bi ∈ B(G2 ), xj bi 6= B(G2 ) ∪ M (G2 ) .
<2

Then, by Theorem 2.3.1, we are necessarily in one of the following three cases:
(1) m = lpp(g) for some g which has to be added to G2 ,
(2) m has to be added to B(G2 ),
(3) m has to be added to M (G2 ), but m is a proper multiple of lpp(g) for some g ∈ G2 .
Case (3) can be checked easily: lpp(g) < m for every admissible ordering <, and therefore
this g has already been added to G2 , i.e. we already have lpp(g) in M (G2 ).

Now let us consider the cases (1) and (2): using the precomputed tensor T (G1 ) = tijk
and the already computed components of C we can determine the coordinates of m = xj bi
w.r.t. B(G1 ) as follows: X
m = xj b i = xj cki ak
k
X
= cki (xj ak )
k
X X 
= cki tjhk ah
k h
X X 
= tjhk cki ah
h k
X
=: c(m)h ah .
h

If the vector
c(m) = (c(m)1 , . . . , c(m)D(I) )
is linearly independent of the vectors in C, then we are in case (2) and we have found a
new term m ∈ B(G2 ).
On the other hand, if c(m) is linearly dependent on the vectors in C, then from this
dependence we get a new element g ∈ G2 .
We leave the complexity bound as an exercise. ⊓

29
The proof of Theorem 2.3.2 is constructive and we can extract the following algorithm
for basis transformation. Since this algorithm is based on the paper of Faug‘ere, Gianni,
Lazard, and Mora, it is called the FGLM algorithm.

algorithm FGLM(in: G1 , <1 , <2 ; out: G2 );


[FGLM algorithm for Gröbner basis transformation.
G1 is a reduced Gröbner basis w.r.t. <1 of I 6= R, <2 an admissible ordering;
G2 is a Gröbner basis for I(G1 ) w.r.t. <2 .]
(1) determine a := B(G1 ) = (a1 , . . . , aD(I) ), D(I), M (G1 ), T (G1 );
(2) B(G2 ) := (1); M (G2 ) := ∅; G2 := ∅; C1. := (1, 0, . . . , 0)T ;
(3) N := {xj bi | 1 ≤ j ≤ n, bi ∈ B(G2 ), xj bi 6∈ B(G2 ) ∪ M (G2 )};
while N 6= ∅ do
m := min<2 N ;
determine c(m) such that m = a · c(m) ;
decide cases (1), (2), (3) in the proof of Theorem 2.3.2, and update
accordingly;
N := {xj bi | 1 ≤ j ≤ n, bi ∈ B(G2 ), xj bi 6∈ B(G2 ) ∪ M (G2 )};
end;
(4) return G2 ⊓

Example 2.3.2. We consider the polynomial ring Q[x, y].

G1 = {x3 + 2xy − 2y, y 3 − y 2 , xy 2 − xy}

is a reduced Gröbner basis w.r.t. <1 , the graduated lexicographic ordering with x < y, for
I = hG1 i.
We want to determine a Gröbner basis G2 for I w.r.t. <2 , the lexicographic ordering
with x < y.
In Step (1) of FGLM we determine
a = (a1 , . . . , a7 ) = B(G1 ) = (1, x, x2 , y, xy, x2y, y 2 ),
M (G1 ) = {y 3 , xy 2 , x2 y 2 , x3 y, x3 },
T (G1 ):
1 x x2 y xy x2 y y 2

xa1 1

ya1 1

xa2 1

ya2 1

xa3 2 −2

30
1 x x2 y xy x2 y y 2

ya3 1

xa4 1

ya4 1

xa5 1

ya5 1

xa6 −2 2

ya6 1

xa7 1

ya7 1

The matrix C will be determined in the course of the execution of FGLM. But we give
here already the final result.
C: b1 b2 b3 b4 b5 b6 b7
1 x x2 x3 x4 x5 x6

a1 1 0 0 0 0 0 0

a2 0 1 0 0 0 0 0

a3 0 0 1 0 0 0 0

a4 0 0 0 2 0 0 0

a5 0 0 0 -2 2 4 -8

a6 0 0 0 0 -2 2 4

a7 0 0 0 0 0 -4 4

In Step (2) we set

B(G2 ) := (1), M (G2 ) := ∅, G2 := ∅, C1. := (1, 0, . . . , 0)T .

Finally we execute the loop in Step (3): N = {x, y}.


m = min<2 {x, y} = x,
not case (3),
m = x = (0, 1, 0, 0, 0, 0, 0) ·a, so m is added to B(G2 );
| {z }
indep. of C1.

31
m = min<2 {y, x2 , xy} = x2 ,
not case (3),
m = x2 = (0, 0, 1, 0, 0, 0, 0) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,C2.
m = min<2 {y, xy, x3, x2 y} = x3 ,
not case (3),
m = x3 = (0, 0, 0, 2, −2, 0, 0) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,C2. ,C3.
m = min<2 {y, xy, x2y, x4 , x3 y} = x4 ,
not case (3),
m = x4 = (0, 0, 0, 0, 2, −2, 0) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,...,C4.
m = min<2 {y, xy, x y, x y, x , x4 y} = x5 ,
2 3 5

not case (3),


m = x5 = (0, 0, 0, 0, 4, 2, −4) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,...,C5.
m = min<2 {y, xy, x2y, x3 y, x4 y, x6 , x5 y} = x6 ,
not case (3),
m = x6 = (0, 0, 0, 0, −8, 4, 4) ·a, so m is added to B(G2 );
| {z }
indep. of C1. ,...,C6.
m = min<2 {y, xy, x y, x y, x y, x5 y, x7 , x6 y} = x7 ,
2 3 4

not case (3),


m = x7 = (0, 0, 0, 0, −4, −8, 8) ·a, so x7 + 2x5 − 2x4 is added to G2 ;
| {z }
=−2·C6. +2·C5.
m = min<2 {y, xy, x2y, x3 y, x4 y, x5 y, x6 y} = y,
not case (3),
m=y= (0, 0, 0, 1, 0, 0, 0) ·a, so 2y − x6 − x5 − 3x4 − x3 is added to G2 ;
| {z }
= 21 ·C7. + 21 ·C6. + 32 ·C5. + 12 ·C4.
m= min<2 {xy, x2 y, x3 y, x4 y, x5 y, x6 y}
= xy,
case (3), so xy is added to M (G2 );
all other power products are also added to M (G2 ).
The algorithm terminates with the Gröbner basis

G2 = {x7 + 2x5 − 2x4 , 2y − x6 − x5 − 3x4 − x3 }

for I w.r.t. <2 . ⊓


32
2.4 Resultants

Lemma 2.4.1. Let a, b ∈ K[x] be polynomials of degrees m > 0 and n > 0, respectively.
Then a and b have a common factor if and only if there are polynomials c, d ∈ K[x] such
that:
(i) ac + bd = 0,
(ii) c and d are not both zero,
(iii) c has degree at most n − 1 and d has degree at most m − 1.

We can use linear algebra to decide the existence of c and d, and in the positive case
compute them. The idea is to turn ac + bd = 0 into a system of linear equations as follows:

a = a m xm + · · · + a 0 , am 6= 0
b = b n xn + · · · + b 0 , bm 6= 0
c = cn−1 xn−1 + · · · + c0
d = dm−1 xm−1 + · · · + d0

Then the equation ac + bd = 0 leads to the linear equations

am cn−1 + bn dm−1 = 0 coeff. of xm+n−1


am−1 cn−1 + am cn−2 + bn−1 dm−1 + bn dm−2 = 0 coeff. of xm+n−2
.. .. ..
. . .
a0 c0 + b 0 d0 = 0 coeff. of x0

We can write this as  


c
M · ··· = 0 ,
d
where c = (cn−1 , . . . , c0 )T , d = (dm−1 , . . . , d0 )T , and the matrix M consists of n shifted
columns of coefficients of a and m shifted columns of b.

Def. 2.4.1. The Sylvester matrix of f and g w.r.t. x, denoted Sylx (f, g), is the coefficient
matrix of the linear system above.
The resultant of f and g w.r.t. x, denoted Resx (f, g), is the determinant of the
Sylvester matrix. ⊓⊔

Theorem 2.4.2. Let f, g ∈ K[x] be of positive degree.


(i) Resx (f, g) ∈ K is an integer polynomial in the coefficients of f and g.
(ii) f and g have a common factor in K[x] if and only if Resx (f, g) = 0.
(iii) There are polynomials A, B ∈ K[x] s.t. Af + Bg = Resx (f, g). The coefficients of A
and B are integer polynomials in the coefficients of f and g.

33
Solving systems of algebraic equations by resultants

Theorem 2.4.3. Let K be an algebraically closed field, let


m
X n
X
a(x1 , . . . , xr ) = ai (x1 , . . . , xr−1 )xir , b(x1 , . . . , xr ) = bi (x1 , . . . , xr−1 )xir
i=0 i=0

be elements of K[x1 , . . . , xr ] of positive degrees m and n in xr , and let c(x1 , . . . , xr−1 ) =


resxr (a, b). If (α1 , . . . , αr ) ∈ K r is a common root of a and b, then c(α1 , . . . , αr−1 ) = 0.
Conversely, if c(α1 , . . . , αr−1 ) = 0, then one of the following holds:
(a) am (α1 , . . . , αr−1 ) = bn (α1 , . . . , αr−1 ) = 0,
(b) for some αr ∈ K, (α1 , . . . , αr ) is a common root of a and b.
Proof: c = ua +vb, for some u, v ∈ K[x1 , . . . , xr ]. If (α1 , . . . , αr ) is a common root of a and
b, then the evaluation of both sides of this equation immediately yields c(α1 , . . . , αr−1 ) = 0.
Now assume c(α1 , . . . , αr−1 ) = 0. Suppose am (α1 , . . . , αr−1 ) 6= 0, so we are not
in case (a). Let φ be the evaluation homomorphism x1 = α1 , . . . , xr−1 = αr−1 . Let
k = deg(b)−deg(φ(b)). By Lemma 4.3.1. in [Win96] we have 0 = c(α1 , . . . , αr−1 ) = φ(c) =
φ(resxr (a, b)) = φ(am )k resxr (φ(a), φ(b)). Since φ(am ) 6= 0, we have resxr (φ(a), φ(b)) = 0.
Since the leading term in φ(a) is non–zero, φ(a) and φ(b) must have a common non–
constant factor, say d(xr ) (see (van der Waerden 1970), Sec. 5.8). Let αr be a root of d
in K. Then (α1 , . . . , αr ) is a common root of a and b. Analogously we can show that (b)
holds if bn (α1 , . . . , αr−1 ) 6= 0. ⊓

Theorem 2.4.3. suggests a method for determining the solutions of a system of alge-
braic, i.e. polynomial, equations over an algebraically closed field. Suppose, for example,
that a system of three algebraic equations is given as

a1 (x, y, z) = a2 (x, y, z) = a3 (x, y, z) = 0.

Let, e.g.,
b(x) = resz (resy (a1 , a2 ), resy (a1 , a3 )),
c(y) = resz (resx (a1 , a2 ), resx (a1 , a3 )),
d(z) = resy (resx (a1 , a2 ), resx (a1 , a3 )).
In fact, we might compute these resultants in any other order. By Theorem 2.4.3, all the
roots (α1 , α2 , α3 ) of the system satisfy b(α1 ) = c(α2 ) = d(α3 ) = 0. So if there are finitely
many solutions, we can check for all of the candidates whether they actually solve the
system.
Unfortunately, there might be solutions of b, c, or d, which cannot be extended to
solutions of the original system, as we can see from Example 1.2.
For further reading on resultants we refer to [CLO98].

34

You might also like