“ And ±i0 < βi0 .

We leave as an exercise to the reader to check that both of the above order

relations satisfy the compatibility property and that the lexicographic ordering

also satis¬es the well-ordering property. However, the reverse lexicographic

order relation is not well ordered. Indeed, with a single variable, we see that

xi revlex xj , if and only if j > i. As the consequence, the in¬nite set S of

monomials de¬ned by:

S = {xi | i ∈ N} (11.20)

has no smallest element.

Thus, the reverse lexicographic order is not a proper monomial ordering.

On the contrary, the lexicographic ordering is admissible and is one of the

most frequent monomial orderings used with Gr¨bner bases.

o

Several other monomial orderings are also used in this context. To introduce

two of them, let us ¬rst de¬ne the total degree of a monomial X ± , as the sum of

n

the exponents, deg(X ± ) = i=1 ±i . It is called the total degree by opposition

to the partial degree in each of the variables. When clear by context, the total

degree of a monomial is simply called its degree. With this notion of degree,

we can de¬ne two additional monomial orderings:

• The graded lexicographic ordering deglex is de¬ned by: ± deglex β

if and only if:

“ Either deg(±) > deg(β)

“ Or deg(±) = deg(β) and ± lex β.

• The graded reverse lexicographic ordering, often known as “grevlex”

grevlex is de¬ned by: ± grevlex β if and only if there exists an index

1 ¤ i0 ¤ n such that:

“ Either deg(±) > deg(β)

“ Or deg(±) = deg(β) and ± revlex β.

1 As explained below, this is not a monomial ordering.

© 2009 by Taylor and Francis Group, LLC

348 Algorithmic Cryptanalysis

Both orderings satisfy the compatibility and well-ordering properties (left as

exercise). In the sequel, the expression graded ordering or total degree

ordering is used to collectively refer to these two orders.

Note that the three monomials ordering we have now de¬ned are, in truth,

families of order parametrized by the order of the variables themselves. Since

n variables can be arranged in n! ways, there are n! monomial orderings in

each family. There exists more monomial orderings that can be used with

Gr¨bner bases. We do not describe these orderings and refer the reader to a

o

textbox on Gr¨bner bases such as [CLO07].

o

Given a monomial ordering and a polynomial f , we de¬ne the head

monomial of f as the largest monomial that appear with non-zero coe¬cient

in f . We denote this head monomial of f by in (f ). When clear by context,

we omit from the notation and simply write in(f ). Using head monomials,

we can associate to any ideal I a monomial ideal in(I) generated by the head

monomials of all polynomials in I. When I already is a monomial ideal, we

see that in(I) = I. In the general case, in(I) can be used to de¬ne Gr¨bner

o

bases:

DEFINITION 11.1 A family of polynomials f1 , f2 , . . . , fm is a Gr¨bner

o

basis for I, if and only if:

I = (f1 , . . . , fm ) and (11.21)

in(I) = (in(f1 ), · · · , in(fm )). (11.22)

An important fact is that any ideal admits a Gr¨bner basis (see [CLO07,

o

Chapter 2]). Moreover, a Gr¨bner basis for an ideal I can be used to construct

o

a Euclidean division rule for I. More precisely:

THEOREM 11.2

If (f1 , · · · , fm ) is a Gr¨bner basis for the ideal I it generates, then any poly-

o

nomial g of K[x1 , · · · , xn ] can be written as:

n

g= gi fi + r, (11.23)

i=1

where none of the monomials with non-zero coe¬cient in r belong to in(I).

This remainder r is unique and does not depend on the choice of the Gr¨bner

o

basis but only on the ideal I and on the chosen monomial ordering. In par-

ticular, if g ∈ I then the remainder is always 0.

The remainder r is called the normal form of g with respect to the ideal I

(and monomial ordering ).

PROOF See [CLO07, Chapter 2].

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 349

As a direct consequence, Gr¨bner bases are often used to determine whether

o

a polynomial belongs to a given ideal or not. The algorithmic version of the

computation of a normal form by Euclidean division as described above is

given as Algorithm 11.1.

Algorithm 11.1 Computation of normal form

Require: Input polynomial g and Gr¨bner basis (f1 , . . . , fm ) of ideal I

o

Let r ←’ 0

Let M ←’ HM (g) {HM stands for Head monomial (with respect to mono-

mial order) with convention HM (0) = 0}

while M = 0 do

for i from 1 to m do

if HM (fi ) divides m then

Let µ ←’ m/HM (fi )

Let » ←’ HC(g)/HC(fi ) {HM stands for Head coe¬cient}

Let g ←’ g ’ »µ fi

Break from FOR loop

end if

end for

if M = HM (g) then

Let r ←’ r + HC(g) M {M has not been reduced, transfer to r}

Let g ←’ g ’ HC(g) M

end if

Let M ←’ HM (g)

end while

Output r (Normal form of g)

11.3.3 Computing roots with Gr¨bner bases

o

Coming back to our initial motivation of ¬nding roots of systems of algebraic

equations, we now consider this problem in the light of Gr¨bner bases. In fact,

o

Gr¨bner bases can be used to address a slightly more general problem, the

o

problem of unknowns elimination, based on the following theorem:

THEOREM 11.3

For any ideal I in the polynomial ring K[x1 , · · · , xn ] and any integer t ∈ [1, n],

let I (t) denotes the set of polynomials in I © K[xt , · · · , xn ]. Then, I (t) is an

ideal in K[xt , · · · , xn ]. Moreover, if G = (f1 , · · · , fm ) is a Gr¨bner basis for I

o

(t)

with respect to the lexicographic order lex , then the subset G of polynomials

in G that belongs to I (t) is a Gr¨bner basis for I (t) .

o

© 2009 by Taylor and Francis Group, LLC

350 Algorithmic Cryptanalysis

PROOF See [CLO07, Chapter 3].

Note that this theorem only holds for the lexicographic ordering. As a

consequence, the applications we now describe require a Gr¨bner basis with

o

respect to this speci¬c ordering. If we are given a Gr¨bner basis for a di¬er-

o

ent ordering and want to perform elimination, we ¬rst need to convert it to

lexicographic ordering (see Section 11.6.4).

To ¬nd roots of an ideal using a Gr¨bner basis for the lexicographic ordering,

o

we start by eliminating all unknowns but one. Equivalently, we consider the

elimination ideal I (n) . Since this ideal is univariate, we know how to ¬nd all

possible for xn . It remains to lift such a partial solution in xn to complete

solution(s) of the original system. We now distinguish several cases.

11.3.3.1 The case of incompatible systems

The simplest case occurs when the initial system of algebraic equations is

incompatible, i.e., when no solution for the system exists over the algebraic

¯

closure K, the Gr¨bner basis for I contains a constant polynomial, say f1 = 1

o

after normalization. Note that in this special case, the elimination ideal I (n)

is also spanned by 1 and there are no partial solutions in xn either.

With incompatible system, the good news is that the Gr¨bner basis con-

o

tains 1, regardless of the monomial ordering that is used. Thus, the knowledge

of a Gr¨bner basis for I under any monomial ordering is enough to check that

o

no solution exists.

11.3.3.2 The case of dimension 0

When a system algebraic equations has a ¬nite number of solutions over

¯ (at least one), it is said to be of dimension 0. In this case, a Gr¨bner o

K

basis G for I as in Theorem 11.3 has a very special form. Namely, for each

unknown xi , there is a polynomial in G whose head monomial is a power of xi .

In addition, if we are considering the lexicographic ordering, this polynomial

only involves the unknowns xi , . . . , xn . For simplicity, we let fi denote the

polynomial of the Gr¨bner basis with a power of xi as head monomial and

o

that only contains xi to xn . We might expect that I only contains these

polynomials; however, this is not true in general.

In any case, the roots of the system can be determined using a simple

backtracking algorithm. First, one ¬nds the roots of fn , which is a univari-

ate polynomial in xn . Then plugging each possible value for xn in fn’1 , one

computes the corresponding values of xn’1 by solving another univariate poly-

nomial. And so on, until the values of all unknowns back to x1 are computed.

In addition, if there are extra polynomials in the Gr¨bner basis, we should

o

check that they are satis¬ed by the candidate solutions. Note that the total

number of solutions can be quite large; counting multiple roots with their

multiplicities, the number of candidates before the ¬nal checking is equal to

the product of the degree of the head monomials in the polynomials f1 to fn .

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 351

Thus, except in the good case where many of the head monomials have degree

1, the total number of roots is at least exponential in n. As a consequence,

computing all these roots can be very costly. However, if we only ask for one

solution of the algebraic system or otherwise restrict the acceptable roots, this

process can become quite e¬cient.

Coming back to the problem of extra polynomials in addition to f1 , . . . ,

fn as de¬ned above, this, in fact, covers two di¬erent things. First, there

is a general fact that Gr¨bner basis can contain too many polynomials, this

o

can be addressed by using the notion of reduced Gr¨bner basis (see De¬ni-

o

tion 11.2 later). Second, somewhat surprisingly, even a reduced Gr¨bner basis

o

may contain extraneous polynomials, in addition to the polynomials f1 to fn

mentioned above.

11.3.3.3 The case of higher dimension

When the system of algebraic equations has an in¬nite number of solutions

¯

over K, the system is under-determined and there are some (at least one)

degrees of freedom left. The dimension of the corresponding ideal is precisely

the number of degrees of freedom that are left. This can be determined from

the Gr¨bner basis of Theorem 11.3. We do not go into more details for this

o

case, because most cryptanalytic applications consider algebraic systems of

dimension 0. However, it remains possible in this case to ¬nd solutions of the

algebraic system.

11.3.4 Homogeneous versus a¬ne algebraic systems

A very frequent special case which is encountered when working with Gr¨bner

o

bases is the case of homogeneous system of equations. To address this case,

we ¬rst need to de¬ne the notion of homogeneous polynomials. Given a mul-

tivariate polynomial F , we say that F is homogeneous, if and only if all the

monomials that occur in F with a non-zero coe¬cient have the same total

degree. For example, F1 (x, y) = x2 + xy + y 2 is a homogeneous polynomial.

An algebraic system of equations is said to be homogeneous, if and only if,

all the polynomials in the system are homogeneous. The de¬nition of homo-

geneous ideals is more complicated. This stems from the fact that all ideals,

including homogeneous ideals, contain non-homogeneous polynomials. Going

back to our example, the ideal generated by the polynomial F1 (x, y) con-

tains (x + 1)F1 (x, y) which is not homogeneous. To get rid of this di¬culty,

we say that an ideal is homogeneous, if and only if, there exists a family of

homogeneous generators for this ideal.

Homogeneous ideals satisfy many additional properties. They are often

considered when working with Gr¨bner bases because they are better behaved

o

and easier to analyze. In particular, given a homogeneous ideal I, it is very

useful to consider its subset Id containing only homogeneous polynomials of

degree d and the zero polynomial. This subset Id is a ¬nite dimensional vector

© 2009 by Taylor and Francis Group, LLC

352 Algorithmic Cryptanalysis

space which naturally occurs in Gr¨bner basis computations.

o

In this book, for lack of space, we do not consider homogeneous systems

but deal with ordinary systems, usually known as a¬ne systems of equations.

This choice is more natural for readers who do not already know Gr¨bner o

basis; however, it induces many complications when going into the details. To

balance this, we sometimes remain imprecise in the following discussion.

To shed some light on the di¬erence between homogeneous and a¬ne sys-

tems in Gr¨bner basis computations, let us quickly mention the notion of

o

“degree fall.” This notion refers to the fact that the sum of two homoge-

neous polynomials of degree d is either a polynomial of degree d or 0. On

the contrary, with a¬ne polynomials, this is no longer true, the sum of two

polynomials of degree d may be 0, a polynomial of degree d or a non-zero

polynomial of degree < d. In this last case, we say that a degree fall occurs

during the sum. While computing Gr¨bner basis, degree falls may occur dur-

o

ing the process and they greatly complicate matters. One consequence is that

in the a¬ne case, in order to obtain vector spaces corresponding to Id for

homogeneous ideals, we need to consider the subset I¤d of all polynomials in

the ideal of degree at most d.

11.4 Buchberger algorithm

The ¬rst algorithm for computing Gr¨bner bases was proposed by Buch-

o

berger. To introduce this algorithm, let us consider a simple example of

an ideal with 2 generators in 2 unknowns, namely the ideal I generated by

f1 = x2 y + 1 and f2 = xy 2 ’ 2. We ¬rst remark that the basis (f1 , f2 ) for I is

not a Gr¨bner basis (for any monomial ordering). Indeed, the ideal in(I) is

o

not generated by the head monomials x2 y and xy 2 of f1 and f2 . In particular,

considering f3 = yf1 ’ xf2 = y + 2x, we see that its head monomial, which

is either x or y depending on the considered ordering, does not belong to the

ideal (x2 y, xy 2 ).

The method we used to construct f3 is the key ingredient of Buchberger™s

algorithm. This polynomial f3 is constructing by subtracting a multiple of f1

and a multiple of f2 in order to guarantee that the two multiples share the

same head monomial and that the contribution of this head monomial vanishes

in f3 . Clearly, the resulting polynomial f3 is unique, up to multiplication by

a constant. It is called a principal syzygy or sometimes a S-polynomial for

f1 and f2 . The principal syzygy of f1 and f2 is denoted by S(f1 , f2 ). In the

sequel, we often write syzygy instead of principal syzygy. Note however that

in general, syzygy has a wider meaning.

The ¬rst application of syzygies is an algorithm, also due to Buchberger,

for testing whether a basis G for an ideal I is a Gr¨bner basis for I. This

o

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 353

algorithm takes all pairs of polynomials from G, constructs their principal

syzygies and computes the remainder of each syzygy modulo G using the

normal form Algorithm 11.1. A theorem of Buchberger states that G is a

Gr¨bner basis for I, if and only if the normal form obtained for each principal

o

syzygy with respect to G is 0. For a proof, refer to [CLO07, Chapter 2].

A simple version of Buchberger™s algorithm can be derived from this decision

algorithm. It su¬ces to add to the initial basis G any non-zero remainder that

appears and to repeat the process until the decision algorithm is happy with

the current basis. Since each of the computed remainders clearly belongs to

the ideal I, the ¬nal basis generates the correct ideal and it is a Gr¨bner basis.

o

This simple version is given as Algorithm 11.2. Note that it needs to compute

the GCD of two monomials M1 and M2 , this is easily done by keeping for

each unknown the minimum of the two exponents in M1 and M2 .

Algorithm 11.2 Basic version of Buchberger™s algorithm

Require: Input list of polynomials G {Throughout the algorithm |G| denotes

the current size of list G}

Let i ←’ 2

while i ¤ |G| do

Let g1 ←’ G[i]

Let M1 ←’ HM (g1 )

Let »1 ←’ HC(g1 )

for j from 1 to i ’ 1 do

Let g2 ←’ G[j]

Let M2 ←’ HM (g2 )

Let »2 ←’ HC(g2 )

Let M ←’ GCD(M1 , M2 )

Let h ←’ »2 (M2 /M )g1 ’ »1 (M1 /M )g2

Let r be the normal form of h with respect to G {Algorithm 11.1}

if r = 0 then

Append r to G {This increments |G|}

end if

end for

Let i ←’ i + 1

end while

Output Gr¨bner basis G

o

Buchberger™s algorithm can be improved from this basic version by adding

rules to avoid the computation of any principal syzygy, for which we can

predict a reduction to zero; see [CLO07, Chapter 2].

We now continue our example, using a monomial ordering such that y x.

Thus y is the head monomial of f3 . We see that the principal syzygy of f1

© 2009 by Taylor and Francis Group, LLC

354 Algorithmic Cryptanalysis

and f3 is S(f1 , f3 ) = f1 ’ x2 f3 = ’2x3 + 1, since x3 is not a multiple of

the previously encountered head monomial, the remainder of this syzygy is

f4 = ’2x3 + 1. Similarly, the syzygy of f2 and f3 is S(f2 , f3 ) = f2 ’ xyf3 =

’2yx2 ’2, adding 2f1 , we see that the corresponding remainder is 0. Next, we

turn to S(f1 , f4 ) = 2xf1 +yf4 = 2x+y, whose remainder is 0 since S(f1 , f4 ) =

f3 . We also see that S(f2 , f4 ) = 2x2 f2 + y 2 f4 = ’4x2 + y 2 = (’2x + y)f3 has

remainder 0 and that S(f3 , f4 ) = x2 f3 + f4 = yx2 + 1 = f1 has remainder 0.

We conclude that (f1 , f2 , f3 , f4 ) is a Gr¨bner basis for I.

o

However, let us remark that this Gr¨bner basis is too large. Indeed, f1 =

o

2 2

x f3 + f4 and f2 = (xy ’ 2x )f3 ’ 2f4 , thus f1 and f2 can safely be removed

and (f3 , f4 ) also is a Gr¨bner basis for I. This also shows that Gr¨bner bases

o o

are not uniquely determined when an ideal and monomial ordering are ¬xed.

Should unicity be required, the notion of reduced Gr¨bner basis can be

o

used.

DEFINITION 11.2 A family of polynomials F = (f1 , f2 , · · · , fm ) is a

reduced Gr¨bner basis for I, if and only if:

o

1. F is a Gr¨bner basis for I

o

2. For all fi in F , no monomial appearing in fi is a head monomial in the

subfamily F ’ {fi }.

3. The coe¬cient of the head monomial in each polynomial fi is 1.

It is easy to obtain a reduced Gr¨bner basis from a Gr¨bner basis with a

o o

simple reduction procedure given as Algorithm 11.3. Moreover, any non-zero

ideal I has a unique reduced Gr¨bner basis; see [CLO07, Chapter 2].

o

11.5 Macaulay™s matrices

In Section 11.2, we saw that in the bivariate case, polynomial systems can

be computed using resultants, which rely on the computation of determinants

of Sylvester™s matrices. In the general case, this can be somehow generalized

by using Macaulay™s matrices.

Macaulay™s matrices are usually de¬ned to represent homogeneous ideals

(see Section 11.3.4). Following our convention in this chapter, we adapt the

de¬nition to the a¬ne case. Given an ideal I, a Macaulay™s matrix of degree

d for I is a matrix that describes a basis for the vector space I¤d introduced

in Section 11.3.4).

Given a Gr¨bner basis for I, it is easy to construct a Macaulay™s matrix

o

M¤d (I) of I¤d . The columns of this matrix are labelled by the monomials

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 355

Algorithm 11.3 Reduction of a Gr¨bner basis

o

Require: Input Gr¨bner basis G as list of polynomials {Throughout the

o

algorithm |G| denotes the current size of list G}

Let i ←’ 1

while i ¤ |G| do

Let g ←’ G[i]

Let r be the normal form of g with respect to G ’ G[i] {Algorithm 11.1}

if r = 0 then

Append G[i] from G {This decrements |G| and renumber polynomials}

else

Let » ←’ HC(r)

Let r ←’ r/»

Replace G[i] by r

Let i ←’ i + 1

end if

end while

Output reduced Gr¨bner basis G

o

of degree ¤ d. With n variables in degree at most d, there are n+d such d

monomials. The rows of the matrix are labelled by simple multiples of the

polynomials in the Gr¨bner. For example, if f is a polynomial of degree d in

o

the Gr¨bner basis, for each monomial m of degree at most d’d there is a row

o

labelled mf . In particular, the matrix of Sylvester can be interpreted in that

way. Starting from f1 and f2 of respective degree d1 and d2 in x, the matrix

contains d1 +d2 columns labelled from x0 to xd1 +d2 ’1 and d1 +d2 rows labelled

x0 f1 to xd2 ’1 f1 and x0 f2 to xd1 ’1 f2 . The only noticeable di¬erence is that

for Sylvester™s matrices, the highest degree monomials are traditionally the

leftmost columns, while for Macaulay™s matrices they are usually the rightmost

ones. Rede¬ning the order of rows and columns in Sylvester™s matrices would,

depending on the degrees d1 and d2 , a¬ect the sign of resultants.

Conversely, Lazard showed in [Laz83] that by performing linear algebra on

Macaulay™s matrices of the ideal I, up to high enough a degree, it is possible

to obtain a Gr¨bner basis for I. This property is, in particular, used for the

o

algorithms presented in the next section.

11.6 Faug`re™s algorithms

e

When computing Gr¨bner bases with Buchberger™s algorithm, the practical

o

limits are quickly encountered. As a consequence, many applications, espe-

cially in cryptanalysis need to use more advanced algorithms. Among those,

© 2009 by Taylor and Francis Group, LLC

356 Algorithmic Cryptanalysis

two algorithms by Faug`re, called F4 and F5 (see [Fau99, Fau02]) are often

e

considered. The papers presenting these two algorithms focus on di¬erent

aspects of Gr¨bner basis computations. On the one hand, F4 considers how

o

to improve the computation of reduced syzygies that occurs in Buchberger™s

algorithm, by using (sparse) linear algebra algorithms on Macaulay™s matri-

ces. On the other hand, F5 focuses on the issue of avoiding reduction to

zero during the computation. Indeed, it is well known that a large fraction

of the syzygies considered during Gr¨bner basis computations yields the zero

o

polynomial after reduction modulo the current basis of the ideal.

The basic idea of the F4 algorithm is to use Macaulay™s matrices in order to

represent polynomial ideals, compute syzygies and speed up the computation

required by Buchberger™s algorithm. The basic idea of the F5 algorithm is to

organize the computation at the abstract algebraic level in order to predict and

remove many of the reductions to zero. For optimal performance of Gr¨bner o

basis computations, it is best to use both the F4 and F5 ideas at the same

time.

11.6.1 The F4 approach

As said above, the F4 approach heavily relies on the use of Macaulay™s

matrices. Thus, it is useful to explicitly specify the correspondence between

polynomial ideals and matrices. If we restrict ourselves to polynomials of total

degree at most d within an ideal, we need to represent a vector space of ¬nite

dimension.

Let us start by considering a vector space Vd be a vector space generated by

an arbitrary (¬nite) set Sd of polynomials of degree at most d. We can easily

describe Vd with a matrix if we label the columns of the matrix by monomials

and ¬ll each row with a polynomial by placing in the column corresponding to

a monomial the coe¬cient of this monomial in the polynomials. In particular,

if we multiply this matrix by the vector formed of all monomials in the same

order, we obtain a vector whose coordinates are the polynomials of Sd . To

make sure that this encoding is meaningful in the context of Gr¨bner basis, the

o

mapping between monomials and columns should conform to the monomial

ordering for which the Gr¨bner basis is computed. Moreover, for the sake of

o

compatibility with the linear algebra algorithms, it is preferable to consider

that this mapping is in decreasing order starting from the leftmost column.

With this convention, the head monomial of a polynomial corresponds to the

leftmost non-zero entry in the corresponding row.

In the case of ideals, we encounter complications because a family of poly-

nomials which generates an ideal does not su¬ce to generate the full vector

space. Temporarily forgetting about the degree limit, we can construct an

in¬nite generating family simply by putting together all multiples of the orig-

inal polynomials. Putting back the degree limit, two very di¬erent cases arise.

With homogeneous ideals, it su¬ces to consider the multiples of the original

polynomials up to the degree limit. With a¬ne ideals, because of degree falls,

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 357

this is no longer true and we might need additional polynomials.

To illustrate this by a basic example, let us consider the ideal generated by

f1 (x, y) = x and f2 (x, y) = x2 + y. Up to degree 1, there is a single multiple

to consider: f1 itself. However, the ideal generated by f1 and f2 contains the

2

polynomial f2 ’ f1 = y, which is of degree one.

Thus, we see that for ¬xed degree d, we are faced with two challenges.

First, we need to make sure that the matrix we construct truly covers all

polynomials of degree up to d. Second, we should represent this matrix in a

convenient form which allows us to quickly test whether a given polynomial

belongs to the vector subspace spanned by this matrix.

The second issue is quite easy to address, it su¬ces to transform the matrix

into row echelon form as described in Section 3.3.3 of Chapter 3. However,

in the context of Gr¨bner basis computations, we need to slightly change

o

the implementation of the linear algebra algorithm, in order to mix some

dense and some sparse rows within the same matrix. Indeed, on the one

hand, in some rows we simply need to store multiples of polynomials of lower

degree and it is preferable to store them in factored form. On the other hand,

in some rows we need to store polynomials which have been computed as

syzygies and it is often easier to store them in dense form. Note that in tuned

implementations, it can be very useful to use various techniques, described

in [Fau99], to also compress these polynomials.

The ¬rst issue is more complicated. Indeed, due to degree fall, when com-

puting the row echelon form of Macaulay™s matrix up to degree d, new poly-

nomials of degree smaller than d may appear due to a degree fall. In that

case, we should not only add this polynomial to the matrix but also all its

multiples. Due to this possibility, it is not enough to compute the multiples

of the initial polynomials up to degree d and to perform linear algebra in

order to make sure that Macaulay™s matrix is complete. Instead, the F4 ap-

proach couples this linear algebra approach with the notion of syzygy as in

Buchberger™s algorithm.

Still, the basic idea of constructing all multiples of the given polynomial

up to some degree d, putting them in a matrix and reducing this matrix

in row echelon form is a very useful thought experiment. Note that, if d is

high enough, this yields a Gr¨bner basis. The main problem is that, short of

o

taking an unreasonably high degree, we do not know beforehand how high d

should be. Moreover, the degree of monomials that needs to be considered

with such a direct approach is usually larger than the maximum degree that

occurs during more advanced Gr¨bner basis algorithm. In addition, except

o

for systems of equations with speci¬c properties, in order to ¬nd the right

value of d, we essentially need to compute a Gr¨bner basis of the system.

o

Yet, we give for further reference an algorithmic version of this basic idea as

Algorithm 11.4. Note that this algorithm is essentially what has been called

linearization/relinearization in [KS99].

© 2009 by Taylor and Francis Group, LLC

358 Algorithmic Cryptanalysis

Algorithm 11.4 A basic linear algebra based Gr¨bner basis algorithm

o

Require: Number of unknowns n

Require: Input list of polynomials G

Require: Maximal degree D to consider

Let N ←’ n+d {Number of monomials, see Exercise 6}

d

Create array A (with N columns)

for i from 1 to |G| do

for all M monomial of degree at most D ’ deg G[i] do

Let h ←’ M · G[i]

Add a new row in A encoding h

end for

Compute row echelon form for A

Let m denote the number of non-zero rows in A

Create empty list H

for i from m downto 1 do

Transform i-th row of A into polynomial h {This loop considers head

monomials in increasing order.}

if HM (h) not divisible by any head monomial of polynomial in H

then

Add h to H

end if

end for

end for

Output H

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 359

11.6.2 The F5 approach

With Buchberger™s algorithm and also when using the F4 approach, it has

been remarked that a large fraction of the polynomials that are constructed

reduces to zero. As a consequence, the algorithms spend a large fraction

of their running time computing and reducing polynomials which are then

discarded.

To analyze this problem, let us consider the thought experiment Algo-

rithm 11.4. Assume that f1 and f2 are two polynomials of degree d1 and

d2 in the initial system. If we construct the matrix containing all multiples

up to degree d1 +d2 , the identity f2 f1 ’f1 f2 = 0 is going to induce a reduction

to zero during the linear algebra. To see why, it is useful to label the rows we

are inserting in the matrix. Given an initial polynomial fk and a monomial m,

we label the rows containing a description of the polynomial mfk by the label

mFk . Once this is done, we can easily label a linear combination of rows, in

N

the following way: given a polynomial g = i=1 ±i mi then gFk denotes the

N

linear combination of rows: i=1 ±i (mi Fk ). With this convention, any linear

combination of rows can be described as a sum of products of the form gi Fi .

We now remark that the above identity implies that the linear combination

f2 F1 ’ f1 F2 is equal to 0. This is a generic reduction to zero which occurs for

an arbitrary system of equations.

The goal of the F5 approach is precisely to avoid all such generic reductions

to 0. In the context of Algorithm 11.4, let us describe all possible generic

reductions to 0. Assume that the initial system contains polynomials from f1

to fk , labeled F1 to Fk and denote by I the ideal generated by the initial

sequence of polynomial from f1 to f . Let g be an arbitrary polynomial in

I , which can be written as i=1 gi fi . Assuming that < k, we have an

identity gf +1 ’ f +1 g = 0. From this identity, we easily see that the linear

combination

’

gF (f +1 gi )Fi

+1

i=1

is equal to zero. To avoid the corresponding reduction to zero, we should make

sure that at least one monomial involved in this combination is not included

in Macaulay™s matrix. Moreover, if we remove a single row for this reduction

to zero, we are not changing the vector space spanned by the matrix. Thus,

the row echelon form we obtain is una¬ected2 . In [Fau02], Faug`re proposed

e

to remove the row mF +1 corresponding to the product of the head monomial

m of g by the polynomial f +1 . Of course, this should be done for all possible

head monomials in I . Faug`re also proved in [Fau02] that for homogeneous

e

ideals, this approach removes all generic reduction to 0.

Up to now, we have described Faug`re™s idea in the context of Algorithm 11.4.

e

However, we would like to use it with practical Gr¨bner basis algorithms. In

o

2 More precisely, the only change to the row echelon form is the removal of a row of zeros.

© 2009 by Taylor and Francis Group, LLC

360 Algorithmic Cryptanalysis

order to do this, we need to be able to label all the polynomials that occur

during these computations. The problem is that, with these algorithm, we are

no longer inserting multiples of the initial polynomials but more complicated

combinations obtained from syzygies. Of course, we could always label the

k

linear combination as above with a label of the form i=1 gi Fi . However, this

would be cumbersome to manipulate. The good news is that to remove generic

reduction to zero, it su¬ces to keep the “head term” of this complicated label.

More precisely, if g is the ¬rst non-zero polynomial in the complete label and

m is its head monomial, then we simply keep m F as simpli¬ed label. These

labels can be easily tracked during the computations of syzygies. Indeed, let

us consider a syzygy between f and g of the form ±f mf f ’±g mg g and assume

that f and g are respectively labelled m F and m F with > . Then the

syzygy is simply labelled by (mf m )F . If = , we should keep the largest

of the two monomials m and m in the label. Note that we cannot have

both = and m = m because it would correspond to inserting twice the

same row in Macaulay™s matrix and lead to a generic reduction to zero.

11.6.3 The speci¬c case of F2

A speci¬c case which arises quite often in cryptography is the computation

of Gr¨bner basis over the ¬nite ¬eld F2 . Moreover, in this case, we usually

o

want to ¬nd solutions of the system over the ¬nite ¬eld itself and not over its

algebraic closure. In order to make this information available to the Gr¨bner

o

basis, a traditional method is to add the so-called “¬eld equations.” For an

unknown x, the ¬eld equation over F2 simply is x2 +x = 0. It is clear that over

F2 both 0 and 1 satisfy this equation. Moreover, any value strictly belonging

to an extension ¬eld of F2 does not satisfy it. Thus, adding ¬eld equations let

us focus on the ground ¬eld solutions. Note that this can be generalized to

Fp by considering the equations xp ’ x = 0. However, when p becomes even

moderately large, the degree of the ¬eld equations is too big to be useful in

Gr¨bner basis computations. With F2 , the ¬eld equations are only quadratic

o

and very helpful to speed up the computations of Gr¨bner bases.

o

However, this is not the only optimization which can be used over F2 . We

can also improve the linear algebra as in Chapter 3 using low-level optimiza-

tion tricks. In fact, similar tricks can be used to add the ¬eld equations

implicitly, rede¬ning multiplication to re¬‚ect the property that x2 = x. This

allows to represent the monomials in a more compact way. However, a bad

side e¬ect is that the syzygies between a polynomial and an implicit ¬eld

equation should be addressed in a speci¬c fashion.

For illustration purposes, a simple implementation of Gr¨bner basis com-

o

putation over F2 simultaneously using the F4 and F5 approaches is available

on the book™s website.

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 361

11.6.4 Choosing and changing monomial ordering for Gr¨bner

o

bases

In theory, all the algorithms that we have seen for computing Gr¨bner bases

o

can be applied with an arbitrary monomial ordering. However, in practice, it

has often been remarked that the choice of monomial ordering greatly impacts

the running time of the computation. As a general rule, total degree orderings

are to preferred. More precisely, the ordering grevlex is usually the best

possible choice. Depending on the speci¬c problem at hand, one can also

¬ne-tune the order of the individual variables within the monomial ordering.

This becomes a problem when the application we have in mind requires

a speci¬c monomial ordering. For example, we have seen that to solve a

algebraic system of equations, the lexicographic ordering should be preferred.

The good news is that, once we have computed a Gr¨bner basis of an ideal for

o

some monomial ordering, it is easier to derive from this a Gr¨bner basis for

o

another ordering than to recompute this second basis from scratch. Moreover,

speci¬c algorithms have been proposed to address this problem. They are

called ordering change algorithms for Gr¨bner bases. In this section, we brie¬‚y

o

discuss these algorithms.

When considering ordering change, two cases need to be distinguished. For

ideals of dimension 0, i.e., ideals with ¬nitely many solutions, there is a method

due to Faug`re, Gianni, Lazard and Mora (see [FGLM93]), which is called

e

FGLM. This method is based on linear algebra and very e¬cient. We sketch

its main idea below in a simpli¬ed version. Thanks to this method, the best

approach to compute Gr¨bner bases in dimension 0 is usually to start by

o

computing a basis for the grevlex ordering and then to convert this basis to

the desired ordering. For ideals of larger dimension, there exists a technique

called the Gr¨bner Walk [CKM97] which can be used for conversion. However,

o

in practice, it is much less e¬cient than FGLM. Thus, problems which involve

ideals of positive dimension are often harder to solve than similar problems

in dimension 0. The details of the Gr¨bner Walk algorithm are out of the

o

scope of this book; however, it is useful to know that this algorithm is often

available in Gr¨bner basis computation packages.

o

11.6.4.1 A simpli¬ed variant of FGLM

Assume that we are given a reduced Gr¨bner basis G for some ideal I

o

and some monomial ordering, usually grevlex. We know that to solve the

algebraic system, it is useful to convert this to obtain a Gr¨bner basis G for

o

the lexicographic ordering. Moreover, if I has dimension 0, there exists in G

a univariate polynomial f (x1 ) involving only the smallest variable x1 in the

lexicographic ordering.

To illustrate FGLM, we now present a simpli¬ed version of this algorithm,

whose goal is to e¬ciently recover f (x1 ) from G. In order to do this, we

¬rst de¬ne the set B(G) of all monomials which are not divisible by a head

monomial of G, with respect to the monomial ordering used for computing G.

© 2009 by Taylor and Francis Group, LLC

362 Algorithmic Cryptanalysis

A very important fact is that, since I has dimension 0, B(G) is a ¬nite set.

Let us also consider the set V (G) of all polynomials after reduction modulo

G. This set is a ¬nite dimensional vector space and B(G) is a basis for V (G).

In the vector space V (G), we can consider the action of the multiplication

by x1 . More precisely, starting from a polynomial P in V (G), we multiply it

by x1 and perform reduction modulo G. The result x1 P (mod G) is a poly-

nomial in V (G). Since, V (G) is ¬nite dimensional, any large enough family

of polynomials in V (G) is linearly dependent. In particular, the sequence of

monomials 1, x1 , . . . xN reduced modulo G is linearly dependent when N is

1

large enough. Moreover, such a dependence relation can be translated into a

polynomial f such that f (x1 ) is equal to 0 modulo G. Thus, f (x1 ) belongs to

I. If f is minimal, i.e., if N is minimal, this polynomial is the ¬rst polynomial

of G , that we were looking for.

From an algorithmic point-of-view, we need to consider two issues. First,

we need a fast method for multiplication by x1 in order to produce the se-

quence xi . Second, we need to recover f from the expression of the monomials.

1

The ¬rst issue can be addressed by remarking that to multiply a polynomial

by x1 , it su¬ces to multiply each of its terms by x1 and to sum the results.

For monomials, the multiplication by x1 is very easy and we only need to

study the reduction modulo G. For the monomial m, two cases arise: ei-

ther x1 m is in B(G) and there is nothing to do, or x1 m is divisible by the

head monomial of a polynomial g in G. In that case, we need to perform a

full reduction modulo G. An important improvement on this basic method

is proposed in [FGLM93] to replace this full reduction by a simple subtrac-

tion of a multiple of g, followed by the re-use of already computed monomial

multiplications. We do not fully describe this improvement which requires

to simultaneously compute the multiplication of monomials by each possible

variable xi .

The second issue is a simple matter of using linear algebra algorithms to

¬nd a linear dependency between the reduced expressions for the monomials

xi . The rest of the polynomials in G are found by computing extra linear

1

dependencies in other families of monomials.

11.7 Algebraic attacks on multivariate cryptography

To illustrate the use of Gr¨bner basis computations in cryptography, we are

o

going to consider attacks against a speci¬c public key cryptosystem based on

multivariate cryptography and called HFE (Hidden Field Equations). This

section presents a description of HFE together with an account of Gr¨bner-

o

based attacks against this cryptosystem. Note that these attacks do not work

on all the variations of the HFE system as proposed in [Pat96], they target

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 363

the basic HFE scheme. For more information about multivariate public key

cryptography, the reader may refer to [BBD07, pages 193“242].

11.7.1 The HFE cryptosystem

The HFE scheme proposed by Patarin in [Pat96] is a multivariate crypto-

graphic scheme based on the hardness of solving quadratic systems of multi-

variate equations over ¬nite ¬elds. Its binary version uses quadratic systems

over F2 . The rationale behind this scheme is that solving quadratic systems

over F2 is an NP-complete problem.

In order to describe the HFE cryptosystem, we are going to alternate be-

tween two di¬erent representations of the polynomial ring F2n [X]. In what

we call the secret view, we consider the ¬nite ¬eld with its complete alge-

braic structure and take a univariate polynomial ring over this ¬nite ¬eld. In

the public view, we strip a large amount of structure and consider the mul-

tivariate polynomial ring F2 [x0 , · · · , xn’1 ]. To link the two representations,

we choose a basis for F2n , for example a polynomial basis given by 1, ±, . . . ,

n’1

±n’1 , replace X by i=0 xi ±i and take the coordinates of the result. Using

this link, any polynomial F in X is translated into n polynomials f0 , . . . ,

fn’1 (with fi denoting the ±i -coordinate). Moreover, if the degree of the

monomials appearing in F is well chosen, it is possible to make sure that the

coordinates fi are quadratic. Indeed, it su¬ces to remark that any monomial

i

X 2 is obtained by iterating the Frobenius map i times and has a linear ex-

i j

pression in the xi s. As a consequence, any monomial X 2 +2 with i = j leads

to a quadratic expression.

The key idea behind HFE is that in the secret view, evaluating F at a point

X is easy and solving the equation Y = F (X) for a given value of Y is also

easy. In the public view, evaluating n quadratic polynomials remains easy,

but solving a system of n quadratic equations is hard in general. Thus, F

is a candidate trapdoor one-way function, the knowledge of the secret view

being the trapdoor. Note that to make sure that F remains easy to invert,

i

we need to put an upper bound D on the degree of the monomials X 2 and

i j

X 2 +2 appearing in F . In order to make the description of HFE complete,

a single ingredient is missing: how do we guarantee that the secret view is

hidden and not revealed by the public view? The answer proposed in [Pat96]

is astonishingly simple: perform two (invertible) linear changes, one on the

input variables and one on the resulting system of equations. Despite the

simplicity of this ingredient, no known simple attack can recover the secret

view from the public view of an HFE system.

To make the description of HFE more precise, we ¬rst ¬x the two parameters

n and D and choose a random polynomial F in F2n [X] of degree at most

D that only contains that constant monomial and monomials of the form

i i j

X 2 and X 2 +2 . We also choose two random invertible n — n matrices over

F2 and produce the public system of quadratic functions by considering the

composition T —¦ f —¦ S. In other words, we proceed as follows:

© 2009 by Taylor and Francis Group, LLC

364 Algorithmic Cryptanalysis

• Let x0 , . . . , xn’1 be the input variables and ¬rst compute a linear change

of variables by writing:

« «

x0 x0

¬ x1 · ¬ x1 ·

¬ . · = S · ¬ . ·. (11.24)

¬ · ¬ ·

. .

. .

xn’1 xn’1

n’1

• Let X = i=0 xi ±i , consider the polynomial F (X) and take its coor-

dinates writing:

n’1

fi (x0 , · · · , xn’1 )±i .

F (X) = (11.25)

i=0

• Compute the second change of coordinates, letting:

« «

f0 f0

¬ f1 · ¬ f1 ·

¬ . · = T · ¬ . ·. (11.26)

¬ · ¬ ·

. .

. .

fn’1 fn’1

After computing the polynomials fi , they are published as the public key of

the HFE scheme. Since these polynomials are quadratic in n variables over F2 ,

they can be described quite compactly. Indeed, there are n(n’1)/2 quadratic

monomials xi xj , n linear monomials xi and a single constant monomial. All

in all, the n polynomials can be encoded using about n3 /2 bits. The HFE

public operation simply consists of evaluating the n public polynomials on a

n-uple of bits. For encryption purposes, it is important to make sure that

each ciphertext has a unique decryption. Since this is not guaranteed for an

arbitrary function F , it is necessary in this case to add redundancy in the

encrypted n-uple, in order to allow unique decryption. We do not describe

this mechanism here.

A important remark about HFE is that inverting F , with knowledge of

the secret view, requires time polynomial in n and D. As a consequence, to

express this time in terms of a single parameter, it is natural to assume that

there exists a parameter γ such that D = O(nγ ). Letting t denote log2 (D),

this implies that t = O(log n). Note that t is an upper bound on the values

i i j

i and j that may appear in the monomials X 2 and X 2 +2 . In practice, a

value of D around 128 is frequently considered.

11.7.2 Experimental Gr¨bner basis attack

o

Clearly, in order to break HFE, it su¬ces to ¬nd a method that allows

inversion of an algebraic system of equations based on the polynomials fi

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 365

and on a target n-uple of bits. In general, for large values of n, currently

available algorithms completely fail to compute a Gr¨bner basis for n random

o

quadratic polynomials over F2 . In particular, for n = 80 which is one of the

parameters proposed as a challenge by Patarin, random quadratic systems are

out of range of Gr¨bner algorithm.

o

However, quadratic systems based on the HFE trapdoor are not random.

In particular, this was illustrated by Faug`re who was able to break an HFE

e

challenge for n = 80, see [FJ03]. The experimental fact is that, when faced

with HFE systems, the Gr¨bner basis algorithm only involves the computation

o

of polynomials and syzygies of surprisingly low degree. This degree is related

to the degree D of the secret polynomial, for example for 17 ¤ D ¤ 128 no

polynomial of degree higher than 4 appears. Moreover, the resulting Gr¨bner

o

basis contains a very large number of linear polynomials, almost n. As a

consequence, once the Gr¨bner basis is obtained for the grevlex ordering,

o

there is no need to perform a change of ordering, it su¬ces to compute all

the solutions of the linear system and test each of them against the original

system of equations.

This experimental result was published, together with a partial heuristic

explanation based on monomial count in [FJ03].

11.7.3 Theoretical explanation

The reason for the experimental behavior of HFE systems can be analyzed

in a much more precise fashion. The key argument is that the degree of

the polynomials and syzygies that appear in a Gr¨bner basis computation is

o

essentially una¬ected by the two sorts of linear changes that are used when

constructing the HFE secret key.

As a consequence, it su¬ces to analyze the highest degree that can be

reached for a Gr¨bner basis computation, when the linear transforms are

o

omitted. Thanks to the use of normal bases, it is even possible to study

an even simpler system of equations over the extension ¬eld F2n . The idea

is to represent an element X in F2n in a redundant manner as a n-uple

n’1

(X, X 2 , X 4 , · · · , X 2 ) containing X and all its copies by the Frobenius map,

i.e., by squaring. There is a linear change, with coe¬cients in F2n between

this n-uple and the coordinates of X in any ¬xed basis for F2n over F2 . For

simplicity, we give a name to each unknown in the above n-uple and let Xi

i

denote X 2 . With this representation of X, the polynomial F can be writ-

ten as a quadratic polynomial G(X0 , X1 , · · · , Xn’1 ). The good news is that,

when looking more precisely at the construction of G from F , this quadratic

polynomial only involves a subset of the unknowns (X0 , · · · , Xt ) where t is

the parameter t = log2 D that we introduced earlier. Moreover, applying the

Frobenius maps i times to the equation G(X0 , X1 , · · · , Xn’1 ) = Y , we obtain

new equations:

i

G(Xi , Xi+1 , · · · , Xi+n’1 ) = Y 2 , (11.27)

© 2009 by Taylor and Francis Group, LLC

366 Algorithmic Cryptanalysis

where the indices are reduced modulo n in case of over¬‚ow.

Of course, we also have extra quadratic equations that translate the Frobe-

2

nius relations, i.e., the equations Xi+1 = Xi . Putting all this together, if

we take t copies of G by applying the Frobenius map up to t ’ 1 times, we

have a total of 3t ’ 1 quadratic equations in 2t unknows. Since the number of

equations in this system is 1.5 times the number of unknowns, it is possible

to use generic complexity analysis results on Gr¨bner bases computation to

o

O(t)

show that the running time is bounded by t ; see [GJS06]. Note that our

speci¬c use of this complexity result involves a heuristic argument. However,

it can be made rigorous if we only aim at constructing a distinguisher between

HFE public keys and random quadratic systems with equivalent parameters.

To summarize, the relative weakness of HFE schemes to Gr¨bner techniques

o

comes from the fact that the system hides by linear changes a system that

can be obtained from the secret view and that involves much fewer equations.

11.7.4 Direct sparse approach on Macaulay™s matrix

To conclude this chapter, it is important to mention that it is, sometimes,

possible to consider a very direct approach to Gr¨bner basis computations for

o

cryptographic applications, similar to Algorithm 11.4. To give a single exam-

ple, consider once again the case of HFE systems, we know that a Gr¨bner o

basis computation only involves polynomial up to some predictable degree and

that the resulting Gr¨bner basis for the corresponding ideal contains many

o

linear polynomials.

Then, an option is to consider a matrix similar to Macaulay™s matrix de¬ned

in Section 11.5 and to search for linear polynomials in the vector space spanned

by this matrix. To construct this matrix, we proceed exactly as in Section 11.5

or Algorithm 11.4. Note that, since we a not starting from a Gr¨bner basis

o

for our ideal, we obtain an incomplete copy of the Macaulay matrix of the

same degree. If we consider high enough a degree, we may hope, thanks to

degree falls, to ¬nd linear polynomials in the vector space spanned by this

matrix. Ignoring this di¬culty for now, we may see that linear polynomials

in the space spanned by Macaulay™s matrix can be found by looking for kernel

elements of a truncated Macaulay™s matrix from which the columns labelled

by the constant and linear monomials have been removed. Clearly, such a

kernel element de¬nes a linear combination of rows that is non-zero on all

monomials of degree 2 or more. This yields either the zero polynomial, the

unit polynomial or a non-trivial linear polynomial. Note the unit polynomial

can only occur if the algebraic system of equations generates the unit ideal,

in that case, the system of equations does not have any solutions.

Since we are looking for kernel elements in a matrix, we may use the al-

gorithms of Chapter 3. Moreover, looking more precisely at our variation of

Macaulay™s matrix, we may remark that it is a sparse matrix. Indeed, each

row in the matrix represents a multiple of a polynomial of low degree and,

thus, it cannot contain too many non-zero coe¬cients. As a consequence, we

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 367

can apply sparse linear algebra algorithms to ¬nd kernel elements. In fact,

we can do better than that. The key argument is to remember that, with

sparse linear algebra algorithm, the main requirement is to be able to e¬-

ciently multiply vectors on the left by the matrix we are considering and also

by its transpose. Since Macaulay™s matrix is obtained by putting together

several shifted copies of the matrix encoding the initial polynomials, we can

implement multiplication of a vector by the complete matrix by executing

several multiplications of well-chosen subvectors by the original matrix and

by reassembling the results into a single vector. This yields a faster matrix

by vector multiplication than a straightforward sparse approach. Indeed, this

approach is compatible with bitslice operations and also cache-friendly.

11.8 On the complexity of Gr¨bner bases computation

o

The complexity of a speci¬c Gr¨bner basis computation depends on several

o

parameters, the underlying ¬eld, the number of unknowns, the number of

equations and their degrees. However, the most important parameter is the

maximum degree of the polynomials that occurs during the computation.

This maximum degree may greatly vary even for similarly looking systems of

equations. In fact, this is the parameter which explains why HFE systems

can be solved, while random systems with similar speci¬cations cannot.

When using F5 with the grevlex technique on homogeneous system, it is

possible by using sophisticated mathematical techniques to compute an upper

bound on a technical parameter called the degree of semi-regularity of the

system.

Without entering the technical details, let us simply mention that the de-

gree of semi-regularity provides a heuristic bound on the maximum degree of

polynomials occuring in the Gr¨bner basis computation.

o

A good reference for this, written in French, is the Ph.D. thesis of Bardet [Bar04].

Let us quote a few very useful theorems from [Bar04]. We start by two theo-

rems which hold for computing Gr¨bner bases over large ¬nite ¬elds.

o

THEOREM 11.4 Bardet 4.1.2

For a ¬xed value of k and n tending to in¬nity, the degree of semi-regularity

for a system of n + k equations of degree D in n unknowns is asymptotically

upper bounded by:

√

D2 ’ 1

D’1

’ ±k

n n + o( n), (11.28)

2 6

for some constant ±k .

© 2009 by Taylor and Francis Group, LLC

368 Algorithmic Cryptanalysis

THEOREM 11.5 Bardet 4.4.1

For a ¬xed value of ± > 1 and n tending to in¬nity, the degree of semi-

regularity for a system of ±n quadratic equations in n unknowns is asymptot-

ically upper bounded by:

’a1 1/3 2± ’ 1

+ o(n’1/3 ),

(± ’ 1/2 ’ n ’ 2’

β) n + (11.29)

1/6 4β 1/2

2β

where β = ±(±’1) and a1 ≈ ’2.33811 is a constant (the largest root of Airy™s

Ai function).

For algebraic systems over F2 , the results are slightly di¬erent. We have:

THEOREM 11.6 Bardet 4.4.3

For a ¬xed value of ± > 1 and n tending to in¬nity, the degree of semi-

regularity reached for a system of ±n quadratic equations in n unknowns over

F2 is asymptotically upper bounded by:

11

n + O(n1/3 ). (11.30)

2±2 ’ 10± ’ 1 + 2(± + 2)

’± + + ±(± + 2)

22

In this theorem, the ¬eld equations are implicitly counted and should not

be counted when determining the value of ±. For large values of ±, the

bound given by this theorem becomes close to the bound obtained by using

Theorem 11.5 for (± + 1)n equations. However, when ± is small the speci¬c

F2 bound is much tighter.

© 2009 by Taylor and Francis Group, LLC

Polynomial systems and Gr¨bner base computations

o 369

Exercises

1. Consider the monomial orderings described in this chapter and show

that they indeed satisfy the well-ordering and compatibility properties.

Show that the reverse lexicographic order relation is not a monomial

ordering.

2. Compute the resultant of x and x + 25. Also compute the resultant of

x2 and x2 + 5. Which of these two pairs of equations has a common

root modulo 25? Conclude that resultants only give information about

roots modulo prime, not prime powers.

3h . Let f be a polynomial of degree d. Recall that the reciprocal polynomial

of f is xd f (1/x). Given two polynomials f and g, what is the relation

between the resultant of f and g and the resultant of their reciprocal

polynomials?

4. Consider the polynomials xy + 1 and x2 y + x + y and compute their

resultant, eliminating x. What are the roots of the resultant? Show

that they cannot be completed into solutions of the bivariate system.

5h . Construct an example of a bivariate ideal of dimension 0 whose Gr¨bner

o

contains more than 2 polynomials.

6h . What is the number of monomials of degree d in n unknowns? The

number of monomials of degree at most d? Write a program that assigns

a number to each monomial of degree at most d and explicitly computes

the bijection between a monomial and its number.

7. Let f (x) be a polynomial of degree d. Compute the resultant of f

and ax + b. This expression is especially useful for the algorithms of

Chapter 15.

This chapter can be a source for numerous implementations projects. A

good start is to implement the basic linear algebra approach given as Algo-

rithm 11.4. This can be completed by considering algorithmic techniques to

reduce the amount of memory: compressing the polynomial representations

or using iterative algorithms. An alternative approach is to add F5 criterias

to this algorithm to construct smaller matrices with very few reductions to

zero during the row echelon form computation.

© 2009 by Taylor and Francis Group, LLC

Part III

Applications

© 2009 by Taylor and Francis Group, LLC

Chapter 12

Attacks on stream ciphers

Stream ciphers are widely encountered in applications where resources are

limited and speed is critical. They come in two ¬‚avors: keystream generators

and self-synchronizing stream ciphers. However, keystream generators are

much more frequent and we only consider this kind of stream ciphers in this

chapter.

Before studying keystream generators, let us ¬rst recall Shannon™s one time

pad. In its binary version, each message is viewed as a sequence of bits and

encrypted by bitwise xoring each bit of message with a corresponding bit

of key. Despite its extreme simplicity, this encryption algorithm provably

ensures the con¬dentiality of encrypted messages. The only information that

can be learned by an eavesdropper is the length of the message. However,

the security proof of the one time pad requires a perfect key: each bit of key

should be generated at random from a uniform distribution and should be

independent from other key bits. In particular, it is well know that re-using

the key material with a one time pad yields a severely broken system; it is

known as the parallel message attack. As a consequence, using the one time

pad in practice is very challenging and rarely done.

Even with a perfect key, the security of the one time pad holds against

a very limited class of adversaries: passive eavesdroppers. If active attacks

are allowed, the one time pad is no longer secure. There are two main di-

rections that allow active adversaries to attack one time pad systems. The

¬rst approach is to remark that encrypted strings are malleable and thus do

not protect the integrity of transmitted messages. The second approach is

to remark that from a plaintext/ciphertext pair, the encryption key can be

directly recovered. As a consequence, if the attacker can trick the receiver

into decrypting a fake message, he can obtain the key and decrypt the real

message later on.

Keystream generators can be seen as a practical way of using the one time

pad. Instead of using an extremely long key, the two parties of the encrypted

communication possess a common short secret key and use the keystream

generator to generate a large random looking string. This idea of random

looking strings can be formalized into polynomial time indistinguishability

from random, called IND$ in Chapter 1.

Note that encryption schemes based on keystream generators inherit the

weaknesses of the one time pad. The same keystream should never be re-

373

© 2009 by Taylor and Francis Group, LLC

374 Algorithmic Cryptanalysis

used to avoid the parallel message attack. Moreover, a secure integrity check

mechanism should be used together with the keystream generator is order to

protect the integrity of messages. For example, following the Encrypt-then-

MAC approach presented in Chapter 1, it su¬ces to add a MAC tag of the

encrypted message to avoid the malleability issue.

Another feature of practical keystream generators is that they usually ac-

cept on auxiliary input, the initial value or IV that allows the users to generate

di¬erent keystreams for di¬erent messages without requiring too much has-

sle to manage these keystreams and to avoid re-using key material between

messages.

12.1 LFSR-based keystream generators

Linear Feedback Shift Registers (LFSR) are often used as a basis for pseudo-

random generators. One important reason is that these generators have a long

period, in particular, when the feedback polynomial is primitive (see Chap-

ter 2, Section 2.5.2), the period of an LFSR with n cells is 2n ’ 1. However,

while directly using an LFSR as pseudo-random source in a video game or

in scienti¬c computing is a good idea, where cryptographic applications are

concerned, this is not acceptable. Indeed, after observing 2n output bits, an

adversary can completely reconstruct the state and feedback polynomial of

the LFSR being observed, using either a simple linear algebra approach or

the more speci¬c Berlekamp-Massey algorithm presented in Chapter 2.

However, the long period property, together with the good statistical prop-

erties of output sequences are highly desirable and it remains tempting to use

LFSRs as core components of keystream generators. Several basic construc-

tions are frequently used. They all aim at using the good properties of LFSRs

while hiding the linearity.

Throughout the rest of this chapter, we are mostly going to study the

security of one particular type of LFSR-based stream ciphers: ¬ltered gen-

erators. However, several kinds of LFSR-based generators exist; let us now

review some of these constructions, starting with the ¬ltered generator.

The ¬ltered generator The ¬ltered generator tries to hide the linearity of

a core LFSR by using a complicated non-linear output function on a few bits.

At each step, the output function takes as input t bits from the inner state

of the LFSR. These bits are usually neither consecutive, nor evenly spaced

within the register.

This output function produces a single bit, which is given as the LFSR

output. The main di¬culty when using the ¬ltered generator is to choose an

adequate function. It should not be too simple, otherwise the keystream gen-

© 2009 by Taylor and Francis Group, LLC

Attacks on stream ciphers 375

erator becomes weak; it should not be too complicated, otherwise it becomes

the bottleneck of the generator.

The function f is usually described either as a table of values or as a polyno-

mial. Note that, using the techniques of Section 9.2, f can always be expressed

as a multivariate polynomial over F2 .

Non-linear output from multiple generators A variant of the ¬ltered

generator is to run several LFSRs in parallel and to output at each clock

a non-linear function of bits coming from these generators. A particularly

simple version is the Ge¬e generator, it is constructed from three LFSRs,

one serves as a control LFSR and the other two as output LFSRs. When

the control LFSR produces a 0, the output of the keystream comes from the

¬rst of the output LFSRs. When it produces a 1, the output comes from the

second generator. With each clock signal, all three LFSRs advance.

The Ge¬e generator can be generalized to the selection from one generator

among many, using a control value. Ge¬e™s generators are highly vulnerable

to correlation attacks.

Another variation on this idea is the shrinking generator. Here we only

have two LFSRs. The ¬rst one serves as a control LFSR, the second one as

an output generator. When the control LFSR produces a ˜0™, no keystream

output is generated. When the control LFSR produces a ˜1™, the output bit of

the output generator is added to the keystream. Once again, with each clock

signal, both LFSRs advance. Implementing the shrinking generator requires

special care to hide the irregular rhythm of the output production. Indeed,

if this rhythm can be precisely measured, then the state of both LFSRs can

easily be recovered through a basic side-channel attack.

Keystream generators with memory In order to obtain more complex

keystream generators, it is also possible to add memory that preserves some

non-linear information from one clock step to the next.

A typical example is the summation generator. From a mathematical point-

of-view, this generator views the output of each of its n internal LFSRs as the

binary representation of a large integer, with low bits coming ¬rst. Then, it

sums these n numbers and outputs the binary representation of the sum.

In practice, this is realized by using a small binary adder and a carry regis-

ter. Initially, the carry is initialized to 0. The carry register can store a small

integer in the range [0, n ’ 1] and the binary adder computes the sum of the n

LFSR output and of the previous carry. The result of this sum S is a number

in the range [0, 2n ’ 1]. Its parity is the output of the keystream generator,

the high order bits of S, i.e., the value S/2 is recycled as the next value of

the carry register to allow carry propagation.

Clock controlled generators Another way of combining LFSRs in a non-

linear fashion is to use the output of a control LFSR or of a simple keystream

© 2009 by Taylor and Francis Group, LLC

376 Algorithmic Cryptanalysis

p

1 1

1

’

p

xt zt

p

’

1

p

0 0

Figure 12.1: Noisy LFSR (Binary Symmetric Channel) model

generator to clock the output generator. For example, with a control LFSR

and two output LFSRs, we obtain the alternating step generator. As each step

in time, the output of the keystream is the XOR of the two output LFSRs.

However, at each step, only one of the two LFSRs states is advanced. When

the control bit is a ˜0™, the ¬rst output LFSR advances, when the control bit

is a ˜1™, the second one does.

12.2 Correlation attacks

12.2.1 Noisy LFSR model

A very convenient way of modeling LFSR based keystream generators is to

consider that the keystream is the output of a regular LFSR masked by some

noise. This approach is called the noisy LFSR model or the binary symmetric

channel model. In this model, if xt denotes the output of the LFSR at time

t, the keystream zt is constructed as follows:

xt with probability p,

zt = (12.1)

xt • 1 with probability 1 ’ p.

This is often represented as in Figure 12.1.

In this model, when the probability p is equal to 1/2, zt is purely random

and carries no information about xt . When p is equal to 0 or 1, zt is simply a

(possibly inverted) copy of xt . Due to symmetry of the construction, we may

safely assume that p > 1/2. Otherwise, if su¬ces to ¬‚ip zt in order to replace

p by 1 ’ p. Under this assumption, it is convenient to write p = (1 + )/2 or

equivalently to let = 2p ’ 1. This value is called the bias of the binary

symmetric channel.

In order to see why this model is relevant in the case of the ¬ltered genera-

tor, we can use the techniques of Chapter 9. Assume that the bit of keystream

zt is obtained from k bits of output as the LFSR as f (xt , xt’δ1 , · · · , xt’δk’1 ).

Then f is a function from {0, 1}k to {0, 1}. Using a Walsh transform based

approach, we compute the best possible linear approximation Lf of f , this