. 11
( 18)


is the initial encoding of the function S in an adequate representation. In
the binary case, this initial encoding is a simple change between an additive
representation by elements of {0, 1} to a multiplicative representation {’1, 1}.
In the general case, this is more complex, since we need to transform the table
S of pn elements into a larger (p ’ 1) — pn array. In fact, this initial encoding
can be obtained by specifying the Walsh transform of a constant function
with no input, say S0 () = ±0 . Using the de¬nition, since the function can
only be evaluated on the (unique) empty input, we have L1 (S0 )(±0 ) = 1 and
1 (±)
LM (S0 ) = 0, for ± = ±0 . As a direct consequence, we ¬nd:
p’1 1
¯ ¯M
L1 (S0 )(±0 ) = and L1 (S0 )(±) = ’ . (9.30)
p p
To avoid denominators, it is easier to multiply these numbers by p. Of course,
since all computations are linear, this implies that the results are scaled by

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 291

the same factor of p. It is interesting to remark that in the binary case, we
obtain (up to sign) the usual encoding.

In addition, it is easy to remark that this Walsh transform on Fp is almost
its own inverse. As in the binary case, during inversion we need to add a
division by p. Moreover, during inversion, we need an additional change of
sign when relating S to its restrictions. More precisely, in Equation (9.29),
± ’ m0 x0 needs to be replaced by ± + m0 x0 . Of course, in the binary case,
since 1 = ’1 (mod 2) this change of sign is not visible. We give a pseudo-code
description of the Walsh transform over Fp and its inverse as Algorithm 9.8,
the preliminary encoding is given as Algorithm 9.7.

Algorithm 9.7 Pre-Walsh transform encoding over Fp
Require: Input Table S containing pn elements of Fp
Create (p ’ 1) — pn table S
for i from 0 to pn ’ 1 do
for j from 1 to p ’ 1 do
Let S[j][i] ←’ ’1
end for
if S[i] = 0 then
Let S[S[i]][i] ←’ p ’ 1
end if
end for
Require: Output encoded table S

9.4.1 Complexity analysis

Since Algorithm 9.8 is presented in iterative form, its complexity analysis is
quite easy. First, when p is ¬xed, we see that only the outer loops depend on
n and that the running time is O(n · pn ), where the implicit constant depends
on p. When p also varies, we need to be more careful in our analysis. The
contribution of the outer loop is easy, it occurs n times. Put together, the
next two loops on Pos and i have a total contribution of pn’1 iterations, since
Pos advances by steps of p Sz. At the deepest level, we have a maximum
of three additional levels, with a total contribution bounded by O(p3 ). As
a consequence, when accounting for both n and p, in terms of arithmetic
operations the running time can be written as O(npn+2 ). Asymptotically,
this can be improved for large values of p by using fast Fourier transforms,
this is described in Section

© 2009 by Taylor and Francis Group, LLC
292 Algorithmic Cryptanalysis

Algorithm 9.8 Walsh transform algorithm over Fp
Require: Input encoded (p ’ 1) — pn table S
Require: Input Boolean value inverse
Comment: Variable Sz is the small table size
Comment: Variable Pos is the small table position
Create temporary p — p table T .
for l from 0 to n ’ 1 do
Let Sz ←’ pl , Pos ←’ 0
while (Pos < pn ) do
for i from 0 to Sz ’ 1 do
for k from 0 to p ’ 1 do
Let Sum ←’ 0
for m from 1 to p ’ 1 do
Let T [m][k] ←’ S[m][Pos + k Sz + i]
Let S[m][Pos + k Sz + i] ←’ 0
Let Sum ←’ Sum + T [m][k]
end for
Let T [0][k] ←’ ’Sum
end for
for k from 0 to p ’ 1 do
for m from 1 to p ’ 1 do
for j from 0 to p ’ 1 do
if inverse = false then
Let m = m ’ kj (mod p)
Let m = m + kj (mod p)
end if
Let S[m][Pos + k Sz + i] ←’ S[m][Pos + k Sz + i] + T [m ][j]
end for
end for
end for
if inverse = true then
for k from 0 to p ’ 1 do
for m from 1 to p ’ 1 do
Let S[m][Pos + k Sz + i] ←’ S[m][Pos + k Sz + i]/p
end for
end for
end if
end for
Let Pos ←’ Pos + p Sz
end while
end for
Output overwritten content of S containing (inverse) Walsh transform

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 293

9.4.2 Generalization of the Moebius transform to Fp
As in the binary case, the algebraic normal form of a function f is obtained
by writing:

f (x0 , . . . , xn’1 ) = g(a0 , . . . , an’1 ) (mod p) (9.31)
(a0 ,...,an’1 )∈Fn i

There are two essential di¬erences here compared to the binary case. The
¬rst di¬erence is that g takes values modulo p instead of binary values. The
second one is that the exponent of each variables runs from 0 to p ’ 1, instead
of simply being 0 or 1. The reason behind the second di¬erence is that for
all value x in Fp , xp ’ x is always zero. This implies that when giving a
polynomial expression for f , any exponent greater than p can be reduced
modulo p ’ 1. Note that xp’1 should not be replaced by 1, because the two
expressions di¬er when x = 0.
Starting again with a univariate function f on a single entry x0 modulo p,
we see that:

f (0) = g(0),
f (1) = g(i),
. (9.32)
g(i)j i ,
f (j) =

Thus, the vector of values of f is obtained by multiplying the vector of values
of g by a ¬xed matrix:
« 
10 0 ... 0
¬1 1 1 ... 1 ·
2p’1 · ,
¬ ·
¬1 2 4 ...
Hp = ¬ (9.33)
¬. . . .
. . . . ·
.. . . 
1 p ’ 1 (p ’ 1)2 . . . (p ’ 1)p’1

de¬ned modulo p.
For symmetry, it is better to reorder the rows of Hp . For this, choose a
generator h of the multiplicative group F— and write:

p’1 p’2
f (hj ) = g(i)hij = (g(0) + g(p ’ 1)) + g(i)hij (9.34)
i=0 i=1

© 2009 by Taylor and Francis Group, LLC
294 Algorithmic Cryptanalysis

The sum in the right-hand side of this equation can be recognized as a Fourier
transform (see next section) of the vector G of p ’ 1 elements de¬ned as
G(0) = g(0) + g(p ’ 1) and G(i) = g(i) and modulo p the inverse transform
is given by:

g(0) = f (0),
f (hi )h’i ,
g(1) = ’
f (hi )h’ij ,
g(j) = ’ (9.35)
f (hi ) ’ f (0).
g(p ’ 1) = ’

This can be used to construct Algorithm 9.9 for Moebius transforms over Fp .

9.5 Fast Fourier transforms
Fourier transforms and fast algorithms for computing them are very general
tools in computer science and they are very closely related to Walsh trans-
forms. In order to de¬ne Fourier transforms in the general case, we require
the existence of a root of unity ξ of order N . The Fourier transform de¬ned
from this root ξ maps vectors of N numbers to vectors of N numbers and the
transform of a vector X is X given by:

N ’1
ˆ Xj ξ ij ,
∀i : Xi = »ξ (9.36)

where »ξ is a renormalization factor. In practice, »ξ is chosen to maximize
convenience and its value may depend on the context we are considering.
It is easy to show that up to a constant factor, the Fourier transform based
on ξ ’1 is the inverse of the Fourier transform based on ξ. Indeed, the i-th

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 295

Algorithm 9.9 Moebius transform algorithm over Fp
Require: Input Table S describing function f on n variables, with pn entries
Comment: Variable Sz is the small table size
Comment: Variable Pos is the small table position
Create table T of p ’ 1 elements
Choose h, generator of F— .p
for i from 0 to n ’ 1 do
Let Sz ←’ pi , Pos ←’ 0
while (Pos < pn ) do
for j from 0 to Sz ’ 1 do
Set table T to 0
for k from 0 to p ’ 2 do
for l from 0 to p ’ 2 do
Let T [l] ←’ T [l] + S[Pos + (hk )Sz + j]h’kl mod p
end for
end for
for l from 1 to p ’ 2 do
Let S[Pos + lSz + j] ←’ ’T [l]
end for
Let S[Pos + (p ’ 1)Sz + j] ←’ ’T [0] ’ S[Pos + j]
end for
Let Pos ←’ Pos + p · Sz
end while
end for
Output overwritten content of S containing Moebius transform

© 2009 by Taylor and Francis Group, LLC
296 Algorithmic Cryptanalysis

component after applying both transforms is:
N ’1 N ’1
ξ ’ij
Xk ξ jk
Ci = »ξ’1 »ξ (9.37)
j=0 k=0
N ’1 N ’1 N ’1 N ’1
Xk ξ j(k’i)
= »ξ’1 »ξ Xk ξ = »ξ’1 »ξ
j=0 k=0 k=0 j=0
N ’1 N ’1
ξ j(k’i) = »ξ’1 »ξ N Xi .
= »ξ’1 »ξ Xk

N ’1
The last equation in this chain comes for the fact that j=0 ξ jt is equal 0
whenever t is non-zero modulo N and equal to N when t is 0 modulo N .
Due to this inversion formula, two choices of the renormalization factors
are frequently encountered. In theoretical texts, the usual choice is:
»ξ = »ξ’1 = √ . (9.38)
In computer science and especially when writing programs, it is often more
convenient to choose:
»ξ = 1 and »ξ’1 = . (9.39)
Indeed, with this choice, we do not need to perform any divisions when com-
puting the direct transform. Moreover, all the divisions by N that occur
during the inverse transform are known to be exact. These divisions by N
are analogous to the divisions by 2 that occur in the inverse Walsh transform
Algorithm 9.3. Of course, if the value of ξ is not an integer, this point is moot.
In particular, this is the case when the Fourier transforms are performed over
the complex numbers, using for example:

ξ = e2π . (9.40)
However, when performing Fourier transform computations over ¬nite ¬elds,
choosing »ξ = 1 is a real asset.
From the de¬nition of the Fourier transform, it is very easy to write a
simple algorithm with quadratic running time to compute Fourier transform,
see Exercise 2. However, this algorithm is not optimal and, as in the case of
Walsh transforms, it is possible to compute Fourier transforms much faster,
in time N log N . Interestingly, the algorithms involved greatly depend on the
value of the order N of the root ξ, but the end result holds for all values of N .

9.5.1 Cooley-Tukey algorithm
The most commonly known algorithm for computing discrete Fourier trans-
forms is due to J. Cooley and J. Tukey [CT65]. This algorithm is based on a

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 297

mathematical expression that allows to write the discrete Fourier transform
of order N = N1 N2 in terms of smaller transforms of order N1 or N2 . As
above, let ξ be a primitive root of order N and de¬ne ξ1 = ξ N2 and ξ2 = ξ N1 .
Clearly, ξ1 is a primitive root of order N1 and ξ2 is a primitive root of order
N2 . The key idea is to renumber X and X, writing:

Xi1 ,i2 = Xi1 N2 +i2 with 0 ¤ i1 < N1 and 0 ¤ i2 < N2 and (9.41)
ˆ ˆ
Xj1 ,j2 = Xj1 +j2 N1 with 0 ¤ j1 < N1 and 0 ¤ j2 < N2 .
Note that two di¬erent renumberings are used for X and X. Using these
renumbering and ignoring the renormalization factor »ξ we can write:
N ’1
ˆ Xi ξ i(j1 +j2 N1 )
Xj1 ,j2 = (9.42)
N2 ’1 N1 ’1
Xi1 ,i2 ξ (i1 N2 +i2 )(j1 +j2 N1 )
i2 =0 i1 =0
N2 ’1 N1 ’1
Xi1 ,i2 ξ11 j1 ξ22 j2 ξ i2 j1
i i
i2 =0 i1 =0
N2 ’1 N1 ’1
Xi1 ,i2 ξ11 j1
ξ22 j2 .
i2 j1
= ξ
i2 =0 i1 =0

De¬ning an intermediate step Xk1 ,k2 as:
N1 ’1
˜ Xk1 ,i2 ξ11 k1 .
Xk1 ,k2 = (9.43)
i1 =0

We see that X can be computed by N2 independent evaluations of discrete
Fourier transforms of order N1 . After this ¬rst series of Fourier transforms,
we obtain:
N2 ’1
ξ i2 j1 Xj1 ,i2 ξ22 j2 .
ˆ ˜
Xj1 ,j2 = (9.44)
i2 =0

Thus, X can be obtained by a second series of N1 independent Fourier trans-
forms of order N2 applied to X de¬ned by:
˜ ˜
Xk1 ,k2 = ξ k1 k2 Xk1 ,k2 . (9.45)

This approach of Cooley and Tukey works very nicely when N can be writ-
ten as a product of small primes. In that case, the above decomposition can
be applied recursively and yields a fast discrete Fourier transform algorithm.
The typical case of application is N = 2n and is described in Algorithm 9.10.

© 2009 by Taylor and Francis Group, LLC
298 Algorithmic Cryptanalysis

Algorithm 9.10 Fast Fourier transform algorithm on N = 2n values
Require: Input Table T containing 2n entries
Require: Input Boolean value inverse
if inverse = false then
√ n
Let ξ ←’ e2π ’1/2
else √ n
Let ξ ←’ e’2π ’1/2
end if
for k from 0 to n ’ 1 do
Let Sz ←’ 2k , Pos ←’ 0
while (Pos < 2n ) do
for j from 0 to Sz ’ 1 do
Let Sum ←’ T [Pos + j] + T [Pos + Sz + j]
Let Diff ←’ T [Pos + j] + ξT [Pos + Sz + j]
if inverse = true then
Let T [Pos + j] ←’ Sum/2
Let T [Pos + Sz + j] ←’ Diff/2
Let T [Pos + j] ←’ Sum
Let T [Pos + Sz + j] ←’ Diff
end if
end for
Let Pos ←’ Pos + 2 · Sz
end while
Let ξ ←’ ξ 2
end for
Output overwritten content of T containing (inverse) Fourier Transform

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 299 Multiplication of polynomials
The Fourier transform is often used as an ingredient of fast multiplication
algorithms. These fast multiplication algorithms can be applied to integers
and to polynomials. The case of polynomials is especially interesting since
it sheds a di¬erent light on Fourier transform algorithms. To explain the
relation between the discrete Fourier transform of order N and polynomial
multiplication, we ¬rst associate to any vector X of N elements a polynomial
PX de¬ned as:
N ’1
Xi xi .
PX (x) = (9.46)
With this notation, we easily remark that the coe¬cients of the discrete
Fourier transform of X corresponds to the evaluations of the polynomial PX
at roots of unity. More precisely:
Xi = PX (ξ i ), (9.47)
where as before ξ is a primitive root of unity of order N .
Since the discrete Fourier transform is a speci¬c case of polynomial evalu-
ation, its inverse is a special case of Lagrange interpolation. With this dic-
tionary in mind, we can now study the multiplication of polynomials. Given
two polynomials PX and PY of degree less than N and their corresponding
ˆ ˆ
vectors X and Y , we know that Xi = PX (ξ i ) and Yi = PY (ξ i ). Multiplying
these two identities, we ¬nd:
Xi Yi = PX (ξ i ) · PY (ξ i ) = (PX — PY )(ξ i ). (9.48)
ˆ ˆ ˆ ˆ ˆˆ
We now denote by Z the pointwise product of X and Y , i.e., let Zi = Xi Yi
for all 0 ¤ i < N . The inverse transform of Z is a vector Z with associated
polynomial PZ , such that:
PZ (ξ i ) = (PX · PY )(ξ i ), for all 0 ¤ i < N . (9.49)
This implies that PZ is equal to PX · PY modulo the polynomial xN ’ 1. If,
in addition, we take care to choose N larger than the sum of the degrees of
PX and PY , then PZ is exactly the product PX · PY . Convolution product In fact, even without the above restric-
tion on the sum of the degrees of PX and PY , the vector Z is still worth
considering. Writing down explicitly the rule for polynomial multiplication
modulo xN ’ 1 we ¬nd that:
N ’1
Zi = Xj Y(i’j) mod N (9.50)
N ’1
= Xj Yi’j + Xj YN +i’j ,
j=0 j=i+1

© 2009 by Taylor and Francis Group, LLC
300 Algorithmic Cryptanalysis

under the convention that the second sum is empty when i = N ’ 1. This
vector Z is called the convolution product of X and Y and denoted by X Y .
Clearly, from an algorithmic point-of-view, convolution products can be
computed very e¬ciently for N = 2n . It su¬ces to compute the Fourier
ˆ ˆ
transform of X and Y , to perform pointwise multiplication of X and Y and
¬nally to apply the inverse discrete Fourier transform of the pointwise product.
However, the restriction N = 2n is not necessary, convolution products can
be computed e¬ciently for all values of N . The trick is to make the following
remark, given two vectors X and Y of order N (not a power of two), we pad
them into vectors X and Y of size N , where N is the smallest power of two
greater than N . The padded vectors X and Y are de¬ned in two di¬erent
ways. The ¬rst vector X is a copy of X with as many zeros as needed in
positions from N to N ’ 1. The second vector Y is obtained by assembling
a complete copy of Y and an initial segment of Y , i.e., in positions from N to
N ’ 1 we ¬nd Y0 , Y1 , . . . up to YN ’N ’1 . Since N is a power of two, thanks
to the algorithm of Cooley and Tukey, the convolution product of X and Y
is computed e¬ciently. Moreover, the ¬rst N elements of this product form
the convolution product of X and Y , see Exercise 3.

9.5.2 Rader™s algorithm
Since the approach of Cooley and Tukey yields a fast Fourier transform
algorithm only for highly composite values of the order N , a di¬erent approach
is also needed, in order to cover all possible values of N . Without loss of
generality, we may restrict ourselves to prime values of N , relying on Cooley
and Tukey approach to reduce the general case to the prime case. Thus,
we now assume that N = p is prime and we let g be a generator of the
multiplicative group F— . The computation of X can then be decomposed in
ˆ ˆ
two cases: X0 and Xg’i with 0 ¤ i < p ’ 1. Indeed, any non-zero number
modulo p can be written as a power of g and we can always choose a negative
exponent if we desire to do so. For X0 , we have:
X0 = Xj . (9.51)

For Xg’i , we can write:
ˆ Xj ξ jg
Xg’i = (9.52)
Xgk ξ g using j = g k .
= X0 +

Computing the above sum simultaneously for all values of i is equivalent to
a convolution product of the two sequences Xgk and ξ g , as de¬ned in Sec-

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 301

tion Thus, thanks to the results of this section, it can be computed
using discrete Fourier transforms of order 2t , where t is obtained by rounding
log2 (p ’ 1) upward. Once this power of 2 has been chosen, the sequences
are padded and the convolution can be computed using Cooley and Tukey
algorithm for the discrete Fourier transform.

An alternative approach is to avoid the padding to a power of two and to
remark that p ’ 1 is always composite. As a consequence, it is possible to
perform the complete Fourier transforms by alternatively using the approaches
of Rader and Cooley-Tukey. However, this makes the complexity analysis
much more di¬cult. Moreover, since the approach of Rader involves the
computation of convolutions, it requires at least three applications of the
Fourier transform of size p ’ 1. Thus, using Rader™s approach more than once
is unlikely to be e¬cient. Yet, there is one special case, where it is a good
idea. This case occurs when sequences of length p ’ 1 ¬t in memory, while
padded sequences of length 2t do not. In that case, using Rader™s approach
twice guarantees that all computations are made in main memory and it is
faster than swapping the computation to disk. Application to Walsh transform over Fp

We saw in Section 9.4 that the Walsh transform over functions from Fn p
to Fp can be computed in O(npn+2 ) arithmetic operations. When p is ¬xed,
this can be rewritten as O(npn ); however, when p is allowed to grow as pn
increases, it is not clear whether O(npn+2 ) is optimal. To answer this question,
we now extract the core routine of the Walsh transform over Fp . This core
runtine transforms a p — p array T to another array T . We give its description
in Algorithm 9.11.

Algorithm 9.11 Core transform of extended Walsh over Fp
Initialize p — p array T to 0
for k from 0 to p ’ 1 do
for m from 1 to p ’ 1 do
for j from 0 to p ’ 1 do
ˆ ˆ
Let T [m][k] ←’ T [m][k] + T [m ’ kj][j]
end for
end for
end for
Output T

© 2009 by Taylor and Francis Group, LLC
302 Algorithmic Cryptanalysis

Equivalently, we can write the transform as in the following equation:
Tm,k = Tm’kj,j (9.53)

where the index m ’ kj is taken modulo p. Using Algorithm 9.11, this trans-
form is computed in O(p3 ) arithmetic operations. In order to improve the
extended Walsh transform as a whole, we are going to exhibit an asymptoti-
cally faster algorithm for this core transform.
For any p-th of unity ξ, let us de¬ne:
p’1 p’1
(ξ) ˆ (ξ) (ξ)
ξ kj Cj .
Cj = ξ Tm,j and Ck = (9.54)
m=0 j=0

We can remark that:
p’1 p’1 p’1 p’1
ˆ (ξ) m+kj
ξ n Tn’kj,j
Ck = ξ Tm,j = (9.55)
j=0 m=0 j=0 n=0
p’1 p’1 p’1
ξ n Tn,k .
= ξ Tn’kj,j =
n=0 n=0

If we now consider a primitive p-th root of unity ξ, we can recover the
ˆ0 ˆ1 ˆ p’1
array T from the collection of vectors C (ξ ) , C (ξ ) , . . . C (ξ ) . Indeed, we can
see that:
(ξ i )
ˆ ˆ
ξ ’in Ck
pTn,k = . (9.56)

Indeed, in this sum, the coe¬cient in front of Tn,k is 1 in each summand;
while the coe¬cient in front of Tm,k is ξ (m’n)i . Thus, for m = n the sum of
the coe¬cients is 0.
As a consequence, we can compute the p — p array T in three steps:
(ξ i )
1. Compute the p2 values Cj , where ξ is a primitive p-th root of unity,
from the values Tm,j using:
(ξ i )
ξ im Tm,j .
Cj =

(ξ i )
ˆ (ξ )
2. Compute the p2 values Ck from the values Cj using:
(ξ i )
ˆ (ξ ) ξ kj Cj
Ck = .

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 303
ˆ (ξ )
3. Compute the p2 values Tn,k from the values Ck using:
(ξ i )
ˆ ˆ
ξ ’in Ck
Tn,k = /p.

In each of these three steps, we recognize p copies of a Fourier transform on p
elements. In the ¬rst step, the transform is performed on index m, in the sec-
ond step on index j and in the third step on index i. Of course, using Rader™s
algorithm, all these Fourier transforms can be computed using p log(p) arith-
metic operations. As a consequence, replacing the core Algorithm 9.11 with
cost O(p3 ) by these fast Fourier transforms with total cost O(p2 log(p)) yields
a fast extended Walsh algorithm with complexity O(n log(p)pn+1 ) arithmetic
operations, where the O constant is independent of both p and n.
Since we are considering the asymptotic complexity, it is also important to
express it in terms of bit operations, rather than arithmetic operations. For
this, we need to specify where the roots of unity ξ are taken and what the
size of involved integers is. It is clear that the integers are smaller than pn
in absolute value, and can, thus, be represented on n log(p) bits. For the
roots of unity, two options are available, we can either take complex roots
with su¬cient precision to recover the array T or p-th root of unity modulo a
large enough prime Q such that p divides Q ’ 1. In both cases, assuming that
we are using a fast asymptotic multiplication algorithm, the bit complex-
ity is O(log(n) log log(p)(n log(p))2 pn+1 ). Thus, the fast Fourier transform
can be used to asymptotically improve the extension of the Walsh transform
algorithm to Fp . However, in practice, pn cannot be too large, since the rep-
resentation S needs to ¬t in memory. As a consequence, p and n cannot both
be large at the same time and this asymptotic approach is very unlikely to be

9.5.3 Arbitrary ¬nite abelian groups
All the previous sections in this chapter deal with Fourier (or Walsh) trans-
forms in abelian groups of various forms, Z/N Z or Fn . We now generalize this
and show that Fourier transforms can be de¬ned and computed e¬ciently for
arbitrary ¬nite abelian groups. Let (G, —) be a ¬nite abelian group. A char-
acter on G is a multiplicative map χ from G to the ¬eld3 C of complex
numbers. In this context, multiplicative means that for any pair of group
elements (g, h), we have χ(gh) = χ(g)χ(h). In particular, this implies the
following properties.

1. If e denotes the neutral element of G, we have χ(e) = 1.

3 HereC could be replaced by any ¬eld containing enough roots of unity. For simplicity, we
only describe the case of the complex numbers.

© 2009 by Taylor and Francis Group, LLC
304 Algorithmic Cryptanalysis

2. For any g in G and any ± in Z: χ(g ± ) = χ(g)± .

3. For any g in G, we have g |G| = e and χ(g)|G| = 1. Thus, χ takes its
values in the group MG of |G|-th roots of unity in C. As usual, |G|
denotes the order (or cardinality) of G. Note that depending on G the
range of characters may be a proper subgroup of MG .
4. The set of all characters on G forms a group denoted G where multipli-
cation of characters is de¬ned by:

∀g ∈ G : (χ1 χ2 )(g) = χ1 (g)χ2 (g). (9.57)

5. The neutral element of G is the so-called trivial character ˆ which sends
any group element to 1 in C.
6. The group G is abelian and ¬nite. Commutativity is clear from the
de¬nition. Finiteness is a direct consequence of the fact that both G
and the set MG of |G|-th roots of unity are ¬nite, thus there is only a
¬nite number of possible mappings from G to the |G|-th roots of unity.
7. If the group G can be written as a direct product G1 — G2 , then G =
ˆ ˆ
G1 — G2 .

PROOF In one direction, if χ1 is a character on G1 and χ2 a charac-
ter on G2 we can de¬ne a character χ1 — χ2 on G letting (χ1 — χ2 )(g) =
ˆ ˆ
χ1 (g1 )χ2 (g2 ) for g = (g1 , g2 ) ∈ G1 — G2 . Thus, G1 — G2 is naturally
embedded into G.
In the other direction, let χ ∈ G and de¬ne χ1 and χ2 , respectively, in
ˆ ˆ
G1 and G2 by the following formulas:

∀g1 ∈ G1 : χ1 (g1 ) = χ((g1 , 1)) and (9.58)
∀g2 ∈ G2 : χ2 (g2 ) = χ((1, g2 )). (9.59)

Clearly, χ1 and χ2 are multiplicative. Furthermore, χ = χ1 — χ2 . This
concludes the proof.

8. There is an isomorphism between G and G.

PROOF Thanks to the previous property and decomposing G into
a product of cyclic groups, it su¬ces to prove this fact when G is cyclic.
We assume that g0 denotes a generator of G.
In that case, a character χ is entirely determined by its value at g0 .
Indeed, any element g in G can be written as g0 and by multiplicativity,

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 305

we ¬nd χ(g) = χ(g0 )± . Since, χ(g0 ) belongs to the set M|G| of |G|-th
roots of unity, there are |G| possible choices for χ(g0 ). Thus, G and G
have the same cardinality. It remains to prove that G is also cyclic to
conclude that they are isomorphic. This is done by choosing a primitive
|G|-th root of unity µ in C and by proving that the character χµ de¬ned
by χµ (g0 ) = µ is a generator for G. Indeed, for any χ it is possible to
express χ(g0 ) as µβ , this implies that χ = χβ .

9. Furthermore, G is naturally isomorphic to G.

PROOF The existence of an isomorphism is clear from the previous
item. Moreover, given an element g of G, we have a natural action of g
on G given by:
g(χ) = χ(g).

To check that this is indeed the group isomorphism we seek, we need
to show that two di¬erent elements of G induce di¬erent actions on G.
Equivalently, it su¬ces to prove that for g = e, there exists χ ∈ G such
that g(χ). As before, it su¬ces to give a proof for cyclic group. Using
the previous notations, g = g0 , we see that g(χµ ) = µ± = 1.

10. For g = e in G, we have:

χ(g) = 0. (9.60)

In addition:
χ(e) = |G| = |G|. (9.61)

PROOF Equation (9.61) is clear, since for all χ, we have χ(e) = 1.
To prove Equation (9.60), we write G as a product of cyclic groups
G = G1 — G2 — · · · — Gk and the sum can be rewritten as:

··· χ1 (g1 ) · · · χk (gk ) = 0.
ˆ ˆ
χ1 ∈G1 χk ∈Gk

As a consequence, it su¬ces to prove Equation (9.60) for a single cyclic
component with gi = ei . In this case, the sum becomes a sum of powers
of some root of unity and equal to 0.

© 2009 by Taylor and Francis Group, LLC
306 Algorithmic Cryptanalysis

Given all characters on a group G, we can now de¬ne the Fourier transform
of an arbitrary function f from G to C as:

f (χ) = χ(g)f (g). (9.62)

The Fourier transform is invertible and the inverse is obtained by:
1 ˆ
χ’1 (g)f (χ).
f (g) = (9.63)

This is a simple consequence of Equations (9.61) and (9.60).
In the general case, the Fourier transform can be used as a tool to compute
the so-called convolution product f1 f2 of two functions f1 and f2 from G to
C de¬ned as:
f1 (g)f2 (g0 g ’1 ).
(f1 f2 )(g0 ) = (9.64)

Indeed, in general, we have f1 f2 = f1 · f2 . Link with the previous cases
To illustrate this general setting, let us compare it to the special cases of the
Walsh and Fourier transforms we already addressed. In each case, it su¬ces
to specify how the objects we are considering are encoded into functions from
a group G to C. In the case of the Fourier transform, we are considering
vectors of N elements and work with the group Z/N Z. The link with the
general case is obtained by mapping the vector X into a function fX over
Z/N Z de¬ned by fX (i) = √ i . Since the group Z/N Z has cardinality N , the
2π ’1/N
N -th root of unity ξ = e naturally arises.
For the Walsh transform, we are already considering functions from Fn to p
Fp and it su¬ces to represent the range Fp using the p-th roots of unity in C.
Note that there are several possible representations, one is trivial and maps
every element of Fp to 1. There are also p ’ 1 non-trivial representations
obtained by sending 1 (an additive generator for Fp ) to one of the p ’ 1
primitive p-th of unity, say ξ and by sending an element x of Fp to ξ x . In
truth, the algorithms of Section 9.4 can be viewed as p (or p ’ 1) parallel
applications of the general case, one for each possible representation.

© 2009 by Taylor and Francis Group, LLC
Fourier and Hadamard-Walsh transforms 307

1. Looking back at truncated di¬erential and at Equation (9.16), we would
like to compute truncated di¬erentials for an S-box S on a single bit
from a table of ordinary di¬erentials. Given a ¬xed input mask with k
unknown bits, show that it su¬ces to sum 2k values. Write a program
which computes the correct sum.
2h . Directly following Equation (9.36), write a program to compute Fourier
transform. What is the complexity of this program?
3. In Section, we have seen that to compute the convolution prod-
uct of two vectors X and Y on N elements, we can pad X with zeros
up to the next power of 2, pad Y with a copy of its initial segment and
compute the convolution product of the padded copies. Check that both
convolution products are indeed equal by using the program resulting
from Exercise 3. Prove the equality.
4h . How many multiplications (of reals) are needed to multiply two complex
5. Implement the Moebius transform on F3 .

6h . Study the Walsh transform given as Program 9.3 and its timings. Pro-
pose a method to avoid the performance penalty due to cache e¬ects.

© 2009 by Taylor and Francis Group, LLC
Chapter 10
Lattice reduction

10.1 De¬nitions
Lattices are rich mathematical objects, thus before studying the algorithmic
issues involving lattices, it is essential to recall some important de¬nitions and
properties concerning lattices.

A lattice is a discrete additive subgroup of Rn .

While very short, this de¬nition needs to be studied carefully to better
understand the notion. First recall that an additive subgroup of Rn is a set
S of elements such that:

• If x ∈ S then ’x ∈ S.

• If x ∈ S and y ∈ S then x + y ∈ S.

As a consequence, the zero vector belongs to S and any linear combination of
vectors of S with integer coe¬cients is a vector of S.
However, a lattice L is more than an additive subgroup, it is a discrete
additive subgroup. This additional topological property implies that it is not
possible to construct a sequence of non-zero vectors in L with a zero limit.
Alternatively, it means that there is a small n-dimensional ball centered at
zero that contains no non-zero vector in L. As a consequence, if we are given
a norm on Rn , usually the Euclidean Norm, there exists a non-zero vector
which minimizes this norm. The norm, or length, of this vector is called the
¬rst minimum of the lattice and denoted by »1 (L). Let v1 be a vector of L
with norm »1 (L), then any vector w in L such that the pair (v1 , w) is linearly
dependent can be written as w = ±v1 for some integer ±. Indeed, assuming
that ± is not an integer, implies that w ’ ± v1 is a non-zero vector in L with
norm smaller than »1 (L) and leads to a contradiction.
This basic consideration with a single short vector illustrates two very im-
portant properties about lattices: the existence of bases and the sequence of
successive minima.
A basis for a lattice L is a set of linearly independent vectors, b1 , . . . , br
belonging to L such that any vector in L can be written as a linear combina-

© 2009 by Taylor and Francis Group, LLC
310 Algorithmic Cryptanalysis

tion, with integer coe¬cients, of the vectors bi . Any lattice L admits a basis,
moreover, the cardinality r is the same for all bases of a given lattice and is
called the rank of L. It is equal to the dimension of the vector subspace of
Rn spanned by L.
In cryptography, we often encounter, as a special case, lattices that only
contain vectors with integer coordinates. Such lattices are called integer
Let L be a lattice of rank r in Rn and B and B be two bases of L. Each
basis can be represented as a matrix of dimension r — n, where each row of
the matrix B (resp. B ) contains the coordinates of one vector in the basis
B. Since all vectors of B belong to L, they can be expressed as a linear
combination of the vectors of B with integer coe¬cients. As a consequence,
there exists a square integer matrix U of dimension r such that:

B = U B. (10.1)

Of course, there also is a matrix U with integer coe¬cients such that B =
U B . Putting the two relations together, we ¬nd that U and U are inverses
of each other. Since both U and U are integer matrices, this implies that the
determinant of U is equal to 1 or ’1. Such a matrix is called unimodular.
A direct consequence of the above relationship between two bases of L is
that for any basis B, the determinant det(B · B) is independent of the choice
of B. Indeed:

det(B · B ) = det(U B· B· U ) = det(U ) det(B· B) det( U ) = det(B· B).
Since in addition this invariant is a positive number, it is traditional to de¬ne
the determinant of the lattice L as:

det(B · B).
det(L) = (10.3)

The determinant of L is also called the volume or sometimes co-volume of L.
In fact, the matrix B · B itself is also important and is known as the Gram
matrix of the basis B.
For every lattice L, we can generalize the notion of ¬rst minimum »1 (L) and
de¬ne the sequence of successive minima. The k-th minimum of the lattice L
is de¬ned as the smallest positive real number »k (L) such that there exists at
least one set of k linearly independent vectors of L, with each vector of norm
at most »k (L). Clearly, the sequence of successive minima satis¬es:

»1 (L) ¤ »2 (L) ¤ »3 (L) ¤ · · · (10.4)

It is important to remark that in this de¬nition, we do not only count
the number of vectors, but also ask that they are linearly independent. To
understand why, note that in any lattice, there are at least two vectors with
minimal length, since when x is one, ’x is another. Of course, for some
lattices, it is possible to have more than two vectors with minimal length.

© 2009 by Taylor and Francis Group, LLC
Lattice reduction 311

From a mathematical point-of-view there exists an interesting relationship
between the values of the successive minima and the determinant of the lattice.
In particular, a theorem of Minkowski states that:

For every integer r > 1, there exists a constant γr , such that for any lattice
L of rank r and for all 1 ¤ k ¤ r:

γr det(L)1/r
»i (L) (10.5)

For more information about lattices, the reader may refer to [MG02].

10.2 Introductory example: Gauss reduction
Before studying general lattices and their reduction, it useful to start with
simple examples in small dimension. In this section, we consider integer lat-
tices in dimension 2, i.e., full rank additive subgroups of Z2 , their descriptions
and the related algorithms. Note that, in the special case of dimension 2,
lattice reduction can also be expressed in the alternate language of binary
quadratic forms; for a detailed exposition, see [BV07].
To describe such a lattice, we are given a basis (u, v) formed of two linearly
independent vectors. For example, we might consider the lattice given in
Figure 10.1. We see on this ¬gure that an arbitrary basis does not necessarily
give a pleasant description of the lattice. In fact, for this example, a much
better description is the reduced basis presented in Figure 10.2. In particular,
we see that the reduced basis contains much shorter vectors than the initial
basis; we also see that the vectors are nearly orthogonal. Computing a reduced
basis in dimension 2 can be done by using the Gauss reduction algorithm. This
algorithm is a greedy algorithm which iteratively reduces the length of the
longest vector in the basis. It continues as long as the reduced vector becomes
shorter than the other one, then it stops. We describe it as Algorithm 10.1.
A sample run is presented in Figure 10.3. The reader may also notice the
similarity of this algorithm with Euclid™s GCD Algorithm 2.1. Throughout
this chapter, the length of vectors is measured using the Euclidean norm · .

In order to better understand Gauss™s algorithm, we are going to check a
few simple facts and prove its correctness:

1. The algorithm always outputs a basis of the correct lattice. This can
be shown by remarking that the lattice generated by (u, v) is a loop

© 2009 by Taylor and Francis Group, LLC
312 Algorithmic Cryptanalysis

Algorithm 10.1 Gauss™s reduction algorithm
Require: Initial lattice basis (u, v)
if u < v then
Exchange u and v
end if
Find integer » that minimizes u ’ »v by:
» ←’ (u|v)/ v
Let u ←’ u ’ »v
Swap u and v
until u ¤ v
Output (u, v) as reduced basis

Figure 10.1: A 2-dimensional lattice with a basis

invariant. In other words, the lattice entering any step of the main
loop is the same as the lattice exiting this loop. As a consequence, the
algorithm as a whole preserves the input lattice. We simply need to
check that the bases (u, v) and (U , V ) = (u ’ »v, v) generate identical
lattices. Take an element x = aU + bV of the second lattice, i.e., with
integer values a and b. Clearly x = au + (b ’ »a)v and x belongs to the
¬rst lattice. Conversely, take x = au + bv in the ¬rst lattice and check

© 2009 by Taylor and Francis Group, LLC
Lattice reduction 313

Figure 10.2: A reduced basis of the same lattice

that x = aU + (b + »a)V belongs to the second lattice.

2. The algorithm always stops. The norm of the longest vector in the basis
decreases from one loop to the next. Since the squared norm is a positive
integer which also decreases, the algorithm must stop.

3. The computation of » in the algorithm indeed minimizes the norm of
u ’ »v. Let x = u ’ »v, we have to minimize:
2 2
’ 2»(u|v) + »2 v .

This norm is a quadratic function in ». For real values of » it is mini-
mized at (u|v)/ v . For integer values of » the minimum is obtained by
rounding this quantity to the nearest integer. The minimum is unique
except when the value to be rounded is a half integer, in which case
there are two equally good solutions.

4. The ¬rst vector of the reduced basis is the shortest non-zero lattice vector.
First, if the algorithm outputs (u, v) we have u ¤ v , thanks to the
loop exit condition. Then take x = au + bv any non-zero element of the
lattice, i.e., with a and b integers and (a, b) = (0, 0). Since the algorithm

© 2009 by Taylor and Francis Group, LLC
314 Algorithmic Cryptanalysis

(a) First reduction step

(b) Second reduction step

(c) Final reduction step

Figure 10.3: Applying Gauss™s algorithm

© 2009 by Taylor and Francis Group, LLC
Lattice reduction 315
ensures that |(u|v)| ¤ u /2 can
2 2 2
= a2 u + 2ab(u|v) + b2 v
≥ (a2 ’ |ab| + b2 ) u

Clearly, T = a2 ’ |ab| + b2 is an integer. Moreover, thanks to the
symmetry in a and b, we can assume without loss of generality that
b = 0. In that case, we can write:

T = |b|2 · ((|a/b|2 ’ |a/b| + 1).

From this expression, we conclude that T > 0. Indeed |b|2 > 0 and
the quadratic expression x2 ’ x + 1 has negative discriminant and thus
x2 ’ x + 1 > 0 for all real values of x, including x = |a/b|. Since T is an
integer, we conclude that T ≥ 1 and ¬nally
2 2

This shows that no non-zero vector in the lattice can be shorter than u.
However, a few other vectors may be as short as u. Studying our bounds
in more details, we see that for all such vectors we have ’1 ¤ a ¤ 1
and ’1 ¤ b ¤ 1. Clearly ’u is also shortest vector in any lattice, and
whenever v > u it is the only other possibility. When v = u , we
have two additional solutions, v and ’v. If in addition |(u|v)| = u /2,
two more vectors reach the minimum length either u’v and its opposite
or u + v and its opposite, depending on the sign of (u|v). Lattices
illustrating these three possibilities are shown in Figure 10.4. Note that
this third option cannot be achieved with integer lattices. Indeed, a
typical basis for this is generated by:
1 √
u= and u= .
0 3/2

Since 3 is irrational, it is clear that this lattice can only be approxi-
mated by integer lattices.
We leave as an exercise to the reader to show that no vector in the lattice
linearly independent from u can be shorter than v. There are either two
vectors v and ’v reaching this minimum or four vectors. With four
vectors, the additional pair is formed of u ’ v and its opposite or of
u + v and its opposite, depending on the sign of (u|v).

10.2.1 Complexity analysis
In order to give a more precise analysis of Gauss™s algorithm it is very useful
to introduce a variant called t-Gauss reduction that takes a parameter t ≥ 1
and is described as Algorithm 10.2.

© 2009 by Taylor and Francis Group, LLC
316 Algorithmic Cryptanalysis

(a) Single pair of short vectors

(b) Two pairs of short vectors

(c) Three pairs of short vectors

Figure 10.4: Typical cases of short vectors in 2-dimensional lattices

© 2009 by Taylor and Francis Group, LLC
Lattice reduction 317

Algorithm 10.2 t-Gauss reduction algorithm
Require: Initial lattice basis (u, v)
Require: Parameter t ≥ 1
if u < v then
Exchange u and v
end if
Find integer » that minimizes u ’ »v by:
» ←’ (u|v)/ v
Let u ←’ u ’ »v
Swap u and v
until u ¤ t v
Output (u, v) as reduced basis

The only di¬erence between t-Gauss and Gauss™s algorithms is the loop exit
condition. With t-Gauss, we no longer ask for the new vector to be shorter
than previous vectors, we only require it to be not too long compared to the
vector v that is kept from the previous iteration. With this new condition,
the length of the shortest vector decreases by a factor t at every iteration but
the last one. This also implies that the length of the longest vector decreases
by a factor t at every iteration, except maybe during the ¬rst one. Indeed,
the longest vector in an iteration is the shortest vector in the next one. As
a consequence t-Gauss algorithm for t > 1 is a polynomial time algorithm
since its number of iterations is smaller than 1 + log(max( u , v ))/ log(t).
Moreover, if we denote by k(t) the number of iterations in t-Gauss, we have:
√ √
k( 3) ¤ k(1) ¤ k( 3) + 1.

This implies that Gauss™s algorithm, which is identical to t-Gauss when t = 1,
also is a polynomial time algorithm. The lower bound is clear. To√ prove the

upper bound, we now let t = 3 and study the output basis of 3-Gauss.
Let ± and β denote the two real numbers such that the output basis (u, v)
u =± v and (u|v) = β v . (10.6)

Several cases are possible:

1. If ± ¤ 1, √ basis is already Gauss reduced and thus in this instance,
k(1) = k( 3).

2. If 3 ≥ ± > 1 and ’1/2 < β < 1/2, Gauss algorithm would have an
additional iteration, which would simply exchange u and v. In this case,

k(1) = k( 3) + 1.

© 2009 by Taylor and Francis Group, LLC
318 Algorithmic Cryptanalysis

3 ≥ ± > 1 and |β| > 1/2, we know that after the ¬nal iteration of
3. √
3-Gauss, we have:
|(u|v)| ¤ u /2 = ±2 /2 v ¤ 3/2 · v . (10.7)

Thus, the next iteration of the complete Gauss algorithm would replace
u by either u + v or u ’ v, depending on the sign of β. Moreover, due
to Equation (10.7), we notice that |β| ¤ ±2 /2. This implies that:

2 2 2
u’v ’ 2(u|v) + v
=u (10.8)
2 2
= (±2 + 1 ’ 2β) v ≥v.

2 2
And similarly that u + v ≥ v . Thus, after this additional itera-

tion, Gauss algorithm also stops and k(1) = k( 3) + 1.

4. Finally, if 3 ≥ ± > 1 and |β| = 1/2, depending on the rounding
convention, one of the two above cases applies. We also have k(1) =

k( 3) + 1.

Much more is known about Gauss™s algorithm and its complexity; useful
references are [DFV97] or [Val91].

10.3 Higher dimensions
In order to reduce lattices in higher dimension, it is important to ¬rst de-
termine what a reduced lattice is. A ¬rst attempt is to generalize the mathe-
matical criteria that says that in dimension 2 a basis of a lattice L is reduced
when its ¬rst and second vectors realize the ¬rst and second minima »1 (L)
and »2 (L). Generalizing this, we would ask for a basis that realizes the suc-
cessive minima of a higher dimensional lattice. However, in large dimension,
this is not possible. For example, look at the following 5-dimensional lattice
(generated by its rows):
« 
¬0 2 0 0 0·
¬ ·
L = ¬0 0 2 0 0·.
¬ ·
0 0 0 2 0

The successive minima of L are:

»1 = 2, »2 = 2, »3 = 2, »4 = 2 and »5 = 2.

© 2009 by Taylor and Francis Group, LLC
Lattice reduction 319

To realize these minima, we may take the following family of vectors:
« 
¬0 2 0 0 0·
¬ ·
FL = ¬ 0 0 2 0 0 · .
¬ ·
0 0 0 2 0
Clearly, this family is not a lattice basis, because it cannot yield the vector
(1, 1, 1, 1, 1).
In fact, in large dimension, there is no good unique de¬nition of a reduced
lattice. We can try to minimize the basis with respect to a variety of prop-
erties; however, the resulting de¬nitions are not equivalent. In 1982, Lenstra,
Lenstra and Lov´sz [LLL82] proposed a polynomial time algorithm able to
produce reasonably good basis, using a relaxed notion for reduced basis. This
notion, called LLL-reduction is de¬ned as follows.

DEFINITION 10.2 A basis B of a lattice L is said to be δ-LLL reduced
for a parameter 1/4 < δ ¤ 1 if the following conditions are satis¬ed:
(bj |b— ) ¤i
∀i < j : , (10.9)
(bi+1 |b— )
2 2
b— b— i
∀i : δ ¤ + , (10.10)
i i+1 2

where the vectors b— result from the Gram-Schmidt orthogonalization of the
basis B.

A basis that satis¬es the single condition of Equation (10.9) is said to be
size-reduced. Moreover, Equation (10.10) is often referred to as Lov´sz con-

10.3.1 Gram-Schmidt orthogonalization
Since Gram-Schmidt orthogonalization is already present in the de¬nition
of LLL-reduction, before presenting Lenstra, Lenstra and Lov´sz lattice re-
duction algorithm, it is essential to recall the details of Gram-Schmidt orthog-
onalization process. This algorithm takes as input a basis B = (b1 , b2 , · · · , bn )
of any vector subspace of Rm (with m ≥ n) and outputs an orthogonal basis
of the same vector subspace B — = (b— , b— , · · · , b— ). In addition, B — satis¬es
the following conditions:
1. b— = b1 ,

2. b— is the projection of bi , orthogonally to the vector subspace generated
by the i ’ 1 ¬rst vectors of B.


. 11
( 18)