. 13
( 19)


entries in where the equation is defined. From this we get
the equation set

Since every in this equation set is defined in and we can
replace with the corresponding bit from the keystream z. Thus, is
An Improved Correlation Attack 383

a sequence of pointers to z and we can write the equations over z as the equation

We are now finished with the precomputation. Let be the number of
equations in that hold. We iterate as follows:
Input The keystream z of length M, the equation the equation set the
index sequence the states and and let

1. Calculate
2. Use Lemma 2 to generate and lower all indexes in the
equation set by one. Theorem 1 guarantees that the equations are defined
3. If the first equation in gets a negative index, then remove the equation
from Find a new index at the end of where is defined, and add
the new equation over z to
4. Calculate metric as the number of equations in that hold.
5. If set and
6. Set and go to step 1.
7. Output as the initialization state for
Remark 3. The algorithm is presented this way to make it readable and to show
the basic idea. To reach the complexity a few technical details on
the implementation of the algorithm are needed. These details are given in Ap-
pendix A.

5 Theoretical Properties
Success Formula
We can let an (unusual) encoder be defined by removing the Boolean function
from the cipher. Then we can use coding theory to evaluate the attack. Let the
initialization state for define the information bits in such a system.
Let be the (not filtered) irregular clocked stream from
that is y = Q(c, u) and Then the bitstream y defines
the codeword that is sent over a noisy channel. Let the keystream z = Q(c, v)
(the filtered version of y) be the received codeword.
Assume we have the wrong guess for then approximately of the equa-
tions in the set (13) will hold. Now assume we have have guessed the correct
According to the observation in Section 2.1 the equations in the set (13) will
independently of the initializa-
hold with probability
tion bits
384 Håvard Holland and Tor Helleseth

Let define the channel ˜noise™. The uncertainty is defined by
and the channel capacity is given by
We can approximate with Following Shannon™s noisy
coding theorem we can set up this bound for success.
Proposition 1. The attack will succeed with if the number of
parity check equations is

where and where is the number of input bits
When is close to we expect the probability for success to be close to 1,
see [15]. The simulations of our algorithm show that if we set the
success rate is approximately 99%.

5.2 Keystream Length
If the generator polynomial has weight we must find a multiple
of of weight 4 and a degree We need at least the v stream to be
of length In addition, to find entries in v where the equation is defined v
must at least have length

From the expectation (6) of N we get
which proves the following proposition:
Proposition 2. Let an equation over v be defined by of weight 4 and degree
To obtain an equation set of equations over z, the length of the z stream
must be

The keystream length M depends on the number of equations the deletion
rate and the degree of The degree is then again highly dependent
on the search algorithm we use to find When we use the search algorithm
in [11,17] the degree of will be of order which is close to
the theoretical expected degree [5] for

5.3 Runtime Complexity
The runtime complexity for our attack is

parity check tests, where is calculated using Corollary 4. Note that the runtime
is independent of the length of
An Improved Correlation Attack 385

5.4 Memory Complexity
If we implement the attack directly as described in Sections 4.3 and 4.4 the
algorithm will need around bits of computer memory. The reason
for the 32N term is that of length N is a sequence
of pointers of 32 bits. In appendix A.2 we show how we can store using
N memory bits without affecting the runtime complexity. The total amount of
memory bytes needed is then

6 Simulations of the Attack
The LILI-128 cipher[16] is based on the general model we attack in this paper.
To be able to compare our attack with previous attacks, we have tested the
attack on this cipher.

6.1 The LILI-128 Cipher
In the LILI cipher the clock control generator is defined by

and The data generator sub system is

and de-
fined by a Boolean table of size 1024. Further on we get and
for and The number of keybits in the secret key
is 39 + 89 = 128.

6.2 Simulations
We have done the simulations on some versions of the LILI-128 cipher with
LFSRs of different lengths to empirically verify the success formula in Section
5.1. See Table 2 for the simulations. Note that we use the full size from
the LILI cipher in the three attacks in the bottom of the table. For and
we get
We have implemented the attack in C code using the Intel icc compiler on a
Pentium IV processor. Using the full 32-bit capability and all the implementation
tricks explained in Appendix A our implementation uses only approximately 7
cycles per parity check test. Hence the algorithm works fast in practice and will
take processor cycles.
Each attack is run 100 times, and the table shows that the estimated success
rate holds and that the algorithm is efficient.
386 Håvard Holland and Tor Helleseth

6.3 A Complete Attack on LILI-128
Preprocessing. For the LILI cipher, we have found a multiple
which corresponds to the recursion
and we have that

This precomputation took only 5 hours and 40 Gbyte hard disk space. We see

Finding We have and
To be almost sure to succeed we use equations. Hence, the
runtime for attacking LILI-128 is

parity checks. Using our implementation this corresponds to processor
cycles. Using Proposition 2 with we need a keystream of length
The attack needs about 290 Mbyte of RAM. It can easily be parallelized
and distributed among processors with virtually no overhead, since there is no
need for communcation between the processor, and no need for shared memory.
If we have 1024 Pentium IV 2.53 GHz processors, each having access to about
290 MB of memory, the attack would take about 4.5 months using 68 Mbyte of
keystream data.

Finding when is known. Our attack only finds the initialization bits
for It is possible to combine the Quick Metricfrom [12] with the previous
attack against LILI in [7] to find when is given. Since this is not the scope
of this paper we will not go into details, and we refer to [7,12] for the exact
description. The preprosessing stage will have complexity of order memory
An Improved Correlation Attack 387

lookups, and runtime complexity of order parity checks. The complexity
for the method above is much lower than the complexity for finding and will
therefore have little effect on the overall runtime for a full attack.

7 Conclusion
We have proposed a new key recovery correlation attack on irregular clocked
keystream generators where the stream is filtered by a nonlinear Boolean func-
tion. Our attack uses a correlation property of Boolean functions, that gives
higher correlation than previous methods. Thus we need fewer equations to suc-
ceed. The property holds even if the function is correlation immune. Using this
property together with the iteration techniques from [11] we get a low runtime
and low memory complexity algorithm for attacking the model. The algorithm
outputs the initialization bits for Knowing there exist previous
algorithms which can determine efficiently.

We would like to thank Matthew Parker, John Erik Mathiassen and the anony-
mous referees for many helpful comments.

1. V. Chepyzhov, T. Johansson, and B. Smeets. A simple algorithm for fast correlation
attacks on stream ciphers. In Fast Software Encryption, FSE 2000, volume 1978
of Lecture Notes in Computer Science, pages 181“195. Springer-Verlag, 2001.
2. Philippe Chose, Antoine Joux, and Michel Mitton. Fast correlation attacks: An
algorithmic point of view. In Advances in Cryptology - EUROCRYPT 2002, volume
2332 of Lecture Notes in Computer Science, pages 209“221. Springer-Verlag, 2002.
3. Nicolas Courtois. Fast algebraic attacks on stream ciphers with linear feedback. In
Advances in Cryptology-CRYPTO™ 2003, volume 2729 of Lecture Notes in Com-
puter Science, pages 176“194, 2003.
4. Nicolas Courtois and Willi Meier. Algebraic attacks on stream ciphers with linear
feedback. In Advances in Cryptology - EUROCRYPT 2003, volume 2656 of Lecture
Notes in Computer Science, pages 345“359, 2003.
5. Computation of low-weight parity-check polynomials. Electronic Letters,
October 1996. 32(21):1981-1982.
6. T. Johansson and F. Jönsson. Theoretical analysis of a correlation attack based
on convolutional codes. In Proceedings of 2000 IEEE International Symposium on
Information Theory, IEEE Transaction on Information Theory, page 212, 2000.
7. Fredrik Jönsson and Thomas Johansson. A fast correlation attack on LILI-128. In
Inf. Process. Lett. 81(3), pages 127“132, 2002.
8. Sabine Leveiller, Gilles Z©mor, Philippe Guillot, and Joseph Boutros. A new crypt-
analytic attack for pn-generators filtered by a boolean function. In Selected Areas
in Cryptography: 9th Annual International Workshop, SAC 2002, volume 2595 of
Lecture Notes in Computer Science, pages 232“249. Springer-Verlag, 2003.

Håvard Molland and Tor Helleseth

9. M. Matsui. Linear cryptanalysis method for DES cipher. In Advances in
Cryptology-EUROCRYPT™93, volume 765 of Lecture Notes in Computer Science,
pages 386“397. Springer-Verlag, 1994.
10. W. Meier and O. Staffelbach. Fast correlation attacks on stream ciphers. In Ad-
vances in Cryptology-EUROCRYPT™88, volume 330 of Lecture Notes in Computer
Science, pages 301“314. Springer-Verlag, 1998.
11. Håvard Molland. Improved linear consistency attack on irregular clocked keystream
generators. In Fast Software Encryption, FSE 2004, To appear in LNCS. Springer-
Verlag, 2004. Available at http://www.ii.uib.no/˜molland/crypto
12. Håvard Molland, John Erik Mathiassen, and Tor Helleseth. Improved fast correla-
tion attack using low rate codes. In Cryptography and Coding, IMA 2003, volume
2898 of Lecture Notes in Computer Science, pages 67“81, 2003.
13. W.T. Penzhorn and G.J Kuhn. Computation of low-weight parity checks for corre-
lation attacks on stream ciphers. In Cryptography and Coding, IMA 1995, volume
1025 of Lecture Notes in Computer Science, pages 74“83. Springer-Verlag, 1995.
14. Markku-Juhani Olavi Saarinen. A time-memory tradeoff attack against LILI-128.
In Fast Software Encryption, FSE 2002, volume 2365 of Lecture Notes in Computer
Science, pages 231“236, 2002.
15. T. Siegenthaler. Decrypting a class of stream ciphers using ciphertext only. IEEE
Trans, on Comp., C-34:81“85, 1985.
16. L. Simpson, E. Dawson, and W. Millan. LILI keystream generator. In
SAC™2000, volume 2012 of Lecture Notes in Computer Science, pages 231“236.
Springer-Verlag, 2002. Available at http://www.isrc.qut.edu.au/lili.
17. D. Wagner. A generalized birthday problem. In Advances in cryptology-CRYPTO™
2002, volume 2442 of Lecture Notes in Computer Science, pages 288“303, 2002.
18. Xian-Mo Zhang and Yuliang Zheng. The nonhomomorphicity of boolean functions.
In Selected Areas in Cryptography, SAC 98, volume 1556 of Lecture Notes in Com-
puter Science, pages 280“295. Springer-Verlag, 1998.

A Implementation Details
To reach the runtime complexity and memory complexity down to
bits, the implementation of the algorithm has some tricks. Since not
all of these tricks are obvious we give more detailed descriptions of them below.

A.1 Runtime Details
Sliding window. In Lemma 2 we get by among other things deleting the
first bits of This is done using the sliding window technique, which means
that we move the viewing to the right instead of shifting the whole sequence to
the left. This way the shifting can be done in just a couple of operations. To
avoid heavy use of memory, we slide the window over an array of fixed length
N, so that the entries that become free at the beginning of the array are reused.
Thus, the left and right indexes of the sliding window after iterations will be

where for all

An Improved Correlation Attack 389

The same sliding window technique is also used on the equation set when
equations are deleted and added to the equation set.

Updating the indices. In Lemma 2 every pointer in is replaced with
for every which would take M operations. If we skip the replacements
we note that after iterations the entry in will become It is also
important to note that when we write the entries
are pointers from to z. They are not the actual key bits.
Thus, in the implementation we do not replace with But when we after
iterations in the search for equations find an equation
that is defined, we replace the corresponding equation with
to compensate.

Reducing the memory access time. When we test an equation we must use
pointers to pointers to the keystream. Then each equation test will have high
memory access time. We can reduce this significantly by testing the equations
on 32 states simultaneously. This is possible since the next state is tested
by shifting all the equations one entry to the left over z. We can now take the
bits for each of the term in the equations and
put them into 32 bit registers. Now we can test the states and add one to the
metrics of the states that satisfy the equation. This speeds up the runtime by a
factor of approximately 20.

A.2 Memory Details
Reducing the use of memory. Instead of storing all the pointers, we set 1
in where the bits are defined and 0 otherwise. When we search in to find
entries where the equation is defined, we keep track of where in z the four
terms in points to by counting the number of 1™s we pass during the search.
This is done for each of the 4 terms in the equation This way we always
know where in z the given equation of points to. Using this trick the number
of memory bits needed during an attack is reduced from bits to

Implementing this trick will not affect the runtime of the attack.

Rewriting Variables: The Complexity
of Fast Algebraic Attacks on Stream Ciphers

Philip Hawkes1 and Gregory G. Rose1
Qualcomm Australia, Level 3, 230 Victoria Rd, Gladesville, NSW 2111, Australia

Abstract. Recently proposed algebraic attacks [2,6] and fast algebraic
attacks [1,5] have provided the best analyses against some deployed
LFSR-based ciphers. The process complexity is exponential in the de-
gree of the equations. Fast algebraic attacks were introduced [5] as a
way of reducing run-time complexity by reducing the degree of the sys-
tem of equations. Previous reports on fast algebraic attacks [1,5] have
underestimated the complexity of substituting the keystream into the
system of equations, which in some cases dominates the attack. We also
show how the Fast Fourier Transform (FFT) [4] can be applied to de-
crease the complexity of the substitution step. Finally, it is shown that
all functions of degree satisfy a common, function-independent linear
combination that may be used in the pre-computation step of the fast
algebraic attack. An explicit factorization of the corresponding charac-
teristic polynomial yields the fastest known method for performing the
pre-computation step.

1 Introduction
Many popular stream ciphers are based on linear feedback shift registers (LF-
SRs) [11]. Such ciphers include E0 [3], LILI-128 [12] and Toyocrypt(see [10]).
They consist of a memory register called the state that is updated (changed)
every time a keystream output is produced, and an additional device, called the
nonlinear combiner. The nonlinear combiner computes a keystream output as
a function of the current LFSR state1. The sequence of states produced by an
LFSR depends on the initial state of LFSR, which is always presumed to be se-
cret. Since recovering this initial state allows prediction of unknown keystream,
we follow the convention of [5] and call it K as if it was actually the key. Most
practical stream ciphers initialize this state from the real key and a nonce. The
advantages of LFSRs are many. LFSRs can be constructed very efficiently in
hardware and some recent designs are also very efficient in software. LFSRs can
be chosen such that the produced sequence has a high period and good statistical
Some LFSR-based stream ciphers have a non-linear filter that maintains some bits of
memory, but research has shown that such ciphers can be analyzed in the same way
as ciphers without memory. Some designs use multiple LFSRs, but again these are
usually equivalent to a single LFSR. Some modern stream ciphers use units larger
than bits, but this discussion applies equally to such ciphers, so we will talk only in
terms of bits.

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 390“406, 2004.
© International Association for Cryptologic Research 2004
Rewriting Variables: The Complexity of Fast Algebraic Attacks 391

While there are many approaches to the cryptanalysis of LFSR-based stream
ciphers, this paper is concerned primarily with the recently proposed algebraic
attacks [2,6] and fast algebraic attacks [1,5]. Such attacks have provided the
best analyses against some theoretical and deployed ciphers.
An algebraic attack consists of three steps. The first step is to find a system
of algebraic equations that relate the bits of the initial state K and bits of the
keystream Some methods [2,6] have been proposed for finding “lo-
calized” equations (where the keystream bits are in a small range
This first step is a pre-computation: the attacker must compute these equations
before attacking a key-stream. Furthermore, the computation need only be per-
formed once, and the attacker can use the same equations for attacking multiple
key-streams. The second and third steps are performed after the attacker has
observed some keystream. In the second step, the observed keystream bits are
substituted into the algebraic equations (from the first step) to obtain a system
of algebraic equations in the bits of K. The third step is to solve these algebraic
equations to determine K. This will be possible if the equations are of low degree
in the bits of K, and a sufficient number of equations can be obtained from the
observed keystream.
The process complexity of the third step is exponential in the degree of the
equations. Fast algebraic attacks were introduced by Courtois at Crypto 2003 [5]
as a way of reducing run-time complexity by reducing the degree of the system
of equations. This method requires an additional pre-computation step; this step
determines a linear combination of equations in the initial system that cancels
out terms of high degree (provided the algebraic equations are of a special form).
This yields a second system of equations relating K and the keystream Z that
contains only terms of low degree. In the second step, the appropriate keystream
values are now substituted into this second system to obtain a new system of
algebraic equations in the bits of K. Solving the new system (in the third step)
is easier than solving the old system because the new system contains only terms
of low degree.
Courtois [5] proposes using a method based on the Berlekamp-Massey algo-
rithm [8] for determining the linear combination obtained in the additional pre-
computation step. The normal Berlekamp-Massey algorithm has a complexity of
while an asymptotically-fast implementation has a complexity of C·D(log D)
for some large constant C. It is unclear which method would be best for the size
of D considered in these attacks. Armknecht [1] provides a method for improving
the complexity when the cipher consists of multiple LFSRs.
Contributions of this paper. The first contribution is to note that previous
reports on fast algebraic attacks (such as [1,5]) appear to have underestimated
the complexity of substituting the keystream into the second system of equa-
tions2. The complexity was originally underestimated as only O(DE) [5], where
D is the size of the linear combination and E is the size of the second system
We are aware (via private communication) of other proposed algebraic attacks in
which the substitution complexities were initially ignored. In one case, the complexity
of simple substitution was almost the square of the complexity of solving the system.

392 Philip Hawkes and Gregory G. Rose

of equations. Table 1 lists the values of O(DE) for previously published attacks
from [1,5]. However, simple substitution would require a complexity of
(see Section 2.3), and no other method was suggested for reducing the complex-
ity. It is true that E bitwise operations of the substitution can be performed
in parallel, reducing the time complexity to but in cases where E is
large, the process complexity should still be considered in the absence of
specialized hardware. In many cases actually exceeds the complexity of
solving the system of equations, as shown in Table 13. The second contribution
of this paper is to show how the Fast Fourier Transform (FFT) [4] can be applied
to decrease the complexity of the substitution step to D. The resulting
complexities of the FFT approach are also listed in Table 1.

The final contribution of this paper is to provide an efficient method for
determining the linear combination obtained in the additional pre-computation
step of the fast algebraic attack. First, we make the observation that all functions
of degree satisfy a common function-independent linear combination of length
that is defined exclusively by the LFSR. Then we provide a direct
method for computing this linear combination (based on the work of Key [7]).
This method requires operations for small constant
This is a significant improvement on the complexities of previous methods.

This paper is organized as follows: Section 2 describes fast algebraic attacks.
In Section 3 we discuss the complexity of substitution step for fast algebraic
attacks. Section 4 reviews the Fast Fourier Transform and Section 5 describes
The attack on LILI-128 requires only every bit from a keystream of
length The process complexity of selecting these bits is ignored in the literature,
and could be an area for useful discussion.

Rewriting Variables: The Complexity of Fast Algebraic Attacks 393

how the FFT can speed up the substitution step. Section 6 contains some ob-
servations on the pre-computation step. Section 7 concludes the paper.

2 Fast Algebraic Attacks

The length of the LFSR is that is the internal state of the LFSR is
A state is derived from the previous state by applying
an (invertible) linear mapping with The
function L can be represented by an matrix over GF(2), which is called the
state update matrix. Notice that we can write Each keystream bit
is generated by first updating the LFSR state (by applying L) and then applying
a Boolean function to the bits of the LFSR state. For the purposes of this paper,
everything about the cipher is presumed to be known to the attacker, except the
initial state of the LFSR and any subsequent state derived from it.
Linearization: Recall that the first two steps of the attack result in a system
of nonlinear algebraic equations in a small number of unknown variables (these
variables being the bits of the initial state). The most successful algebraic attacks
(to date), have been based on linearization. The basis of this technique is to
“linearize” a system of nonlinear algebraic equations by assigning a new unknown
variable to each monomial term that appears in the system. The same monomial
term appearing in distinct equations is assigned the same new unknown variable.
The system of equations then changes from a system of non-linear equations
(with few unknown variables) into a system of linear equations (with a large
number of unknown variables). If the number of linear equations exceeds the
number of new unknown variables, then an attacker can solve the system to
obtain the new unknown variables of the linear system (which will in turn reveal
the unknown variables of the non-linear system). The advantage of linearization
is that the attacker can use the large body of knowledge about the solution of
linear systems.

2.1 The Monomial State

This section introduces some notation that is useful for describing linearization.
For a given value of the state and for a given degree we shall let (the
monomial state) denote the GF(2) column vector with each component being a
corresponding monomial of degree or less. The number of such monomials is
so contains D components. The initial monomial
corresponds to the initial state K.

Example 1. If (that is, and then there are
D = 11 monomials of degree 2:

394 Philip Hawkes and Gregory G. Rose

where “T” denotes the transpose of the matrix to make a column vector. For
the values of the monomial components of are:

The ordering of the monomial components is arbitrary; for consistency we will
enumerate using lower subscripts first, as shown.

Expressing Functions of the LFSR State. We can express any Boolean
function of the LFSR state as a product of the matrix with a row vector.
Example 2. Consider a Boolean function of the state (using
the LFSR state from Example 1). This function can be expressed:

where the addition and multiplication operations are performed in GF(2). We
have now expressed the Boolean function as the product of the matrix
with a row vector

that selects the values of the specific monomials required to evaluate
The row vector f depends only on the function and is independent of the
LFSR feedback polynomial, the value of the initial state, and the index
The Monomial State Rewriting Matrix. The mapping from one LFSR state
to the next LFSR state can be expressed as a matrix product It
is also possible to determine the mapping from one monomial state to the next
monomial state as a matrix product
Example 3. Consider a 4-bit LFSR as in Example 1 with monomial state
If the LFSR has is
of the form then the next state has a corresponding next
monomial state
which is related to the original monomial state as follows (only some relation-
ships have been shown in order to save space):

Rewriting Variables: The Complexity of Fast Algebraic Attacks 395

Each component of the next monomial state is a linear function of the original
monomial state. These linear functions for can be combined into
a matrix (the “rewriting matrix”) such that

Notice that the matrix depends only on the LFSR and the degree This
example generalizes: for every LFSR and degree there is a “monomial state
rewriting matrix” such that Moreover, for every
the monomial state after clocks of the LFSR can be expressed as a GF(2)
matrix operation

where is the initial monomial state. Combining equations (1) and (2), we
get another expression for

where the vector depends solely on the function the monomial
state update matrix and the number of clocks (all of which are known to
the attacker). For example, the vectors corresponding to the
function in Example 2 are:

2.2 Algebraic Attacks
We always assume that the monomial state is unknown; it is the goal of algebraic
attacks to determine the initial monomial state (and thereby determine the
initial LFSR state).
Step 1. The first step in an algebraic attack is to find a Boolean function such
that the equation

396 Philip Hawkes and Gregory G. Rose

is true for all clocks (or indices) The degree of with respect to the bits
of we shall denote by Various methods have been proposed for finding
such equations (see [2,6]). These equations typically have small values for For
simplicity we shall hereafter combine the keystream bit values into
a keystream vector
For the linearization approach, it is convenient to obtain an expression for
in terms of keystream bits and bits of the initial monomial state
1. Express as the inner product of and a keystream-dependent

2. Now, Equation (2) can be substituted into Equation (5);

3. Equation (4) is thereby transformed to the form:

The components of depend on: (a) the function
(b) the monomial state rewriting matrix associated with the monomials of
degree of degree or less; (c) the number of clocks and (d) the small keystream
vector An attacker has access to all of this information, so the attacker is able
to compute all of the components of This means that the only unknowns
in Equation (6) are the components of the initial monomial state
Step 2. The second step of an algebraic attack consists of substituting the
observed keystream vector into the components of the vectors and then
computing the vector The vectors are evaluated for
many indices Each of the evaluated vectors provides the attacker with
a linear equation in the D unknown bits of the initial monomial state
Since there are D unknowns, around D linear equations will be required to
obtain a solvable system. An initial choice of D equations may contain linearly
dependent equations, so more than D equations may be required in order to get a
completely solvable system. It is thought that not many more than D equations
will be required in practice (see remark at the end of section 5.1 of [5]), so we
will assume D equations are sufficient.
Step 3. The third step recovers by solving the resulting system of linear
equations. The system can be solved by Gaussian elimination or more efficient
methods [13]. The complexity of solving such a system of equations is estimated
to be where (known as the Gaussian co efficient) is estimated to be
In general, D will be about

Rewriting Variables: The Complexity of Fast Algebraic Attacks 397

Complexities. The complexities of an algebraic attack are as follows:
The complexity of finding the equation depends on many factors
and is beyond the scope of this paper.
The amount of keystream required for the second step (the data complexity)
The complexity of the second step (substituting the keystream into the equa-
tions) is assuming that the functions are rela-
tively simple functions of the keystream; and
the complexity of solving the system in the third step is
Note 1. The complexity is exponential in the degree Hence, a low degree is
required for an efficient attack. Therefore, an attacker using an algebraic attack
will always try to find a system of low degree equations.

2.3 Fast Algebraic Attacks
Courtois [5] proposed “fast algebraic attacks”, as a method for decreasing the
degree of a given system of equations. For fast algebraic attacks, we presume
that the function can be written in the form

where is of degree in the bits of is of degree in the bits of
and only depends on the keystream. Since the functions and are of two
distinct degrees (in the bits of it is simplest to consider them as depending
on distinct monomial states and with corresponding monomial state
monomials of
rewriting matrices and There are
monomials of degree or less.
degree or less, and
A fast algebraic attack gains an advantage over the normal algebraic attacks
by including an additional pre-computation step in which the attacker deter-
mines linear combinations of equation (7) that will cancel out the high-degree
monomials of degree that occur in but not in
As in equation (3), and are written as vector inner-products:
where is a vector with D
components (all of which are independent of the keystream); and
where is a vector
with E components (some of which are dependent on the keystream).
Equation (7) is then transformed to:

In the fast algebraic attack pre-computation step, the attacker finds (D + 1)
coefficients such that

398 Philip Hawkes and Gregory G. Rose

Equations (8) and (9) can be combined:

Thus, we obtain a linear expression in

The second step of a fast algebraic attack is to evaluate many vectors
by substituting observed keystream vectors into the vectors in equa-
tion (11). Each of the evaluated vectors provides the attacker with a linear
equation in the E unknown bits of the initial monomial state Equation (10)
involves fewer unknowns than the initial equation (7); this means that the fast
algebraic attack requires fewer equations in order to solve for the unknowns.
Reducing the number of unknowns and equations significantly improves the third
step of the attack as solving the system of E equations (10) takes significantly
less time than solving the system of D equations of (6). The complexity of the
third step is now
Courtois [5] and Armknecht [1] have proposed efficient methods for finding
the coefficients of equation (9). The details are not relevant to this paper, but
the complexities are provided in Table 2 for the purposes of comparison with
the method proposed in Section 6 of this paper.
Data Complexity. Evaluating the vector (for each equation (10)) requires
substituting the bits from the D keystream vectors Obtaining
E equations (10) can be achieved using the set of keystream vectors
These keystream vectors can be
obtained from the keystream bits Hence, the attack can
be performed using as few as keystream bits.

3 Substitution Complexity of Fast Algebraic attacks
Normal algebraic attacks and fast algebraic attacks differ in the complexity of
substituting the keystream into the equations in Step 2. The vector is a
function of a small number of keystream bits but the vector
is a function of a large number of keystream bits
As discussed in the introduction, a misunderstanding resulted the attacks [1,5]
failing to account for this difference.
The naïve approach to substituting the keystream is to compute the vectors
first and then substitute these vectors into the equations (11) individually4.
Computing a single component of the vector for a single
value of will require complexity D/2, since (on average) half of the coefficients
We ignore the cost of computing as this cost is independent of the cost of
determining from the values of

Rewriting Variables: The Complexity of Fast Algebraic Attacks 399

are expected to be zero. There are E components in each vector so the
complexity of substituting the keystream to obtain a single vector using
equation (11) is E — (D/2) = ED/2. That is, obtaining a single equation (11)
has complexity is ED/2. Since E equations are required in order to solve the
system, the total cost of simple substitution will be
Table 3 lists the complexity of simple substitution for the fast algebraic attacks
in the literature. Note that simple substitution is significantly more complex than
solving the linear system of equations in these cases.

4 The Discrete Fourier Transform
Real Spectral Analysis: First, we™ll consider a quick tangential topic. A com-
mon tool in analyzing a real-valued function (such as a sound wave) eval-
uated on a real domain is to represents the function as a sum
of simple periodic functions (cosine and sine curves) where the function is
specified by the amplitudes of these periodic functions:

with amplitudes and assigned for each frequency The sequences
and are the Fourier series for and evaluating the amplitudes is called
a spectral analysis.
Discrete Spectral Analysis: Suppose is a function defined at discrete
values and the values of lie in a field Such discrete
functions are equivalent to sequences written Discrete spectral anal-
ysis of like real spectral analysis, represents using simple periodic sequences
with period P. These periodic sequences are of the form
where and is an element of multiplicative order P in some
field these functions are analogous to the sine and cosine curves.
In some cases, has elements of multiplicative order P, and can be an
element of that is, In other cases, must be chosen in an a larger
field that is an extension field of In either case, the field is a vector space
over that is, elements of are of the form for some basis
Elements are mapped to elements where
400 Philip Hawkes and Gregory G. Rose

is the identity element of Thus, the sequence of elements of is mapped
to the sequence with elements of A discrete spectral analysis determines
a sequence of P “amplitudes” such that the
sequence can be expressed as:

In this way, each sequence value is represented as a linear combination of
sum of P periodic sequences It is well known that the
sequence of amplitudes A can be computed directly from the sequence as:

The calculation of A from as in (13) is called the Discrete Fourier Transform
(DFT), while the calculation of from A is the Inverse DFT. The most efficient
method for performing the DFT, known as the Fast Fourier Transform (FFT) [4],
requires a total of operations in the field There is also an Inverse
FFT that uses the same amount of computation to invert the DFT.
Convolutions and the DFT. The convolution of two discrete sequences and
of period P is another sequences of period P with
These are sequences of elements from the field It is common
to write Computing the convolution according to first principles
would take multiplication and addition operations in the field However,
the Convolution Property provides us with an alternative method.
Convolution Property if and only if
The convolution can be computed by applying the FFT to and to form A
and B, forming and finally applying the inverse FFT to Y
in order to form The total complexity is
operations in the field In the cases where the FFT method is faster
by a factor of In other cases, computations in cost more than
computations in and the advantage is less. This “trick” has been applied in
many areas such as the fast multiplication of larger numbers and polynomials
(the product of two polynomials is the convolution of the two corresponding
sequences of coefficients). We shall use this trick in the next two sections.

5 Applying the FFT to the Substitution Step
The calculation in equation (11) is performed component-wise, so we will begin
by focussing on the sequence of values for only one of the monomial components
of the vectors and the
corresponding components of the vectors Assume that the attacker
has observed a sufficient amount of keystream, evaluated the values of in
equation (8) for and determined the values The
Rewriting Variables: The Complexity of Fast Algebraic Attacks 401

attacker now needs fast way to determine the values
(see equation (11)) from the values
The inefficiency of using simple substitution is indicated by two things:
Equations (11) often re-use the same values of when computing

Equations (11) all use the same linear combination;
This problem appears similar to computing for an appropriate
sequence Indeed, if is defined as and
then for Thus, the FFT may
be combined with the Convolution Property for computing The sequences
and are defined on the field so will be a field of the form
We choose to be the smallest value such that and define
This choice seems best because it uses the smallest number of bits
to represent elements of
Basic DFT-Based Substitution Algorithm
1. Map the sequences and in to sequences and in
Apply the DFT to obtain the sequence of amplitudes B from
3. Apply the DFT to obtain the sequence of amplitudes V from
5. Apply the inverse DFT to obtain from Q. Note
6. Extract from if else
Complexity. This may seem like a strange way to compute but the
algorithm is very efficient when the FFT is used to compute the DFTs:
The values are computed via the FFT using field operations
(operations in the field For given values the same sequence
is used for each monomial component and for each attacked keystream.
The attacker should pre-compute and store B to save time.
The values are computed via the FFT using field operations.
The values are computing using (D + E) field multiplications.
The sequence can be obtained from by applying the standard Inverse
FFT; this requires in time field operations.
The pre-computation of B requires field operations. The run-time
total complexity for computing the value of from the values is ap-
proximately field operations. These field operations are more complex
than GF(2) (logical) operations. To a good approximation, each field opera-
tion is equal in complexity to logical operations (much of this can be
parallelized). Thus, for our calculations, the run-time complexity of the above
algorithm is equivalent to around logical operations.
Improvement 1. The above algorithm computes all (D + E) values of but
only E of these values are ever used. An efficient alternative is to divide the
linear combination into segments of length and perform the FFTs
402 Philip Hawkes and Gregory G. Rose

on these segments using a smaller field where
If we define appropriate sequences then we may write:

with sub-sequences Now, if
and only if Thus, computing requires: pre-
computing the FFTs of computing the FFTS of computing and
applying the inverse FFT to Q to obtain and thus The FFTs and Inverse
FFT dominate the complexity, requiring logical operations at
run-time, where The basic algorithm above uses The optimal
choice for (providing the lowest complexity) depends on D and E.
Improvement 2. The DFT-based substitution algorithm computes the values
of for only one component of the vectors
There are a total of E monomial components for each
such vector; thus if each component is computed separately, then the total
complexity of computing all components of every vector would be ap-
proximately field operations, or logical
operations. Fortunately, monomial components can be packed into each com-
putation. For a set of monomial indices we define a sequence

provides values for the price of one, dividing the total complexity by
Improved DFT-Based Substitution Algorithm
1. Form
Form sub-sequences
3. Apply DFT to obtain from
Apply inverse DFT to obtain from Q .
The run-time complexity, after applying these two improvements, is

Table 4 shows the complexities of the FFT method for substitution for the
current fast algebraic attacks in literature. In the case of E0, the improvement
has been significant, and the substitution step no-longer dominates the run-time
Rewriting Variables: The Complexity of Fast Algebraic Attacks 403

complexity. The improvement in the substitution complexity is less noticeable
required for LILI-128, and insignificant for Toyocrypt. The substitution step still
comprises a significant portion of the complexity for these attacks.
In all cases, the first improvement did not affect the complexity significantly;
the largest improvement was by a factor of 4. Interestingly, the optimal value of
was for which and The corresponding complexity
is around

6 Improving the Pre-computation Step
A square matrix satisfies its characteristic polynomial. That is, if
is the characteristic polynomial of then

where 0 represents the all-zero matrix. Suppose the coefficients of
Equation (9) are assigned the values of the coefficients of the charac-
teristic polynomial of Then, for any function of degree

The characteristic polynomial of depends on the LFSR and the degree and
is otherwise independent of the function Thus, the coefficients
(of the characteristic polynomial of can be substituted for the coefficients
in equations (9) and (11) for all functions of degree
Most functions of degree have a minimal polynomial of degree
(see [9, Fact 6.55]); the minimal polynomial for these functions will
be However, there are functions the minimal polynomial is a smaller
factor of For example, in the attack on E0 [1], has length D =
11,017,633; while the minimal polynomial of the Boolean function used in the
attack is of slightly smaller length Using in the attack
on E0 (instead of using the minimal polynomial) would increase the complexity
404 Philip Hawkes and Gregory G. Rose

by a small amount. An advantage of the methods proposed in [1,5] is that those
method will find the minimal polynomial for a specified Boolean function, even
if the minimal polynomial is smaller than
It is not difficult to show that the polynomial divides This
suggests that the linear combination may also be zero, thus
resulting in a trivial equation that provides no information about the
initial monomial state. The probability of this cancellation occurring is small;
the vectors in the sum depend
on the keystream. Nonetheless, this suggests that a better approach would be
to cancel only those components corresponding to monomials of degree greater
than using the polynomial

The linear combination cancels components corresponding
to monomials of degree in the range but will not cancel components
corresponding to monomials of degree or less. Hence,
can be considered as an E-dimensional vector for some vector
Equation 10 would then become

The probability that will depend on the probability that
for a random z. Unless v(z) is constant, this probability will
be less than 1/2. After substitution of many such vectors, the probability that
will be very small and Equation (15) is highly unlikely to be

6.1 Direct Computation of the Linear Combination
Suppose an LFSR state of length is updated according to state update
matrix L, and the characteristic polynomial of L is primitive5. The following
theorem, while not explicitly stated by Key [7], is a fairly obvious consequence
of Key™s ideas, so no proof is given. This result provides a direct method for
Theorem1. (Largely due to Key [7]) If is a root of the char-
acteristic polynomial of the LFSR state update matrix, then the characteristic
polynomial of is where and
denotes the Hamming weight of (that is, the number of 1™s in the radix-2 rep-
resentation of the integer
Factoring into GF(2) polynomials Computing the entire
product while in would be costly. Fortunately, the
This approach can be extended to cases where the characteristic polynomial is not
primitive; for example, when the keystream is a function of more than one LFSR.
See Key [7] for more details.

Rewriting Variables: The Complexity of Fast Algebraic Attacks 405

factors in can be easily grouped into GF(2) polynomials of degree
or less. We define an equivalence relation where if and only if
for some value Since the set is closed under
multiplication by 2, the polynomial has coefficients
that are either 0 or the identity element I. That is, the product can be
represented as a GF(2) polynomial. Thus, can compute in two phases:
1. Compute the GF(2) polynomials for all of weight or less.
2. Multiply the GF(2) polynomials to form

Computing The FFT over may be used to compute the poly-
nomials First, apply the FFT to the sequences corresponding to
for to obtain sequences Second, form sequence with
Finally, apply the inverse FFT to to obtain The
first step is the most costly; it requires logical operations for each factor.
There are D factors, so the total combined cost is logical operations.
Multiplying to form The second phase has polynomials with
coefficients in GF(2) and uses FFT™s in extension fields of GF(2). Multiplying
two GF(2) polynomials in to get a product of degree less than can be
performed (via the FFT) using operations in the
logical operations6. Use the
extension field this is equal to
FFT to first multiply pairs of polynomials of degree to get polynomials of
degree Then multiply pairs of polynomials of degree to get polynomials
of degree and so forth until is formed. The total complexity is

The combined complexity for the two phases is In Ta-
ble 2, the complexity of this method is compared against the previous methods.

7 Conclusion
We have shown that some published “fast algebraic attacks” on stream ciphers
underestimate the process complexity of one of the steps, and we provide correct
complexity estimates for these cases. We then show an improved method, us-
ing Fast Fourier Transforms, for substituting keystream bits into the system of
equations needing to be solved. We also made some observations about the linear
combination used in the pre-computation step of the fast algebraic attack. In par-
ticular, we found the fastest known method for performing the pre-computation.
The fast algebraic attack remains an extremely powerful technique for analyzing
LFSR-based stream ciphers.
The attacker can “pack” multiple GF(2) polynomials into a single sequence
and thereby compute the convolution of multiple pairs GF(2) polynomials using the
same amount of computation. This reduces complexity by a relatively small factor.

406 Philip Hawkes and Gregory G. Rose

Acknowledgement is due to Nicolas Courtois for inspirational discussions and fine
French cuisine during which the results in Section 6 came to light. The reviewers
also provided many useful insights; in particular, the two improvements to the
basic DFT-based substitution algorithm.

1. F. Armknecht: Improving Fast Algebraic Attacks, to be presented at FSE2004 Fast
Software Encryption Workshop, Delhi, India, February 5-7, 2004.
2. F. Armknecht and M. Krause: Algebraic Attacks on Combiners with Memory,
proceedings of Crypto2003, Lecture Notes in Computer Science, vol. 2729, pp.
162-176, Springer 2003.
3. Bluetooth CIG, Specification of the Bluetooth system, Version 1.1, February 22,
2001. Available from www.bluetooth.com.
4. J. W.Cooley and J. W. Tukey: An algorithm for the machine calculation of complex
Fourier series, Mathematics of Computation, 19, 90, pp. 297-301, 1965.
5. N. Courtois: Fast Algebraic Attacks on Stream Ciphers with Linear Feedback,
Crypto2003, Lecture Notes in Computer Science, vol. 2729, pp. 177-194, Springer
6. N. Courtois and W. Meier: Algebraic Attacks on Stream Ciphers with Linear Feed-
back, EUROCRYPT 2003, Warsaw, Poland, Lecture Notes in Computer Science,
vol. 2656, pp. 345-359, Springer, 2003.
7. E. Key: An Analysis of the Structure and Complexity of Nonlinear Binary Se-
quence Generators, IEEE Transactions on Information Theory, vol. IT-22, No. 6,
November 1976.
8. J. Massey: Shift Register synthesis and BCH decoding, IEEE Transactions on
Information Theory, IT-15 (1969), pp. 122-127.
9. A. Menezes, P.van Oorschot and A. Vanstone Handbook of Applied Cryptography,
CRC Press series on discrete mathematics and its applications, CRC Press LLC,
10. M. Mihaljevic and H. Imai: Cryptanalysis of Toyocrypt-HS1 stream cipher, IEICE
Transactions on Fundamentals, vol E85-A, pp. 66-73, January 2002. Available at
11. R. Rueppel: Stream Ciphers; Contemporary Cryptology: The Science of Informa-
tion Integrity. G. Simmons ed., IEEE Press, New York, 1991.
12. L. Simpson, E. Dawson, J. Golic and W. Millan: LILI Keystream Generator; Se-
lected Areas in Cryptography SAC™2000, Lecture Notes in Computer Science, vol.
1807, Springer, pp. 392-407.
13. V. Strassen: Gaussian Elimination is Not Optimal; Numerische Mathematik, vol.
13, pp. 354-356, 1969.

Faster Correlation Attack
on Bluetooth Keystream Generator E0

Yi Lu* and Serge Vaudenay

Abstract. We study both distinguishing and key-recovery attacks
against E0, the keystream generator used in Bluetooth by means of cor-
relation. First, a powerful computation method of correlations is formu-
lated by a recursive expression, which makes it easier to calculate corre-
lations of the finite state machine output sequences up to 26 bits for E0
and allows us to verify the two known correlations to be the largest for
the first time. Second, we apply the concept of convolution to the analy-
sis of the distinguisher based on all correlations, and propose an efficient
distinguisher due to the linear dependency of the largest correlations.
Last, we propose a novel maximum likelihood decoding algorithm based
on fast Walsh transform to recover the closest codeword for any linear
code of dimension L and length It requires time and
memory This can speed up many attacks such as fast corre-
lation attacks. We apply it to E0, and our best key-recovery attack works
in time given consecutive bits after precomputation. This
is the best known attack against E0 so far.

1 Background
Correlation properties play an important role in the security of nonlinear LFSR-
based combination generators in stream ciphers. As name implies, the word
correlation in stream ciphers is frequently referred to as the intrinsic relation
between the keystream and a subset of the LFSR subsequences. The earliest
studies dated back to [21,25,27] in the 80™s and the concept of correlation im-
munity was proposed as a security criterion. In the 90™s Meier-Staffelbach [22]
analyzed correlation properties of combiners with one memory bit, followed by
[12] focusing on correlation properties of a general combiner with
memory. Recently, a series of fast correlation attacks sprang up, to name but a
few [5“7,16,24]. Thereupon we dedicate this paper to the generalized correlation
attacks against E0, a combiner with 4-bit memory used in the short-range wire-
less technology Bluetooth. Prior to our work, existed various attacks [1,8,10,11,
13“15,17,26] against E0. The best key-recovery attacks are algebraic attacks [1,
Supported in part by the National Competence Center in Research on Mobile Infor-

mation and Communication Systems (NCCR-MICS), a center of the Swiss National
Science Foundation under the grant number 5005-67322.

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 407“425, 2004.
© International Association for Cryptologic Research 2004
408 Yi Lu and Serge Vaudenay

8], whose basic approach is to use the polynomial canceling all memory bits and
involving only key bits, instead of considering the multiple polynomial to cancel
the key bits in the distinguishing attack; besides, [9,13,14] discussed correlations
of E0. In [14], Hermelin-Nyberg for the first time presented a rough computation
method to compute the correlation (called bias for our purpose), but neither did
they formalize the computation systematically, nor did they attempt to find a
larger correlation. In [9,13], two larger correlations for a short sequence of up to
6 bits were exposed. However, due to the limit of the computation method, no
one was certain about the existence of a larger correlation for a longer sequence,
which is critical to the security of E0.
Our first contribution in the paper is that based on Hermelin-Nyberg [14] we
formulate a powerful computation method of correlations by a recursive expres-
sion, which makes it easier to calculate correlations of the Finite State Machine
(FSM) output sequences up to 26 bits for E0 (and allows us to prove the two
known correlations to be the only largest for the first time). Second, we apply
the concept of convolution to the analysis of the distinguisher based on all cor-
relations, which allows us to build an efficient distinguisher that halves the data
complexity of the basic uni-bias-based distinguisher due to the linear depen-
dency of the two largest biases. Our best distinguishing attack takes time
given keystream with precomputation . Finally, by means of Fast
Walsh Transform (FWT), we propose a novel Maximum Likelihood Decoding
(MLD) algorithm to recover the closest codeword for any linear code. Our pro-
posed algorithm can be easily applied to speed up a class of fast correlation
attacks. Furthermore the algorithm is optimal when the length of the code
and the dimension L satisfy the relation which is the case when we ap-
ply it to recover for E0. Our best key-recovery attack works in time given
consecutive bits after precomputation. Compared with the minimum
time complexity in algebraic attacks [1,8], this is the best known attack
against E0.
This paper is structured as follows: in Section 2, a description of E0 is given.
In Section 3, we analyze the bias inside E0 systematically. Then based on one
largest bias, we build a primary distinguisher for E0 in Section 4; an efficient
way is shown in Section 5 that makes full use of all the largest biases to advance
the distinguisher. In Section 6 we investigate the MLD algorithm for a linear
code; the result is then applied to a key-recovery attack against E0 in Section 7.
Finally we conclude in Section 8.

2 Description of the Bluetooth Keystream Generator E0
As specified in [3], the keystream generator E0 used in Bluetooth belongs to a
combination generator with four memory bits2, denoted by at
time where The whole system (Fig.1) uses four Linear Feedback
Throughout this paper, O(·) is used to provide a rough estimate on complexities, eg.
here means operations, where is a small constant.
The description of E0 (sometimes called one-level E0) here only involves the
keystream generation after the initialization.

Faster Correlation Attack on Bluetooth Keystream Generator E0 409

Fig. 1. Outline of E0

Shift Registers (LFSRs) denoted by with lengths
and primitive feedback polynomials

respectively. At clock cycle the four LFSRs™ output bits will be
added as integers. The sum is represented in the binary system.
Let denote its least significant bit A 16-state machine (the
dashed box in Fig.1) emits one bit out of its state and takes the
input to update by Finally, the keystream is obtained by xoring
with That is,

The detailed mechanism of the FSM is beyond the scope of the paper except
the fact that the embedded delay cell (the box labeled in Fig.1) makes
depend only on the initial state of the FSM as well as the past vectors
For completeness, we briefly outline it: given together with
the state the FSM moves into the state Table 1 shows the state transi-
tion of the FSM, where the four-bit state is represented in the quaternary system
(e.g. the FSM changes from into by the input
More formally, by introducing two temporary bits each
clock, the following indirect iterative expressions between and suffice
to update

410 Yi Lu and Serge Vaudenay

One can check Table 1 by those equations. We denote hereafter the content
of LFSRs at time Then the state of E0 at time is fully represented by the

3 Biases Inside E0
Property 1. Assuming holds for then

Proof. It™s easy to verify that the state transition given (the third bottom
row in Table 1) is indeed a linear transformation over that actually
satisfies the recurrence relation: where states are
represented by column vectors of and A is the following 4 — 4
square matrix over GF(2):

Note that is the minimal polynomial of A, from which we
Remark 2. Since this seemingly suggests that

As mentioned in [9,13] (without relating to the above special case), this bit
exhibits a much higher bias as shown later in Corollary 7. We will now introduce
essential material in order to find a systematic algorithm to compute biases.
Proposition 3. If is random and uniformly distributed, then for any
is random and uniformly distributed,
is independent of

Faster Correlation Attack on Bluetooth Keystream Generator E0 411

Proof (sketch). The former half of the theorem is justified by the fact that
is a permutation of for any About the latter half of theorem, first,
we know that is random and uniformly distributed by previous
conclusion. Thus, are i.i.d. random variables all independent of
both and By Eq.(2,3,4), we complete the proof.
Interestingly, we deduce that if is uniformly distributed, then any se-
quence of 39-bit consecutive E0 keystream is uniformly distributed; in particular,
no better key-recovery attack against E0 exists other than tradeoffs given a se-
quence of 39-bit consecutive keystream.
The following definition is derived from normalized correlation [22, p.71].
Definition 4. The bias of a random Boolean variable X is defined as

The normalized correlation between two random Boolean variables X and Y
is just the bias of Assuming that is the sum of four balanced inde-
pendent random bits and that is uniformly distributed, then we know that
is a constant for any denoted by
Table 2 shows computed by Eq.(2), where dashed entries are zeros. The
following important lemma (see Appendix A for proof) inspired by [14], gives an
easy way of computing the bias for iterative structures.

Lemma 5. Given and let
X and Y be two independent random variables in and respectively.
Assuming that is uniformly distributed in for any
we have

Corollary 6. We set to be a permutation defined
over and where
Assuming is uniformly distributed, for any we have

412 Yi Lu and Serge Vaudenay

Proof. By Eq.(2,3,4) we have

Then we apply Lemma 5 with
and we

Note that the assumption of Lemma 5 holds by Proposition 3.
Now we use Corollary 6 iteratively to deduce some important biases of with
Table 2 and the initial values and for A
full list of nonzero triplets is given below for illustration:

Corollary 7. Assuming is random and uniformly distributed, we have

Note that both biases were mentioned in [9,13] (without formal proof). Now by
Corollary 6, we can easily prove it as shown next.
Proof. We show the equivalent of the first bias as follows:

The second bias is similarly proved from
Also, we computed all biases for and found that
are the only largest ones. All biases for are listed in Ta-
ble 14, Appendix C. Throughout the paper we let

Faster Correlation Attack on Bluetooth Keystream Generator E0 413

4 A Primary Distinguisher for E0
4.1 The Connection Polynomial of the Equivalent Single LFSR
Let be the order of the connection polynomial of for
Since all are primitive polynomials, furthermore, by Lemma
6.57 of [18, p.218], the equivalent LFSR to generate the same sequence of the sum
of the four original LFSRs outputs over GF(2) has the connection polynomial
with order (by Lemma 6.50, [18, p.214])
and degree

4.2 Finding the Multiple Polynomial with Low Weight
Let be the degree of a general polynomial We use the standard approx-
imation to estimate the minimal weight of multiples of with degree at
most by the following constraint: is the smallest such that

Listed in Table 3 is the estimated3 corresponding to with
by solving Inequality (5).
To find multiples with low weight, efficient algorithms like [4] exist provided
the degree is low, say, less than 2000, which does not apply to E0. So we can use
the conventional birthday paradox to find with the minimal (i.e.
which takes precomputation time or we apply the generalized
birthday problem [29] to find of same weight but higher degree with much
less precomputation as tradeoff. Table 4 compares the two algorithms. In Ap-
pendix B, we also provide some non-optimal multiples as examples, including
with and

4.3 Building a Uni-Bias-Based Distinguisher for E0
Let be the normalized multiple of with degree
and weight where As holds for
all by Eq.(l), we deduce

Two special cases occur for and because we know the exact value of

414 Yi Lu and Serge Vaudenay

By the Piling-up Lemma [20] and Corollary 7, we know the right-hand side
of Eq.(6) is equal to zero with probability With standard linear
cryptanalysis techniques, we can distinguish the keystream of E0 from a
truly random sequence with samples, simply by checking the left-hand
side of Eq.(6) equals zero most of the time. Based on with and we
minimize the data complexity by choosing Table 5 shows the
minimum is achieved with Table 6 summarizes the
best performance of our primary distinguisher for E0 based on either the use of
with weight 4 in Appendix B, or a search of

5 The Advanced Multi-bias-Based Distinguisher for E0
5.1 Preliminaries
Definition 8. Given for we define



where 1 denotes a constant function equal to 1
Faster Correlation Attack on Bluetooth Keystream Generator E0 415

Note that the first two definitions correspond to convolution and Walsh trans-
form respectively. We recall these basic facts: for any we

if is a distribution, i.e. and for all then
the distribution of the XOR of i.i.d. random vectors with distribution
is moreover,
If the random Boolean variable A follows the distribution then
where is defined in Definition 4.

5.2 An Efficient Way to Deploy Multi-biases in E0 Simultaneously
Given a linear mapping of rank we define vectors
and Note that can be derived from
the keystream directly. Except for accidentally bad choices of we make
a heuristic assumption that all are independent. Let be the probability
distribution of the vector and let be the probability
distribution of the vector The Walsh transforms of and are linked

Now we discuss how to design in order to reduce data complexity. From
Baignères [2, Theorem 3, p.10], we know that we can distinguish a distribution
of random vectors from a uniform distribution with samples.
Here, the distribution of is So the modified distinguisher needs
data complexity

Let be the number of the largest Walsh coefficients over all nonzero
with absolute value4 Since we obtain

In order to lower it™s necessary to have This implies the largest
coefficients are linearly dependent, which happens to be true in E0: recall that
the 6-bit vectors of the three largest biases satisfy the linear relation,

As a simple solution we may just pick
and (where denotes the row of then we obtain
Note that from Subsection 4.3 we have for regardless of and

416 Yi Lu and Serge Vaudenay

And is reduced to a factor of for negligible Indeed, recall that we
proved by computation that the largest Walsh coefficient for are either
(0,.. .,0,1,1,1,1,1,0,...,0) or (0,...,0,1,0,0,0,0,1,0,...,0). Thus
This leads to a more general solution, if we pick and the
row of as

then we obtain And so the improved factor of data complexity


. 13
( 19)