the equation set

Since every in this equation set is defined in and we can

replace with the corresponding bit from the keystream z. Thus, is

TEAM LinG

An Improved Correlation Attack 383

a sequence of pointers to z and we can write the equations over z as the equation

set

We are now finished with the precomputation. Let be the number of

equations in that hold. We iterate as follows:

Input The keystream z of length M, the equation the equation set the

index sequence the states and and let

1. Calculate

2. Use Lemma 2 to generate and lower all indexes in the

equation set by one. Theorem 1 guarantees that the equations are defined

over

3. If the first equation in gets a negative index, then remove the equation

from Find a new index at the end of where is defined, and add

the new equation over z to

4. Calculate metric as the number of equations in that hold.

5. If set and

6. Set and go to step 1.

7. Output as the initialization state for

Remark 3. The algorithm is presented this way to make it readable and to show

the basic idea. To reach the complexity a few technical details on

the implementation of the algorithm are needed. These details are given in Ap-

pendix A.

5 Theoretical Properties

Success Formula

5.1

We can let an (unusual) encoder be defined by removing the Boolean function

from the cipher. Then we can use coding theory to evaluate the attack. Let the

initialization state for define the information bits in such a system.

Let be the (not filtered) irregular clocked stream from

that is y = Q(c, u) and Then the bitstream y defines

the codeword that is sent over a noisy channel. Let the keystream z = Q(c, v)

(the filtered version of y) be the received codeword.

Assume we have the wrong guess for then approximately of the equa-

tions in the set (13) will hold. Now assume we have have guessed the correct

According to the observation in Section 2.1 the equations in the set (13) will

independently of the initializa-

hold with probability

tion bits

TEAM LinG

384 Håvard Holland and Tor Helleseth

Let define the channel ˜noise™. The uncertainty is defined by

and the channel capacity is given by

We can approximate with Following Shannon™s noisy

coding theorem we can set up this bound for success.

Proposition 1. The attack will succeed with if the number of

parity check equations is

where and where is the number of input bits

in

When is close to we expect the probability for success to be close to 1,

see [15]. The simulations of our algorithm show that if we set the

success rate is approximately 99%.

5.2 Keystream Length

If the generator polynomial has weight we must find a multiple

of of weight 4 and a degree We need at least the v stream to be

of length In addition, to find entries in v where the equation is defined v

must at least have length

From the expectation (6) of N we get

which proves the following proposition:

Proposition 2. Let an equation over v be defined by of weight 4 and degree

To obtain an equation set of equations over z, the length of the z stream

must be

The keystream length M depends on the number of equations the deletion

rate and the degree of The degree is then again highly dependent

on the search algorithm we use to find When we use the search algorithm

in [11,17] the degree of will be of order which is close to

the theoretical expected degree [5] for

5.3 Runtime Complexity

The runtime complexity for our attack is

parity check tests, where is calculated using Corollary 4. Note that the runtime

is independent of the length of

TEAM LinG

An Improved Correlation Attack 385

5.4 Memory Complexity

If we implement the attack directly as described in Sections 4.3 and 4.4 the

algorithm will need around bits of computer memory. The reason

for the 32N term is that of length N is a sequence

of pointers of 32 bits. In appendix A.2 we show how we can store using

N memory bits without affecting the runtime complexity. The total amount of

memory bytes needed is then

6 Simulations of the Attack

The LILI-128 cipher[16] is based on the general model we attack in this paper.

To be able to compare our attack with previous attacks, we have tested the

attack on this cipher.

6.1 The LILI-128 Cipher

In the LILI cipher the clock control generator is defined by

and The data generator sub system is

and de-

fined by a Boolean table of size 1024. Further on we get and

for and The number of keybits in the secret key

is 39 + 89 = 128.

6.2 Simulations

We have done the simulations on some versions of the LILI-128 cipher with

LFSRs of different lengths to empirically verify the success formula in Section

5.1. See Table 2 for the simulations. Note that we use the full size from

the LILI cipher in the three attacks in the bottom of the table. For and

we get

We have implemented the attack in C code using the Intel icc compiler on a

Pentium IV processor. Using the full 32-bit capability and all the implementation

tricks explained in Appendix A our implementation uses only approximately 7

cycles per parity check test. Hence the algorithm works fast in practice and will

take processor cycles.

Each attack is run 100 times, and the table shows that the estimated success

rate holds and that the algorithm is efficient.

TEAM LinG

386 Håvard Holland and Tor Helleseth

6.3 A Complete Attack on LILI-128

Preprocessing. For the LILI cipher, we have found a multiple

which corresponds to the recursion

and we have that

This precomputation took only 5 hours and 40 Gbyte hard disk space. We see

that

Finding We have and

To be almost sure to succeed we use equations. Hence, the

runtime for attacking LILI-128 is

parity checks. Using our implementation this corresponds to processor

cycles. Using Proposition 2 with we need a keystream of length

The attack needs about 290 Mbyte of RAM. It can easily be parallelized

and distributed among processors with virtually no overhead, since there is no

need for communcation between the processor, and no need for shared memory.

If we have 1024 Pentium IV 2.53 GHz processors, each having access to about

290 MB of memory, the attack would take about 4.5 months using 68 Mbyte of

keystream data.

Finding when is known. Our attack only finds the initialization bits

for It is possible to combine the Quick Metricfrom [12] with the previous

attack against LILI in [7] to find when is given. Since this is not the scope

of this paper we will not go into details, and we refer to [7,12] for the exact

description. The preprosessing stage will have complexity of order memory

TEAM LinG

An Improved Correlation Attack 387

lookups, and runtime complexity of order parity checks. The complexity

for the method above is much lower than the complexity for finding and will

therefore have little effect on the overall runtime for a full attack.

7 Conclusion

We have proposed a new key recovery correlation attack on irregular clocked

keystream generators where the stream is filtered by a nonlinear Boolean func-

tion. Our attack uses a correlation property of Boolean functions, that gives

higher correlation than previous methods. Thus we need fewer equations to suc-

ceed. The property holds even if the function is correlation immune. Using this

property together with the iteration techniques from [11] we get a low runtime

and low memory complexity algorithm for attacking the model. The algorithm

outputs the initialization bits for Knowing there exist previous

algorithms which can determine efficiently.

Acknowledgment

We would like to thank Matthew Parker, John Erik Mathiassen and the anony-

mous referees for many helpful comments.

References

1. V. Chepyzhov, T. Johansson, and B. Smeets. A simple algorithm for fast correlation

attacks on stream ciphers. In Fast Software Encryption, FSE 2000, volume 1978

of Lecture Notes in Computer Science, pages 181“195. Springer-Verlag, 2001.

2. Philippe Chose, Antoine Joux, and Michel Mitton. Fast correlation attacks: An

algorithmic point of view. In Advances in Cryptology - EUROCRYPT 2002, volume

2332 of Lecture Notes in Computer Science, pages 209“221. Springer-Verlag, 2002.

3. Nicolas Courtois. Fast algebraic attacks on stream ciphers with linear feedback. In

Advances in Cryptology-CRYPTO™ 2003, volume 2729 of Lecture Notes in Com-

puter Science, pages 176“194, 2003.

4. Nicolas Courtois and Willi Meier. Algebraic attacks on stream ciphers with linear

feedback. In Advances in Cryptology - EUROCRYPT 2003, volume 2656 of Lecture

Notes in Computer Science, pages 345“359, 2003.

5. Computation of low-weight parity-check polynomials. Electronic Letters,

October 1996. 32(21):1981-1982.

6. T. Johansson and F. Jönsson. Theoretical analysis of a correlation attack based

on convolutional codes. In Proceedings of 2000 IEEE International Symposium on

Information Theory, IEEE Transaction on Information Theory, page 212, 2000.

7. Fredrik Jönsson and Thomas Johansson. A fast correlation attack on LILI-128. In

Inf. Process. Lett. 81(3), pages 127“132, 2002.

8. Sabine Leveiller, Gilles Z©mor, Philippe Guillot, and Joseph Boutros. A new crypt-

analytic attack for pn-generators filtered by a boolean function. In Selected Areas

in Cryptography: 9th Annual International Workshop, SAC 2002, volume 2595 of

Lecture Notes in Computer Science, pages 232“249. Springer-Verlag, 2003.

TEAM LinG

Håvard Molland and Tor Helleseth

388

9. M. Matsui. Linear cryptanalysis method for DES cipher. In Advances in

Cryptology-EUROCRYPT™93, volume 765 of Lecture Notes in Computer Science,

pages 386“397. Springer-Verlag, 1994.

10. W. Meier and O. Staffelbach. Fast correlation attacks on stream ciphers. In Ad-

vances in Cryptology-EUROCRYPT™88, volume 330 of Lecture Notes in Computer

Science, pages 301“314. Springer-Verlag, 1998.

11. Håvard Molland. Improved linear consistency attack on irregular clocked keystream

generators. In Fast Software Encryption, FSE 2004, To appear in LNCS. Springer-

Verlag, 2004. Available at http://www.ii.uib.no/˜molland/crypto

12. Håvard Molland, John Erik Mathiassen, and Tor Helleseth. Improved fast correla-

tion attack using low rate codes. In Cryptography and Coding, IMA 2003, volume

2898 of Lecture Notes in Computer Science, pages 67“81, 2003.

13. W.T. Penzhorn and G.J Kuhn. Computation of low-weight parity checks for corre-

lation attacks on stream ciphers. In Cryptography and Coding, IMA 1995, volume

1025 of Lecture Notes in Computer Science, pages 74“83. Springer-Verlag, 1995.

14. Markku-Juhani Olavi Saarinen. A time-memory tradeoff attack against LILI-128.

In Fast Software Encryption, FSE 2002, volume 2365 of Lecture Notes in Computer

Science, pages 231“236, 2002.

15. T. Siegenthaler. Decrypting a class of stream ciphers using ciphertext only. IEEE

Trans, on Comp., C-34:81“85, 1985.

16. L. Simpson, E. Dawson, and W. Millan. LILI keystream generator. In

SAC™2000, volume 2012 of Lecture Notes in Computer Science, pages 231“236.

Springer-Verlag, 2002. Available at http://www.isrc.qut.edu.au/lili.

17. D. Wagner. A generalized birthday problem. In Advances in cryptology-CRYPTO™

2002, volume 2442 of Lecture Notes in Computer Science, pages 288“303, 2002.

18. Xian-Mo Zhang and Yuliang Zheng. The nonhomomorphicity of boolean functions.

In Selected Areas in Cryptography, SAC 98, volume 1556 of Lecture Notes in Com-

puter Science, pages 280“295. Springer-Verlag, 1998.

Appendix

A Implementation Details

To reach the runtime complexity and memory complexity down to

bits, the implementation of the algorithm has some tricks. Since not

all of these tricks are obvious we give more detailed descriptions of them below.

A.1 Runtime Details

Sliding window. In Lemma 2 we get by among other things deleting the

first bits of This is done using the sliding window technique, which means

that we move the viewing to the right instead of shifting the whole sequence to

the left. This way the shifting can be done in just a couple of operations. To

avoid heavy use of memory, we slide the window over an array of fixed length

N, so that the entries that become free at the beginning of the array are reused.

Thus, the left and right indexes of the sliding window after iterations will be

where for all

TEAM LinG

An Improved Correlation Attack 389

The same sliding window technique is also used on the equation set when

equations are deleted and added to the equation set.

Updating the indices. In Lemma 2 every pointer in is replaced with

for every which would take M operations. If we skip the replacements

we note that after iterations the entry in will become It is also

important to note that when we write the entries

are pointers from to z. They are not the actual key bits.

Thus, in the implementation we do not replace with But when we after

iterations in the search for equations find an equation

that is defined, we replace the corresponding equation with

to compensate.

Reducing the memory access time. When we test an equation we must use

pointers to pointers to the keystream. Then each equation test will have high

memory access time. We can reduce this significantly by testing the equations

on 32 states simultaneously. This is possible since the next state is tested

by shifting all the equations one entry to the left over z. We can now take the

bits for each of the term in the equations and

put them into 32 bit registers. Now we can test the states and add one to the

metrics of the states that satisfy the equation. This speeds up the runtime by a

factor of approximately 20.

A.2 Memory Details

Reducing the use of memory. Instead of storing all the pointers, we set 1

in where the bits are defined and 0 otherwise. When we search in to find

entries where the equation is defined, we keep track of where in z the four

terms in points to by counting the number of 1™s we pass during the search.

This is done for each of the 4 terms in the equation This way we always

know where in z the given equation of points to. Using this trick the number

of memory bits needed during an attack is reduced from bits to

Implementing this trick will not affect the runtime of the attack.

TEAM LinG

Rewriting Variables: The Complexity

of Fast Algebraic Attacks on Stream Ciphers

Philip Hawkes1 and Gregory G. Rose1

Qualcomm Australia, Level 3, 230 Victoria Rd, Gladesville, NSW 2111, Australia

{phawkes,ggr}@qualcomm.com

Abstract. Recently proposed algebraic attacks [2,6] and fast algebraic

attacks [1,5] have provided the best analyses against some deployed

LFSR-based ciphers. The process complexity is exponential in the de-

gree of the equations. Fast algebraic attacks were introduced [5] as a

way of reducing run-time complexity by reducing the degree of the sys-

tem of equations. Previous reports on fast algebraic attacks [1,5] have

underestimated the complexity of substituting the keystream into the

system of equations, which in some cases dominates the attack. We also

show how the Fast Fourier Transform (FFT) [4] can be applied to de-

crease the complexity of the substitution step. Finally, it is shown that

all functions of degree satisfy a common, function-independent linear

combination that may be used in the pre-computation step of the fast

algebraic attack. An explicit factorization of the corresponding charac-

teristic polynomial yields the fastest known method for performing the

pre-computation step.

1 Introduction

Many popular stream ciphers are based on linear feedback shift registers (LF-

SRs) [11]. Such ciphers include E0 [3], LILI-128 [12] and Toyocrypt(see [10]).

They consist of a memory register called the state that is updated (changed)

every time a keystream output is produced, and an additional device, called the

nonlinear combiner. The nonlinear combiner computes a keystream output as

a function of the current LFSR state1. The sequence of states produced by an

LFSR depends on the initial state of LFSR, which is always presumed to be se-

cret. Since recovering this initial state allows prediction of unknown keystream,

we follow the convention of [5] and call it K as if it was actually the key. Most

practical stream ciphers initialize this state from the real key and a nonce. The

advantages of LFSRs are many. LFSRs can be constructed very efficiently in

hardware and some recent designs are also very efficient in software. LFSRs can

be chosen such that the produced sequence has a high period and good statistical

properties.

1

Some LFSR-based stream ciphers have a non-linear filter that maintains some bits of

memory, but research has shown that such ciphers can be analyzed in the same way

as ciphers without memory. Some designs use multiple LFSRs, but again these are

usually equivalent to a single LFSR. Some modern stream ciphers use units larger

than bits, but this discussion applies equally to such ciphers, so we will talk only in

terms of bits.

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 390“406, 2004.

© International Association for Cryptologic Research 2004

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 391

While there are many approaches to the cryptanalysis of LFSR-based stream

ciphers, this paper is concerned primarily with the recently proposed algebraic

attacks [2,6] and fast algebraic attacks [1,5]. Such attacks have provided the

best analyses against some theoretical and deployed ciphers.

An algebraic attack consists of three steps. The first step is to find a system

of algebraic equations that relate the bits of the initial state K and bits of the

keystream Some methods [2,6] have been proposed for finding “lo-

calized” equations (where the keystream bits are in a small range

This first step is a pre-computation: the attacker must compute these equations

before attacking a key-stream. Furthermore, the computation need only be per-

formed once, and the attacker can use the same equations for attacking multiple

key-streams. The second and third steps are performed after the attacker has

observed some keystream. In the second step, the observed keystream bits are

substituted into the algebraic equations (from the first step) to obtain a system

of algebraic equations in the bits of K. The third step is to solve these algebraic

equations to determine K. This will be possible if the equations are of low degree

in the bits of K, and a sufficient number of equations can be obtained from the

observed keystream.

The process complexity of the third step is exponential in the degree of the

equations. Fast algebraic attacks were introduced by Courtois at Crypto 2003 [5]

as a way of reducing run-time complexity by reducing the degree of the system

of equations. This method requires an additional pre-computation step; this step

determines a linear combination of equations in the initial system that cancels

out terms of high degree (provided the algebraic equations are of a special form).

This yields a second system of equations relating K and the keystream Z that

contains only terms of low degree. In the second step, the appropriate keystream

values are now substituted into this second system to obtain a new system of

algebraic equations in the bits of K. Solving the new system (in the third step)

is easier than solving the old system because the new system contains only terms

of low degree.

Courtois [5] proposes using a method based on the Berlekamp-Massey algo-

rithm [8] for determining the linear combination obtained in the additional pre-

computation step. The normal Berlekamp-Massey algorithm has a complexity of

while an asymptotically-fast implementation has a complexity of C·D(log D)

for some large constant C. It is unclear which method would be best for the size

of D considered in these attacks. Armknecht [1] provides a method for improving

the complexity when the cipher consists of multiple LFSRs.

Contributions of this paper. The first contribution is to note that previous

reports on fast algebraic attacks (such as [1,5]) appear to have underestimated

the complexity of substituting the keystream into the second system of equa-

tions2. The complexity was originally underestimated as only O(DE) [5], where

D is the size of the linear combination and E is the size of the second system

2

We are aware (via private communication) of other proposed algebraic attacks in

which the substitution complexities were initially ignored. In one case, the complexity

of simple substitution was almost the square of the complexity of solving the system.

TEAM LinG

392 Philip Hawkes and Gregory G. Rose

of equations. Table 1 lists the values of O(DE) for previously published attacks

from [1,5]. However, simple substitution would require a complexity of

(see Section 2.3), and no other method was suggested for reducing the complex-

ity. It is true that E bitwise operations of the substitution can be performed

in parallel, reducing the time complexity to but in cases where E is

large, the process complexity should still be considered in the absence of

specialized hardware. In many cases actually exceeds the complexity of

solving the system of equations, as shown in Table 13. The second contribution

of this paper is to show how the Fast Fourier Transform (FFT) [4] can be applied

to decrease the complexity of the substitution step to D. The resulting

complexities of the FFT approach are also listed in Table 1.

The final contribution of this paper is to provide an efficient method for

determining the linear combination obtained in the additional pre-computation

step of the fast algebraic attack. First, we make the observation that all functions

of degree satisfy a common function-independent linear combination of length

that is defined exclusively by the LFSR. Then we provide a direct

method for computing this linear combination (based on the work of Key [7]).

This method requires operations for small constant

This is a significant improvement on the complexities of previous methods.

This paper is organized as follows: Section 2 describes fast algebraic attacks.

In Section 3 we discuss the complexity of substitution step for fast algebraic

attacks. Section 4 reviews the Fast Fourier Transform and Section 5 describes

3

The attack on LILI-128 requires only every bit from a keystream of

length The process complexity of selecting these bits is ignored in the literature,

and could be an area for useful discussion.

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 393

how the FFT can speed up the substitution step. Section 6 contains some ob-

servations on the pre-computation step. Section 7 concludes the paper.

2 Fast Algebraic Attacks

The length of the LFSR is that is the internal state of the LFSR is

A state is derived from the previous state by applying

an (invertible) linear mapping with The

function L can be represented by an matrix over GF(2), which is called the

state update matrix. Notice that we can write Each keystream bit

is generated by first updating the LFSR state (by applying L) and then applying

a Boolean function to the bits of the LFSR state. For the purposes of this paper,

everything about the cipher is presumed to be known to the attacker, except the

initial state of the LFSR and any subsequent state derived from it.

Linearization: Recall that the first two steps of the attack result in a system

of nonlinear algebraic equations in a small number of unknown variables (these

variables being the bits of the initial state). The most successful algebraic attacks

(to date), have been based on linearization. The basis of this technique is to

“linearize” a system of nonlinear algebraic equations by assigning a new unknown

variable to each monomial term that appears in the system. The same monomial

term appearing in distinct equations is assigned the same new unknown variable.

The system of equations then changes from a system of non-linear equations

(with few unknown variables) into a system of linear equations (with a large

number of unknown variables). If the number of linear equations exceeds the

number of new unknown variables, then an attacker can solve the system to

obtain the new unknown variables of the linear system (which will in turn reveal

the unknown variables of the non-linear system). The advantage of linearization

is that the attacker can use the large body of knowledge about the solution of

linear systems.

2.1 The Monomial State

This section introduces some notation that is useful for describing linearization.

For a given value of the state and for a given degree we shall let (the

monomial state) denote the GF(2) column vector with each component being a

corresponding monomial of degree or less. The number of such monomials is

so contains D components. The initial monomial

corresponds to the initial state K.

state

Example 1. If (that is, and then there are

D = 11 monomials of degree 2:

TEAM LinG

394 Philip Hawkes and Gregory G. Rose

where “T” denotes the transpose of the matrix to make a column vector. For

the values of the monomial components of are:

The ordering of the monomial components is arbitrary; for consistency we will

enumerate using lower subscripts first, as shown.

Expressing Functions of the LFSR State. We can express any Boolean

function of the LFSR state as a product of the matrix with a row vector.

Example 2. Consider a Boolean function of the state (using

the LFSR state from Example 1). This function can be expressed:

where the addition and multiplication operations are performed in GF(2). We

have now expressed the Boolean function as the product of the matrix

with a row vector

that selects the values of the specific monomials required to evaluate

The row vector f depends only on the function and is independent of the

LFSR feedback polynomial, the value of the initial state, and the index

The Monomial State Rewriting Matrix. The mapping from one LFSR state

to the next LFSR state can be expressed as a matrix product It

is also possible to determine the mapping from one monomial state to the next

monomial state as a matrix product

Example 3. Consider a 4-bit LFSR as in Example 1 with monomial state

If the LFSR has is

of the form then the next state has a corresponding next

monomial state

which is related to the original monomial state as follows (only some relation-

ships have been shown in order to save space):

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 395

Each component of the next monomial state is a linear function of the original

monomial state. These linear functions for can be combined into

a matrix (the “rewriting matrix”) such that

Notice that the matrix depends only on the LFSR and the degree This

example generalizes: for every LFSR and degree there is a “monomial state

rewriting matrix” such that Moreover, for every

the monomial state after clocks of the LFSR can be expressed as a GF(2)

matrix operation

where is the initial monomial state. Combining equations (1) and (2), we

get another expression for

where the vector depends solely on the function the monomial

state update matrix and the number of clocks (all of which are known to

the attacker). For example, the vectors corresponding to the

function in Example 2 are:

2.2 Algebraic Attacks

We always assume that the monomial state is unknown; it is the goal of algebraic

attacks to determine the initial monomial state (and thereby determine the

initial LFSR state).

Step 1. The first step in an algebraic attack is to find a Boolean function such

that the equation

TEAM LinG

396 Philip Hawkes and Gregory G. Rose

is true for all clocks (or indices) The degree of with respect to the bits

of we shall denote by Various methods have been proposed for finding

such equations (see [2,6]). These equations typically have small values for For

simplicity we shall hereafter combine the keystream bit values into

a keystream vector

For the linearization approach, it is convenient to obtain an expression for

in terms of keystream bits and bits of the initial monomial state

1. Express as the inner product of and a keystream-dependent

vector

2. Now, Equation (2) can be substituted into Equation (5);

where

3. Equation (4) is thereby transformed to the form:

The components of depend on: (a) the function

(b) the monomial state rewriting matrix associated with the monomials of

degree of degree or less; (c) the number of clocks and (d) the small keystream

vector An attacker has access to all of this information, so the attacker is able

to compute all of the components of This means that the only unknowns

in Equation (6) are the components of the initial monomial state

Step 2. The second step of an algebraic attack consists of substituting the

observed keystream vector into the components of the vectors and then

computing the vector The vectors are evaluated for

many indices Each of the evaluated vectors provides the attacker with

a linear equation in the D unknown bits of the initial monomial state

Since there are D unknowns, around D linear equations will be required to

obtain a solvable system. An initial choice of D equations may contain linearly

dependent equations, so more than D equations may be required in order to get a

completely solvable system. It is thought that not many more than D equations

will be required in practice (see remark at the end of section 5.1 of [5]), so we

will assume D equations are sufficient.

Step 3. The third step recovers by solving the resulting system of linear

equations. The system can be solved by Gaussian elimination or more efficient

methods [13]. The complexity of solving such a system of equations is estimated

to be where (known as the Gaussian co efficient) is estimated to be

In general, D will be about

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 397

Complexities. The complexities of an algebraic attack are as follows:

The complexity of finding the equation depends on many factors

and is beyond the scope of this paper.

The amount of keystream required for the second step (the data complexity)

is

The complexity of the second step (substituting the keystream into the equa-

tions) is assuming that the functions are rela-

tively simple functions of the keystream; and

the complexity of solving the system in the third step is

Note 1. The complexity is exponential in the degree Hence, a low degree is

required for an efficient attack. Therefore, an attacker using an algebraic attack

will always try to find a system of low degree equations.

2.3 Fast Algebraic Attacks

Courtois [5] proposed “fast algebraic attacks”, as a method for decreasing the

degree of a given system of equations. For fast algebraic attacks, we presume

that the function can be written in the form

where is of degree in the bits of is of degree in the bits of

and only depends on the keystream. Since the functions and are of two

distinct degrees (in the bits of it is simplest to consider them as depending

on distinct monomial states and with corresponding monomial state

monomials of

rewriting matrices and There are

monomials of degree or less.

degree or less, and

A fast algebraic attack gains an advantage over the normal algebraic attacks

by including an additional pre-computation step in which the attacker deter-

mines linear combinations of equation (7) that will cancel out the high-degree

monomials of degree that occur in but not in

As in equation (3), and are written as vector inner-products:

where is a vector with D

components (all of which are independent of the keystream); and

where is a vector

with E components (some of which are dependent on the keystream).

Equation (7) is then transformed to:

In the fast algebraic attack pre-computation step, the attacker finds (D + 1)

coefficients such that

TEAM LinG

398 Philip Hawkes and Gregory G. Rose

Equations (8) and (9) can be combined:

Thus, we obtain a linear expression in

The second step of a fast algebraic attack is to evaluate many vectors

by substituting observed keystream vectors into the vectors in equa-

tion (11). Each of the evaluated vectors provides the attacker with a linear

equation in the E unknown bits of the initial monomial state Equation (10)

involves fewer unknowns than the initial equation (7); this means that the fast

algebraic attack requires fewer equations in order to solve for the unknowns.

Reducing the number of unknowns and equations significantly improves the third

step of the attack as solving the system of E equations (10) takes significantly

less time than solving the system of D equations of (6). The complexity of the

third step is now

Courtois [5] and Armknecht [1] have proposed efficient methods for finding

the coefficients of equation (9). The details are not relevant to this paper, but

the complexities are provided in Table 2 for the purposes of comparison with

the method proposed in Section 6 of this paper.

Data Complexity. Evaluating the vector (for each equation (10)) requires

substituting the bits from the D keystream vectors Obtaining

E equations (10) can be achieved using the set of keystream vectors

These keystream vectors can be

obtained from the keystream bits Hence, the attack can

be performed using as few as keystream bits.

3 Substitution Complexity of Fast Algebraic attacks

Normal algebraic attacks and fast algebraic attacks differ in the complexity of

substituting the keystream into the equations in Step 2. The vector is a

function of a small number of keystream bits but the vector

is a function of a large number of keystream bits

As discussed in the introduction, a misunderstanding resulted the attacks [1,5]

failing to account for this difference.

The naïve approach to substituting the keystream is to compute the vectors

first and then substitute these vectors into the equations (11) individually4.

Computing a single component of the vector for a single

value of will require complexity D/2, since (on average) half of the coefficients

4

We ignore the cost of computing as this cost is independent of the cost of

determining from the values of

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 399

are expected to be zero. There are E components in each vector so the

complexity of substituting the keystream to obtain a single vector using

equation (11) is E — (D/2) = ED/2. That is, obtaining a single equation (11)

has complexity is ED/2. Since E equations are required in order to solve the

system, the total cost of simple substitution will be

Table 3 lists the complexity of simple substitution for the fast algebraic attacks

in the literature. Note that simple substitution is significantly more complex than

solving the linear system of equations in these cases.

4 The Discrete Fourier Transform

Real Spectral Analysis: First, we™ll consider a quick tangential topic. A com-

mon tool in analyzing a real-valued function (such as a sound wave) eval-

uated on a real domain is to represents the function as a sum

of simple periodic functions (cosine and sine curves) where the function is

specified by the amplitudes of these periodic functions:

with amplitudes and assigned for each frequency The sequences

and are the Fourier series for and evaluating the amplitudes is called

a spectral analysis.

Discrete Spectral Analysis: Suppose is a function defined at discrete

values and the values of lie in a field Such discrete

functions are equivalent to sequences written Discrete spectral anal-

ysis of like real spectral analysis, represents using simple periodic sequences

with period P. These periodic sequences are of the form

where and is an element of multiplicative order P in some

field these functions are analogous to the sine and cosine curves.

In some cases, has elements of multiplicative order P, and can be an

element of that is, In other cases, must be chosen in an a larger

field that is an extension field of In either case, the field is a vector space

over that is, elements of are of the form for some basis

Elements are mapped to elements where

TEAM LinG

400 Philip Hawkes and Gregory G. Rose

is the identity element of Thus, the sequence of elements of is mapped

to the sequence with elements of A discrete spectral analysis determines

a sequence of P “amplitudes” such that the

sequence can be expressed as:

In this way, each sequence value is represented as a linear combination of

sum of P periodic sequences It is well known that the

sequence of amplitudes A can be computed directly from the sequence as:

The calculation of A from as in (13) is called the Discrete Fourier Transform

(DFT), while the calculation of from A is the Inverse DFT. The most efficient

method for performing the DFT, known as the Fast Fourier Transform (FFT) [4],

requires a total of operations in the field There is also an Inverse

FFT that uses the same amount of computation to invert the DFT.

Convolutions and the DFT. The convolution of two discrete sequences and

of period P is another sequences of period P with

These are sequences of elements from the field It is common

to write Computing the convolution according to first principles

would take multiplication and addition operations in the field However,

the Convolution Property provides us with an alternative method.

Convolution Property if and only if

The convolution can be computed by applying the FFT to and to form A

and B, forming and finally applying the inverse FFT to Y

in order to form The total complexity is

operations in the field In the cases where the FFT method is faster

by a factor of In other cases, computations in cost more than

computations in and the advantage is less. This “trick” has been applied in

many areas such as the fast multiplication of larger numbers and polynomials

(the product of two polynomials is the convolution of the two corresponding

sequences of coefficients). We shall use this trick in the next two sections.

5 Applying the FFT to the Substitution Step

The calculation in equation (11) is performed component-wise, so we will begin

by focussing on the sequence of values for only one of the monomial components

of the vectors and the

corresponding components of the vectors Assume that the attacker

has observed a sufficient amount of keystream, evaluated the values of in

equation (8) for and determined the values The

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 401

attacker now needs fast way to determine the values

(see equation (11)) from the values

The inefficiency of using simple substitution is indicated by two things:

Equations (11) often re-use the same values of when computing

Equations (11) all use the same linear combination;

This problem appears similar to computing for an appropriate

sequence Indeed, if is defined as and

then for Thus, the FFT may

be combined with the Convolution Property for computing The sequences

and are defined on the field so will be a field of the form

We choose to be the smallest value such that and define

This choice seems best because it uses the smallest number of bits

to represent elements of

Basic DFT-Based Substitution Algorithm

1. Map the sequences and in to sequences and in

Apply the DFT to obtain the sequence of amplitudes B from

2.

3. Apply the DFT to obtain the sequence of amplitudes V from

Compute

4.

5. Apply the inverse DFT to obtain from Q. Note

6. Extract from if else

Complexity. This may seem like a strange way to compute but the

algorithm is very efficient when the FFT is used to compute the DFTs:

The values are computed via the FFT using field operations

(operations in the field For given values the same sequence

is used for each monomial component and for each attacked keystream.

The attacker should pre-compute and store B to save time.

The values are computed via the FFT using field operations.

The values are computing using (D + E) field multiplications.

The sequence can be obtained from by applying the standard Inverse

FFT; this requires in time field operations.

The pre-computation of B requires field operations. The run-time

total complexity for computing the value of from the values is ap-

proximately field operations. These field operations are more complex

than GF(2) (logical) operations. To a good approximation, each field opera-

tion is equal in complexity to logical operations (much of this can be

parallelized). Thus, for our calculations, the run-time complexity of the above

algorithm is equivalent to around logical operations.

Improvement 1. The above algorithm computes all (D + E) values of but

only E of these values are ever used. An efficient alternative is to divide the

linear combination into segments of length and perform the FFTs

TEAM LinG

402 Philip Hawkes and Gregory G. Rose

on these segments using a smaller field where

If we define appropriate sequences then we may write:

with sub-sequences Now, if

and only if Thus, computing requires: pre-

computing the FFTs of computing the FFTS of computing and

applying the inverse FFT to Q to obtain and thus The FFTs and Inverse

FFT dominate the complexity, requiring logical operations at

run-time, where The basic algorithm above uses The optimal

choice for (providing the lowest complexity) depends on D and E.

Improvement 2. The DFT-based substitution algorithm computes the values

of for only one component of the vectors

There are a total of E monomial components for each

such vector; thus if each component is computed separately, then the total

complexity of computing all components of every vector would be ap-

proximately field operations, or logical

operations. Fortunately, monomial components can be packed into each com-

putation. For a set of monomial indices we define a sequence

Then

provides values for the price of one, dividing the total complexity by

Improved DFT-Based Substitution Algorithm

Inputs:

Outputs:

1. Form

Form sub-sequences

2.

3. Apply DFT to obtain from

Compute

4.

Apply inverse DFT to obtain from Q .

5.

Set

6.

Output

7.

The run-time complexity, after applying these two improvements, is

Table 4 shows the complexities of the FFT method for substitution for the

current fast algebraic attacks in literature. In the case of E0, the improvement

has been significant, and the substitution step no-longer dominates the run-time

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 403

complexity. The improvement in the substitution complexity is less noticeable

required for LILI-128, and insignificant for Toyocrypt. The substitution step still

comprises a significant portion of the complexity for these attacks.

In all cases, the first improvement did not affect the complexity significantly;

the largest improvement was by a factor of 4. Interestingly, the optimal value of

was for which and The corresponding complexity

is around

6 Improving the Pre-computation Step

A square matrix satisfies its characteristic polynomial. That is, if

is the characteristic polynomial of then

where 0 represents the all-zero matrix. Suppose the coefficients of

Equation (9) are assigned the values of the coefficients of the charac-

teristic polynomial of Then, for any function of degree

The characteristic polynomial of depends on the LFSR and the degree and

is otherwise independent of the function Thus, the coefficients

(of the characteristic polynomial of can be substituted for the coefficients

in equations (9) and (11) for all functions of degree

Most functions of degree have a minimal polynomial of degree

(see [9, Fact 6.55]); the minimal polynomial for these functions will

be However, there are functions the minimal polynomial is a smaller

factor of For example, in the attack on E0 [1], has length D =

11,017,633; while the minimal polynomial of the Boolean function used in the

attack is of slightly smaller length Using in the attack

on E0 (instead of using the minimal polynomial) would increase the complexity

TEAM LinG

404 Philip Hawkes and Gregory G. Rose

by a small amount. An advantage of the methods proposed in [1,5] is that those

method will find the minimal polynomial for a specified Boolean function, even

if the minimal polynomial is smaller than

It is not difficult to show that the polynomial divides This

suggests that the linear combination may also be zero, thus

resulting in a trivial equation that provides no information about the

initial monomial state. The probability of this cancellation occurring is small;

the vectors in the sum depend

on the keystream. Nonetheless, this suggests that a better approach would be

to cancel only those components corresponding to monomials of degree greater

than using the polynomial

The linear combination cancels components corresponding

to monomials of degree in the range but will not cancel components

corresponding to monomials of degree or less. Hence,

can be considered as an E-dimensional vector for some vector

Equation 10 would then become

The probability that will depend on the probability that

for a random z. Unless v(z) is constant, this probability will

be less than 1/2. After substitution of many such vectors, the probability that

will be very small and Equation (15) is highly unlikely to be

trivial.

6.1 Direct Computation of the Linear Combination

Suppose an LFSR state of length is updated according to state update

matrix L, and the characteristic polynomial of L is primitive5. The following

theorem, while not explicitly stated by Key [7], is a fairly obvious consequence

of Key™s ideas, so no proof is given. This result provides a direct method for

computing

Theorem1. (Largely due to Key [7]) If is a root of the char-

acteristic polynomial of the LFSR state update matrix, then the characteristic

polynomial of is where and

denotes the Hamming weight of (that is, the number of 1™s in the radix-2 rep-

resentation of the integer

Factoring into GF(2) polynomials Computing the entire

product while in would be costly. Fortunately, the

5

This approach can be extended to cases where the characteristic polynomial is not

primitive; for example, when the keystream is a function of more than one LFSR.

See Key [7] for more details.

TEAM LinG

Rewriting Variables: The Complexity of Fast Algebraic Attacks 405

factors in can be easily grouped into GF(2) polynomials of degree

or less. We define an equivalence relation where if and only if

for some value Since the set is closed under

multiplication by 2, the polynomial has coefficients

that are either 0 or the identity element I. That is, the product can be

represented as a GF(2) polynomial. Thus, can compute in two phases:

1. Compute the GF(2) polynomials for all of weight or less.

2. Multiply the GF(2) polynomials to form

Computing The FFT over may be used to compute the poly-

nomials First, apply the FFT to the sequences corresponding to

for to obtain sequences Second, form sequence with

Finally, apply the inverse FFT to to obtain The

first step is the most costly; it requires logical operations for each factor.

There are D factors, so the total combined cost is logical operations.

Multiplying to form The second phase has polynomials with

coefficients in GF(2) and uses FFT™s in extension fields of GF(2). Multiplying

two GF(2) polynomials in to get a product of degree less than can be

performed (via the FFT) using operations in the

logical operations6. Use the

extension field this is equal to

FFT to first multiply pairs of polynomials of degree to get polynomials of

degree Then multiply pairs of polynomials of degree to get polynomials

of degree and so forth until is formed. The total complexity is

The combined complexity for the two phases is In Ta-

ble 2, the complexity of this method is compared against the previous methods.

7 Conclusion

We have shown that some published “fast algebraic attacks” on stream ciphers

underestimate the process complexity of one of the steps, and we provide correct

complexity estimates for these cases. We then show an improved method, us-

ing Fast Fourier Transforms, for substituting keystream bits into the system of

equations needing to be solved. We also made some observations about the linear

combination used in the pre-computation step of the fast algebraic attack. In par-

ticular, we found the fastest known method for performing the pre-computation.

The fast algebraic attack remains an extremely powerful technique for analyzing

LFSR-based stream ciphers.

6

The attacker can “pack” multiple GF(2) polynomials into a single sequence

and thereby compute the convolution of multiple pairs GF(2) polynomials using the

same amount of computation. This reduces complexity by a relatively small factor.

TEAM LinG

406 Philip Hawkes and Gregory G. Rose

Acknowledgements

Acknowledgement is due to Nicolas Courtois for inspirational discussions and fine

French cuisine during which the results in Section 6 came to light. The reviewers

also provided many useful insights; in particular, the two improvements to the

basic DFT-based substitution algorithm.

References

1. F. Armknecht: Improving Fast Algebraic Attacks, to be presented at FSE2004 Fast

Software Encryption Workshop, Delhi, India, February 5-7, 2004.

2. F. Armknecht and M. Krause: Algebraic Attacks on Combiners with Memory,

proceedings of Crypto2003, Lecture Notes in Computer Science, vol. 2729, pp.

162-176, Springer 2003.

3. Bluetooth CIG, Specification of the Bluetooth system, Version 1.1, February 22,

2001. Available from www.bluetooth.com.

4. J. W.Cooley and J. W. Tukey: An algorithm for the machine calculation of complex

Fourier series, Mathematics of Computation, 19, 90, pp. 297-301, 1965.

5. N. Courtois: Fast Algebraic Attacks on Stream Ciphers with Linear Feedback,

Crypto2003, Lecture Notes in Computer Science, vol. 2729, pp. 177-194, Springer

2003.

6. N. Courtois and W. Meier: Algebraic Attacks on Stream Ciphers with Linear Feed-

back, EUROCRYPT 2003, Warsaw, Poland, Lecture Notes in Computer Science,

vol. 2656, pp. 345-359, Springer, 2003.

7. E. Key: An Analysis of the Structure and Complexity of Nonlinear Binary Se-

quence Generators, IEEE Transactions on Information Theory, vol. IT-22, No. 6,

November 1976.

8. J. Massey: Shift Register synthesis and BCH decoding, IEEE Transactions on

Information Theory, IT-15 (1969), pp. 122-127.

9. A. Menezes, P.van Oorschot and A. Vanstone Handbook of Applied Cryptography,

CRC Press series on discrete mathematics and its applications, CRC Press LLC,

1997.

10. M. Mihaljevic and H. Imai: Cryptanalysis of Toyocrypt-HS1 stream cipher, IEICE

Transactions on Fundamentals, vol E85-A, pp. 66-73, January 2002. Available at

www.csl.sony.co.jp/ATL/papers/IEICEjan02.pdf.

11. R. Rueppel: Stream Ciphers; Contemporary Cryptology: The Science of Informa-

tion Integrity. G. Simmons ed., IEEE Press, New York, 1991.

12. L. Simpson, E. Dawson, J. Golic and W. Millan: LILI Keystream Generator; Se-

lected Areas in Cryptography SAC™2000, Lecture Notes in Computer Science, vol.

1807, Springer, pp. 392-407.

13. V. Strassen: Gaussian Elimination is Not Optimal; Numerische Mathematik, vol.

13, pp. 354-356, 1969.

TEAM LinG

Faster Correlation Attack

on Bluetooth Keystream Generator E0

Yi Lu* and Serge Vaudenay

EPFL

http://lasecwww.epfl.ch

Abstract. We study both distinguishing and key-recovery attacks

against E0, the keystream generator used in Bluetooth by means of cor-

relation. First, a powerful computation method of correlations is formu-

lated by a recursive expression, which makes it easier to calculate corre-

lations of the finite state machine output sequences up to 26 bits for E0

and allows us to verify the two known correlations to be the largest for

the first time. Second, we apply the concept of convolution to the analy-

sis of the distinguisher based on all correlations, and propose an efficient

distinguisher due to the linear dependency of the largest correlations.

Last, we propose a novel maximum likelihood decoding algorithm based

on fast Walsh transform to recover the closest codeword for any linear

code of dimension L and length It requires time and

memory This can speed up many attacks such as fast corre-

lation attacks. We apply it to E0, and our best key-recovery attack works

in time given consecutive bits after precomputation. This

is the best known attack against E0 so far.

1 Background

Correlation properties play an important role in the security of nonlinear LFSR-

based combination generators in stream ciphers. As name implies, the word

correlation in stream ciphers is frequently referred to as the intrinsic relation

between the keystream and a subset of the LFSR subsequences. The earliest

studies dated back to [21,25,27] in the 80™s and the concept of correlation im-

munity was proposed as a security criterion. In the 90™s Meier-Staffelbach [22]

analyzed correlation properties of combiners with one memory bit, followed by

[12] focusing on correlation properties of a general combiner with

memory. Recently, a series of fast correlation attacks sprang up, to name but a

few [5“7,16,24]. Thereupon we dedicate this paper to the generalized correlation

attacks against E0, a combiner with 4-bit memory used in the short-range wire-

less technology Bluetooth. Prior to our work, existed various attacks [1,8,10,11,

13“15,17,26] against E0. The best key-recovery attacks are algebraic attacks [1,

Supported in part by the National Competence Center in Research on Mobile Infor-

*

mation and Communication Systems (NCCR-MICS), a center of the Swiss National

Science Foundation under the grant number 5005-67322.

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 407“425, 2004.

© International Association for Cryptologic Research 2004

TEAM LinG

408 Yi Lu and Serge Vaudenay

8], whose basic approach is to use the polynomial canceling all memory bits and

involving only key bits, instead of considering the multiple polynomial to cancel

the key bits in the distinguishing attack; besides, [9,13,14] discussed correlations

of E0. In [14], Hermelin-Nyberg for the first time presented a rough computation

method to compute the correlation (called bias for our purpose), but neither did

they formalize the computation systematically, nor did they attempt to find a

larger correlation. In [9,13], two larger correlations for a short sequence of up to

6 bits were exposed. However, due to the limit of the computation method, no

one was certain about the existence of a larger correlation for a longer sequence,

which is critical to the security of E0.

Our first contribution in the paper is that based on Hermelin-Nyberg [14] we

formulate a powerful computation method of correlations by a recursive expres-

sion, which makes it easier to calculate correlations of the Finite State Machine

(FSM) output sequences up to 26 bits for E0 (and allows us to prove the two

known correlations to be the only largest for the first time). Second, we apply

the concept of convolution to the analysis of the distinguisher based on all cor-

relations, which allows us to build an efficient distinguisher that halves the data

complexity of the basic uni-bias-based distinguisher due to the linear depen-

dency of the two largest biases. Our best distinguishing attack takes time

1

given keystream with precomputation . Finally, by means of Fast

Walsh Transform (FWT), we propose a novel Maximum Likelihood Decoding

(MLD) algorithm to recover the closest codeword for any linear code. Our pro-

posed algorithm can be easily applied to speed up a class of fast correlation

attacks. Furthermore the algorithm is optimal when the length of the code

and the dimension L satisfy the relation which is the case when we ap-

ply it to recover for E0. Our best key-recovery attack works in time given

consecutive bits after precomputation. Compared with the minimum

time complexity in algebraic attacks [1,8], this is the best known attack

against E0.

This paper is structured as follows: in Section 2, a description of E0 is given.

In Section 3, we analyze the bias inside E0 systematically. Then based on one

largest bias, we build a primary distinguisher for E0 in Section 4; an efficient

way is shown in Section 5 that makes full use of all the largest biases to advance

the distinguisher. In Section 6 we investigate the MLD algorithm for a linear

code; the result is then applied to a key-recovery attack against E0 in Section 7.

Finally we conclude in Section 8.

2 Description of the Bluetooth Keystream Generator E0

As specified in [3], the keystream generator E0 used in Bluetooth belongs to a

combination generator with four memory bits2, denoted by at

time where The whole system (Fig.1) uses four Linear Feedback

1

Throughout this paper, O(·) is used to provide a rough estimate on complexities, eg.

here means operations, where is a small constant.

2

The description of E0 (sometimes called one-level E0) here only involves the

keystream generation after the initialization.

TEAM LinG

Faster Correlation Attack on Bluetooth Keystream Generator E0 409

Fig. 1. Outline of E0

Shift Registers (LFSRs) denoted by with lengths

and primitive feedback polynomials

respectively. At clock cycle the four LFSRs™ output bits will be

added as integers. The sum is represented in the binary system.

Let denote its least significant bit A 16-state machine (the

dashed box in Fig.1) emits one bit out of its state and takes the

input to update by Finally, the keystream is obtained by xoring

with That is,

The detailed mechanism of the FSM is beyond the scope of the paper except

the fact that the embedded delay cell (the box labeled in Fig.1) makes

depend only on the initial state of the FSM as well as the past vectors

For completeness, we briefly outline it: given together with

the state the FSM moves into the state Table 1 shows the state transi-

tion of the FSM, where the four-bit state is represented in the quaternary system

(e.g. the FSM changes from into by the input

More formally, by introducing two temporary bits each

clock, the following indirect iterative expressions between and suffice

to update

TEAM LinG

410 Yi Lu and Serge Vaudenay

One can check Table 1 by those equations. We denote hereafter the content

of LFSRs at time Then the state of E0 at time is fully represented by the

pair

3 Biases Inside E0

Property 1. Assuming holds for then

Proof. It™s easy to verify that the state transition given (the third bottom

row in Table 1) is indeed a linear transformation over that actually

satisfies the recurrence relation: where states are

represented by column vectors of and A is the following 4 — 4

square matrix over GF(2):

Note that is the minimal polynomial of A, from which we

deduce

Remark 2. Since this seemingly suggests that

As mentioned in [9,13] (without relating to the above special case), this bit

exhibits a much higher bias as shown later in Corollary 7. We will now introduce

essential material in order to find a systematic algorithm to compute biases.

Proposition 3. If is random and uniformly distributed, then for any

is random and uniformly distributed,

is independent of

TEAM LinG

Faster Correlation Attack on Bluetooth Keystream Generator E0 411

Proof (sketch). The former half of the theorem is justified by the fact that

is a permutation of for any About the latter half of theorem, first,

we know that is random and uniformly distributed by previous

conclusion. Thus, are i.i.d. random variables all independent of

both and By Eq.(2,3,4), we complete the proof.

Interestingly, we deduce that if is uniformly distributed, then any se-

quence of 39-bit consecutive E0 keystream is uniformly distributed; in particular,

no better key-recovery attack against E0 exists other than tradeoffs given a se-

quence of 39-bit consecutive keystream.

The following definition is derived from normalized correlation [22, p.71].

Definition 4. The bias of a random Boolean variable X is defined as

The normalized correlation between two random Boolean variables X and Y

is just the bias of Assuming that is the sum of four balanced inde-

pendent random bits and that is uniformly distributed, then we know that

is a constant for any denoted by

Table 2 shows computed by Eq.(2), where dashed entries are zeros. The

following important lemma (see Appendix A for proof) inspired by [14], gives an

easy way of computing the bias for iterative structures.

Lemma 5. Given and let

X and Y be two independent random variables in and respectively.

Assuming that is uniformly distributed in for any

we have

Corollary 6. We set to be a permutation defined

over and where

Assuming is uniformly distributed, for any we have

TEAM LinG

412 Yi Lu and Serge Vaudenay

Proof. By Eq.(2,3,4) we have

Then we apply Lemma 5 with

and we

and

obtain

Note that the assumption of Lemma 5 holds by Proposition 3.

Now we use Corollary 6 iteratively to deduce some important biases of with

Table 2 and the initial values and for A

full list of nonzero triplets is given below for illustration:

Corollary 7. Assuming is random and uniformly distributed, we have

Note that both biases were mentioned in [9,13] (without formal proof). Now by

Corollary 6, we can easily prove it as shown next.

Proof. We show the equivalent of the first bias as follows:

The second bias is similarly proved from

Also, we computed all biases for and found that

are the only largest ones. All biases for are listed in Ta-

ble 14, Appendix C. Throughout the paper we let

TEAM LinG

Faster Correlation Attack on Bluetooth Keystream Generator E0 413

4 A Primary Distinguisher for E0

4.1 The Connection Polynomial of the Equivalent Single LFSR

Let be the order of the connection polynomial of for

Since all are primitive polynomials, furthermore, by Lemma

6.57 of [18, p.218], the equivalent LFSR to generate the same sequence of the sum

of the four original LFSRs outputs over GF(2) has the connection polynomial

with order (by Lemma 6.50, [18, p.214])

and degree

4.2 Finding the Multiple Polynomial with Low Weight

Let be the degree of a general polynomial We use the standard approx-

imation to estimate the minimal weight of multiples of with degree at

most by the following constraint: is the smallest such that

Listed in Table 3 is the estimated3 corresponding to with

by solving Inequality (5).

To find multiples with low weight, efficient algorithms like [4] exist provided

the degree is low, say, less than 2000, which does not apply to E0. So we can use

the conventional birthday paradox to find with the minimal (i.e.

which takes precomputation time or we apply the generalized

birthday problem [29] to find of same weight but higher degree with much

less precomputation as tradeoff. Table 4 compares the two algorithms. In Ap-

pendix B, we also provide some non-optimal multiples as examples, including

with and

4.3 Building a Uni-Bias-Based Distinguisher for E0

Let be the normalized multiple of with degree

and weight where As holds for

all by Eq.(l), we deduce

3

Two special cases occur for and because we know the exact value of

TEAM LinG

414 Yi Lu and Serge Vaudenay

By the Piling-up Lemma [20] and Corollary 7, we know the right-hand side

of Eq.(6) is equal to zero with probability With standard linear

cryptanalysis techniques, we can distinguish the keystream of E0 from a

truly random sequence with samples, simply by checking the left-hand

side of Eq.(6) equals zero most of the time. Based on with and we

minimize the data complexity by choosing Table 5 shows the

minimum is achieved with Table 6 summarizes the

best performance of our primary distinguisher for E0 based on either the use of

with weight 4 in Appendix B, or a search of

5 The Advanced Multi-bias-Based Distinguisher for E0

5.1 Preliminaries

Definition 8. Given for we define

1.

2.

3.

where 1 denotes a constant function equal to 1

4.

TEAM LinG

Faster Correlation Attack on Bluetooth Keystream Generator E0 415

Note that the first two definitions correspond to convolution and Walsh trans-

form respectively. We recall these basic facts: for any we

have

if is a distribution, i.e. and for all then

the distribution of the XOR of i.i.d. random vectors with distribution

is moreover,

If the random Boolean variable A follows the distribution then

where is defined in Definition 4.

5.2 An Efficient Way to Deploy Multi-biases in E0 Simultaneously

Given a linear mapping of rank we define vectors

and Note that can be derived from

the keystream directly. Except for accidentally bad choices of we make

a heuristic assumption that all are independent. Let be the probability

distribution of the vector and let be the probability

distribution of the vector The Walsh transforms of and are linked

by

Now we discuss how to design in order to reduce data complexity. From

Baignères [2, Theorem 3, p.10], we know that we can distinguish a distribution

of random vectors from a uniform distribution with samples.

Here, the distribution of is So the modified distinguisher needs

data complexity

Let be the number of the largest Walsh coefficients over all nonzero

with absolute value4 Since we obtain

In order to lower it™s necessary to have This implies the largest

coefficients are linearly dependent, which happens to be true in E0: recall that

the 6-bit vectors of the three largest biases satisfy the linear relation,

As a simple solution we may just pick

and (where denotes the row of then we obtain

4

Note that from Subsection 4.3 we have for regardless of and

TEAM LinG

416 Yi Lu and Serge Vaudenay

And is reduced to a factor of for negligible Indeed, recall that we

proved by computation that the largest Walsh coefficient for are either

(0,.. .,0,1,1,1,1,1,0,...,0) or (0,...,0,1,0,0,0,0,1,0,...,0). Thus

This leads to a more general solution, if we pick and the

row of as

then we obtain And so the improved factor of data complexity