3152

Lecture Notes in Computer Science

Commenced Publication in 1973

Founding and Former Series Editors:

Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board

David Hutchison

Lancaster University, UK

Takeo Kanade

Carnegie Mellon University, Pittsburgh, PA, USA

Josef Kittler

University of Surrey, Guildford, UK

Jon M. Kleinberg

Cornell University, Ithaca, NY, USA

Friedemann Mattern

ETH Zurich, Switzerland

John C. Mitchell

Stanford University, CA, USA

Moni Naor

Weizmann Institute of Science, Rehovot, Israel

Oscar Nierstrasz

University of Bern, Switzerland

C. Pandu Rangan

Indian Institute of Technology, Madras, India

Bernhard Steffen

University of Dortmund, Germany

Madhu Sudan

Massachusetts Institute of Technology, MA, USA

Demetri Terzopoulos

New York University, NY, USA

Doug Tygar

University of California, Berkeley, CA, USA

MosheY.Vardi

Rice University, Houston, TX, USA

Gerhard Weikum

Max-Planck Institute of Computer Science, Saarbruecken, Germany

TEAM LinG

This page intentionally left blank

TEAM LinG

Matt Franklin (Ed.)

Advances in Cryptology “

CRYPTO 2004

24th Annual International Cryptology Conference

Santa Barbara, California, USA, August 15-19, 2004

Proceedings

Springer

TEAM LinG

3-540-28628-4

eBook ISBN:

3-540-22668-0

Print ISBN:

©2005 Springer Science + Business Media, Inc.

Print ©2004 International Association for Cryptologic Research

All rights reserved

No part of this eBook may be reproduced or transmitted in any form or by any means, electronic,

mechanical, recording, or otherwise, without written consent from the Publisher

Created in the United States of America

http://ebooks.springerlink.com

Visit Springer's eBookstore at:

and the Springer Global Website Online at: http://www.springeronline.com

TEAM LinG

Preface

Crypto 2004, the 24th Annual Crypto Conference, was sponsored by the Inter-

national Association for Cryptologic Research (IACR) in cooperation with the

IEEE Computer Society Technical Committee on Security and Privacy and the

Computer Science Department of the University of California at Santa Barbara.

The program committee accepted 33 papers for presentation at the confer-

ence. These were selected from a total of 211 submissions. Each paper received

at least three independent reviews. The selection process included a Web-based

discussion phase, and a one-day program committee meeting at New York Uni-

versity.

These proceedings include updated versions of the 33 accepted papers. The

authors had a few weeks to revise them, aided by comments from the reviewers.

However, the revisions were not subjected to any editorial review.

The conference program included two invited lectures. Victor Shoup™s invited

talk was a survey on chosen ciphertext security in public-key encryption. Susan

Landau™s invited talk was entitled “Security, Liberty, and Electronic Communi-

cations” . Her extended abstract is included in these proceedings.

We continued the tradition of a Rump Session, chaired by Stuart Haber.

Those presentations (always short, often serious) are not included here.

I would like to thank everyone who contributed to the success of this confer-

ence. First and foremost, the global cryptographic community submitted their

scientific work for our consideration. The members of the Program Committee

worked hard throughout, and did an excellent job. Many external reviewers con-

tributed their time and expertise to aid our decision-making. James Hughes,

the General Chair, was supportive in a number of ways. Dan Boneh and Victor

Shoup gave valuable advice. Yevgeniy Dodis hosted the PC meeting at NYU.

It would have been hard to manage this task without the Web-based submis-

sion server (developed by Chanathip Namprempre, under the guidance of Mihir

Bellare) and review server (developed by Wim Moreau and Joris Claessens, under

the guidance of Bart Preneel). Terri Knight kept these servers running smoothly,

and helped with the preparation of these proceedings.

Matt Franklin

June 2004

TEAM LinG

CRYPTO 2004

August 15“19, 2004, Santa Barbara, California, USA

Sponsored by the

International Association for Cryptologic Research (IACR)

in cooperation with

IEEE Computer Society Technical Committee on Security and Privacy,

Computer Science Department, University of California, Santa Barbara

General Chair

James Hughes, StorageTek

Program Chair

Matt Franklin, U.C. Davis, USA

Program Committee

Bill Aiello AT&T Labs, USA

Jee Hea An SoftMax, USA

Eli Biham Technion, Israel

University of Colorado at Boulder, USA

John Black

Anne Canteaut INRIA, France

Ronald Cramer University of Aarhus, Denmark

Yevgeniy Dodis New York University, USA

Yuval Ishai Technion, Israel

Lars Knudsen Technical University of Denmark, Denmark

Hugo Krawczyk Technion/IBM, Israel/USA

Pil Joong Lee POSTECH/KT, Korea

Phil MacKenzie Bell Labs, USA

Tal Malkin Columbia University, USA

Willi Meier Fachhochschule Aargau, Switzerland

Daniele Micciancio U.C. San Diego, USA

Ilya Mironov Microsoft Research, USA

Tatsuaki Okamoto NTT, Japan

Rafail Ostrovsky U.C.L.A., USA

Torben Pedersen Cryptomathic, Denmark

Benny Pinkas HP Labs, USA

Bart Preneel Katholieke Universiteit Leuven, Belgium

Alice Silverberg Ohio State University, USA

Nigel Smart Bristol University, UK

David Wagner U.C. Berkeley, USA

Stefan Wolf University of Montreal, Canada

TEAM LinG

CRYPTO 2004 VII

Advisory Members

Dan Boneh (Crypto 2003 Program Chair) Stanford University, USA

Victor Shoup (Crypto 2005 Program Chair) New York University, USA

External Reviewers

Masayuki Abe Marine Minier

Pierrick Gaudry

Siddhartha Annapuredy Bodo Moeller

Rosario Gennaro

Frederik Armknecht Håvard Molland

Craig Gentry

Daniel Augot Shafi Goldwasser David Molnar

Boaz Barak Jovan Golic Tal Mor

Elad Barkan Rob Granger Sara Miner More

Amos Beimel Jens Groth Fran§ois Morain

Mihir Bellare Stuart Haber Waka Nagao

Shai Halevi

Daniel Bleichenbacher Phong Nguyen

Dan Boneh Helena Handschuh Antonio Nicolosi

Carl Bosley Danny Harnik Jesper Nielsen

Ernie Brickell Johan Haståd Miyako Ohkubo

Ran Canetti Alejandro Hevia Kazuo Ohta

Jung Hee Cheon Jim Hughes Roberto Oliveira

Don Coppersmith Yong Ho Hwang Seong-Hun Paeng

Jean-S©bastien Coron Oleg Izmerly Dan Page

Nicolas Courtois Markus Jakobsson Dong Jin Park

Christophe De Cannière Stanislaw Jarecki Jae Hwan Park

Anand Desai Rob Johnson Joonhah Park

Yael Tauman Kalai Matthew Parker

Simon-Pierre Desrosiers

Irit Dinur Jonathan Katz Rafael Pass

Mario di Raimondo Dan Kenigsberg Kenny Paterson

Orr Dunkelman Dmitriy Kharchenko Erez Petrank

Aggelos Kiayias David Pointcheval

Glenn Durfee

Prashant Puniya

Iwan Duursma Eike Kiltz

Kihyun Kim Tal Rabin

Stefan Dziembowski

Andreas Enge Haavard Raddum

Ted Krovetz

Nelly Fazio Zulfikar Ramzan

Klaus Kursawe

Eyal Kushilevitz

Serge Fehr Oded Regev

Joseph Lano

Marc Fischlin Omer Reingold

In-Sok Lee

Matthias Fitzi Renato Renner

Arjen Lenstra

Caroline Fontaine Leonid Reyzin

Yehuda Lindell

Michael J. Freedman Vincent Rijmen

Hoi-Kwong Lo Phillip Rogaway

Atsushi Fujioka

Pankaj Rohatgi

Pierre Loidreau

Eiichiro Fujisaki

Anna Lysyanskaya Adi Rosen

Martin Gagne

Karl Rubin

Steven Galbraith John Malone-Lee

Dominic Mayers Alex Russell

Juan Garay

TEAM LinG

VIII CRYPTO 2004

Amit Sahai Martijn Stam Luis von Ahn

Gorm Salomonsen Jacques Stern Jason Waddle

Louis Salvail Douglas Stinson Shabsi Walfish

Tomas Sander Koutarou Suzuki Andreas Winter

Hovav Shacham Keisuke Tanaka Christopher Wolf

Ronen Shaltiel Edlyn Teske Juerg Wullschleger

Jonghoon Shin Christian Tobias Go Yamamoto

Victor Shoup Yuuki Tokunaga Yeon Hyeong Yang

Thomas Shrimpton Vinod Vaikuntanathan Sung Ho Yoo

Berit Skjernaa Brigitte Vallee Young Tae Youn

Adam Smith R. Venkatesan Dae Hyun Yum

Jerome A. Solinas Frederik Vercauteren Moti Yung

Jessica Staddon Felipe Voloch

TEAM LinG

Table of Contents

Linear Cryptanalysis

On Multiple Linear Approximations 1

Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

Feistel Schemes and Bi-linear Cryptanalysis 23

Nicolas T. Courtois

Group Signatures

Short Group Signatures 41

Dan Boneh, Xavier Boyen, and Hovav Shacham

Signature Schemes and Anonymous Credentials from Bilinear Maps 56

Jan Camenisch and Anna Lysyanskaya

Foundations

Complete Classification of Bilinear Hard-Core Functions 73

Thomas Holenstein, Ueli Maurer, and Johan Sjödin

Finding Collisions on a Public Road,

92

or Do Secure Hash Functions Need Secret Coins?

Chun-Yuan Hsiao and Leonid Reyzin

Security of Random Feistel Schemes with 5 or More Rounds 106

Jacques Patarin

Efficient Representations

123

Signed Binary Representations Revisited

Katsuyuki Okeya, Katja Schmidt-Samoa, Christian Spahn,

and Tsuyoshi Takagi

Compressed Pairings 140

Michael Scott and Paulo S.L.M. Barreto

157

Asymptotically Optimal Communication for Torus-Based Cryptography

Marten van Dijk and David Woodruff

179

How to Compress Rabin Ciphertexts and Signatures (and More)

Craig Gentry

TEAM LinG

X Table of Contents

Public Key Cryptanalysis

On the Bounded Sum-of-Digits Discrete Logarithm Problem

201

in Finite Fields

Qi Cheng

Computing the RSA Secret Key Is Deterministic Polynomial Time

213

Equivalent to Factoring

Alexander May

Zero-Knowledge

Multi-trapdoor Commitments and Their Applications to Proofs

220

of Knowledge Secure Under Concurrent Man-in-the-Middle Attacks

Rosario Gennaro

Constant-Round Resettable Zero Knowledge

237

with Concurrent Soundness in the Bare Public-Key Model

Giovanni Di Crescenzo, Giuseppe Persiano, and Ivan Visconti

Zero-Knowledge Proofs

254

and String Commitments Withstanding Quantum Attacks

Ivan Damgård, Serge Fehr, and Louis Salvail

The Knowledge-of-Exponent Assumptions

273

and 3-Round Zero-Knowledge Protocols

Mihir Bellare and Adriana Palacio

Hash Collisions

290

Near-Collisions of SHA-0

Eli Biham and Rafi Chen

Multicollisions in Iterated Hash Functions.

306

Application to Cascaded Constructions

Antoine Joux

Secure Computation

Adaptively Secure Feldman VSS and Applications

317

to Universally-Composable Threshold Cryptography

Masayuki Abe and Serge Fehr

335

Round-Optimal Secure Two-Party Computation

Jonathan Katz and Rafail Ostrovsky

Invited Talk

355

Security, Liberty, and Electronic Communications

Susan Landau

TEAM LinG

Table of Contents XI

Stream Cipher Cryptanalysis

An Improved Correlation Attack Against Irregular Clocked

and Filtered Keystream Generators 373

Håvard Molland and Tor Helleseth

Rewriting Variables: The Complexity of Fast Algebraic Attacks

on Stream Ciphers 390

Philip Hawkes and Gregory G. Rose

Faster Correlation Attack on Bluetooth Keystream Generator E0 407

Yi Lu and Serge Vaudenay

Public Key Encryption

426

A New Paradigm of Hybrid Encryption Scheme

Kaoru Kurosawa and Yvo Desmedt

443

Secure Identity Based Encryption Without Random Oracles

Dan Boneh and Xavier Boyen

Bounded Storage Model

460

Non-interactive Timestamping in the Bounded Storage Model

Tal Moran, Ronen Shaltiel, and Amnon Ta-Shma

Key Management

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 477

Dario Catalano, David Pointcheval, and Thomas Pornin

Randomness Extraction and Key Derivation

494

Using the CBC, Cascade and HMAC Modes

Yevgeniy Dodis, Rosario Gennaro, Johan Håstad, Hugo Krawczyk,

and Tal Rabin

Efficient Tree-Based Revocation in Groups of Low-State Devices 511

Michael T. Goodrich, Jonathan Z. Sun, and Roberto Tamassia

Computationally Unbounded Adversaries

528

Privacy-Preserving Datamining on Vertically Partitioned Databases

Cynthia Dwork and Kobbi Nissim

545

Optimal Perfectly Secure Message Transmission

K. Srinathan, Arvind Narayanan, and C. Pandu Rangan

Pseudo-signatures, Broadcast, and Multi-party Computation

from Correlated Randomness 562

Matthias Fitzi, Stefan Wolf, and Jürg Wullschleger

Author Index 579

TEAM LinG

This page intentionally left blank

TEAM LinG

On Multiple Linear Approximations*

Alex Biryukov**, Christophe De Cannière***, and Micha«l Quisquater***

Katholieke Universiteit Leuven, Dept. ESAT/SCD-COSIC,

Kasteelpark Arenberg 10,

B“3001 Leuven-Heverlee, Belgium

{abiryuko, cdecanni, mquisqua}@esat. kuleuven. ac. be

Abstract. In this paper we study the long standing problem of informa-

tion extraction from multiple linear approximations. We develop a formal

statistical framework for block cipher attacks based on this technique

and derive explicit and compact gain formulas for generalized versions of

Matsui™s Algorithm 1 and Algorithm 2. The theoretical framework allows

both approaches to be treated in a unified way, and predicts significantly

improved attack complexities compared to current linear attacks using

a single approximation. In order to substantiate the theoretical claims,

we benchmarked the attacks against reduced-round versions of DES and

observed a clear reduction of the data and time complexities, in almost

perfect correspondence with the predictions. The complexities are re-

duced by several orders of magnitude for Algorithm 1, and the significant

improvement in the case of Algorithm 2 suggests that this approach may

outperform the currently best attacks on the full DES algorithm.

Keywords: Linear cryptanalysis, multiple linear approximations,

stochastic systems of linear equations, maximum likelihood decoding,

key-ranking, DES, AES.

1 Introduction

Linear cryptanalysis [8] is one of the most powerful attacks against modern cryp-

tosystems. In 1994, Kaliski and Robshaw [5] proposed the idea of generalizing

this attack using multiple linear approximations (the previous approach consid-

ered only the best linear approximation). However, their technique was mostly

limited to cases where all approximations derive the same parity bit of the key.

Unfortunately, this approach imposes a very strong restriction on the approxima-

tions, and the additional information gained by the few surviving approximations

is often negligible.

In this paper we start by developing a theoretical framework for dealing with

multiple linear approximations. We first generalize Matsui™s Algorithm 1 based

* This work was supported in part by the Concerted Research Action (GOA) Mefisto-

2000/06 of the Flemish Government.

** F.W.O. Researcher, Fund for Scientific Research “ Flanders (Belgium).

F.W.O. Research Assistant, Fund for Scientific Research “ Flanders (Belgium).

***

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 1“22, 2004.

© International Association for Cryptologic Research 2004

TEAM LinG

2 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

on this framework, and then reuse these results to generalize Matsui™s Algo-

rithm 2. Our approach allows to derive compact expressions for the performance

of the attacks in terms of the biases of the approximations and the amount of

data available to the attacker. The contribution of these theoretical expressions

is twofold. Not only do they clearly demonstrate that the use of multiple ap-

proximations can significantly improve classical linear attacks, they also shed a

new light on the relations between Algorithm 1 and Algorithm 2.

The main purpose of this paper is to provide a new generally applicable crypt-

analytic tool, which performs strictly better than standard linear cryptanalysis.

In order to illustrate the potential of this new approach, we implemented two

attacks against reduced-round versions of DES, using this cipher as a well estab-

lished benchmark for linear cryptanalysis. The experimental results, discussed

in the second part of this paper, are in almost perfect correspondence with our

theoretical predictions and show that the latter are well justified.

This paper is organized as follows: Sect. 2 describes a very general maximum

likelihood framework, which we will use in the rest of the paper; in Sect. 3 this

framework is applied to derive and analyze an optimal attack algorithm based

on multiple linear approximations. In the last part of this section, we provide

a more detailed theoretical analysis of the assumptions made in order to derive

the performance expressions. Sect. 4 presents experimental results on DES as

an example. Finally, Sect. 5 discusses possible further improvements and open

questions. A more detailed discussion of the practical aspects of the attacks and

an overview of previous work can be found in the appendices.

2 General Framework

In this section we discuss the main principles of statistical cryptanalysis and

set up a generalized framework for analyzing block ciphers based on maximum

likelihood. This framework can be seen as an adaptation or extension of earlier

frameworks for statistical attacks proposed by Murphy et al. [11], Junod and

Vaudenay [3,4,14] and Sel§uk [12].

2.1 Attack Model

We consider a block cipher which maps a plaintext to a ciphertext

The mapping is invertible and depends on a secret key

We now assume that an adversary is given N different plaintext“ciphertext pairs

encrypted with a particular secret key (a known plaintext scenario),

and his task is to recover the key from this data. A general statistical approach ”

also followed by Matsui™s original linear cryptanalysis ” consists in performing

the following three steps:

Distillation phase. In a typical statistical attack, only a fraction of the infor-

mation contained in the N plaintext“ciphertext pairs is exploited. A first step

therefore consists in extracting the relevant parts of the data, and discarding

TEAM LinG

On Multiple Linear Approximations 3

all information which is not used by the attack. In our framework, the distil-

lation operation is denoted by a function which is applied to

each plaintext“ciphertext pair. The result is a vector with

which contains all relevant information. If which is

usually the case, we can further reduce the data by counting the occurrence of

each element of and only storing a vector of counters

In this paper we will not restrict ourselves to a single function but consider

separate functions each of which maps the text pairs into different sets

and generates a separate vector of counters

Analysis phase. This phase is the core of the attack and consists in generating

a list of key candidates from the information extracted in the previous step.

Usually, candidates can only be determined up to a set of equivalent keys,

i.e., typically, a majority of the key bits is transparent to the attack. In

general, the attack defines a function which maps each key

onto an equivalent key class The purpose of the analysis phase is

to determine which of these classes are the most likely to contain the true

key given the particular values of the counters

Search phase. In the last stage of the attack, the attacker exhaustively tries

all keys in the classes suggested by the previous step, until the correct key

is found. Note that the analysis and the searching phase may be intermixed:

the attacker might first generate a short list of candidates, try them out, and

then dynamically extend the list as long as none of the candidates turns out

to be correct.

2.2 Attack Complexities

When evaluating the performance of the general attack described above, we

need to consider both the data complexity and the computational complexity.

The data complexity is directly determined by N, the number of plaintext“

ciphertext pairs required by the attack. The computational complexity depends

on the total number of operations performed in the three phases of the attack.

In order to compare different types of attacks, we define a measure called the

gain of the attack:

Definition 1 (Gain). If an attack is used to recover an key and is expected

to return the correct key after having checked on the average M candidates, then

the gain of the attack, expressed in bits, is defined as:

Let us illustrate this with an example where an attacker wants to recover an

key. If he does an exhaustive search, the number of trials before hitting

the correct key can be anywhere from 1 to The average number M is

and the gain according to the definition is 0. On the other hand, if the

attack immediately derives the correct candidate, M equals 1 and the gain is

There is an important caveat, however. Let us consider two attacks

TEAM LinG

4 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

which both require a single plaintext“ciphertext pair. The first deterministically

recovers one bit of the key, while the second recovers the complete key, but

with a probability of 1/2. In this second attack, if the key is wrong and only

one plaintext“ciphertext pair is available, the attacker is forced to perform an

exhaustive search. According to the definition, both attacks have a gain of 1 bit

in this case. Of course, by repeating the second attack for different pairs, the

gain can be made arbitrary close to bits, while this is not the case for the first

attack.

2.3 Maximum Likelihood Approach

The design of a statistical attack consists of two important parts. First, we need

to decide on how to process the N plaintext“ciphertext pairs in the distillation

phase. We want the counters to be constructed in such a way that they con-

centrate as much information as possible about a specific part of the secret key

in a minimal amount of data. Once this decision has been made, we can proceed

to the next stage and try to design an algorithm which efficiently transforms this

information into a list of key candidates. In this section, we discuss a general

technique to optimize this second step. Notice that throughout this paper, we

will denote random variables by capital letters.

In order to minimize the amount of trials in the search phase, we want the

candidate classes which have the largest probability of being correct to be tried

first. If we consider the correct key class as a random variable Z and denote the

complete set of counters extracted from the observed data by t, then the ideal

output of the analysis phase would consist of a list of classes sorted according

to the conditional probability Taking the Bayesian approach, we

express this probability as follows:

The factor denotes the a priori probability that the class contains

the correct key and is equal to the constant with the total number

of classes, provided that the key was chosen at random. The denominator is

determined by the probability that the specific set of counters t is observed,

taken over all possible keys and plaintexts. The only expression in (2) that

depends on and thus affects the sorting, is the factor compactly

written as This quantity denotes the probability, taken over all possible

plaintexts, that a key from a given class produces a set of counters t. When

viewed as a function of for a fixed set t, the expression is also

called the likelihood of given t, and denoted by i.e.,

This likelihood and the actual probability have distinct values, but

they are proportional for a fixed t, as follows from (2). Typically, the likelihood

TEAM LinG

On Multiple Linear Approximations 5

expression is simplified by applying a logarithmic transformation. The result is

denoted by

and called the log-likelihood. Note that this transformation does not affect the

sorting, since the logarithm is a monotonously increasing function.

Assuming that we can construct an efficient algorithm that accurately esti-

mates the likelihood of the key classes and returns a list sorted accordingly, we

are now ready to derive a general expression for the gain of the attack.

Let us assume that the plaintexts are encrypted with an secret key

contained in the equivalence class and let be the set of classes

different from The average number of classes checked during the searching

phase before the correct key is found, is given by the expression

where the random variable T represents the set of counters generated by a key

from the class given N random plaintexts. Note that this number includes

the correct key class, but since this class will be treated differently later on,

we do not include it in the sum. In order to compute the probabilities in this

expression, we define the sets Using this notation,

we can write

Knowing that each class contains different keys, we can now derive the

expected number of trials M*, given a secret key Note that the number of keys

that need to be checked in the correct equivalence class is only

on the average, yielding

This expression needs to be averaged over all possible secret keys in order to

1

find the expected value M, but in many cases we will find that M* does not

depend on the actual value of such that M = M*. Finally, the gain of the

attack is computed by substituting this value of M into (1).

Application to Multiple Approximations

3

In this section, we apply the ideas discussed above to construct a general frame-

work for analyzing block ciphers using multiple linear approximations.

1

In some cases the variance of the gain over different keys would be very significant.

In these cases it might be worth to exploit this phenomenon in a weak-key attack

scenario, like in the case of the IDEA cipher.

TEAM LinG

6 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

The starting point in linear cryptanalysis is the existence of unbalanced lin-

ear expressions involving plaintext bits, ciphertext bits, and key bits. In this

paper we assume that we can use such expressions (a method to find them is

presented in an extended version of this paper [1]):

with (P, C) a random plaintext“ciphertext pair encrypted with a random key K.

The notation stands for where represent

particular bits of X. The deviation is called the bias of the linear expression.

We now use the framework of Sect. 2.1 to design an attack which exploits

the information contained in (4). The first phase of the cryptanalysis consists in

extracting the relevant parts from the N plaintext“ciphertext pairs. The linear

expressions in (4) immediately suggest the following functions

with These values are then used to construct counter

vectors where and reflect the number of plaintext“

equals 0 and 1, respectively2.

ciphertext pairs for which

In the second step of the framework, a list of candidate key classes needs to

be generated. We represent the equivalent key classes induced by the linear

expressions in (4) by an word with Note

that might possibly be much larger than the length of the key In this

case, only a subspace of all possible words corresponds to a valid key class.

The exact number of classes depends on the number of independent linear

approximations (i.e., the rank of the corresponding linear system).

3.1 Computing the Likelihoods of the Key Classes

We will for now assume that the linear expressions in (4) are statistically in-

dependent for different plaintext“ciphertext pairs and for different values of

(in the next section we will discuss this important point in more details). This

allows us to apply the maximum likelihood approach described earlier in a very

straightforward way. In order to simplify notations, we define the probabilities

and the imbalances3

and of the linear expressions as

We start by deriving a convenient expression for the probability To

simplify the calculation, we first give a derivation for the special key class

2

The vectors are only constructed to be consistent with the framework described

earlier. In practice of course, the attacker will only calculate (this is a minimal

sufficient statistic).

3

Also known in the literature as “correlations”.

TEAM LinG

On Multiple Linear Approximations 7

Fig. 1. Geometrical interpretation for The correct key class has the second

largest likelihood in this example. The numbers in the picture represent the number of

trials M* when falls in the associated area.

Assuming independence of different approximations and of dif-

ferent pairs, the probability that this key generates the counters is

given by the product

In practice, and will be very close to 1/2, and N very large. Taking this

into account, we approximate the binomial distribution above by

an Gaussian distribution:

The variable is called the estimated imbalance and is derived from the counters

according to the relation For any key class we can repeat

the reasoning above, yielding the following general expression:

This formula has a useful geometrical interpretation: if we take a key from a

fixed key class and construct an vector by

encrypting N random plaintexts, then will be distributed around the vector

according to a Gaussian distribution with a

diagonal variance-covariance matrix where is an identity

matrix. This is illustrated in Fig. 1. From (6) we can now directly compute the

log-likelihood:

TEAM LinG

8 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

The constant C depends on and N only, and is irrelevant to the attack. From

this formula we immediately derive the following property.

Lemma 1. The relative likelihood of a key class is completely determined by

the Euclidean distance where is an vector containing

the estimated imbalances derived from the known texts, and

The lemma implies that if and only if This

type of result is common in coding theory.

3.2 Estimating the Gain of the Attack

Based on the geometrical interpretation given above, and using the results from

Sect. 2.3, we can now easily derive the gain of the attack.

Theorem 1. Given approximations and N independent pairs an

adversary can mount a linear attack with a gain equal to:

where is the cumulative normal distribution function,

and is the number of key classes induced by the approximations.

Proof. The probability that the likelihood of a key class exceeds the likelihood

of the correct key class is given by the probability that the vector falls

into the half plane Considering the fact that

describes a Gaussian distribution around with a variance-covariance matrix

we need to integrate this Gaussian over the half plane and due to

the zero covariances, we immediately find:

By summing these probabilities as in (3) we find the expected number of trials:

The gain is obtained by substituting this expression for M* in equation (1).

The formula derived in the previous theorem can easily be evaluated as long as

is not too large. In order to estimate the gain in the other cases as well, we

need to make a few approximations.

TEAM LinG

On Multiple Linear Approximations 9

Corollary 1. If is sufficiently large, the gain derived in Theorem 1 can

accurately be approximated by

where

Proof. See App. A.

An interesting conclusion that can be drawn from the corollary above is that

the gain of the attack is mainly determined by the product As a result, if

we manage to increase by using more linear characteristics, then the required

number of known plaintext“ciphertext pairs N can be decreased by the same

factor, without affecting the gain. Since the quantity plays a very important

role in the attacks, we give it a name and define it explicitly.

Definition 2. The capacity of a system of approximations is defined as

3.3 Extension: Multiple Approximations and Matsui™s Algorithm 2

The approach taken in the previous section can be seen as an extension of Mat-

sui™s Algorithm 1. Just as in Algorithm 1, the adversary analyses parity bits

of the known plaintext“ciphertext pairs and then tries to determine parity bits

of internal round keys. An alternative approach, which is called Algorithm 2

and yields much more efficient attacks in practice, consists in guessing parts of

the round keys in the first and the last round, and determining the probability

that the guess was correct by exploiting linear characteristics over the remaining

rounds. In this section we will show that the results derived above can still be

applied in this situation, provided that we modify some definitions.

Let us denote by the set of possible guesses for the targeted subkeys of the

outer rounds (round 1 and round For each guess and for all N plaintext“

ciphertext pairs, the adversary does a partial encryption and decryption at the

top and bottom of the block cipher, and recovers the parity bits of the intermedi-

ate data blocks involved in different linear characteristics. Using

this data, he constructs counters which can be transformed

into a vector containing the estimated imbalances.

As explained in the previous section, the linear characteristics involve

parity bits of the key, and thus induce a set of equivalent key classes, which we

will here denote by (I from inner). Although not strictly necessary, we will

for simplicity assume that the sets and are independent, such that each

guess can be combined with any class thereby determining a

subclass of keys with

TEAM LinG

10 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

At this point, the situation is very similar to the one described in the previous

section, the main difference being a higher dimension The only remaining

question is how to construct the vectors for each key class

To solve this problem, we will need to make some assumptions.

Remember that the coordinates of are determined by the expected imbalances

of the corresponding linear expressions, given that the data is encrypted with

a key from class For the counters that are constructed after guessing the

correct subkey the expected imbalances are determined by and equal to

For each of the other counters, however, we

will assume that the wrong guesses result in independent random-looking parity

bits, showing no imbalance at all4. Accordingly, the vector has the following

form:

With the modified definitions of and given above, both Theorem 1 and

Corollary 1 still hold (the proofs are given in App. A). Notice however that the

gain of the Algorithm-2-style linear attack will be significantly larger because it

depends on the capacity of linear characteristics over rounds instead of

rounds.

3.4 Influence of Dependencies

When deriving (5) in Sect. 3, we assumed statistical independence. This assump-

tion is not always fulfilled, however. In this section we discuss different potential

sources of dependencies and estimate how they might influence the cryptanalysis.

Dependent plaintext“ciphertext pairs. A first assumption made by equa-

tion (5) concerns the dependency of the parity bits with com-

puted with a single linear approximation for different plaintext“ciphertext pairs.

The equation assumes that the probability that the approximation holds for a

single pair equals regardless of what is observed for other pairs.

This is a very reasonable assumption if the N plaintexts are chosen randomly,

but even if they are picked in a systematic way, we can still safely assume that

the corresponding ciphertexts are sufficiently unrelated as to prevent statistical

dependencies.

Dependent text mask. The next source of dependencies is more fundamental

and is related to dependent text masks. Suppose for example that we want to use

three linear approximations with plaintext“ciphertext masks

and that It is immediately clear

that the parity bits computed for these three approximations cannot possibly be

independent: for all pairs, the bit computed for the 3rd approximation

is equal to

4

Note that for some ciphers, other assumptions may be more appropriate. The rea-

soning in this section can be applied to these cases just as well, yielding very similar

results.

TEAM LinG

On Multiple Linear Approximations 11

Even in such cases, however, we believe that the results derived in the pre-

vious section are still quite reasonable. In order to show this, we consider the

probability that a single random plaintext encrypted with an equivalent key

yields a vector5 of parity bits Let us denote by the con-

catenation of both text masks and Without loss of generality, we can

assume that the masks are linearly independent for and linearly

dependent (but different) for This implies that x is restricted to a

subspace We will only consider the key class in

order to simplify the equations. The probability we want to evaluate is:

These (unknown) probabilities determine the (known) imbalances of the linear

approximations through the following expression:

We now make the (in many cases reasonable) assumption that all masks

which depend linearly on the masks but which differ from the ones

considered by the attack, have negligible imbalances. In this case, the equation

above can be reversed (note the similarity with the Walsh-Hadamard transform),

and we find that:

Assuming that we can make the following approximation:

Apart from an irrelevant constant factor this is exactly what we need:

it implies that, even with dependent masks, we can still multiply probabilities

as we did in order to derive (5). This is an important conclusion, because it

indicates that the capacity of the approximations continues to grow, even when

exceeds twice the block size, in which case the masks are necessarily linearly

dependent.

Dependent trails. A third type of dependencies might be caused by merging

linear trails. When analyzing the best linear approximations for DES, for exam-

ple, we notice that most of the good linear approximations follow a very limited

number of trails through the inner rounds of the cipher, which might result in

dependencies. Although this effect did not appear to have any influence on our

experiments (with up to 100 different approximations), we cannot exclude at

this point that they will affect attacks using much more approximations.

5

Note a small abuse of notation here: the definition of x differs from the one used in

Sect. 2.1.

TEAM LinG

12 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

Dependent key masks. We finally note that we did not make any assumption

about the dependency of key masks in the previous sections. This implies that

all results derived above remain valid for dependent key masks.

4 Experimental Results

In Sect. 3 we derived an optimal approach for cryptanalyzing block ciphers using

multiple linear approximations. In this section, we implement practical attack

algorithms based on this approach and evaluate their performance when applied

to DES, the standard benchmark for linear cryptanalysis. Our experiments show

that the attack complexities are in perfect correspondence with the theoretical

results derived in the previous sections.

4.1 Attack Algorithm MK 1

Table 1 summarizes the attack algorithm presented in Sect. 2 (we call this al-

gorithm Attack Algorithm MK 1). In order to verify the theoretical results, we

applied the attack algorithm to 8 rounds of DES. We picked 86 linear approx-

imations with a total capacity (see Definition 2). In order to speed

up the simulation, the approximations were picked to contain 10 linearly inde-

pendent key masks, such that Fig. 2 shows the simulated gain for

Algorithm MK 1 using these 86 approximations, and compares it to the gain of

Matsui™s Algorithm 1, which uses the best one only We clearly see

a significant improvement. While Matsui™s algorithm requires about pairs

to attain a gain close to 1 bit, only pairs suffice for Algorithm MK 1. The

theoretical curves shown in the figure were plotted by computing the gain using

TEAM LinG

On Multiple Linear Approximations 13

Fig. 2. Gain (in bits) as a function of data (known plaintext) for 8-round DES.

the exact expression for M* derived in Theorem 1 and using the approximation

from Corollary 1. Both fit nicely with the experimental results.

Note, that the attack presented in this section is just a proof of concept,

even higher gains would be possible with more optimized attacks. For a more

detailed discussion of the technical aspects playing a role in the implementation

of Algorithm MK 1, we refer to App. B.

4.2 Attack Algorithm MK 2

In this section, we discuss the experimental results for the generalization of Mat-

sui™s Algorithm 2 using multiple linear approximations (called Attack Algorithm

MK 2). We simulated the attack algorithm on 8 rounds of DES and compared

the results to the gain of the corresponding Algorithm 2 attack described in

Matsui™s paper [9].

Our attack uses eight linear approximations spanning six rounds with a total

capacity In order to compute the parity bits of these equations,

eight 6-bit subkeys need to be guessed in the first and the last rounds (how this

is done in practice is explained in App. B). Fig. 3 compares the gain of the attack

to Matsui™s Algorithm 2, which uses the two best approximations

For the same amount of data, the multiple linear attack clearly achieves a much

higher gain. This reduces the complexity of the search phase by multiple orders

of magnitude. On the other hand, for the same gain, the adversary can reduce

the amount of data by at least a factor 2. For example, for a gain of 12 bits, the

data complexity is reduced from to This is in a close correspondence

with the ratio between the capacities. Note that both simulations were carried

TEAM LinG

14 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

Fig. 3. Gain (in bits) as a function of data (known plaintext) for 8-round DES.

out under the assumption of independent subkeys (this was also the case for

the simulations presented in [9]). Without this assumption, the gain will closely

follow the graphs on the figure, but stop increasing as soon as the gain equals

the number of independent key bits involved in the attack.

As in Sect. 4.1 our goal was not to provide the best attack on 8-round DES,

but to show that Algorithm-2 style attacks do gain from the use of multiple linear

approximations, with a data reduction proportional to the increase in the joint

capacity. We refer to App. B for the technical aspects of the implementation of

Algorithm MK 2.

4.3 Capacity “ DES Case Study

In Sect. 3 we argued that the minimal amount of data needed to obtain a certain

gain compared to exhaustive search is determined by the capacity of the linear

approximations. In order to get a first estimate of the potential improvement of

using multiple approximations, we calculated the total capacity of the best

linear approximations of DES for The capacities were computed

using an adapted version of Matsui™s algorithm (see [1]). The results, plotted for

different number of rounds, are shown in Fig. 4 and 5, both for approximations

restricted to a single S-box per round and for the general case. Note that the

single best approximation is not visible on these figures due to the scale of the

graphs.

Kaliski and Robshaw [5] showed that the first 10 006 approximations with a

single active S-box per round have a joint capacity of for 14 rounds

TEAM LinG

On Multiple Linear Approximations 15

Fig. 4. Capacity (14 rounds). Fig. 5. Capacity (16 rounds).

of DES6. Fig. 4 shows that this capacity can be increased to when

multiple S-boxes are allowed. Comparing this to the capacity of Matsui™s best

approximation the factor 38 gained by Kaliski and Robshaw is

increased to 304 in our case. Practical techniques to turn this increased capacity

into an effective reduction of the data complexity are presented in this paper,

but exploiting the full gain of 10000 unrestricted approximations will require

additional techniques. In theory, however, it would be possible to reduce the

data complexity form (in Matsui™s case, using two approximations) to about

(using 10000 approximations).

In order to provide a more conservative (and probably rather realistic) es-

timation of the implications of our new attacks on full DES, we searched for

14-round approximations which only require three 6-bit subkeys to be guessed

simultaneously in the first and the last rounds. The capacity of the 108 best

approximations satisfying this restriction is This suggests that an

MK 2 attack exploiting these 108 approximations might reduce the data com-

plexity by a factor 4 compared to Matsui™s Algorithm 2 (i.e., instead of

This is comparable to the Knudsen-Mathiassen reduction [6], but would preserve

the advantage of being a known-plaintext attack rather than a chosen-plaintext

one.

Using very high numbers of approximations is somewhat easier in practice

for MK 1 because we do not have to impose restrictions on the plaintext and

ciphertext masks (see App. B). Analyzing the capacity for the 10000 best 16-

round approximations, we now find a capacity of If we restrict the

complexity of the search phase to an average of trials (i. e., a gain of 12 bits),

we expect that the attack will require known plaintexts. As expected, this

theoretical number is larger than for the MK 2 attack using the same amount

of approximations.

5 Future Work

In this paper we proposed a framework which allows to use the information

contained in multiple linear approximations in an optimal way. The topics below

are possible further improvements and open questions.

6

Note that Kaliski and Robshaw calculated the sum of squared biases:

TEAM LinG

16 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

Application to 16-round DES. The results in this paper suggest that Algo-

rithms MK 1 and MK 2 could reduce the data complexity to known

plaintexts, or even less when the number of approximations is further in-

creased. An interesting problem related to this is how to merge multiple lists

of key classes (possibly with overlapping key-bits) efficiently.

Application to AES. Many recent ciphers, e.g., AES, are specifically designed

to minimize the bias of the best approximation. However, this artificial flat-

tening of the bias profile comes at the expense of a large increase in the

number of approximations having the same bias. This suggests that the gain

made by using multiple linear approximations could potentially be much

higher in this case than for a cipher like DES. Considering this, we expect

that one may need to add a few rounds when defining bounds of provable se-

curity against linear cryptanalysis, based only on best approximations. Still,

since AES has a large security margin against linear cryptanalysis we do not

believe that linear attacks enhanced with multiple linear approximations will

pose a practical threat to the security of the AES.

Performance of Algorithm MD. Using a very high number of independent

approximations seems impractical in Algorithms MK 1 and MK 2, but could

be feasible with Algorithm MD described in App. B.3. Additionally, this

method would allow to replace the multiple linear approximations by multi-

ple linear hulls.

Success rate. In this paper we derived simple formulas for the average number

of key candidates checked during the final search phase. Deriving a simple

expression for the distribution of this number is still an open problem. This

would allow to compute the success rate of the attack as a function of the

number of plaintexts and a given maximal number of trials.

6 Conclusions

In this paper, we have studied the problem of generalizing linear cryptanalytic

attacks given multiple linear approximations, which has been stated in 1994

by Kaliski and Robshaw [5]. In order to solve the problem, we have developed

a statistical framework based on maximum likelihood decoding. This approach

is optimal in the sense that it utilizes all the information that is present in the

multiple linear approximations. We have derived explicit and compact gain for-

mulas for the generalized linear attacks and have shown that for a constant gain,

the data-complexity N of the attack is proportional to the inverse joint capacity

of the multiple linear approximations: The gain formulas hold for

the generalized versions of both algorithms proposed by Matsui (Algorithm 1

and Algorithm 2).

In the second half of the paper we have proposed several practical methods

which deliver the theoretical gains derived in the first part of the paper. We

have proposed a key-recovery algorithm MK 1 which has a time complexity

and a data complexity where is the number of

solutions of the system of equations defined by the linear approximations. We

TEAM LinG

On Multiple Linear Approximations 17

have also designed an algorithm MK 2 which is a direct generalization of Matsui™s

Algorithm 2, as described in [9]. The performances of both algorithms are very

close to our theoretical estimations and confirm that the data-complexity of the

attack decreases proportionally to the increase in the joint capacity of multiple

approximations. We have used 8-round DES as a standard benchmark in our

experiments and in all cases our attacks perform significantly better than those

given by Matsui. However our goal in this paper was not to produce the most

optimal attack on DES, but to construct a new cryptanalytic tool applicable to

a variety of ciphers.

References

1. A. Biryukov, C. De Cannière, and M. Quisquater, “On multiple linear approxi-

mations (extended version).” Cryptology ePrint Archive: Report 2004/057, http:

//eprint.iacr.org/2004/057/.

2. J. Daemen and V. Rijmen, The Design of Rijndael: AES ” The Advanced En-

cryption Standard. Springer-Verlag, 2002.

3. P. Junod, “On the optimality of linear, differential, and sequential distinguishers,”

in Advances in Cryptology “ EUROCRYPT 2003 (E. Biham, ed.), Lecture Notes

in Computer Science, pp. 17“32, Springer-Verlag, 2003.

4. P. Junod and S. Vaudenay, “Optimal key ranking procedures in a statistical crypt-

analysis,” in Fast Software Encryption, FSE 2003 (T. Johansson, ed.), vol. 2887

of Lecture Notes in Computer Science, pp. 1“15, Springer-Verlag, 2003.

5. B. S. Kaliski and M. J. Robshaw, “Linear cryptanalysis using multiple approxima-

tions,” in Advances in Cryptology “ CRYPTO™94 (Y. Desmedt, ed.), vol. 839 of

Lecture Notes in Computer Science, pp. 26“39, Springer-Verlag, 1994.

6. L. R. Knudsen and J. E. Mathiassen, “A chosen-plaintext linear attack on DES,”

in Fast Software Encryption, FSE 2000 (B. Schneier, ed.), vol. 1978 of Lecture

Notes in Computer Science, pp. 262“272, Springer-Verlag, 2001.

7. L. R. Knudsen and M. J. B. Robshaw, “Non-linear approximations in linear crypt-

analysis,” in Proceedings of Eurocrypt™96 (U. Maurer, ed.), no. 1070 in Lecture

Notes in Computer Science, pp. 224“236, Springer-Verlag, 1996.

8. M. Matsui, “Linear cryptanalysis method for DES cipher,” in Advances in Cryptol-

ogy “ EUROCRYPT™93 (T. Helleseth, ed.), vol. 765 of Lecture Notes in Computer

Science, pp. 386“397, Springer-Verlag, 1993.

9. M. Matsui, “The first experimental cryptanalysis of the Data Encryption Stan-

dard,” in Advances in Cryptology “ CRYPTO™94 (Y. Desmedt, ed.), vol. 839 of

Lecture Notes in Computer Science, pp. 1“11, Springer-Verlag, 1994.

10. M. Matsui, “Linear cryptanalysis method for DES cipher (I).” (extended paper),

unpublished, 1994.

11. S. Murphy, F. Piper, M. Walker, and P. Wild, “Likelihood estimation for block

cipher keys,” Technical report, Information Security Group, Royal Holloway, Uni-

versity of London, 1995.

12. A. A. Sel§uk, “On probability of success in linear and differential cryptanalysis,”

in Proceedings of SCN™02 (S. Cimato, C. Galdi, and G. Persiano, eds.), vol. 2576

of Lecture Notes in Computer Science, Springer-Verlag, 2002. Also available at

https://www.cerias.purdue.edu/papers/archive/2002-02.ps.

TEAM LinG

18 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

13. T. Shimoyama and T. Kaneko, “Quadratic relation of s-box and its application

to the linear attack of full round des,” in Advances in Cryptology “ CRYPTO™98

(H. Krawczyk, ed.), vol. 1462 of Lecture Notes in Computer Science, pp. 200“211,

Springer-Verlag, 1998.

14. S. Vaudenay, “An experiment on DES statistical cryptanalysis,” in 3rd ACM Con-

ference on Computer and Communications Security, CCS, pp. 139“147, ACM

Press, 1996.

A Proofs

A.1 Proof of Corollary 1

Corollary 1. If is sufficiently large, the gain derived in Theorem 1 can

accurately be approximated by

where is called the total capacity of the linear characteristics.

Proof. In order to show how (11) is derived from (8), we just need to construct

an approximation for the expression

We first define the function Denoting the average value

of a set of variables by we can reduce (12) to the compact expression

with By expanding into a Taylor series around the

average value we find

Provided that the higher order moments of are sufficiently small, we can use

the approximation Exploiting the fact that the jth coordinate

of each vector is either or we can easily calculate the average value

When is sufficiently large (say the right hand part can be ap-

proximated by (remember that and thus

Substituting this into the relation we find

By applying this approximation to the gain formula derived in Theorem 1, we

directly obtain expression (11).

TEAM LinG

On Multiple Linear Approximations 19

A.2 Gain Formulas for the Algorithm-2-Style Attack

With the modified definitions of and given in Sect. 3.3, Theorem 1 can

immediately be applied. This results in the following corollary.

Corollary 2. Given approximations and N independent pairs an

adversary can mount an Algorithm-2-style linear attack with a gain equal to:

The formula above involves a summation over all elements of Motivated

by the fact that is typically very large, we now derive

a more convenient approximated expression similar to Corollary 1. In order to

do this, we split the sum into two parts. The first part considers only keys

where the second part sums over

all remaining keys In this second case, we have that

for all such that

For the first part of the sum, we apply the approximation used to derive Corol-

lary 1 and obtain a very similar expression:

Combining both result we find the counterpart of Corollary 1 for an Algorithm-

2-style linear attack.

Corollary 3. If is sufficiently large, the gain derived in Theorem 2 can

accurately be approximated by

where is the total capacity of the linear characteristics.

Notice that although Corollary 1 and 3 contain identical formulas, the gain of

the Algorithm-2-style linear attack will be significantly larger because it depends

on the capacity of linear characteristics over rounds instead of rounds.

B Discussion “ Practical Aspects

When attempting to calculate the optimal estimators derived in Sect. 3, the

attacker might be confronted with some practical limitations, which are often

cipher-dependent. In this section we discuss possible problems and propose ways

to deal with them.

TEAM LinG

20 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

B.1 Attack Algorithm MK 1

When estimating the potential gain in Sect. 3, we did not impose any restrictions

on the number of approximations However, while it does reduce the complex-

ity of the search phase (since it increases the gain), having an excessively high

number increases both the time and the space complexity of the distillation

and the analysis phase. At some point the latter will dominate, cancelling out

any improvement made in the search phase.

Analyzing the complexities in Table 1, we can make a few observations. We

first note that the time complexity of the distillation phase should be compared

to the time needed to encrypt plaintext“ciphertext pairs. Given that

a single counting operation is much faster than an encryption, we expect the

complexity of the distillation to remain negligible compared to the encryption

time as long as is only a few orders of magnitude (say

The second observation is that the number of different key classes clearly

plays an important role, both for the time and the memory complexities of the

algorithm. In a practical situation, the memory is expected to be the strongest

limitation. Different approaches can be taken to deal with this problem:

Straightforward, but inefficient approach. Since the number of different

key classes is bounded by the most straightforward solution is to limit

the number of approximations. A realistic upper bound would be

The obvious drawback of this approach is that it will not allow to attain

very high capacities.

Exploiting dependent key masks. A better approach is to impose a bound

on the number of linearly independent key masks This way, we limit

the memory requirements to but still allow a large number of ap-

proximations (for ex. a few thousands). This approach restricts the choice

of approximations, however, and thus reduces the maximum attainable ca-

pacity. This is the approach taken in Sect. 4.1. Note also that the attack

described in [5] can be seen as a special case of this approach, with

Merging separate lists. A third strategy consists in constructing separate

lists and merging them dynamically. Suppose for simplicity that the key

masks considered in the attack are all independent. In this case, we can

apply the analysis phase twice, each time using approximations. This

will result in two sorted lists of intermediate key classes, both containing

classes. We can then dynamically compute a sorted sequence of final

key classes constructed by taking the product of both lists. The ranking of

the sequence is determined by the likelihood of these final classes, which is

just the sum of the likelihoods of the elements in the separate lists. This

approach slightly increases7 the time complexity of the analysis phase, but

will considerably reduce the memory requirements. Note that this approach

can be generalized in order to allow some dependencies in the key masks.

7

In cases where the gain of the attack is several bits, this approach will actually

decrease the complexity, since we expect that only a fraction of the final sequence

will need to be computed.

TEAM LinG

On Multiple Linear Approximations 21

B.2 Attack Algorithm MK 2

We now briefly discuss some practical aspects of the Algorithm-2-style multiple

linear attack, called Attack Algorithm MK 2. As discussed earlier, the ideas of

the attack are very similar to Attack Algorithm MK 1, but there are a number of

additional issues. In the following paragraphs, we denote the number of rounds

of the cipher by

Choice of characteristics. In order to limit the amount of guesses in rounds 1

and only parts of the subkeys in these rounds will be guessed. This restricts

the set of useful characteristics to those that only depend on

bits which can be derived from the plaintext, the ciphertext, and the partial

subkeys. This obviously reduces the maximum attainable capacity.

Efficiency of the distillation phase. During the distillation phase, all N

plaintexts need to be analyzed for all guesses Since is rather

large in practice, this could be very computational intensive. For example,

a naive implementation would require steps and even Matsui™s

counting trick would use steps. However, the distillation can

be performed in steps by gradually guessing parts of and

re-processing the counters.

Merging Separate lists. The idea of working with separate lists can be ap-

plied here just as for MK 1.

Computing distances. In order to compare the likelihoods of different keys,

we need to evaluate the distance for all classes The vectors

and are both When calculating this distance as

a sum of squares, most terms do not depend on however. This allows the

distance to be computed very efficiently, by summing only terms.

B.3 Attack Algorithm MD (distinguishing/key-recovery)

The main limitation of Algorithm MK 1 and MK 2 is the bound on the number

of key classes In this section, we show that this limitation disappears if

our sole purpose is to distinguish an encryption algorithm from a random

permutation R. As usual, the distinguisher can be extended into a key-recovery

attack by adding rounds at the top and at the bottom.

If we observe N plaintext“ciphertext pairs and assume for simplicity that the

a priori probability that they were constructed using the encryption algorithm

is 1/2, we can construct a distinguishing attack using the maximum likelihood

approach in a similar way as in Sect. 3. Assuming that all secret keys are equally

probable, one can easily derive the likelihood that the encryption algorithm was

used, given the values of the counters t:

This expression is correct if all text masks and key masks are independent, but

is still expected to be a good approximation, if this assumption does not hold

TEAM LinG

22 Alex Biryukov, Christophe De Cannière, and Micha«l Quisquater

(for the reasons discussed in Sect. 3.4). A similar likelihood can be calculated

for the random permutation:

Contrary to what was found for Algorithm MK 1, both likelihoods can be com-

puted in time proportional to i.e., independent of The complete distin-

guishing algorithm, called Attack Algorithm MD consists of two steps:

Distillation phase. Obtain N plaintext“ciphertext pairs For

count the number of pairs satisfying

and If

Analysis phase. Compute decide that

the plaintexts were encrypted with the algorithm (using some unknown

key

The analysis of this algorithm is a matter of further research.

C Previous Work: Linear Cryptanalysis

Since the introduction of linear cryptanalysis by Matsui [8“10], several gen-

eralizations of the linear cryptanalysis method have been proposed. Kaliski-

Robshaw [5] suggested to use many linear approximations instead of one, but

did provide an efficient method for doing so only for the case when all the ap-

proximations cover the same parity bit of the key. Realizing that this limited

the number of useful approximations, the authors also proposed a simple (but

somewhat inefficient) extension to their technique which removes this restriction

by guessing a relation between the different key bits. The idea of using non-

linear approximations has been suggested by Knudsen-Robshaw [7]. It was used

by Shimoyama-Kaneko [13] to marginally improve the linear attack on DES.

Knudsen-Mathiassen [6] suggest to convert linear cryptanalysis into a chosen

plaintext attack, which would gain the first round of approximation for free.

The gain is small, since Matsui™s attack gains the first round rather efficiently

as well.

A more detailed overview of the history of linear cryptanalysis can be found

in the extended version of this paper [1].

TEAM LinG

Feistel Schemes and Bi-linear Cryptanalysis

(Extended Abstract)

Nicolas T. Courtois

Axalto Smart Cards Crypto Research,

36-38 rue de la Princesse, BP 45, F-78430 Louveciennes Cedex, France

courtois@minrank.org

Abstract. In this paper we introduce the method of bi-linear crypt-

analysis (BLC), designed specifically to attack Feistel ciphers. It allows

to construct periodic biased characteristics that combine for an arbitrary

number of rounds. In particular, we present a practical attack on DES

based on a 1-round invariant, the fastest known based on such invariant,

and about as fast as the best Matsui™s attack. For ciphers similar to DES,

based on small S-boxes, we claim that BLC is very closely related to LC,

and we do not expect to find a bi-linear attack much faster than by

LC. Nevertheless we have found bi-linear characteristics that are strictly

better than the best Matsui™s result for 3, 7, 11 and more rounds.

For more general Feistel schemes there is no reason whatsoever for BLC

to remain only a small improvement over LC. We present a construction

of a family of practical ciphers based on a big Rijndael-type S-box that

are strongly resistant against linear cryptanalysis (LC) but can be easily

broken by BLC, even with 16 or more rounds.

Keywords: Block ciphers, Feistel schemes, S-box design, inverse-based

S-box, DES, linear cryptanalysis, generalised linear cryptanalysis, I/O

sums, correlation attacks on block ciphers, multivariate quadratic equa-

tions.

1 Introduction

In spite of growing importance of AES, Feistel schemes and DES remain widely

used in practice, especially in financial/banking sector. The linear cryptanalysis

(LC), due to Gilbert and Matsui is the best known plaintext attack on DES, see

[4, 25, 27,16, 21]. (For chosen plaintext attacks, see [21, 2]).

A straightforward way of extending linear attacks is to consider nonlinear

multivariate equations. Exact multivariate equations can give a tiny improve-

ment to the last round of a linear attack, as shown at Crypto™98 [18]. A more

powerful idea is to use probabilistic multivariate equations, for every round, and

replace Matsui™s biased linear I/O sums by nonlinear I/O sums as proposed by

Harpes, Kramer, and Massey at Eurocrypt™95 [9]. This is known as Generalized

Linear Cryptanalysis (GLC). In [10,11] Harpes introduces partitioning crypt-

analysis (PC) and shows that it generalizes both LC and GLC. The correlation

cryptanalysis (CC) introduced in Jakobsen™s master thesis [13] is claimed even

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 23“40, 2004.

© International Association for Cryptologic Research 2004

TEAM LinG

24 Nicolas T. Courtois

more general. Moreover, in [12] it is shown that all these attacks, including also

Differential Cryptanalysis are closely related and can be studied in terms of the

Fast Fourier Transform for the cipher round function. Unfortunately, computing

this transform is in general infeasible for a real-life cipher and up till now, non-

linear multivariate I/O sums played a marginal role in attacking real ciphers.

Accordingly, these attacks may be excessively general and there is probably no

substitute to finding and studying in details interesting special cases.