2.2 There exists a polynomial time Turing Machine such that on in-

put and it outputs for Therefore,

3 There exists a polynomial time Turing Machine which, on input

and any answers whether or not.

There exists a (deterministic) polynomial time Turing Machine Inv such

4

that for all and for all

5 For every probabilistic polynomial time Turing Machine we have that,

for large enough

where is a negligible function.

The last property is our formal hard-to-invert notion, which is quite similar to

the usual one-way notion: they just differ if is one-way.

TEAM LinG

482 Dario Catalano, David Pointcheval, and Thomas Pornin

2.2 Verifiable Sub-family of Trapdoor Hard-to-Invert Isomorphisms

In the above definition, it is clear that for any the function is an

isomorphism from the group onto However, in practice, the family of

functions maybe indexed by a potentially larger set S (i.e.

for which there may exist some indices that do not lead to an isomorphism.

Therefore, we require more properties to be satisfied.

there exists a large subset such that is a

family of trapdoor hard-to-invert isomorphisms;

there exists a set J, of indices which provide an isomorphism “ such that

which admits an efficient zero-knowledge proof of membership.

The last property turns out to be crucial for the application we have in mind. In

our setting the client has to choose the specific function to use in the protocol.

This means that a dishonest client (i.e. one that does not share a password

with the server) could propose an index whose corresponding function is not

an isomorphism. This would give him the ability to run a partition attack (as

already explained for RSA). For this reason we require the client to produce a

function together with a proof that it is actually an isomorphism.

2.3 Zero-Knowledge Proofs of Membership

As noticed above, the only property we want to be able to verify is the isomor-

phic one, and thus the fact that the index actually lies in J: we just want

the adversary not to be able to prove a wrong statement, we do not care about

malleability [11]. One second point is that the zero-knowledge property will be

required in the security proof: a valid index is given, one tries to use the adver-

sary to solve a hard problem related to Thus, we need to be able to provide a

proof of validity of without any witness. Note however that the simulation is

performed for valid statements only, and thus simulation soundness [24] is not

required. Moreover, since we just have to simulate one proof without the wit-

ness (other executions will be performed as in an actual execution) concurrent

zero-knowledge is not needed either.

For efficiency reasons, we will focus on a specific class of zero-knowledge

the verifier sends a random seed seed and then

proofs: for a given statement

the prover non-interactively provides a proof using a

w.r.t. the random seed seed; the proof can be checked

witness that

without the witness In our protocol, honest players will sam-

ple and thus together the trapdoor This trapdoor will generally be a

good witness. More formally we require:

Completeness “ and are two efficient (polynomial time) al-

and any challenge seed, a witness helps to build

gorithms, and for any

a proof which is always accepted:

accepts;

TEAM LinG

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 483

Soundness “ for any the probability for any adversary (on its random

tape and the random seed seed) to forge a valid proof (accepted by the

algorithm) is negligible within time will denote the

maximal success probability for any adversary within time

ROM-simulatability “ granted the programmability of the random oracle, for

and any seed, there exists an efficient way to perfectly simulate

any

an accepted proof.

2.4 Concrete Examples

The Diffie-Hellman Family. The most natural example of family of trapdoor

hard-to-invert isomorphisms is the Diffie-Hellman one. The machine Gen, on

input the security parameter does as follows. First it chooses a random prime

of size and a prime such that divides Next, it chooses a subgroup

of order in and a corresponding generator Finally it chooses a random

element in it sets and outputs the pair where

and is an encoding of This defines our set I.

Now is instantiated as follows. Set and

1

is defined as Moreover is defined

as (for any

Clearly, to efficiently evaluate on a random point X, one should know

either the trapdoor information or any such that (assuming,

of course, that the computational Diffie-Hellman problem is infeasible in

Similarly knowledge of the trapdoor is sufficient to invert

on a random point However inverting the function without

knowing the trapdoor seems to be infeasible. Nevertheless, is efficiently

decidable: simply checks whether or not.

For our functions to be isomorphisms, one just needs to be co-prime with

where is actually the order of For better efficiency, the group informations

can be fixed, and considered as common trusted parameters. Therefore,

Gen just chooses and sets one just needs to check that

and no witness is required, nor additional proof:

does not need any witness for outputting any proof, since simply checks

the above equality/inequality.

The RSA Family. Another natural example is the RSA permutation. In this

case the machine Gen on input the security parameter does as follows. First it

chooses two random primes of size and sets Next, it chooses a

public exponent such that Finally it outputs the pair

where and is an encoding of This defines our set I.

The function is instantiated as follows. Set and

is the identity function, i.e. The function

1

Note that we allow a slight misuse of notation here. Actually the function

should be defined as However we prefer to adopt a simpler

(and somehow incorrect) notation for visual comfort.

TEAM LinG

484 Dario Catalano, David Pointcheval, and Thomas Pornin

is defined as (for any Hence,

The Inv algorithm is straightforward, granted the trapdoor. And the

algorithm simply has to check whether the element is prime to

As already noticed, since is easy to invert, the RSA family is not

only a trapdoor hard-to-invert isomorphism family, but also a trapdoor one-way

permutation family. However, to actually be an isomorphism, does not

really need to lie in I, which would be very costly to prove (while still possible).

It just needs to satisfy which defines our set J. An efficient

proof of validity is provided in the full version [8], where both and

are formally defined.

The Squaring Family. As a final example, we suggest the squaring function

which is defined as the RSA function with the variant that A problem

here arises from the fact that squaring is not a permutation over simply

because 2 is not co-prime with However, if one considers Blum moduli

(i.e. composites of the form where then it is easy to

check that the squaring function becomes an automorphism onto the group of

quadratic residues modulo (in the following we refer to this group as to

However this is not enough for our purposes. An additional difficulty comes from

the fact that we need an efficient way to check if a given element belongs to

(which would be here): the need of an efficient algorithm The most

natural extension of is the subset of which contains all the elements

with Jacobi symbol equal to +1. Note that for a Blum modulus this

set is isomorphic to (this is because “1 has a Jacobi symbol

equal to +1, but is not a square). By these positions we get the signed squaring2

isomorphism:

For this family, the machine Gen, on input the security parameter does as

follows. First it chooses two random Blum primes of size and sets

Then it outputs the pair where and is an encoding

of This thus defines our set I. The function is instantiated as follows.

Set and

is the identity function, i.e. The function

is defined as (for any

The Inv algorithm is straightforward, granted

Hence,

the trapdoor. And the algorithm simply computes the Jacobi symbol.

As above, since is easy to invert, the squaring family is not only a

trapdoor hard-to-invert isomorphism family, but also a trapdoor one-way per-

mutation family. However, to actually be an isomorphism, does not really need

to be a Blum modulus, which would be very costly to prove. What we need is

just that “1 has Jacobi symbol +1 and any square in admits exactly 4 roots.

A validity proof is provided, with the mathematical justification, in the section 6,

which thus formally defines both and

2

By signed, we mean that the output of the function has a sign (plus or minus).

TEAM LinG

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 485

The Formal Model

3

3.1 Security Model

Players. We denote by A and B two parties that can participate in the key

exchange protocol P. Each of them may have several instances called oracles

involved in distinct, possibly concurrent, executions of P. We denote A (resp.

B) instances by (resp. or by U when we consider any user instance. The

two parties share a low-entropy secret pw which is drawn from a small dictionary

Password, according to a distribution In the following, we use the notation

for the probability to be in the most probable set of passwords:

If we denote by the uniform distribution among N passwords,

Queries. We use the security model introduced by Bellare et al. [1], to which

paper we refer for more details. In this model, the adversary has the entire

control of the network, which is formalized by allowing to ask the following

queries:

This query models passive attacks, where the adversary

gets access to honest executions of P between the instances and by

eavesdropping.

Reveal(U): This query models the misuse of the session key by any instance

U (use of a weak encryption scheme, leakage after use, etc). The query is

only available to if the attacked instance actually “holds” a session key

and it releases the latter to

This query models sending a message to instance U. The

adversary gets back the response U generates in processing the message

according to the protocol P. A query initializes the key

exchange algorithm, and thus the adversary receives the flow A should send

out to B.

In the active scenario, the Execute-query may seem rather useless: after all the

Send-query already gives the adversary the ability to carry out honest executions

of P among parties. However, even in the active scenario, Execute-queries are

essential to properly deal with dictionary attacks. Actually the number of

Send-queries directly asked by the adversary does not take into account the

number of Execute-queries. Therefore, represents the number of flows the

adversary may have built by itself, and thus the number of passwords it may

have tried. Even better, is an upper-bound on the number of passwords

it may have tried, where (and resp.) is the number of A (B resp.) instances

involved in the attack. For the sake of simplicity, we restricted queries to A and

B only. One can indeed easily extend the model, and the proof, to the more

TEAM LinG

486 Dario Catalano, David Pointcheval, and Thomas Pornin

general case, keeping in mind that we are interested in the security of executions

involving at least A or B, with the password pw shared by them. Additional

queries would indeed use distinct passwords, which could be assumed public in

the security analysis (known to our simulator).

3.2 Security Notions

Two main security notions have been defined for key exchange protocols. The

first one is the semantic security of the key, which means that the exchanged key

is unknown to anybody else than the players. The second one is unilateral or

mutual authentication, which means that either one, or both, of the participants

actually know the key.

AKE Security. The semantic security of the session key is modeled by an

additional query Test(U). The Test-query can be asked at most once by the

adversary and is only available to if the attacked instance U is Fresh. The

freshness notion captures the intuitive fact that a session key is not “obviously”

known to the adversary. An instance is said to be Fresh if the instance has

accepted (i.e. the flag accept is set to true) and neither it nor its partner (i.e. the

other instance with same session tag ”or SID” which is defined as the view

the player has of the protocol ”the flows” before it accepts) have been asked

for a Reveal-query. The Test-query is answered as follows: one flips a (private)

coin and forwards sk (the value Reveal(U) would output) if or a random

value if

We denote the AKE advantage as the probability that correctly guesses

the value of More precisely we define where the

probability space is over the password, all the random coins of the adversary

and all the oracles, and is the output guess of for the bit involved in the

Test-query. The protocol P is said to be if advantage is

smaller than for any adversary running with time

Entity Authentication. Another goal of the adversary is to impersonate a

party. We may consider unilateral authentication of either A (A-Auth) or B (B-

Auth), thus we denote by the probability

that successfully impersonates an A instance (resp. a B instance) in an exe-

cution of P, which means that B (resp. A) terminates (i.e. the terminate flag is

set to true) even though it does not actually share the key with any accepting

partner A (resp. B).

A protocol P is said to be if success for breaking

either A-Auth or B-Auth is smaller than for any adversary running with

time This protocol then provides mutual authentication.

4 Algorithmic Assumptions

In this section we state some algorithmic assumptions we need in order to con-

struct an IPAKE protocol. As already sketched in section 1.2, our basic building

TEAM LinG

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 487

block is a family of trapdoor hard-to-invert bijections More precisely each

bijection needs to be a group isomorphism from a group into a

group where is the inverse operation of

As additional assumption we require the existence of a generalized full-domain

hash function which on a new input outputs a uniformly distributed

element in This is the reason why we need the decidability of in practice,

will be implemented by iterating a hash function until the output is in

The non-invertibility of the functions in the family is measured by the

“ability”, for any adversary in inverting a random function (in on a random

point, uniformly drawn from

More precisely, we denote by the maximal success probability for all

the adversaries running within time A simpler task for the adversary may be

to output a list of elements which contains the solutions:

As above, we denote by the maximal success probability for all

the adversaries running within time which output sets of size

The RSA Family:

4.1

As described in section 2.4 the function is defined by and

And, for any For a correctly generated and a valid

(i.e an such that the non-invertibility of the function is

equivalent to the, widely conjectured, one-wayness of RSA. This leads to the

following

where is an upper-bound on the time required to perform an exponentiation.

4.2 The Diffie-Hellman Family:

Let be any cyclic group of (preferably) prime order As sketched in

section 2.4, the function is defined by a point (and thus

and For any

attacker, in the finite cyclic group of prime order gen-

erated by is a probabilistic machine running in time such that

3

For visual comfort in the following we adopt the symbols rather than

(respectively)

TEAM LinG

488 Dario Catalano, David Pointcheval, and Thomas Pornin

Fig. 1. An execution of the IPAKE protocol: Auth is computed by Alice (Bob resp.)

as and sk is computed by

Alice (Bob resp.) as resp.)

where the probability is taken over the random values and in As usual,

we denote by the maximal success probability over every adversary

running within time Then, when and are fixed,

Using Shoup™s result [25] about “self-correcting Diffie-Hellman”, one can see that

if then for some

4.3 The Squaring Family:

As discussed in section 2.4 if one assumes that the modulus is the product

of two Blum primes, the signed squaring function becomes an isomorphism

from onto Furthermore, for a correctly generated the non-

invertibility of is trivially equivalent to the one-wayness of factoring Blum

composites. This leads us to the following inequality

which provides a very tight bound because, in this case, represents the time

required to perform a single modular multiplication (i.e. to square).

Security Proof for the IPAKE Protocol

5

5.1 Description and Notations

In this section we show that the IPAKE protocol distributes session keys that

are semantically secure and provides unilateral authentication for the client A.

TEAM LinG

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 489

The specification of the protocol can be found on Figure 1. Some remarks, about

notation, are in order

We assume to be a correct family, with a verifiable sub-family of trapdoor

hard-to-invert isomorphisms from into In the following, we identify

to and thus We denote by the size of I. Furthermore, we denote

by a lower bound on the size of any

For this choice of parameters for the family we can define the function

which is assumed to behave like a generalized full-domain random oracle.

In particular we model as follows: on input a couple it outputs a

random element, uniformly distributed in

Since we only consider unilateral authentication (of A to B), we just introduce

a terminate flag for B.

5.2 Security Proof

Theorem 1 (AKE/UA Security). Let us consider the protocol I PAKE, over

a family of trapdoor hard-to-invert isomorphisms, with parameter where

Password is a dictionary equipped with the distribution For any adversary

within a time bound with less than active interactions with the parties (Send-

passive eavesdroppings (Execute-queries), and asking

queries) and and

hash queries to and any respectively: and

with upper-bounded by

where and denote the number of A and B instances involved during the

attack (each upper-bounded by and denotes

the number of involved instances and is the time needed

for evaluating one law operation. Let us remind that is the output length of

(the authenticator.)

For lack of space, we refer to the full version [8] for the full proof, here we justify

the main terms in the security result.

Ideally, when one considers a password-based authenticated key exchange,

one would like to prove that the two above success/advantage are upper-bounded

by plus some negligible terms. For technical reasons in the proof (to

get a clear proof) we have a small additional constant factor. This main term is

indeed the basic attack one cannot avoid: the adversary guesses a password and

makes an on-line trial. Other ways for it to break the protocol are:

use a function that is not a permutation, and in particular not a surjection.

With the view of the adversary tries all the passwords, and only a strict

fraction leads to in the image of this is a partition attack. But for that,

it has to forge a proof of validity for Hence the term

TEAM LinG

490 Dario Catalano, David Pointcheval, and Thomas Pornin

use the authenticator Auth to check the correct password. But this requires

the ability to compute Hence the term

send a correct authenticator Auth, but being lucky. Hence the term

Additional negligible terms come from very unlikely collisions. All the remaining

kinds of attacks need some information about the password.

6 A Concrete Example: The SQRT-IPAKE Protocol

An important contribution of this work (at least from a practical point of view)

is the first efficient and provably secure password-based key exchange protocol

based on factoring. The formal protocol appears in Figure 2. Here we describe

the details of this specific implementation.

Fig. 2. SQRT “ IPAKE protocol

Description of the SQRT-IPAKE Protocol

6.1

In order for the protocol to be correct we need to make sure that the adopted

function is actually an isomorphism. As seen in section 2.4 this is the case if

one assumes that the modulus is the product of two Blum primes, and

is the signed squaring function.

TEAM LinG

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 491

We thus set and and, of course, the internal

In order for the password PW to be

law is the multiplication in the group

generated correctly, we need a hash function onto Constructing such

a function is pretty easy: we start from a hash function onto and we

iterate it until we get an output in The details of this technique are deferred

to the full version of this paper [8]. Here we stress that if then very

few iterations are sufficient. As already noticed, we require Alice to prove the

following about the modulus so that the function is actually an isomorphism:

The modulus is in the correct range

The Jacobi symbol of “1 is +1 in (this is to make sure that is actually

a morphism);

The signed squaring function is actually an isomorphism from

onto (this is to make sure that any square in has exactly 4 roots).

Proving the first two statements is trivial. For the third one we need some new

machinery.

6.2 Proof of Correct Modulus

With the following theorem (whose proof can be found in the full version of this

paper [8]) we show that if is a composite modulus (with at least two different

prime factors) then the proposed function is an isomorphism.

Theorem 2. Let be a composite modulus containing at least two different

prime factors and such that “1 has Jacobi symbol +1 in Moreover let be

the morphism defined above. The following facts are true

1. If is surjective then it is an isomorphism.

2. If is not surjective, then at most half of the elements in have a pre-

image.

The theorem above leads to the protocol Prove-Surjective (see Figure 3). The

basic idea of this protocol is that we prove that our function is a bijection by

proving it is surjective. Soundness follows from the second statement. However,

in order to fall into the hypotheses of the theorem, we need to make sure

is actually a composite modulus of the required form (i.e. with at least two

distinct prime factors). We achieve this with the Prove-Composite protocol

(see Figure 3). The correctness (completeness, soundness and zero-knowledge

properties) of these protocols is deferred to the full version of this paper [8].

Remark 3. We point out that our protocol is very efficient, for the verifier, in

terms of modular multiplications. It is also possible for Alice to use the same

modulus for different sessions.

Acknowledgments

We thank the anonymous referees for their fruitful comments.

TEAM LinG

Dario Catalano, David Pointcheval, and Thomas Pornin

492

Fig. 3. Proof of Correct Modulus

References

1. M. Bellare, D. Pointcheval, and P. Rogaway. Authenticated Key Exchange Se-

cure Against Dictionary Attacks. In Eurocrypt ™00, LNCS 1807, pages 139“155.

Springer-Verlag, Berlin, 2000.

2. M. Bellare and P. Rogaway. The AuthA Protocol for Password-Based Authenti-

cated Key Exchange. Contributions to IEEE P1363. March 2000.

3. S. M. Bellovin and M. Merritt. Encrypted Key Exchange: Password-Based Pro-

tocols Secure against Dictionary Attacks. In Proc. of the Symposium on Security

and Privacy, pages 72“84. IEEE, 1992.

4. C. Boyd, P. Montague, and K. Nguyen. Elliptic Curve Based Password Authen-

ticated Key Exchange Protocols. In ACISP ™01, LNCS 2119, pages 487“501.

Springer-Verlag, Berlin, 2001.

5. V. Boyko, P. MacKenzie, and S. Patel. Provably Secure Password Authenticated

Key Exchange Using Diffie-Hellman. In Eurocrypt ™00, LNCS 1807, pages 156“171.

Springer-Verlag, Berlin, 2000.

Full version available at: http://cm.bell-labs.com/who/philmac/research/.

6. E. Bresson, O. Chevassut, and D. Pointcheval. Security Proofs for Efficient

Password-Based Key Exchange. In Proc. of the 10th CCS, pages 241“250. ACM

Press, New York, 2003.

TEAM LinG

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 493

7. E. Bresson, O. Chevassut, and D. Pointcheval. New Security Results on Encrypted

Key Exchange. In PKC ™04, LNCS, pages 145“159. Springer-Verlag, Berlin, 2004.

8. D. Catalano, D. Pointcheval, and T. Pornin. IPAKE: Isomorphisms for Password-

based Authenticated Key Exchange. In Crypto ™04, LNCS. Springer-Verlag, Berlin,

2004. Full version available from http://www.di.ens.fr/users/pointche/.

9. W. Diffie and M. E. Hellman. New Directions in Cryptography. IEEE Transactions

on Information Theory, IT“22(6):644“654, November 1976.

10. Y. Dodis, J. Katz, S. Xu, and M. Yung. Strong Key-Insulated Signature Schemes.

In PKC ™03, LNCS, pages 130“144. Springer-Verlag, Berlin, 2003.

11. D. Dolev, C. Dwork, and M. Naor. Non-Malleable Cryptography. SIAM Journal

on Computing, 30(2):391“437, 2000.

12. T. El Gamal. A Public Key Cryptosystem and a Signature Scheme Based on

Discrete Logarithms. IEEE Transactions on Information Theory, IT“31(4):469“

472, July 1985.

13. R. Gennaro and Y. Lindell. A Framework for Password-Based Authenticated Key

Exchange. In Eurocrypt ™03, LNCS 2656, pages 524“543. Springer-Verlag, Berlin,

2003.

14. O. Goldreich and Y. Lindell. Session-Key Generation Using Human Passwords

Only. In Crypto ™01, LNCS 2139, pages 408“432. Springer-Verlag, Berlin, 2001.

15. S. Halevi and H. Krawczyk. Public-Key Cryptography and Password Protocols.

In Proc. of the 5th CCS. ACM Press, New York, 1998.

16. J. Katz, R. Ostrovsky, and M. Yung. Efficient Password-Authenticated Key Ex-

change Using Human-Memorizable Passwords. In Eurocrypt ™01, LNCS 2045, pages

475“494. Springer-Verlag, Berlin, 2001.

17. S. Lucks. Open Key Exchange: How to Defeat Dictionary Attacks Without En-

crypting Public Keys. In Proc. of the Security Protocols Workshop, LNCS 1361.

Springer-Verlag, Berlin, 1997.

18. P. MacKenzie, S. Patel, and R. Swaminathan. Password-Authenticated Key Ex-

change Based on RSA. In Asiacrypt ™00, LNCS 1976, pages 599“613. Springer-

Verlag, Berlin, 2000.

19. P. MacKenzie and R. Swaminathan. Secure Network Authentication with Password

Identification. Submission to IEEE P1363a. August 1999.

20. T. Okamoto and S. Uchiyama. A New Public Key Cryptosystem as Secure as

Factoring. In Eurocrypt ™98, LNCS 1403, pages 308“318. Springer-Verlag, Berlin,

1998.

21. P. Paillier. Public-Key Cryptosystems Based on Discrete Logarithms Residues. In

Eurocrypt ™99, LNCS 1592, pages 223“238. Springer-Verlag, Berlin, 1999.

22. M. O. Rabin. Digitalized Signatures. In R. Lipton and R. De Millo, editors,

Foundations of Secure Computation, pages 155“166. Academic Press, New York,

1978.

23. R. Rivest, A. Shamir, and L. Adleman. A Method for Obtaining Digital Signatures

and Public Key Cryptosystems. Communications of the ACM, 21(2):120“126,

February 1978.

24. A. Sahai. Non-Malleable Non-Interactive Zero-Knowledge and Chosen-Ciphertext

Security. In Proc. of the 40th FOCS. IEEE, New York, 1999.

25. V. Shoup. Lower Bounds for Discrete Logarithms and Related Problems. In

Eurocrypt ™97, LNCS 1233, pages 256“266. Springer-Verlag, Berlin, 1997.

26. F. Zhu, A. H. Chan, D. S. Wong, and R. Ye. Password Authenticated Key Exchange

based on RSA for Imbalanced Wireless Network. In Proc. of ISC ™02, LNCS 2433,

pages 150“161. Springer-Verlag, Berlin, 2002.

TEAM LinG

Randomness Extraction and Key Derivation

Using the CBC, Cascade and HMAC Modes*

Yevgeniy Dodis1, Rosario Gennaro2, Johan Håstad3,

Hugo Krawczyk4, and Tal Rabin2

1

New York University

dodis@cs.nyu.edu

2

IBM Research

{rosario,talr}@watson.ibm.com

3

Royal Institute, Sweden

johanh@nada.kth.se

4

Technion, Israel, and IBM Research

hugo@ee.technion.ac.il

Abstract. We study the suitability of common pseudorandomness

modes associated with cryptographic hash functions and block ciphers

(CBC-MAC, Cascade and HMAC) for the task of “randomness extrac-

tion” , namely, the derivation of keying material from semi-secret and/or

semi-random sources. Important applications for such extractors include

the derivation of strong cryptographic keys from non-uniform sources of

randomness (for example, to extract a seed for a pseudorandom genera-

tor from a weak source of physical or digital noise), and the derivation

of pseudorandom keys from a Diffie-Hellman value.

Extractors are closely related in their applications to pseudorandom

functions and thus it is attractive to (re)use the common pseudoran-

dom modes as randomness extractors. Yet, the crucial difference between

pseudorandom generation and randomness extraction is that the former

uses random secret keys while the latter uses random but known keys. We

show that under a variety of assumptions on the underlying primitives

(block ciphers and compression functions), ranging from ideal random-

ness assumptions to realistic universal-hashing properties, these modes

induce good extractors. Hence, these schemes represent a more practical

alternative to combinatorial extractors (that are seldom used in prac-

tice) , and a better-analyzed alternative to the common practice of using

SHA-1 or MD5 (as a single un-keyed function) for randomness extraction.

In particular, our results serve to validate the method of key extraction

and key derivation from Diffie-Hellman values used in the IKE (IPsec™s

Key Exchange) protocol.

1 Introduction

Key Derivation and Randomness Extractors

1.1

Key derivation is a central functionality in cryptography concerned with the

process of deriving secret and random cryptographic keys from some source of

Extended abstract. Full version available at eprint.iacr.org/2004/

*

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 494“510, 2004.

© International Association for Cryptologic Research 2004

TEAM LinG

Randomness Extraction and Key Derivation 495

semi-secret randomness. In general, it is sufficient to derive a single random and

secret key (say of length 128) which can then be used to key a pseudorandom

function (or a pseudorandom generator) to obtain further pseudorandom keys

as needed. Thus, a basic question, which motivates the work presented here,

is how to derive such a random and secret key when all that is given is an

imperfect source of randomness which contains some good amount of secret

(computational) entropy, but this entropy is not presented in a direct form of

uniformly (or pseudorandomly) distributed secret bits. This problem arises in

a variety of scenarios such as when deriving keys from a non-uniform source of

noise (as used, for example, by physical random generators) or from semi-random

data (say, coming from user™s input or the sampling of computer events, etc.).

This is also the case when deriving keys from a Diffie-Hellman (DH) exchange.

Let us elaborate on the latter case.

Let™s assume that two parties run a DH protocol in order to agree on a shared

secret key, namely, they exchange DH exponentials and and compute the

DH value In this case, and as seen by the attacker fully determine

Yet, it is assumed (by the Decisional Diffie-Hellman, DDH, assumption)

that a computationally-bounded attacker cannot distinguish from a random

element in the group generated by Thus, one can assume that contains

bits of computational entropy relative to the view of the

attacker (for a formal treatment of computational entropy in the DH context

see [GKR04]). However, this entropy is spread over the whole value which

may be significantly longer than Thus, we are in a situation similar to that of

an imperfect source of randomness as discussed above. In particular, cannot

be used directly as a cryptographic key, but rather as a source from which to

extract a shorter string (say, of length 128) of full computational entropy which

can then be used as a cryptographic key.

The tools used to derive a uniform key from these sources of imperfect ran-

domness are often referred to as randomness extractors. The amount of theo-

retical results in this area is impressive; moreover, some of the constructions

that have proven extraction guarantees are also efficient (see [Sha02] for a recent

survey). One such example is the so called “pairwise independent universal hash

functions” (also called “strongly universal”) [CW79] which have quite efficient

implementations and provable extraction properties. In particular, [HILL99]

shows (see also [Lub96,Gol01]) that if an input distribution has sufficient min-

entropy (meaning that no single value is assigned a too-large probability even

though the distribution may be far from uniform) then hashing this input into a

(sufficiently) shorter output using a function chosen at random from a family of

strongly universal hash functions results in an output that is statistically-close

to uniform. (This result is often referred to as the “Leftover Hash Lemma”.)

1

For example, consider that is an element of prime order in (i.e., and

are primes and and that and In this case the

DDH assumption guarantees that the value hides (from the attacker) 512 bits of

computational entropy, yet these bits are spread in some unknown way among the

1024 bits of

TEAM LinG

496 Yevgeniy Dodis et al.

Yet, in spite of these results and an extensive literature studying their ap-

plication to real cryptographic problems (such as those mentioned earlier, and

in particular for the DH case [GKR04]) one seldom encounters in practice the

use of strong universal hashing or other proven extractors. Instead, the common

practice is to use cryptographic hash functions (such as MD5 and SHA-1) for

the purpose of randomness extraction. A main reason for this practice, justified

by engineering considerations, is that cryptographic hash functions are readily

available in software and hardware implementations, and are required by most

cryptographic applications for purposes other than randomness extraction (e.g.,

as pseudorandom functions). Therefore, it is attractive and convenient to use

them for key extraction as well. Also, the common perception that these hash

functions behave as random functions (formalized via the notion of “random

oracles”) make them intuitively appealing for the extraction applications.

1.2 Randomness Extraction via Common Chaining Modes

In this paper we attempt to bridge between the world of provable extraction

and the common practice of relying on idealized hash functions. The question

that we ask is what is the best way to use cryptographic hash functions, or

other widely available cryptographic tools such as block ciphers, for the task of

randomness extraction. Specifically, we consider three common modes of oper-

ation: CBC chaining, cascade (or Merkle-Damgard) chaining, and HMAC, and

analyze the appropriateness of these modes as extraction tools. Since the goal is

to provide as general (and generic) as possible results, we do not investigate the

extraction properties of specific functions (say SHA-1 or AES) but rather ab-

stract the basic primitives (the compression functions in the case of the cascade

and HMAC modes, and block ciphers in the case of CBC), as random functions

or permutations2.

Before going on with the description of our results, it is worth considering

the following issue. Given that the common practice is to extract randomness

using a hash function modeled as a random oracle, then how much do we gain

by analyzing the above modes under the weaker, but still idealized, randomness

assumption on the underlying basic primitives. There are several aspects to this

question.

The first thing to note is that modeling the compression function of SHA-

1, for example, as a random function, or as a family of random functions, is a

strict relaxation to modeling SHA-1 (as a single un-keyed function) as a ran-

dom function. This is easily seen from the fact that even if one starts with a

random function as the compression function the result of the cascade chain-

ing (which is how SHA-1 is derived) is not a random function. (For example,

in the cascade construction, the probability that two L-block inputs that differ

only in their first block are mapped to the same output is while for

2

In the case of HMAC we obtain results based on non-ideal assumptions on the

underlying basic primitives (see Section 1.3).

TEAM LinG

Randomness Extraction and Key Derivation 497

a random function this probability is Another important point is that

cryptographic design work focuses on building the basic blocks, i.e. a compres-

sion function or a block cipher. Thus, making an assumption on these primitives

will represent the design goals which there can be an attempt to satisfy. Also

analysis of the type presented here, rather than implying the security of any spe-

cific implementation of these functions, serves to validate the suitability of the

corresponding chaining modes for some defined goal (in our case the goal is ran-

domness extraction). Indeed, the common approach for analyzing such modes

(e.g., [Dam89,BKR94,BCK96a,BCK96b]) is to make some assumption on the

basic primitive (for example, assuming the underlying compression function to

be a pseudorandom function, or a secure MAC, or a collision-resistant hash func-

tion) and then proving that these or other properties are preserved or implied

by the chaining operation.

In addition, the “monolithic” randomness assumption on a single (unkeyed)

function such as SHA-1 is inappropriate for the setting of randomness extraction

as no single function (even if fully random) can extract a close-to-uniform distri-

bution from arbitrary high-entropy input distributions. This is so, since once the

function is fixed (even if to purely random values) then there are high-entropy

input distributions that will be mapped to small subsets of outputs3. Therefore,

the viable approach for randomness extraction is to consider a family (or col-

lection) of functions indexed by a set of keys. When an application requires the

hashing of an input for the purpose of extracting randomness, then a random

element (i.e., a function) from this family is chosen and the function is applied

to the given input. While there may be specific input distributions that interact

badly with specific functions in the family, a good randomness-extraction family

will make this “bad event” happen with very small probability. Universal hash

families, mentioned before, are examples of this approach. An important point

here is that, while the choice of a function from the family is done by selecting

a random index, or key, this key does not need to be kept secret (this is im-

portant in applications that use extraction to generate secret keys; otherwise,

if we required this index to be secret then we would have a “chicken and egg”

problem).

In our setting, families of keyed functions come up naturally with block ci-

phers and compression functions (for the latter we consider, as in HMAC, the

variable IV as the key to the function). These functions are defined on fixed

length inputs (e.g., 512 bits in the case of compression function of SHA-1, or

128 in the case of AES). Then, to hash arbitrarily long inputs, we extend these

families by the appropriate chaining mode: cascade chaining (or HMAC) for com-

pression functions, and CBC-MAC in the case of block ciphers. What makes the

analysis of these functions challenging (in the setting of randomness extraction)

is that, as discussed before, the key to the function is random but known. For ex-

3

For example, let F be a random function from to bits and let S denote the subset

of that is mapped by F to outputs with a low-order bit of zero. If we consider

the uniform distribution on 5 as the input distribution, then this distribution has

almost full entropy, yet the output of F on S is trivially distinguishable from uniform.

TEAM LinG

498 Yevgeniy Dodis et al.

ample, the fact that the above functions are widely believed to be pseudorandom

does not help much here since, once the key is revealed, the pseudorandom prop-

erties may be lost completely (see full paper). Yet, as we will see in Section 4.2,

we do use the pseudorandom property in some of our analysis. Also worth not-

ing is that using families that are pseudorandom for extraction is particularly

convenient since these same functions can then be used by the same application

(for example, a key-exchange protocol, a random generator, etc.) for further key

derivation (using the extracted key to key the pseudorandom function).

The last question is how to generate the random known keys used by the

extractor. Technically this is not hard, as the parties can generate the appropri-

ate randomness, but the exact details depend on the application. For example,

in the DH key exchange discussed earlier, the parties exchange in the clear ran-

domly chosen values, which are then combined to generate a single key for the

extractor family (e.g. HMAC-SHA1). The shared key is set to

We note that this is substantially the procedure in place in the IKE protocol

[RFC2409,IKEv2] (see also [Kra03]), and this paper presents the first formal

analysis of this design.

A similar DH key extraction step is required in non-interactive scenarios,

such as ElGamal or Cramer-Shoup encryption. There the extractor key can

be chosen either by the encryptor and appended to the ciphertext, or chosen by

the decryptor and included in the public key (this choice is mandatory in case

we want CCA-security, as we don™t want to leave the choice of in the hands

of the adversary). For a different example, consider a cryptographic hardware

device, containing a physical random generator that samples some imperfect

source of noise. In this case the application can choose a random hash function

in the family and wire-in its key into the randomness generation circuit [BST03].

Notice that by using our results, it will be possible to perform the extraction

step using circuitry (such as a block-cipher or a cryptographic hash function)

which is very likely to already be part of the device.

1.3 Our Results

The Extraction Properties of CBC-MAC Mode. We show, in Section 3, that if

is a random permutation over and is an input distribution with

min-entropy of at least then the statistical distance between (where F

represents the function computed in CBC-MAC mode over L blocks) and the

uniform distribution on is As an example, in the application

(discussed before) in which we use the CBC-MAC function F to hash a Diffie-

Hellman value computed over a DDH group of order larger than we get that

the output distribution is computationally indistinguishable from a dis-

tribution whose distance from uniform is at most hence proving (under

DDH) that the output from is computationally indistinguishable

from uniform (and thus suitable for use as a cryptographic key). Note that if

one works over for 1024-bit and then all we need to assume is a

min-entropy of 256 out of the 1024 bits of In the full paper we show that

TEAM LinG

Randomness Extraction and Key Derivation 499

for input distributions with particularly high entropy (in particular those that

contain a block of almost-full entropy) the CBC-MAC mode guarantees an

almost-uniform output for any family of permutations.

The Extraction Properties of Cascade Chaining. In Section 4 we study the cas-

cade (or Merkle-Damgard) chaining used in common hash functions such as

MD5 and SHA-1. We show these families to be good extractors when modeling

the underlying compression function as a family of random functions. However,

in this case we need a stronger assumption on the entropy of the input dis-

tribution. Specifically, if the output of the compression function is long

(typically, or 160) we assume a min-entropy of over the whole input,

and “enough” min-entropy over the distribution induced on the last block of

input (typically of length 512 bits). For example, if the last block has bits of

min-entropy, and we assume L blocks, then the statistical distance between the

output of the cascade construction and the uniform distribution (on is

at most We note that the above restriction on the last-block distribu-

tion is particularly problematic in the case of practical functions such as MD5

and SHA since the input-padding conventions of these functions may cause a

full fixed block to be added as the last block of input. In this case, the output

distribution is provably far from uniform. Fortunately, we show that our anal-

ysis is applicable also to the padded-input case. However, instead of proving a

negligible statistical distance, what we show is that the output of the “padded

cascade” is computationally indistinguishable from uniform, a result that suffices

for the cryptographic applications of extraction. Finally, we prove that when ev-

ery block of input has large-enough min-entropy (conditioned on the distribution

of previous blocks), then the above extraction results hold under the sole (and

non-ideal) assumption that the underlying compression function is a family of

functions (for sufficiently small

The Extraction Properties of HMAC. HMAC is the most widely used pseudoran-

dom mode based on functions such as MD5 or SHA, thus proving its extraction

properties is extremely important. Our main result concerning the good extrac-

tion properties of HMAC is proven on the basis of a high min-entropy bits)

in the input distribution without relying on any particular entropy in the

last block of input. Specifically, let F denote the keyed hash function underlying

an instantiation of HMAC (e.g., F is SHA-1 with random IV) and let be the

corresponding outer compression function. Then we show that if F is collision

resistant and is modeled as a random function then the output of HMAC (on

input drawn from the distribution is indistinguishable from uniform for any

attacker that is restricted in the number of queries to the function Moreover,

if the compression function itself is a good extractor, then HMAC is a good

extractor too. However, in this latter case if we are interested in an output of

close-to-uniform bits then the key to the underlying compression function needs

to be sufficiently larger than As a concrete example, if (e.g., we need

to generate a pseudorandom key of length 160) then we can use HMAC with

SHA2-512. Note that this result is particularly interesting in the sense that it

TEAM LinG

500 Yevgeniy Dodis et al.

uses no idealized assumptions, and yet the output of HMAC is provably close to

uniform (even against completely unbounded attackers, including attackers that

can break the collision resistance of F).

Remark (Pseudorandom functions with known keys). It is tempting to use the

pseudorandomness properties enjoyed by the modes studied here as a basis to

claim good (computational) extraction properties. For example, in spite of the

fact that the output of these functions may be statistically very-far from uniform,

it is still true that no (efficient) standard statistical test will be able to tell apart

this output from random (simply because such a test does not use the knowledge

of the key even if this key is known.) Yet, for cryptographic applications using a

family of functions as extractors, based solely on the assumption that the family

is pseudorandom, may be totally insecure. We illustrate this point by showing,

in the full paper, an example of a secure pseudorandom family whose output is

trivially distinguishable from randomness once the key is known.

All proofs appear in the full version of the paper.

2 Universal Hashing and Randomness Extraction

Preliminaries. For a probability distribution we use the notation to

mean that is chosen according to the distribution For a set S, S is used

to mean that is chosen from S with uniform probability. Also, for a probability

distribution we use the notation to denote the probability assigned by

to the value (We often omit the subscript when the probability distribution

is clear from the context.) Throughout the paper, we will use to denote the

maximal numbers of divisors of any number smaller or equal to L. As a very

crude upper bound, we will sometimes use the fact that

MIN-ENTROPY AND COLLISION PROBABILITY. For a probability distribution

over we define its min-entropy as the minimum integer such that for

all We denote the min-entropy of such by

The collision probability of is

and the Renyi (or collision) entropy of is It is easy to

see that these two notions of entropy are related:

In particular, we will frequently use the fact that

STATISTICAL DISTANCE. Let be two probability distributions over the

set S. The statistical distance between the distributions and is defined as

If two distributions have statistical

distance of (at most) then we refer to them as We note that

distributions cannot be distinguished with probability better than even by a

computationally unbounded adversary. It is also well known that if has support

on some set S and U is the uniform distribution over this set, then

TEAM LinG

Randomness Extraction and Key Derivation 501

Definition 1. Let and be integers, and be a family of hash functions

with domain range and key space We say that the family

universal

is if for every pair of different inputs from

it holds that where the probability is taken over

For a given probability distribution on we say that

w.r.t.

is if where the probability is taken over

and conditioned to

Clearly, a family is if it is w.r.t. all distributions on

The notion of universal hashing originates with the seminal papers by Carter

and Wegman [CW79,WC81]; the variant used here was first formulated in

[Sti94]. The main usefulness of this notion comes from the following lemma whose

proof is immediately obtained by conditioning on whether the two independent

samples from collide or not (below E denotes the expected value).

Lemma 1. If is w.r.t. then

Now, using the above lemma and Eq. (1), the lemma below extends the

well-known “Leftover Hash Lemma” (LHL) from [HILL99] in two ways. First,

it relaxes the pairwise-independence condition assumed by that lemma on the

family of hash functions, and allows for “imperfect” families in which the collision

probability is only to perfect (i.e., instead of Second, it

allows for the collision probability to depend on the input distribution rather

than being an absolute property of the family of hash functions. We use these

straightforward extensions of the LHL in an essential way for achieving our

results. We also note that the standard LHL can be obtained from Lemma 2

below by setting

Lemma 2. Let and be integers, let be a probability distribution over

and let be a family of hash function with domain and

range If is universal w.r.t. U is uniform

over and is uniform over then

Remark 1. It is important to note that for the above lemma to be useful one

needs or otherwise the derived bound on the statistical closeness ap-

proaches 1. Moreover, this fact is not a result of a sub-optimal analysis but rather

there are examples of families with (i.e., families) that gen-

erate outputs that are easily distinguishable from uniform. For example, if

is a family of pairwise independent hash functions with outputs, and we de-

fine a new family which is identical to except that it replaces the last

bit of output with 0, then the new family has collision probability of yet its

output (which has a fixed bit of output) is trivially distinguishable from uniform.

The fact that we need (say, makes the analysis of CBC and

the cascade construction presented in the next sections non-trivial. In particular,

TEAM LinG

502 Yevgeniy Dodis et al.

the existing analyses of these functions (such as [BKR94,BCK96b]) are too weak

for our purposes as they yield upper bounds on the collision probability of these

constructions that are larger than

3 The CBC-MAC Construction

Here we study the suitability of the CBC-MAC mode as a randomness extractor.

Recall that for a given permutation on the CBC-MAC computation

of on an input with L blocks in is defined as

where the latter value is set by the recursion: for

We denote the output of the above CBC-MAC process by

Our main result in this section states the extraction properties of CBC-MAC

for a random permutation on elements. To state it more compactly,

we let and notice that

when (here we use the fact that

Theorem 1. Let F denote the CBC-MAC mode over a random permutation

on and let be an input distribution to F defined over L-block

strings. Then the statistical distance between and the uniform distribu-

tion on is at most

In particular, assuming and the above statistical distance

is at most

The proof of the theorem follows from Lemma 2 in combination with the

following lemma that shows that CBC-MAC mode with a random permutation

is for sufficiently small

Lemma 3. Let F denote the CBC-MAC mode over a permutation on

For any if then where

the probability is over the choice of a random permutation

4 The Cascade Construction

We first recall the Merkle-Damgard approach to the design of cryptographic hash

functions and introduce some notation and terminology. For given integers and

let be a family of functions such that and

for all the function maps bits into bits. On the basis of this family

we build another family that works on inputs of any length which

is a multiple of and produces a output. For each the function

is defined as follows. Let for some and

(for all denote the input to we define L variables (each of length

TEAM LinG

Randomness Extraction and Key Derivation 503

as and set For processing

inputs of arbitrary length one needs a rule for padding inputs to a length that is

a multiple of Specific functions that build on the above approach, such as MD5

and SHA-1, define their specific padding; we expand on this issue in Section 4.2.

For the moment we assume inputs of length Lb for some L. Some more notation:

Sometimes we use F to denote the family and we write to denote

the random variable for Finally, we use K to denote

The family is called the “compression function(s)” and the family

is referred to as the “cascade construction” (over the compression func-

tion Typical values for the parameters of the compression function are

and

4.1 The Basic Cascade Construction

The main result of this section is the following. Assume is an input distribu-

tion with bits of (overall) min-entropy and “enough” bits of min-entropy in

its last block (“enough” will be quantified below). For the cascade con-

structions we model the underlying family of compression functions as a family

of random functions (with outputs). Then the output of F on the distribu-

tion is statistically close to uniform. This result is formalized in the following

theorem. As in Section 3, we let and notice

that when

Theorem 2. Let be the cascade construction defined, as above, over

a family of random functions Let be the input distribution to F defined

over L-block strings, and denote the probability distribution induced by on

the last block for Then, if U is the uniform distribution over

we have

In particular, if and then

The proof of the theorem follows from Lemma 2 in combination with the

following lemma that shows that the cascade construction with a random family

of compression functions is for sufficiently small

Lemma 4. Let be the cascade construction defined over a family of

random functions Let be an input distribution as assumed in Theorem 2,

where Then, the family F is

w. r. t.

The proof of this lemma is based on the following two propositions: the first

analyzes the collision probability of the cascade function F over random com-

pression functions on inputs that differ (at least) in the last block (Proposition 1);

TEAM LinG

504 Yevgeniy Dodis et al.

then we extend the analysis to the general case, i.e. for any two different and

(Proposition 2). All proofs appear in the final paper.

Proposition 1. Let be the cascade construction defined over a family

of random functions Let be two inputs to F that differ (at least) in the

last block, namely, and let be any value of the initial key. Then

where the probability is taken over the

choice of random functions in F.

Proposition 2. Let F be defined as above, let be two different inputs to

F, and let be any value of the initial key. Then

where the probability is taken over the choice of random functions

in F.

THE VALUE OF THE INITIAL KEY We note that the above analysis holds for

any value of the initial key when the functions are truly random, which means

that in principle can be fixed to a constant. However, in practice not all the

functions of the function family satisfy this requirement. Thus, choosing

at random allows, for example, to extend our analysis to the situation where a

negligible fraction of functions in F are not close to random functions.

NECESSITY OF MIN-ENTROPY IN THE LAST BLOCK. We argue that assuming

non-trivial min-entropy in the last block is required, even if the family of

compression functions is completely random. Assume an input distribution in

which the last block is fixed to a value B. The first L “ 1 blocks induce some dis-

tribution for the last key in the sequence. Examining the distribution on

induced by (any) distribution on it is easy to see that this distribution is sta-

tistically far from the uniform distribution. In particular, we expected with high

probability a constant fraction of elements will not appear in the distribution.

4.2 The Cascade Construction with Input Padding

The conditions imposed by Theorem 2 on the input distribution conflict with a

technical detail of the practical implementations of the cascade paradigm (such

as MD5 and SHA-1): rather than applying the cascade process to the input

these functions modify by concatenating enough padding bits as to

obtain a new input whose length is a full multiple of the block length. In some

cases this padding results in a full fixed block appended to Therefore, even

if has the property that the last block of input has relatively high entropy

(as required by Theorem 2) the actual input to the cascade does not have

this property any more. This fact is sufficient to make our main result from

Section 4.1 irrelevant to these real-world functions; luckily, however, we show

here that this serious obstacle can be lifted.

In order to better understand this problem we first describe the actual way

this padding is performed. We consider the concrete value used as the

block length in these functions. Let denote the length of the input and let

TEAM LinG

Randomness Extraction and Key Derivation 505

If then is padded with the binary string

followed by a 64-bit representation of If then a whole new block

is added with a padding similar to the one described above (with the binary

representation of occupying the last 64 bits of the added block). From this

description, we see that if, for example, the original input had a length which

was an exact multiple of 512, then where B is a whole new block

(and represents the concatenation operation). Moreover, the value of B is the

same (and fixed) for all inputs of the length of In particular, if we consider

the case in which we hash Diffie-Hellman values of length of 1024 or 2048 bits

(this is the common case when working over groups), then we get that the

padded input will always have a fixed last block. In other words, regardless

of the entropy existing in the original input the actual input to the cascade

process now has a zero-entropy last block.

For this case, in which the last block is fixed, we show here that a somewhat

weaker (but still very useful) result holds. Specifically, combining Theorem 2 with

the assumption that the family of (compression) functions is pseudorandom,

we can prove that the output from the cascade construction is pseudorandom,

i.e., computationally indistinguishable from uniform (and thus sufficient for most

cryptographic applications) This result holds even though the key to the cascade

function F is revealed! We note that the assumption that the family is

pseudorandom is clearly implied by the modeling (from the previous subsection)

of these functions as random. But we also note that assuming the compression

function (family) of actual schemes such as MD5 or SHA-1 to be pseudorandom

is a standard and widely-used cryptographic assumption (see [BCK96b] for some

analytical results on these PRF constructions).

Lemma 5. Let be a family of pseudorandom functions which is

from random for attackers restricted to time T and a sin-

gle query. Let denote the cascade construction over the family

Further, let be a probability distribution on L-block strings from

which the inputs to F are chosen, and B be a fixed block. If the output distri-

bution with random but known key is close to uniform,

then the distribution (for random but known is

from uniform by attackers that run time T.

The above lemma together with Theorem 2 show that if is a family of

random functions then the cascade construction with a fixed last block block

is indistinguishable from random, provided that the original input distribution

(before padding!) satisfies the conditions of Theorem 2.

It is also worth noting that Lemma 5 can be generalized to input distributions

that can be described as the concatenation of two probability distributions

and where satisfies the conditions of Theorem 2, and is an arbitrary

(polynomial-time samplable) distribution independent from

A PRACTICAL CONSIDERATION. Note that the application of Lemma 5 on an

input distribution still requires the last block of (before padding) to have

TEAM LinG

506 Yevgeniy Dodis et al.

relatively high min-entropy. To maximize this last-block min-entropy it is advis-

able that any input whose length is not a multiple of be “shifted to

the right” (to a block boundary) by prepending a sufficient number of bits (say,

all zeros) to the beginning of This way, the resultant string is of length a

multiple of and, more importantly, its last block contains the full entropy of

the last bits in Also, this shifting forces the appended padding described

earlier to add a full block as assumed in Lemma 54.

4.3 Modeling the Compression Function as a Family

In Section 4.1 we presented an analysis of the basic cascade construction under

the modeling of the compression function as a family of random functions. Here

we study the question of what can be guaranteed on the output distribution of

the cascade under the simple assumption that the family of compression func-

tions is a good extractor (or more generally that this family is Clearly

this is a more realistic assumption on the underlying compression function. On

the other hand, in order to prove a close-to-uniform output in this case we are

going to require a stronger assumption on the input distribution. Specifically, we

are going to assume that the distribution on every block of input has a high min-

entropy (e.g., bits of min-entropy out of the bits in the block), conditioned

on the distribution of the previous blocks. We prove below that under these

conditions the output of the cascade function is statistically close to uniform.

We note that the above requirement on the input distribution, while strin-

gent, is met in some important cases, such as applications that extract keys from

a Diffie-Hellman value computed over a high-entropy group. In particular, this

requirement is satisfied by the DH groups in use with the IKE protocol.

CONDITIONAL ENTROPY. Let and be two probability distributions over

and respectively. If we denote with the distribution

conditioned to the event that the string is selected according to Then

we can define the conditional min-entropy of (and denote it as

as the minimum integer such that for all

We define the conditional min-entropy of with respect to as the expec-

tation over of

Lemma 6. Assume that the family of compression functions from to

bits has the property that for any probability distribution defined over

with min-entropy of the output distribution for and is

to uniform (for some given Further, assume that is an

input distribution on L blocks with the property that for each

4

For example, assume the inputs from the distribution to be of length 1800 bits.

Given such an input we prepend to it 248 ˜0™s resulting in a 4-block string

Now when this input is processed by, say, SHA-1 an additional fifth block

is added to The important thing is that the last block of receives as much

entropy from the last 512 bits of as possible.

TEAM LinG

507

Randomness Extraction and Key Derivation

the distribution induced by on the block has conditional min-entropy

with respect to the distribution induced by on blocks and that

Then, the cascade construction over the family applied to the distribution

is to uniform.

In particular, if we assume that the family is and the

min-entropy of each input block (as defined above) is at least then we

get (using Lemma 2) a statistical distance between the cascade construction on

L blocks and the uniform distribution on of at most

Combining Lemmas 6 and 5 we get that, under the above assumption on

the input distribution, if the family of compression functions is both and

pseudorandom then the output of the padded cascade (see Section 4.2) is pseu-

dorandom (i.e. indistinguishable from uniform).

5 HMAC Construction

We now turn to the study of HMAC [BCK96a] as a randomness extraction

family. HMAC (and its underlying family NMAC) is defined using the cascade

construction over a family of compression functions with domain

range and (as usual we denote The family of

functions NMAC uses two independent keys drawn from and is defined over

{0,1}* as where both and the result from

are padded as described in Section 4.2. On the basis of NMAC one defines the

family HMAC as where and

the value iv is fixed to the IV defined by the underlying

hash function, and are two different fixed strings of length The

analysis of HMAC is based on that of NMAC under the specialized assumption

that the keys and are “essentially independent”. We keep this assumption

and develop our analysis on NMAC. (The reason of this form of derivation of

the keys in HMAC is to allow for the use, without modification, of the

underlying hash function; in particular, without having to replace the fixed IV

with a variable value.)

We start by observing that if one considers the family as a

family then we get This is so since for any two inputs the

probability that NMAC sends both values to the same output is the sum of the

probability that (which is at least 1/K) plus the probability

that but maps these two different results to the same value

(which is also at least 1/K). Therefore we cannot apply the results of Section 2

directly to the analysis of NMAC.

However, we provide three analyses, which, under different assumptions, es-

tablish the security of NMAC as a randomness extractor.

DROPPING SOME OUTPUT BITS. Specifically, we assume the “outer” function

outputs bits (e.g., in case is a random function outputting

bits, one can simply drop the last bits and view it as a random function

TEAM LinG

508 Yevgeniy Dodis et al.

outputting bits). In this case, a straightforward application of Lemma 1

and Lemma 2 shows that if the family is w.r.t. and is

then NMAC extracts bits of statistical distance at most

from uniform. Assuming now that both families con-

sist of random functions, then and Proposition 2 implies that

This means that if

and then NMAC extracts bits which are to uniform. In

fact, the same is true even if the outer function family is merely a good

extractor (e.g., if it is pairwise independent). In any case, we get that dropping

roughly (logL + 160) bits from the output of the NMAC construction makes it

a good extractor. To make this result meaningful, however, we must consider

compression functions whose key is non-trivially larger than 160 bits, such as

the compression function for SHA2-512.

COMPUTATIONAL SECURITY. Our second approach to analyzing NMAC is sim-

ilar to the analysis of the padded cascade from Lemma 5. We will present it in

the full version.

MODELING AS A RANDOM ORACLE. As we remarked, even if is truly

random, the value cannot be statistically close to uniform, even

if was perfectly uniform. This was argued under an extremely strong

distinguisher that can evaluate at all of its inputs. This is different

from the typical modeling of as a random oracle. Namely, in the random

oracle model it is assumed that the adversary can evaluate at most a bounded

number of points, This assumption can be seen as restrictive, but in fact

a realistic characterization of the adversary™s capabilities. Thus, we show that

when we model the outer function of NMAC, as a random oracle then the

construction is a good randomness extractor. We start by showing the general

result about the quality of using a random oracle as an extractor, and then

apply it to the NMAC construction.

5.1 Random Oracle as an Extractor

In this section, we show that by utilizing the power of the random oracle to the

fullest, we can provide some provable guarantees on the quality of the random

oracle as a randomness extractor. Our precise modeling of the random oracle

is the following. The adversary is allowed to adaptively

query the random oracle at upto points, and possibly make the distribution

depend on these queries. However, we assume that the remaining “unqueried”

values of are chosen randomly and independently of and are never

5

given to the adversary . Finally, given a distribution and a number we let

denote the probability mass of the heaviest elements under

5

We stress that this is very different from our modeling of a random function from

before, where the adversary first chooses the distribution after which is chosen

at random (independently from and given to the adversary in its entirety.

TEAM LinG

Randomness Extraction and Key Derivation 509

Lemma 7. Assume is a random oracle from bits to bits, and the adversary

can evaluate in at most points. Then, for any distribution on

the maximal probability the adversary can distinguish from the uniform

distribution over is at most

Remark 2. We already remarked that no single function can be a universally

good extractor, which means that one has to use a function family instead,

indexed by some key On the other hand, in the idealized random oracle

model, we manage to use a single random oracle in Lemma 7. This is not

a contradiction since we critically assumed that the adversary cannot read the

entire description of the random oracle. In essence, the choice of the random

oracle can be viewed as a key but the distinguisher cannot read the entire

key (although it has a choice which parts of it to read) and therefore cannot

adversarially choose a bad distribution Put differently, in our analysis we

could assume that a large part of the key (i.e., is chosen independently of

which is consistent with the conventional extractors such as those obtained by the

LHL. However, unlike the conventional extractors, we (restrictively) assume that

the adversary never learns the entire description of the key (i.e., the unqueried

parts of which allowed us to get a much stronger bound that what we could

get with the LHL. For example, LHL required while Lemma 7

only requires

We will also use the following Corollary of Eq. (4) and Lemma 1.

Corollary 1. If a family of functions is universal w.r.t.

is a random oracle, U is the uniform distribution of and the adversary

can make at most queries to the random oracle, then the maximal probabil-

ity the adversary can distinguish the pair from is at most

The above corollary implies that the composition of a collision-resistant hash

function and a random oracle could be viewed as a relatively good extractor. This

is because a (computational) collision-resistant function must be (information-

theoretically) almost universal. More precisely, if a function family is

collision-resistant with exact security against non-uniform adversaries running

in linear time, it must also be universal. For uniform adversaries running

in time T, must be universal w.r.t. any which is samplable in time

T/2.

APPLICATION TO NMAC. We can now apply Corollary 1 to the case of NMAC

assuming that the outer function is a random oracle which can be eval-

uated in at most places. By Proposition 2, the family is when the

function family is chosen at random, for (when

Thus, Corollary 1 implies that NMAC extracts bits which cannot be dis-

tinguished from random with probability more than

which is negligible if

TEAM LinG

Yevgeniy Dodis et al.

510

Acknowledgments

We thank David Wagner for valuable discussions.

References

[BST03] B. Barak, R. Shaltiel, and E. Tromer. True Random Number Generators

Secure in a Changing Environment. In CHESS ™03, pages 166“180, 2003.

[BCK96a] M. Bellare, R. Canetti, and H. Krawczyk. Keying hash functions for mes-

sage authentication. In Crypto ™96, pages 1“15, 1996. LNCS No. 1109.

[BCK96b] M. Bellare, R. Canetti, and H. Krawczyk. Pseudorandom Functions Revis-

ited: The Cascade Construction and Its Concrete Security. In Proc. 37th

FOGS, pages 514“523. IEEE, 1996.

[BKR94] M. Bellare, J. Kilian, and P. Rogaway. The Security of Cipher Block Chain-

ing. In Crypto ™94, pages 341“358, 1994. LNCS No. 839.

L. Carter and M. N. Wegman. Universal Classes of Hash Functions. JCSS,

[CW79]

18(2): 143“154, April 1979.

[Dam89] I. Damgard. A Design Principle for Hash Functions. In Crypto ™89, pages

416“427, 1989. LNCS No. 435.

[GKR04] R. Gennaro, H. Krawczyk, and T. Rabin. Secure Hashed Diffie-Hellman

over Non-DDH Groups. In Eurocrypt ™04, 2004. LNCS No.

Oded Goldreich. Foundations of Cryptography. Cambridge University Press,

[Gol0l]

2001. Preliminary version http://philby.ucsd.edu/cryptolib.html/.

[HILL99] J. Håstad, R. Impagliazzo, L. Levin, and M. Luby. Construction of a

Pseudo-random Generator from any One-way Function. SIAM. J. Com-

puting, 28(4): 1364“1396, 1999.

IKEv2. Internet Key Exchange (IKEv2) Protocol, draft-ietf-ipsec-ikev2-

[IKEv2]

13.txt, (to be published as an RFC). Editor: C. Kaufman, March 2004.

H. Krawczyk. SIGMA: The ˜SIGn-and-MAc™ Approach to Authenticated

[Kra03]

Diffie-Hellman and Its Use in the IKE Protocols. In Crypto ™03, pages 400“

425, 2003. LNCS No. 2729.

Michael Luby. Pseudorandomness and Cryptographic Applications. Prince-

[Lub96]

ton Computer Science Note, Princeton University Press, January 1996.

[RFC2409 RFC2409. The Internet Key Exchange (IKE). Editors: D. Harkins and D.

Carrel, Nov 1998.

R. Shaltiel. Recent developments in Extractors. Bulletin of the European

[Sha02]

Association for Theoretical Computer Science, Volume 77, June 2002, pages

67-95. Available at:

http://www.wisdom.weizmann.ac.il/˜ronens/papers/survey.ps

D. R. Stinson. Universal Hashing and Authentication Codes. Designs,

[Sti94]

Codes and Cryptography 4 (1994), 369-380.

M.N. Wegman and L. Carter. New Hash Functions and Their Use in Au-

[WC81]

thentication and Set Equality. JCSS, 22(3):265“279, July 1981.

TEAM LinG

Efficient Tree-Based Revocation

in Groups of Low-State Devices

Michael T. Goodrich1,*, Jonathan Z. Sun1,*, and Roberto Tamassia2,**

1

Dept. of Computer Science, Univ. of California, Irvine, CA 92697-3425

{goodrich,zhengsun}@ics.uci.edu

2

Dept. of Computer Science, Brown Univ., Providence, RI 02912

rt@cs.brown.edu

Abstract. We study the problem of broadcasting confidential informa-

tion to a collection of devices while providing the ability to revoke

an arbitrary subset of those devices (and tolerating collusion among the

revoked devices). In this paper, we restrict our attention to low-memory

devices, that is, devices that can store at most keys. We consider

solutions for both zero-state and low-state cases, where such devices are

organized in a tree structure T. We allow the group controller to encrypt

broadcasts to any subtree of T, even if the tree is based on an multi-way

organizational chart or a severely unbalanced multicast tree.

1 Introduction

In the group broadcast problem, we have a group S of devices and a group

controller (GC) that periodically broadcasts messages to all the devices over

an insecure channel [8]. Such broadcast messages are encrypted so that only

valid devices can decrypt them. For example, the messages could be important

instructions from headquarters being sent to PDAs carried by employees in a

large corporation. We would like to provide for revocation, that is, for an arbi-

trary subset we would like to prevent any device in R from decrypting

the messages.

We are interested in schemes that work efficiently with low-memory devices,

that is, devices that can store at most secret keys. Such a scenario

models the likely situation where the devices are small and the secure firmware

dedicated to storing keys is smaller still. We refer to this as the log-key restriction.

We consider two variants of this model.

A static or zero-state version: the keys on each device cannot be

changed once the device is deployed. For example, memory for the devices

could be written into secure firmware at deployment.

This work was supported in part by NSF Grants CCR-0312760, CCR-0311720, CCR-

*

0225642, and CCR-0098068.

** This work was supported in part by NSF Grants CCR-0311510, CCR-0098068, and

IIS-0324846 and by a research gift from Sun Microsystems.

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 511“527, 2004.

© International Association for Cryptologic Research 2004

TEAM LinG

512 Michael T. Goodrich, Jonathan Z. Sun, and Roberto Tamassia

The dynamic or low-state version: any of the keys on each device

can be updated in response to broadcast messages. For example, such de-

vices might have a small tamper-resistant secure cache in which to store and

update secret keys.

Organizing Devices Using Trees. The schemes we consider organize the set

of devices in a tree structure, associating each device with a different leaf in

the tree. In fact, we consider three possible kinds of trees that the devices can

conceptually be organized into.

1. A balanced tree. In this case, the devices are associated with the leaves

of a balanced tree where each internal node has a constant number of

children; hence, each is at depth This tree is usually chosen purely

for the sake of efficiency, and, in fact, has been the only tree considered in

previous related work we are familiar with. For example, it forms the basis of

the Logical Key Hierarchy (LKH) scheme [26, 28], the One-way Function Tree

(OFT) scheme [21], the Subset-Difference Revocation (SDR) scheme [16],

and the Layered Subset Difference (LSD) scheme [10].

2. An organizational chart. In this case, the devices are associated with the

leaves of a tree that represents an organizational chart, such as that of a

corporation or university. For example, internal nodes could correspond to

campuses, colleges, and departments. The height of this tree is assumed

to be but the number of children of an internal is not assumed

to be bounded by a constant. Thus, the straightforward conversion of this

tree into an equivalent bounded-degree tree may cause the height to become

3. A multicast tree. In this case, the devices are associated with the nodes of

a multicast tree rooted at the group controller. The logical structure of this

tree could be determined in an ad hoc manner so that no bound is assumed

on either the tree height or the degree of internal nodes. Thus, this tree

may be quite imbalanced and could in fact have height that is exponentially

greater than the number of keys each device can hold.

In using trees, particularly in the latter two cases, we feel it is important to

provide the capability to the group controller of encrypting a message so that it

may be decrypted only by the devices associated with nodes in a certain subtree.

For instance, a sporting event might be broadcast to just a single region, or a

directive from headquarters might be intended just for a single division. We call

such a broadcast a subtree broadcast, which can also be modeled by multiple

GCs, each assigned to a different subtree. We continue in this case to assume