. 16
( 19)


implements a bijection from onto
2.2 There exists a polynomial time Turing Machine such that on in-
put and it outputs for Therefore,

3 There exists a polynomial time Turing Machine which, on input
and any answers whether or not.
There exists a (deterministic) polynomial time Turing Machine Inv such
that for all and for all
5 For every probabilistic polynomial time Turing Machine we have that,
for large enough

where is a negligible function.
The last property is our formal hard-to-invert notion, which is quite similar to
the usual one-way notion: they just differ if is one-way.

482 Dario Catalano, David Pointcheval, and Thomas Pornin

2.2 Verifiable Sub-family of Trapdoor Hard-to-Invert Isomorphisms

In the above definition, it is clear that for any the function is an
isomorphism from the group onto However, in practice, the family of
functions maybe indexed by a potentially larger set S (i.e.
for which there may exist some indices that do not lead to an isomorphism.
Therefore, we require more properties to be satisfied.

there exists a large subset such that is a
family of trapdoor hard-to-invert isomorphisms;
there exists a set J, of indices which provide an isomorphism “ such that
which admits an efficient zero-knowledge proof of membership.

The last property turns out to be crucial for the application we have in mind. In
our setting the client has to choose the specific function to use in the protocol.
This means that a dishonest client (i.e. one that does not share a password
with the server) could propose an index whose corresponding function is not
an isomorphism. This would give him the ability to run a partition attack (as
already explained for RSA). For this reason we require the client to produce a
function together with a proof that it is actually an isomorphism.

2.3 Zero-Knowledge Proofs of Membership

As noticed above, the only property we want to be able to verify is the isomor-
phic one, and thus the fact that the index actually lies in J: we just want
the adversary not to be able to prove a wrong statement, we do not care about
malleability [11]. One second point is that the zero-knowledge property will be
required in the security proof: a valid index is given, one tries to use the adver-
sary to solve a hard problem related to Thus, we need to be able to provide a
proof of validity of without any witness. Note however that the simulation is
performed for valid statements only, and thus simulation soundness [24] is not
required. Moreover, since we just have to simulate one proof without the wit-
ness (other executions will be performed as in an actual execution) concurrent
zero-knowledge is not needed either.
For efficiency reasons, we will focus on a specific class of zero-knowledge
the verifier sends a random seed seed and then
proofs: for a given statement
the prover non-interactively provides a proof using a
w.r.t. the random seed seed; the proof can be checked
witness that
without the witness In our protocol, honest players will sam-
ple and thus together the trapdoor This trapdoor will generally be a
good witness. More formally we require:

Completeness “ and are two efficient (polynomial time) al-
and any challenge seed, a witness helps to build
gorithms, and for any
a proof which is always accepted:
IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 483

Soundness “ for any the probability for any adversary (on its random
tape and the random seed seed) to forge a valid proof (accepted by the
algorithm) is negligible within time will denote the
maximal success probability for any adversary within time
ROM-simulatability “ granted the programmability of the random oracle, for
and any seed, there exists an efficient way to perfectly simulate
an accepted proof.

2.4 Concrete Examples
The Diffie-Hellman Family. The most natural example of family of trapdoor
hard-to-invert isomorphisms is the Diffie-Hellman one. The machine Gen, on
input the security parameter does as follows. First it chooses a random prime
of size and a prime such that divides Next, it chooses a subgroup
of order in and a corresponding generator Finally it chooses a random
element in it sets and outputs the pair where
and is an encoding of This defines our set I.
Now is instantiated as follows. Set and
is defined as Moreover is defined
as (for any
Clearly, to efficiently evaluate on a random point X, one should know
either the trapdoor information or any such that (assuming,
of course, that the computational Diffie-Hellman problem is infeasible in
Similarly knowledge of the trapdoor is sufficient to invert
on a random point However inverting the function without
knowing the trapdoor seems to be infeasible. Nevertheless, is efficiently
decidable: simply checks whether or not.
For our functions to be isomorphisms, one just needs to be co-prime with
where is actually the order of For better efficiency, the group informations
can be fixed, and considered as common trusted parameters. Therefore,
Gen just chooses and sets one just needs to check that
and no witness is required, nor additional proof:
does not need any witness for outputting any proof, since simply checks
the above equality/inequality.

The RSA Family. Another natural example is the RSA permutation. In this
case the machine Gen on input the security parameter does as follows. First it
chooses two random primes of size and sets Next, it chooses a
public exponent such that Finally it outputs the pair
where and is an encoding of This defines our set I.
The function is instantiated as follows. Set and
is the identity function, i.e. The function
Note that we allow a slight misuse of notation here. Actually the function
should be defined as However we prefer to adopt a simpler
(and somehow incorrect) notation for visual comfort.

484 Dario Catalano, David Pointcheval, and Thomas Pornin

is defined as (for any Hence,
The Inv algorithm is straightforward, granted the trapdoor. And the
algorithm simply has to check whether the element is prime to
As already noticed, since is easy to invert, the RSA family is not
only a trapdoor hard-to-invert isomorphism family, but also a trapdoor one-way
permutation family. However, to actually be an isomorphism, does not
really need to lie in I, which would be very costly to prove (while still possible).
It just needs to satisfy which defines our set J. An efficient
proof of validity is provided in the full version [8], where both and
are formally defined.

The Squaring Family. As a final example, we suggest the squaring function
which is defined as the RSA function with the variant that A problem
here arises from the fact that squaring is not a permutation over simply
because 2 is not co-prime with However, if one considers Blum moduli
(i.e. composites of the form where then it is easy to
check that the squaring function becomes an automorphism onto the group of
quadratic residues modulo (in the following we refer to this group as to
However this is not enough for our purposes. An additional difficulty comes from
the fact that we need an efficient way to check if a given element belongs to
(which would be here): the need of an efficient algorithm The most
natural extension of is the subset of which contains all the elements
with Jacobi symbol equal to +1. Note that for a Blum modulus this
set is isomorphic to (this is because “1 has a Jacobi symbol
equal to +1, but is not a square). By these positions we get the signed squaring2

For this family, the machine Gen, on input the security parameter does as
follows. First it chooses two random Blum primes of size and sets
Then it outputs the pair where and is an encoding
of This thus defines our set I. The function is instantiated as follows.
Set and
is the identity function, i.e. The function
is defined as (for any
The Inv algorithm is straightforward, granted
the trapdoor. And the algorithm simply computes the Jacobi symbol.
As above, since is easy to invert, the squaring family is not only a
trapdoor hard-to-invert isomorphism family, but also a trapdoor one-way per-
mutation family. However, to actually be an isomorphism, does not really need
to be a Blum modulus, which would be very costly to prove. What we need is
just that “1 has Jacobi symbol +1 and any square in admits exactly 4 roots.
A validity proof is provided, with the mathematical justification, in the section 6,
which thus formally defines both and
By signed, we mean that the output of the function has a sign (plus or minus).

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 485

The Formal Model
3.1 Security Model
Players. We denote by A and B two parties that can participate in the key
exchange protocol P. Each of them may have several instances called oracles
involved in distinct, possibly concurrent, executions of P. We denote A (resp.
B) instances by (resp. or by U when we consider any user instance. The
two parties share a low-entropy secret pw which is drawn from a small dictionary
Password, according to a distribution In the following, we use the notation
for the probability to be in the most probable set of passwords:

If we denote by the uniform distribution among N passwords,

Queries. We use the security model introduced by Bellare et al. [1], to which
paper we refer for more details. In this model, the adversary has the entire
control of the network, which is formalized by allowing to ask the following
This query models passive attacks, where the adversary
gets access to honest executions of P between the instances and by
Reveal(U): This query models the misuse of the session key by any instance
U (use of a weak encryption scheme, leakage after use, etc). The query is
only available to if the attacked instance actually “holds” a session key
and it releases the latter to
This query models sending a message to instance U. The
adversary gets back the response U generates in processing the message
according to the protocol P. A query initializes the key
exchange algorithm, and thus the adversary receives the flow A should send
out to B.
In the active scenario, the Execute-query may seem rather useless: after all the
Send-query already gives the adversary the ability to carry out honest executions
of P among parties. However, even in the active scenario, Execute-queries are
essential to properly deal with dictionary attacks. Actually the number of
Send-queries directly asked by the adversary does not take into account the
number of Execute-queries. Therefore, represents the number of flows the
adversary may have built by itself, and thus the number of passwords it may
have tried. Even better, is an upper-bound on the number of passwords
it may have tried, where (and resp.) is the number of A (B resp.) instances
involved in the attack. For the sake of simplicity, we restricted queries to A and
B only. One can indeed easily extend the model, and the proof, to the more
486 Dario Catalano, David Pointcheval, and Thomas Pornin

general case, keeping in mind that we are interested in the security of executions
involving at least A or B, with the password pw shared by them. Additional
queries would indeed use distinct passwords, which could be assumed public in
the security analysis (known to our simulator).

3.2 Security Notions
Two main security notions have been defined for key exchange protocols. The
first one is the semantic security of the key, which means that the exchanged key
is unknown to anybody else than the players. The second one is unilateral or
mutual authentication, which means that either one, or both, of the participants
actually know the key.

AKE Security. The semantic security of the session key is modeled by an
additional query Test(U). The Test-query can be asked at most once by the
adversary and is only available to if the attacked instance U is Fresh. The
freshness notion captures the intuitive fact that a session key is not “obviously”
known to the adversary. An instance is said to be Fresh if the instance has
accepted (i.e. the flag accept is set to true) and neither it nor its partner (i.e. the
other instance with same session tag ”or SID” which is defined as the view
the player has of the protocol ”the flows” before it accepts) have been asked
for a Reveal-query. The Test-query is answered as follows: one flips a (private)
coin and forwards sk (the value Reveal(U) would output) if or a random
value if
We denote the AKE advantage as the probability that correctly guesses
the value of More precisely we define where the
probability space is over the password, all the random coins of the adversary
and all the oracles, and is the output guess of for the bit involved in the
Test-query. The protocol P is said to be if advantage is
smaller than for any adversary running with time

Entity Authentication. Another goal of the adversary is to impersonate a
party. We may consider unilateral authentication of either A (A-Auth) or B (B-
Auth), thus we denote by the probability
that successfully impersonates an A instance (resp. a B instance) in an exe-
cution of P, which means that B (resp. A) terminates (i.e. the terminate flag is
set to true) even though it does not actually share the key with any accepting
partner A (resp. B).
A protocol P is said to be if success for breaking
either A-Auth or B-Auth is smaller than for any adversary running with
time This protocol then provides mutual authentication.

4 Algorithmic Assumptions
In this section we state some algorithmic assumptions we need in order to con-
struct an IPAKE protocol. As already sketched in section 1.2, our basic building
IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 487

block is a family of trapdoor hard-to-invert bijections More precisely each
bijection needs to be a group isomorphism from a group into a
group where is the inverse operation of
As additional assumption we require the existence of a generalized full-domain
hash function which on a new input outputs a uniformly distributed
element in This is the reason why we need the decidability of in practice,
will be implemented by iterating a hash function until the output is in
The non-invertibility of the functions in the family is measured by the
“ability”, for any adversary in inverting a random function (in on a random
point, uniformly drawn from

More precisely, we denote by the maximal success probability for all
the adversaries running within time A simpler task for the adversary may be
to output a list of elements which contains the solutions:

As above, we denote by the maximal success probability for all
the adversaries running within time which output sets of size

The RSA Family:
As described in section 2.4 the function is defined by and
And, for any For a correctly generated and a valid
(i.e an such that the non-invertibility of the function is
equivalent to the, widely conjectured, one-wayness of RSA. This leads to the

where is an upper-bound on the time required to perform an exponentiation.

4.2 The Diffie-Hellman Family:
Let be any cyclic group of (preferably) prime order As sketched in
section 2.4, the function is defined by a point (and thus
and For any
attacker, in the finite cyclic group of prime order gen-
erated by is a probabilistic machine running in time such that

For visual comfort in the following we adopt the symbols rather than

488 Dario Catalano, David Pointcheval, and Thomas Pornin

Fig. 1. An execution of the IPAKE protocol: Auth is computed by Alice (Bob resp.)
as and sk is computed by
Alice (Bob resp.) as resp.)

where the probability is taken over the random values and in As usual,
we denote by the maximal success probability over every adversary
running within time Then, when and are fixed,
Using Shoup™s result [25] about “self-correcting Diffie-Hellman”, one can see that
if then for some

4.3 The Squaring Family:
As discussed in section 2.4 if one assumes that the modulus is the product
of two Blum primes, the signed squaring function becomes an isomorphism
from onto Furthermore, for a correctly generated the non-
invertibility of is trivially equivalent to the one-wayness of factoring Blum
composites. This leads us to the following inequality

which provides a very tight bound because, in this case, represents the time
required to perform a single modular multiplication (i.e. to square).

Security Proof for the IPAKE Protocol
5.1 Description and Notations
In this section we show that the IPAKE protocol distributes session keys that
are semantically secure and provides unilateral authentication for the client A.
IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 489

The specification of the protocol can be found on Figure 1. Some remarks, about
notation, are in order
We assume to be a correct family, with a verifiable sub-family of trapdoor
hard-to-invert isomorphisms from into In the following, we identify
to and thus We denote by the size of I. Furthermore, we denote
by a lower bound on the size of any
For this choice of parameters for the family we can define the function
which is assumed to behave like a generalized full-domain random oracle.
In particular we model as follows: on input a couple it outputs a
random element, uniformly distributed in
Since we only consider unilateral authentication (of A to B), we just introduce
a terminate flag for B.

5.2 Security Proof
Theorem 1 (AKE/UA Security). Let us consider the protocol I PAKE, over
a family of trapdoor hard-to-invert isomorphisms, with parameter where
Password is a dictionary equipped with the distribution For any adversary
within a time bound with less than active interactions with the parties (Send-
passive eavesdroppings (Execute-queries), and asking
queries) and and
hash queries to and any respectively: and
with upper-bounded by

where and denote the number of A and B instances involved during the
attack (each upper-bounded by and denotes
the number of involved instances and is the time needed
for evaluating one law operation. Let us remind that is the output length of
(the authenticator.)

For lack of space, we refer to the full version [8] for the full proof, here we justify
the main terms in the security result.
Ideally, when one considers a password-based authenticated key exchange,
one would like to prove that the two above success/advantage are upper-bounded
by plus some negligible terms. For technical reasons in the proof (to
get a clear proof) we have a small additional constant factor. This main term is
indeed the basic attack one cannot avoid: the adversary guesses a password and
makes an on-line trial. Other ways for it to break the protocol are:
use a function that is not a permutation, and in particular not a surjection.
With the view of the adversary tries all the passwords, and only a strict
fraction leads to in the image of this is a partition attack. But for that,
it has to forge a proof of validity for Hence the term
490 Dario Catalano, David Pointcheval, and Thomas Pornin

use the authenticator Auth to check the correct password. But this requires
the ability to compute Hence the term
send a correct authenticator Auth, but being lucky. Hence the term
Additional negligible terms come from very unlikely collisions. All the remaining
kinds of attacks need some information about the password.

6 A Concrete Example: The SQRT-IPAKE Protocol
An important contribution of this work (at least from a practical point of view)
is the first efficient and provably secure password-based key exchange protocol
based on factoring. The formal protocol appears in Figure 2. Here we describe
the details of this specific implementation.

Fig. 2. SQRT “ IPAKE protocol

Description of the SQRT-IPAKE Protocol
In order for the protocol to be correct we need to make sure that the adopted
function is actually an isomorphism. As seen in section 2.4 this is the case if
one assumes that the modulus is the product of two Blum primes, and
is the signed squaring function.
IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 491

We thus set and and, of course, the internal
In order for the password PW to be
law is the multiplication in the group
generated correctly, we need a hash function onto Constructing such
a function is pretty easy: we start from a hash function onto and we
iterate it until we get an output in The details of this technique are deferred
to the full version of this paper [8]. Here we stress that if then very
few iterations are sufficient. As already noticed, we require Alice to prove the
following about the modulus so that the function is actually an isomorphism:
The modulus is in the correct range
The Jacobi symbol of “1 is +1 in (this is to make sure that is actually
a morphism);
The signed squaring function is actually an isomorphism from
onto (this is to make sure that any square in has exactly 4 roots).
Proving the first two statements is trivial. For the third one we need some new

6.2 Proof of Correct Modulus
With the following theorem (whose proof can be found in the full version of this
paper [8]) we show that if is a composite modulus (with at least two different
prime factors) then the proposed function is an isomorphism.
Theorem 2. Let be a composite modulus containing at least two different
prime factors and such that “1 has Jacobi symbol +1 in Moreover let be
the morphism defined above. The following facts are true
1. If is surjective then it is an isomorphism.
2. If is not surjective, then at most half of the elements in have a pre-
The theorem above leads to the protocol Prove-Surjective (see Figure 3). The
basic idea of this protocol is that we prove that our function is a bijection by
proving it is surjective. Soundness follows from the second statement. However,
in order to fall into the hypotheses of the theorem, we need to make sure
is actually a composite modulus of the required form (i.e. with at least two
distinct prime factors). We achieve this with the Prove-Composite protocol
(see Figure 3). The correctness (completeness, soundness and zero-knowledge
properties) of these protocols is deferred to the full version of this paper [8].
Remark 3. We point out that our protocol is very efficient, for the verifier, in
terms of modular multiplications. It is also possible for Alice to use the same
modulus for different sessions.

We thank the anonymous referees for their fruitful comments.
Dario Catalano, David Pointcheval, and Thomas Pornin

Fig. 3. Proof of Correct Modulus

1. M. Bellare, D. Pointcheval, and P. Rogaway. Authenticated Key Exchange Se-
cure Against Dictionary Attacks. In Eurocrypt ™00, LNCS 1807, pages 139“155.
Springer-Verlag, Berlin, 2000.
2. M. Bellare and P. Rogaway. The AuthA Protocol for Password-Based Authenti-
cated Key Exchange. Contributions to IEEE P1363. March 2000.
3. S. M. Bellovin and M. Merritt. Encrypted Key Exchange: Password-Based Pro-
tocols Secure against Dictionary Attacks. In Proc. of the Symposium on Security
and Privacy, pages 72“84. IEEE, 1992.
4. C. Boyd, P. Montague, and K. Nguyen. Elliptic Curve Based Password Authen-
ticated Key Exchange Protocols. In ACISP ™01, LNCS 2119, pages 487“501.
Springer-Verlag, Berlin, 2001.
5. V. Boyko, P. MacKenzie, and S. Patel. Provably Secure Password Authenticated
Key Exchange Using Diffie-Hellman. In Eurocrypt ™00, LNCS 1807, pages 156“171.
Springer-Verlag, Berlin, 2000.
Full version available at: http://cm.bell-labs.com/who/philmac/research/.
6. E. Bresson, O. Chevassut, and D. Pointcheval. Security Proofs for Efficient
Password-Based Key Exchange. In Proc. of the 10th CCS, pages 241“250. ACM
Press, New York, 2003.

IPAKE: Isomorphisms for Password-Based Authenticated Key Exchange 493

7. E. Bresson, O. Chevassut, and D. Pointcheval. New Security Results on Encrypted
Key Exchange. In PKC ™04, LNCS, pages 145“159. Springer-Verlag, Berlin, 2004.
8. D. Catalano, D. Pointcheval, and T. Pornin. IPAKE: Isomorphisms for Password-
based Authenticated Key Exchange. In Crypto ™04, LNCS. Springer-Verlag, Berlin,
2004. Full version available from http://www.di.ens.fr/users/pointche/.
9. W. Diffie and M. E. Hellman. New Directions in Cryptography. IEEE Transactions
on Information Theory, IT“22(6):644“654, November 1976.
10. Y. Dodis, J. Katz, S. Xu, and M. Yung. Strong Key-Insulated Signature Schemes.
In PKC ™03, LNCS, pages 130“144. Springer-Verlag, Berlin, 2003.
11. D. Dolev, C. Dwork, and M. Naor. Non-Malleable Cryptography. SIAM Journal
on Computing, 30(2):391“437, 2000.
12. T. El Gamal. A Public Key Cryptosystem and a Signature Scheme Based on
Discrete Logarithms. IEEE Transactions on Information Theory, IT“31(4):469“
472, July 1985.
13. R. Gennaro and Y. Lindell. A Framework for Password-Based Authenticated Key
Exchange. In Eurocrypt ™03, LNCS 2656, pages 524“543. Springer-Verlag, Berlin,
14. O. Goldreich and Y. Lindell. Session-Key Generation Using Human Passwords
Only. In Crypto ™01, LNCS 2139, pages 408“432. Springer-Verlag, Berlin, 2001.
15. S. Halevi and H. Krawczyk. Public-Key Cryptography and Password Protocols.
In Proc. of the 5th CCS. ACM Press, New York, 1998.
16. J. Katz, R. Ostrovsky, and M. Yung. Efficient Password-Authenticated Key Ex-
change Using Human-Memorizable Passwords. In Eurocrypt ™01, LNCS 2045, pages
475“494. Springer-Verlag, Berlin, 2001.
17. S. Lucks. Open Key Exchange: How to Defeat Dictionary Attacks Without En-
crypting Public Keys. In Proc. of the Security Protocols Workshop, LNCS 1361.
Springer-Verlag, Berlin, 1997.
18. P. MacKenzie, S. Patel, and R. Swaminathan. Password-Authenticated Key Ex-
change Based on RSA. In Asiacrypt ™00, LNCS 1976, pages 599“613. Springer-
Verlag, Berlin, 2000.
19. P. MacKenzie and R. Swaminathan. Secure Network Authentication with Password
Identification. Submission to IEEE P1363a. August 1999.
20. T. Okamoto and S. Uchiyama. A New Public Key Cryptosystem as Secure as
Factoring. In Eurocrypt ™98, LNCS 1403, pages 308“318. Springer-Verlag, Berlin,
21. P. Paillier. Public-Key Cryptosystems Based on Discrete Logarithms Residues. In
Eurocrypt ™99, LNCS 1592, pages 223“238. Springer-Verlag, Berlin, 1999.
22. M. O. Rabin. Digitalized Signatures. In R. Lipton and R. De Millo, editors,
Foundations of Secure Computation, pages 155“166. Academic Press, New York,
23. R. Rivest, A. Shamir, and L. Adleman. A Method for Obtaining Digital Signatures
and Public Key Cryptosystems. Communications of the ACM, 21(2):120“126,
February 1978.
24. A. Sahai. Non-Malleable Non-Interactive Zero-Knowledge and Chosen-Ciphertext
Security. In Proc. of the 40th FOCS. IEEE, New York, 1999.
25. V. Shoup. Lower Bounds for Discrete Logarithms and Related Problems. In
Eurocrypt ™97, LNCS 1233, pages 256“266. Springer-Verlag, Berlin, 1997.
26. F. Zhu, A. H. Chan, D. S. Wong, and R. Ye. Password Authenticated Key Exchange
based on RSA for Imbalanced Wireless Network. In Proc. of ISC ™02, LNCS 2433,
pages 150“161. Springer-Verlag, Berlin, 2002.

Randomness Extraction and Key Derivation
Using the CBC, Cascade and HMAC Modes*
Yevgeniy Dodis1, Rosario Gennaro2, Johan Håstad3,
Hugo Krawczyk4, and Tal Rabin2
New York University
IBM Research
Royal Institute, Sweden
Technion, Israel, and IBM Research

Abstract. We study the suitability of common pseudorandomness
modes associated with cryptographic hash functions and block ciphers
(CBC-MAC, Cascade and HMAC) for the task of “randomness extrac-
tion” , namely, the derivation of keying material from semi-secret and/or
semi-random sources. Important applications for such extractors include
the derivation of strong cryptographic keys from non-uniform sources of
randomness (for example, to extract a seed for a pseudorandom genera-
tor from a weak source of physical or digital noise), and the derivation
of pseudorandom keys from a Diffie-Hellman value.
Extractors are closely related in their applications to pseudorandom
functions and thus it is attractive to (re)use the common pseudoran-
dom modes as randomness extractors. Yet, the crucial difference between
pseudorandom generation and randomness extraction is that the former
uses random secret keys while the latter uses random but known keys. We
show that under a variety of assumptions on the underlying primitives
(block ciphers and compression functions), ranging from ideal random-
ness assumptions to realistic universal-hashing properties, these modes
induce good extractors. Hence, these schemes represent a more practical
alternative to combinatorial extractors (that are seldom used in prac-
tice) , and a better-analyzed alternative to the common practice of using
SHA-1 or MD5 (as a single un-keyed function) for randomness extraction.
In particular, our results serve to validate the method of key extraction
and key derivation from Diffie-Hellman values used in the IKE (IPsec™s
Key Exchange) protocol.

1 Introduction
Key Derivation and Randomness Extractors
Key derivation is a central functionality in cryptography concerned with the
process of deriving secret and random cryptographic keys from some source of
Extended abstract. Full version available at eprint.iacr.org/2004/

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 494“510, 2004.
© International Association for Cryptologic Research 2004
Randomness Extraction and Key Derivation 495

semi-secret randomness. In general, it is sufficient to derive a single random and
secret key (say of length 128) which can then be used to key a pseudorandom
function (or a pseudorandom generator) to obtain further pseudorandom keys
as needed. Thus, a basic question, which motivates the work presented here,
is how to derive such a random and secret key when all that is given is an
imperfect source of randomness which contains some good amount of secret
(computational) entropy, but this entropy is not presented in a direct form of
uniformly (or pseudorandomly) distributed secret bits. This problem arises in
a variety of scenarios such as when deriving keys from a non-uniform source of
noise (as used, for example, by physical random generators) or from semi-random
data (say, coming from user™s input or the sampling of computer events, etc.).
This is also the case when deriving keys from a Diffie-Hellman (DH) exchange.
Let us elaborate on the latter case.
Let™s assume that two parties run a DH protocol in order to agree on a shared
secret key, namely, they exchange DH exponentials and and compute the
DH value In this case, and as seen by the attacker fully determine
Yet, it is assumed (by the Decisional Diffie-Hellman, DDH, assumption)
that a computationally-bounded attacker cannot distinguish from a random
element in the group generated by Thus, one can assume that contains
bits of computational entropy relative to the view of the
attacker (for a formal treatment of computational entropy in the DH context
see [GKR04]). However, this entropy is spread over the whole value which
may be significantly longer than Thus, we are in a situation similar to that of
an imperfect source of randomness as discussed above. In particular, cannot
be used directly as a cryptographic key, but rather as a source from which to
extract a shorter string (say, of length 128) of full computational entropy which
can then be used as a cryptographic key.
The tools used to derive a uniform key from these sources of imperfect ran-
domness are often referred to as randomness extractors. The amount of theo-
retical results in this area is impressive; moreover, some of the constructions
that have proven extraction guarantees are also efficient (see [Sha02] for a recent
survey). One such example is the so called “pairwise independent universal hash
functions” (also called “strongly universal”) [CW79] which have quite efficient
implementations and provable extraction properties. In particular, [HILL99]
shows (see also [Lub96,Gol01]) that if an input distribution has sufficient min-
entropy (meaning that no single value is assigned a too-large probability even
though the distribution may be far from uniform) then hashing this input into a
(sufficiently) shorter output using a function chosen at random from a family of
strongly universal hash functions results in an output that is statistically-close
to uniform. (This result is often referred to as the “Leftover Hash Lemma”.)
For example, consider that is an element of prime order in (i.e., and
are primes and and that and In this case the
DDH assumption guarantees that the value hides (from the attacker) 512 bits of
computational entropy, yet these bits are spread in some unknown way among the
1024 bits of

496 Yevgeniy Dodis et al.

Yet, in spite of these results and an extensive literature studying their ap-
plication to real cryptographic problems (such as those mentioned earlier, and
in particular for the DH case [GKR04]) one seldom encounters in practice the
use of strong universal hashing or other proven extractors. Instead, the common
practice is to use cryptographic hash functions (such as MD5 and SHA-1) for
the purpose of randomness extraction. A main reason for this practice, justified
by engineering considerations, is that cryptographic hash functions are readily
available in software and hardware implementations, and are required by most
cryptographic applications for purposes other than randomness extraction (e.g.,
as pseudorandom functions). Therefore, it is attractive and convenient to use
them for key extraction as well. Also, the common perception that these hash
functions behave as random functions (formalized via the notion of “random
oracles”) make them intuitively appealing for the extraction applications.

1.2 Randomness Extraction via Common Chaining Modes

In this paper we attempt to bridge between the world of provable extraction
and the common practice of relying on idealized hash functions. The question
that we ask is what is the best way to use cryptographic hash functions, or
other widely available cryptographic tools such as block ciphers, for the task of
randomness extraction. Specifically, we consider three common modes of oper-
ation: CBC chaining, cascade (or Merkle-Damgard) chaining, and HMAC, and
analyze the appropriateness of these modes as extraction tools. Since the goal is
to provide as general (and generic) as possible results, we do not investigate the
extraction properties of specific functions (say SHA-1 or AES) but rather ab-
stract the basic primitives (the compression functions in the case of the cascade
and HMAC modes, and block ciphers in the case of CBC), as random functions
or permutations2.
Before going on with the description of our results, it is worth considering
the following issue. Given that the common practice is to extract randomness
using a hash function modeled as a random oracle, then how much do we gain
by analyzing the above modes under the weaker, but still idealized, randomness
assumption on the underlying basic primitives. There are several aspects to this
The first thing to note is that modeling the compression function of SHA-
1, for example, as a random function, or as a family of random functions, is a
strict relaxation to modeling SHA-1 (as a single un-keyed function) as a ran-
dom function. This is easily seen from the fact that even if one starts with a
random function as the compression function the result of the cascade chain-
ing (which is how SHA-1 is derived) is not a random function. (For example,
in the cascade construction, the probability that two L-block inputs that differ
only in their first block are mapped to the same output is while for
In the case of HMAC we obtain results based on non-ideal assumptions on the
underlying basic primitives (see Section 1.3).

Randomness Extraction and Key Derivation 497

a random function this probability is Another important point is that
cryptographic design work focuses on building the basic blocks, i.e. a compres-
sion function or a block cipher. Thus, making an assumption on these primitives
will represent the design goals which there can be an attempt to satisfy. Also
analysis of the type presented here, rather than implying the security of any spe-
cific implementation of these functions, serves to validate the suitability of the
corresponding chaining modes for some defined goal (in our case the goal is ran-
domness extraction). Indeed, the common approach for analyzing such modes
(e.g., [Dam89,BKR94,BCK96a,BCK96b]) is to make some assumption on the
basic primitive (for example, assuming the underlying compression function to
be a pseudorandom function, or a secure MAC, or a collision-resistant hash func-
tion) and then proving that these or other properties are preserved or implied
by the chaining operation.
In addition, the “monolithic” randomness assumption on a single (unkeyed)
function such as SHA-1 is inappropriate for the setting of randomness extraction
as no single function (even if fully random) can extract a close-to-uniform distri-
bution from arbitrary high-entropy input distributions. This is so, since once the
function is fixed (even if to purely random values) then there are high-entropy
input distributions that will be mapped to small subsets of outputs3. Therefore,
the viable approach for randomness extraction is to consider a family (or col-
lection) of functions indexed by a set of keys. When an application requires the
hashing of an input for the purpose of extracting randomness, then a random
element (i.e., a function) from this family is chosen and the function is applied
to the given input. While there may be specific input distributions that interact
badly with specific functions in the family, a good randomness-extraction family
will make this “bad event” happen with very small probability. Universal hash
families, mentioned before, are examples of this approach. An important point
here is that, while the choice of a function from the family is done by selecting
a random index, or key, this key does not need to be kept secret (this is im-
portant in applications that use extraction to generate secret keys; otherwise,
if we required this index to be secret then we would have a “chicken and egg”
In our setting, families of keyed functions come up naturally with block ci-
phers and compression functions (for the latter we consider, as in HMAC, the
variable IV as the key to the function). These functions are defined on fixed
length inputs (e.g., 512 bits in the case of compression function of SHA-1, or
128 in the case of AES). Then, to hash arbitrarily long inputs, we extend these
families by the appropriate chaining mode: cascade chaining (or HMAC) for com-
pression functions, and CBC-MAC in the case of block ciphers. What makes the
analysis of these functions challenging (in the setting of randomness extraction)
is that, as discussed before, the key to the function is random but known. For ex-
For example, let F be a random function from to bits and let S denote the subset
of that is mapped by F to outputs with a low-order bit of zero. If we consider
the uniform distribution on 5 as the input distribution, then this distribution has
almost full entropy, yet the output of F on S is trivially distinguishable from uniform.

498 Yevgeniy Dodis et al.

ample, the fact that the above functions are widely believed to be pseudorandom
does not help much here since, once the key is revealed, the pseudorandom prop-
erties may be lost completely (see full paper). Yet, as we will see in Section 4.2,
we do use the pseudorandom property in some of our analysis. Also worth not-
ing is that using families that are pseudorandom for extraction is particularly
convenient since these same functions can then be used by the same application
(for example, a key-exchange protocol, a random generator, etc.) for further key
derivation (using the extracted key to key the pseudorandom function).
The last question is how to generate the random known keys used by the
extractor. Technically this is not hard, as the parties can generate the appropri-
ate randomness, but the exact details depend on the application. For example,
in the DH key exchange discussed earlier, the parties exchange in the clear ran-
domly chosen values, which are then combined to generate a single key for the
extractor family (e.g. HMAC-SHA1). The shared key is set to
We note that this is substantially the procedure in place in the IKE protocol
[RFC2409,IKEv2] (see also [Kra03]), and this paper presents the first formal
analysis of this design.
A similar DH key extraction step is required in non-interactive scenarios,
such as ElGamal or Cramer-Shoup encryption. There the extractor key can
be chosen either by the encryptor and appended to the ciphertext, or chosen by
the decryptor and included in the public key (this choice is mandatory in case
we want CCA-security, as we don™t want to leave the choice of in the hands
of the adversary). For a different example, consider a cryptographic hardware
device, containing a physical random generator that samples some imperfect
source of noise. In this case the application can choose a random hash function
in the family and wire-in its key into the randomness generation circuit [BST03].
Notice that by using our results, it will be possible to perform the extraction
step using circuitry (such as a block-cipher or a cryptographic hash function)
which is very likely to already be part of the device.

1.3 Our Results

The Extraction Properties of CBC-MAC Mode. We show, in Section 3, that if
is a random permutation over and is an input distribution with
min-entropy of at least then the statistical distance between (where F
represents the function computed in CBC-MAC mode over L blocks) and the
uniform distribution on is As an example, in the application
(discussed before) in which we use the CBC-MAC function F to hash a Diffie-
Hellman value computed over a DDH group of order larger than we get that
the output distribution is computationally indistinguishable from a dis-
tribution whose distance from uniform is at most hence proving (under
DDH) that the output from is computationally indistinguishable
from uniform (and thus suitable for use as a cryptographic key). Note that if
one works over for 1024-bit and then all we need to assume is a
min-entropy of 256 out of the 1024 bits of In the full paper we show that
Randomness Extraction and Key Derivation 499

for input distributions with particularly high entropy (in particular those that
contain a block of almost-full entropy) the CBC-MAC mode guarantees an
almost-uniform output for any family of permutations.

The Extraction Properties of Cascade Chaining. In Section 4 we study the cas-
cade (or Merkle-Damgard) chaining used in common hash functions such as
MD5 and SHA-1. We show these families to be good extractors when modeling
the underlying compression function as a family of random functions. However,
in this case we need a stronger assumption on the entropy of the input dis-
tribution. Specifically, if the output of the compression function is long
(typically, or 160) we assume a min-entropy of over the whole input,
and “enough” min-entropy over the distribution induced on the last block of
input (typically of length 512 bits). For example, if the last block has bits of
min-entropy, and we assume L blocks, then the statistical distance between the
output of the cascade construction and the uniform distribution (on is
at most We note that the above restriction on the last-block distribu-
tion is particularly problematic in the case of practical functions such as MD5
and SHA since the input-padding conventions of these functions may cause a
full fixed block to be added as the last block of input. In this case, the output
distribution is provably far from uniform. Fortunately, we show that our anal-
ysis is applicable also to the padded-input case. However, instead of proving a
negligible statistical distance, what we show is that the output of the “padded
cascade” is computationally indistinguishable from uniform, a result that suffices
for the cryptographic applications of extraction. Finally, we prove that when ev-
ery block of input has large-enough min-entropy (conditioned on the distribution
of previous blocks), then the above extraction results hold under the sole (and
non-ideal) assumption that the underlying compression function is a family of
functions (for sufficiently small

The Extraction Properties of HMAC. HMAC is the most widely used pseudoran-
dom mode based on functions such as MD5 or SHA, thus proving its extraction
properties is extremely important. Our main result concerning the good extrac-
tion properties of HMAC is proven on the basis of a high min-entropy bits)
in the input distribution without relying on any particular entropy in the
last block of input. Specifically, let F denote the keyed hash function underlying
an instantiation of HMAC (e.g., F is SHA-1 with random IV) and let be the
corresponding outer compression function. Then we show that if F is collision
resistant and is modeled as a random function then the output of HMAC (on
input drawn from the distribution is indistinguishable from uniform for any
attacker that is restricted in the number of queries to the function Moreover,
if the compression function itself is a good extractor, then HMAC is a good
extractor too. However, in this latter case if we are interested in an output of
close-to-uniform bits then the key to the underlying compression function needs
to be sufficiently larger than As a concrete example, if (e.g., we need
to generate a pseudorandom key of length 160) then we can use HMAC with
SHA2-512. Note that this result is particularly interesting in the sense that it
500 Yevgeniy Dodis et al.

uses no idealized assumptions, and yet the output of HMAC is provably close to
uniform (even against completely unbounded attackers, including attackers that
can break the collision resistance of F).

Remark (Pseudorandom functions with known keys). It is tempting to use the
pseudorandomness properties enjoyed by the modes studied here as a basis to
claim good (computational) extraction properties. For example, in spite of the
fact that the output of these functions may be statistically very-far from uniform,
it is still true that no (efficient) standard statistical test will be able to tell apart
this output from random (simply because such a test does not use the knowledge
of the key even if this key is known.) Yet, for cryptographic applications using a
family of functions as extractors, based solely on the assumption that the family
is pseudorandom, may be totally insecure. We illustrate this point by showing,
in the full paper, an example of a secure pseudorandom family whose output is
trivially distinguishable from randomness once the key is known.
All proofs appear in the full version of the paper.

2 Universal Hashing and Randomness Extraction
Preliminaries. For a probability distribution we use the notation to
mean that is chosen according to the distribution For a set S, S is used
to mean that is chosen from S with uniform probability. Also, for a probability
distribution we use the notation to denote the probability assigned by
to the value (We often omit the subscript when the probability distribution
is clear from the context.) Throughout the paper, we will use to denote the
maximal numbers of divisors of any number smaller or equal to L. As a very
crude upper bound, we will sometimes use the fact that
MIN-ENTROPY AND COLLISION PROBABILITY. For a probability distribution
over we define its min-entropy as the minimum integer such that for
all We denote the min-entropy of such by
The collision probability of is
and the Renyi (or collision) entropy of is It is easy to
see that these two notions of entropy are related:
In particular, we will frequently use the fact that
STATISTICAL DISTANCE. Let be two probability distributions over the
set S. The statistical distance between the distributions and is defined as
If two distributions have statistical
distance of (at most) then we refer to them as We note that
distributions cannot be distinguished with probability better than even by a
computationally unbounded adversary. It is also well known that if has support
on some set S and U is the uniform distribution over this set, then

Randomness Extraction and Key Derivation 501

Definition 1. Let and be integers, and be a family of hash functions
with domain range and key space We say that the family
is if for every pair of different inputs from
it holds that where the probability is taken over
For a given probability distribution on we say that
is if where the probability is taken over
and conditioned to

Clearly, a family is if it is w.r.t. all distributions on
The notion of universal hashing originates with the seminal papers by Carter
and Wegman [CW79,WC81]; the variant used here was first formulated in
[Sti94]. The main usefulness of this notion comes from the following lemma whose
proof is immediately obtained by conditioning on whether the two independent
samples from collide or not (below E denotes the expected value).

Lemma 1. If is w.r.t. then

Now, using the above lemma and Eq. (1), the lemma below extends the
well-known “Leftover Hash Lemma” (LHL) from [HILL99] in two ways. First,
it relaxes the pairwise-independence condition assumed by that lemma on the
family of hash functions, and allows for “imperfect” families in which the collision
probability is only to perfect (i.e., instead of Second, it
allows for the collision probability to depend on the input distribution rather
than being an absolute property of the family of hash functions. We use these
straightforward extensions of the LHL in an essential way for achieving our
results. We also note that the standard LHL can be obtained from Lemma 2
below by setting

Lemma 2. Let and be integers, let be a probability distribution over
and let be a family of hash function with domain and
range If is universal w.r.t. U is uniform
over and is uniform over then

Remark 1. It is important to note that for the above lemma to be useful one
needs or otherwise the derived bound on the statistical closeness ap-
proaches 1. Moreover, this fact is not a result of a sub-optimal analysis but rather
there are examples of families with (i.e., families) that gen-
erate outputs that are easily distinguishable from uniform. For example, if
is a family of pairwise independent hash functions with outputs, and we de-
fine a new family which is identical to except that it replaces the last
bit of output with 0, then the new family has collision probability of yet its
output (which has a fixed bit of output) is trivially distinguishable from uniform.
The fact that we need (say, makes the analysis of CBC and
the cascade construction presented in the next sections non-trivial. In particular,
502 Yevgeniy Dodis et al.

the existing analyses of these functions (such as [BKR94,BCK96b]) are too weak
for our purposes as they yield upper bounds on the collision probability of these
constructions that are larger than

3 The CBC-MAC Construction

Here we study the suitability of the CBC-MAC mode as a randomness extractor.
Recall that for a given permutation on the CBC-MAC computation
of on an input with L blocks in is defined as
where the latter value is set by the recursion: for
We denote the output of the above CBC-MAC process by
Our main result in this section states the extraction properties of CBC-MAC
for a random permutation on elements. To state it more compactly,
we let and notice that
when (here we use the fact that

Theorem 1. Let F denote the CBC-MAC mode over a random permutation
on and let be an input distribution to F defined over L-block
strings. Then the statistical distance between and the uniform distribu-
tion on is at most

In particular, assuming and the above statistical distance
is at most

The proof of the theorem follows from Lemma 2 in combination with the
following lemma that shows that CBC-MAC mode with a random permutation
is for sufficiently small

Lemma 3. Let F denote the CBC-MAC mode over a permutation on
For any if then where
the probability is over the choice of a random permutation

4 The Cascade Construction

We first recall the Merkle-Damgard approach to the design of cryptographic hash
functions and introduce some notation and terminology. For given integers and
let be a family of functions such that and
for all the function maps bits into bits. On the basis of this family
we build another family that works on inputs of any length which
is a multiple of and produces a output. For each the function
is defined as follows. Let for some and
(for all denote the input to we define L variables (each of length
Randomness Extraction and Key Derivation 503

as and set For processing
inputs of arbitrary length one needs a rule for padding inputs to a length that is
a multiple of Specific functions that build on the above approach, such as MD5
and SHA-1, define their specific padding; we expand on this issue in Section 4.2.
For the moment we assume inputs of length Lb for some L. Some more notation:
Sometimes we use F to denote the family and we write to denote
the random variable for Finally, we use K to denote
The family is called the “compression function(s)” and the family
is referred to as the “cascade construction” (over the compression func-
tion Typical values for the parameters of the compression function are

4.1 The Basic Cascade Construction

The main result of this section is the following. Assume is an input distribu-
tion with bits of (overall) min-entropy and “enough” bits of min-entropy in
its last block (“enough” will be quantified below). For the cascade con-
structions we model the underlying family of compression functions as a family
of random functions (with outputs). Then the output of F on the distribu-
tion is statistically close to uniform. This result is formalized in the following
theorem. As in Section 3, we let and notice
that when

Theorem 2. Let be the cascade construction defined, as above, over
a family of random functions Let be the input distribution to F defined
over L-block strings, and denote the probability distribution induced by on
the last block for Then, if U is the uniform distribution over
we have

In particular, if and then

The proof of the theorem follows from Lemma 2 in combination with the
following lemma that shows that the cascade construction with a random family
of compression functions is for sufficiently small

Lemma 4. Let be the cascade construction defined over a family of
random functions Let be an input distribution as assumed in Theorem 2,
where Then, the family F is
w. r. t.

The proof of this lemma is based on the following two propositions: the first
analyzes the collision probability of the cascade function F over random com-
pression functions on inputs that differ (at least) in the last block (Proposition 1);
504 Yevgeniy Dodis et al.

then we extend the analysis to the general case, i.e. for any two different and
(Proposition 2). All proofs appear in the final paper.

Proposition 1. Let be the cascade construction defined over a family
of random functions Let be two inputs to F that differ (at least) in the
last block, namely, and let be any value of the initial key. Then
where the probability is taken over the
choice of random functions in F.

Proposition 2. Let F be defined as above, let be two different inputs to
F, and let be any value of the initial key. Then
where the probability is taken over the choice of random functions
in F.

THE VALUE OF THE INITIAL KEY We note that the above analysis holds for
any value of the initial key when the functions are truly random, which means
that in principle can be fixed to a constant. However, in practice not all the
functions of the function family satisfy this requirement. Thus, choosing
at random allows, for example, to extend our analysis to the situation where a
negligible fraction of functions in F are not close to random functions.
non-trivial min-entropy in the last block is required, even if the family of
compression functions is completely random. Assume an input distribution in
which the last block is fixed to a value B. The first L “ 1 blocks induce some dis-
tribution for the last key in the sequence. Examining the distribution on
induced by (any) distribution on it is easy to see that this distribution is sta-
tistically far from the uniform distribution. In particular, we expected with high
probability a constant fraction of elements will not appear in the distribution.

4.2 The Cascade Construction with Input Padding

The conditions imposed by Theorem 2 on the input distribution conflict with a
technical detail of the practical implementations of the cascade paradigm (such
as MD5 and SHA-1): rather than applying the cascade process to the input
these functions modify by concatenating enough padding bits as to
obtain a new input whose length is a full multiple of the block length. In some
cases this padding results in a full fixed block appended to Therefore, even
if has the property that the last block of input has relatively high entropy
(as required by Theorem 2) the actual input to the cascade does not have
this property any more. This fact is sufficient to make our main result from
Section 4.1 irrelevant to these real-world functions; luckily, however, we show
here that this serious obstacle can be lifted.
In order to better understand this problem we first describe the actual way
this padding is performed. We consider the concrete value used as the
block length in these functions. Let denote the length of the input and let
Randomness Extraction and Key Derivation 505

If then is padded with the binary string
followed by a 64-bit representation of If then a whole new block
is added with a padding similar to the one described above (with the binary
representation of occupying the last 64 bits of the added block). From this
description, we see that if, for example, the original input had a length which
was an exact multiple of 512, then where B is a whole new block
(and represents the concatenation operation). Moreover, the value of B is the
same (and fixed) for all inputs of the length of In particular, if we consider
the case in which we hash Diffie-Hellman values of length of 1024 or 2048 bits
(this is the common case when working over groups), then we get that the
padded input will always have a fixed last block. In other words, regardless
of the entropy existing in the original input the actual input to the cascade
process now has a zero-entropy last block.
For this case, in which the last block is fixed, we show here that a somewhat
weaker (but still very useful) result holds. Specifically, combining Theorem 2 with
the assumption that the family of (compression) functions is pseudorandom,
we can prove that the output from the cascade construction is pseudorandom,
i.e., computationally indistinguishable from uniform (and thus sufficient for most
cryptographic applications) This result holds even though the key to the cascade
function F is revealed! We note that the assumption that the family is
pseudorandom is clearly implied by the modeling (from the previous subsection)
of these functions as random. But we also note that assuming the compression
function (family) of actual schemes such as MD5 or SHA-1 to be pseudorandom
is a standard and widely-used cryptographic assumption (see [BCK96b] for some
analytical results on these PRF constructions).

Lemma 5. Let be a family of pseudorandom functions which is
from random for attackers restricted to time T and a sin-
gle query. Let denote the cascade construction over the family
Further, let be a probability distribution on L-block strings from
which the inputs to F are chosen, and B be a fixed block. If the output distri-
bution with random but known key is close to uniform,
then the distribution (for random but known is
from uniform by attackers that run time T.

The above lemma together with Theorem 2 show that if is a family of
random functions then the cascade construction with a fixed last block block
is indistinguishable from random, provided that the original input distribution
(before padding!) satisfies the conditions of Theorem 2.
It is also worth noting that Lemma 5 can be generalized to input distributions
that can be described as the concatenation of two probability distributions
and where satisfies the conditions of Theorem 2, and is an arbitrary
(polynomial-time samplable) distribution independent from
A PRACTICAL CONSIDERATION. Note that the application of Lemma 5 on an
input distribution still requires the last block of (before padding) to have
506 Yevgeniy Dodis et al.

relatively high min-entropy. To maximize this last-block min-entropy it is advis-
able that any input whose length is not a multiple of be “shifted to
the right” (to a block boundary) by prepending a sufficient number of bits (say,
all zeros) to the beginning of This way, the resultant string is of length a
multiple of and, more importantly, its last block contains the full entropy of
the last bits in Also, this shifting forces the appended padding described
earlier to add a full block as assumed in Lemma 54.

4.3 Modeling the Compression Function as a Family

In Section 4.1 we presented an analysis of the basic cascade construction under
the modeling of the compression function as a family of random functions. Here
we study the question of what can be guaranteed on the output distribution of
the cascade under the simple assumption that the family of compression func-
tions is a good extractor (or more generally that this family is Clearly
this is a more realistic assumption on the underlying compression function. On
the other hand, in order to prove a close-to-uniform output in this case we are
going to require a stronger assumption on the input distribution. Specifically, we
are going to assume that the distribution on every block of input has a high min-
entropy (e.g., bits of min-entropy out of the bits in the block), conditioned
on the distribution of the previous blocks. We prove below that under these
conditions the output of the cascade function is statistically close to uniform.
We note that the above requirement on the input distribution, while strin-
gent, is met in some important cases, such as applications that extract keys from
a Diffie-Hellman value computed over a high-entropy group. In particular, this
requirement is satisfied by the DH groups in use with the IKE protocol.
CONDITIONAL ENTROPY. Let and be two probability distributions over
and respectively. If we denote with the distribution
conditioned to the event that the string is selected according to Then
we can define the conditional min-entropy of (and denote it as
as the minimum integer such that for all
We define the conditional min-entropy of with respect to as the expec-
tation over of

Lemma 6. Assume that the family of compression functions from to
bits has the property that for any probability distribution defined over
with min-entropy of the output distribution for and is
to uniform (for some given Further, assume that is an
input distribution on L blocks with the property that for each
For example, assume the inputs from the distribution to be of length 1800 bits.
Given such an input we prepend to it 248 ˜0™s resulting in a 4-block string
Now when this input is processed by, say, SHA-1 an additional fifth block
is added to The important thing is that the last block of receives as much
entropy from the last 512 bits of as possible.

Randomness Extraction and Key Derivation

the distribution induced by on the block has conditional min-entropy
with respect to the distribution induced by on blocks and that
Then, the cascade construction over the family applied to the distribution
is to uniform.

In particular, if we assume that the family is and the
min-entropy of each input block (as defined above) is at least then we
get (using Lemma 2) a statistical distance between the cascade construction on
L blocks and the uniform distribution on of at most
Combining Lemmas 6 and 5 we get that, under the above assumption on
the input distribution, if the family of compression functions is both and
pseudorandom then the output of the padded cascade (see Section 4.2) is pseu-
dorandom (i.e. indistinguishable from uniform).

5 HMAC Construction

We now turn to the study of HMAC [BCK96a] as a randomness extraction
family. HMAC (and its underlying family NMAC) is defined using the cascade
construction over a family of compression functions with domain
range and (as usual we denote The family of
functions NMAC uses two independent keys drawn from and is defined over
{0,1}* as where both and the result from
are padded as described in Section 4.2. On the basis of NMAC one defines the
family HMAC as where and
the value iv is fixed to the IV defined by the underlying
hash function, and are two different fixed strings of length The
analysis of HMAC is based on that of NMAC under the specialized assumption
that the keys and are “essentially independent”. We keep this assumption
and develop our analysis on NMAC. (The reason of this form of derivation of
the keys in HMAC is to allow for the use, without modification, of the
underlying hash function; in particular, without having to replace the fixed IV
with a variable value.)
We start by observing that if one considers the family as a
family then we get This is so since for any two inputs the
probability that NMAC sends both values to the same output is the sum of the
probability that (which is at least 1/K) plus the probability
that but maps these two different results to the same value
(which is also at least 1/K). Therefore we cannot apply the results of Section 2
directly to the analysis of NMAC.
However, we provide three analyses, which, under different assumptions, es-
tablish the security of NMAC as a randomness extractor.
DROPPING SOME OUTPUT BITS. Specifically, we assume the “outer” function
outputs bits (e.g., in case is a random function outputting
bits, one can simply drop the last bits and view it as a random function
508 Yevgeniy Dodis et al.

outputting bits). In this case, a straightforward application of Lemma 1
and Lemma 2 shows that if the family is w.r.t. and is
then NMAC extracts bits of statistical distance at most
from uniform. Assuming now that both families con-
sist of random functions, then and Proposition 2 implies that
This means that if
and then NMAC extracts bits which are to uniform. In
fact, the same is true even if the outer function family is merely a good
extractor (e.g., if it is pairwise independent). In any case, we get that dropping
roughly (logL + 160) bits from the output of the NMAC construction makes it
a good extractor. To make this result meaningful, however, we must consider
compression functions whose key is non-trivially larger than 160 bits, such as
the compression function for SHA2-512.
COMPUTATIONAL SECURITY. Our second approach to analyzing NMAC is sim-
ilar to the analysis of the padded cascade from Lemma 5. We will present it in
the full version.
MODELING AS A RANDOM ORACLE. As we remarked, even if is truly
random, the value cannot be statistically close to uniform, even
if was perfectly uniform. This was argued under an extremely strong
distinguisher that can evaluate at all of its inputs. This is different
from the typical modeling of as a random oracle. Namely, in the random
oracle model it is assumed that the adversary can evaluate at most a bounded
number of points, This assumption can be seen as restrictive, but in fact
a realistic characterization of the adversary™s capabilities. Thus, we show that
when we model the outer function of NMAC, as a random oracle then the
construction is a good randomness extractor. We start by showing the general
result about the quality of using a random oracle as an extractor, and then
apply it to the NMAC construction.

5.1 Random Oracle as an Extractor

In this section, we show that by utilizing the power of the random oracle to the
fullest, we can provide some provable guarantees on the quality of the random
oracle as a randomness extractor. Our precise modeling of the random oracle
is the following. The adversary is allowed to adaptively
query the random oracle at upto points, and possibly make the distribution
depend on these queries. However, we assume that the remaining “unqueried”
values of are chosen randomly and independently of and are never
given to the adversary . Finally, given a distribution and a number we let
denote the probability mass of the heaviest elements under

We stress that this is very different from our modeling of a random function from
before, where the adversary first chooses the distribution after which is chosen
at random (independently from and given to the adversary in its entirety.

Randomness Extraction and Key Derivation 509

Lemma 7. Assume is a random oracle from bits to bits, and the adversary
can evaluate in at most points. Then, for any distribution on
the maximal probability the adversary can distinguish from the uniform
distribution over is at most

Remark 2. We already remarked that no single function can be a universally
good extractor, which means that one has to use a function family instead,
indexed by some key On the other hand, in the idealized random oracle
model, we manage to use a single random oracle in Lemma 7. This is not
a contradiction since we critically assumed that the adversary cannot read the
entire description of the random oracle. In essence, the choice of the random
oracle can be viewed as a key but the distinguisher cannot read the entire
key (although it has a choice which parts of it to read) and therefore cannot
adversarially choose a bad distribution Put differently, in our analysis we
could assume that a large part of the key (i.e., is chosen independently of
which is consistent with the conventional extractors such as those obtained by the
LHL. However, unlike the conventional extractors, we (restrictively) assume that
the adversary never learns the entire description of the key (i.e., the unqueried
parts of which allowed us to get a much stronger bound that what we could
get with the LHL. For example, LHL required while Lemma 7
only requires
We will also use the following Corollary of Eq. (4) and Lemma 1.
Corollary 1. If a family of functions is universal w.r.t.
is a random oracle, U is the uniform distribution of and the adversary
can make at most queries to the random oracle, then the maximal probabil-
ity the adversary can distinguish the pair from is at most

The above corollary implies that the composition of a collision-resistant hash
function and a random oracle could be viewed as a relatively good extractor. This
is because a (computational) collision-resistant function must be (information-
theoretically) almost universal. More precisely, if a function family is
collision-resistant with exact security against non-uniform adversaries running
in linear time, it must also be universal. For uniform adversaries running
in time T, must be universal w.r.t. any which is samplable in time
APPLICATION TO NMAC. We can now apply Corollary 1 to the case of NMAC
assuming that the outer function is a random oracle which can be eval-
uated in at most places. By Proposition 2, the family is when the
function family is chosen at random, for (when
Thus, Corollary 1 implies that NMAC extracts bits which cannot be dis-
tinguished from random with probability more than
which is negligible if
Yevgeniy Dodis et al.


We thank David Wagner for valuable discussions.

[BST03] B. Barak, R. Shaltiel, and E. Tromer. True Random Number Generators
Secure in a Changing Environment. In CHESS ™03, pages 166“180, 2003.
[BCK96a] M. Bellare, R. Canetti, and H. Krawczyk. Keying hash functions for mes-
sage authentication. In Crypto ™96, pages 1“15, 1996. LNCS No. 1109.
[BCK96b] M. Bellare, R. Canetti, and H. Krawczyk. Pseudorandom Functions Revis-
ited: The Cascade Construction and Its Concrete Security. In Proc. 37th
FOGS, pages 514“523. IEEE, 1996.
[BKR94] M. Bellare, J. Kilian, and P. Rogaway. The Security of Cipher Block Chain-
ing. In Crypto ™94, pages 341“358, 1994. LNCS No. 839.
L. Carter and M. N. Wegman. Universal Classes of Hash Functions. JCSS,
18(2): 143“154, April 1979.
[Dam89] I. Damgard. A Design Principle for Hash Functions. In Crypto ™89, pages
416“427, 1989. LNCS No. 435.
[GKR04] R. Gennaro, H. Krawczyk, and T. Rabin. Secure Hashed Diffie-Hellman
over Non-DDH Groups. In Eurocrypt ™04, 2004. LNCS No.
Oded Goldreich. Foundations of Cryptography. Cambridge University Press,
2001. Preliminary version http://philby.ucsd.edu/cryptolib.html/.
[HILL99] J. Håstad, R. Impagliazzo, L. Levin, and M. Luby. Construction of a
Pseudo-random Generator from any One-way Function. SIAM. J. Com-
puting, 28(4): 1364“1396, 1999.
IKEv2. Internet Key Exchange (IKEv2) Protocol, draft-ietf-ipsec-ikev2-
13.txt, (to be published as an RFC). Editor: C. Kaufman, March 2004.
H. Krawczyk. SIGMA: The ˜SIGn-and-MAc™ Approach to Authenticated
Diffie-Hellman and Its Use in the IKE Protocols. In Crypto ™03, pages 400“
425, 2003. LNCS No. 2729.
Michael Luby. Pseudorandomness and Cryptographic Applications. Prince-
ton Computer Science Note, Princeton University Press, January 1996.
[RFC2409 RFC2409. The Internet Key Exchange (IKE). Editors: D. Harkins and D.
Carrel, Nov 1998.
R. Shaltiel. Recent developments in Extractors. Bulletin of the European
Association for Theoretical Computer Science, Volume 77, June 2002, pages
67-95. Available at:
D. R. Stinson. Universal Hashing and Authentication Codes. Designs,
Codes and Cryptography 4 (1994), 369-380.
M.N. Wegman and L. Carter. New Hash Functions and Their Use in Au-
thentication and Set Equality. JCSS, 22(3):265“279, July 1981.

Efficient Tree-Based Revocation
in Groups of Low-State Devices

Michael T. Goodrich1,*, Jonathan Z. Sun1,*, and Roberto Tamassia2,**
Dept. of Computer Science, Univ. of California, Irvine, CA 92697-3425
Dept. of Computer Science, Brown Univ., Providence, RI 02912

Abstract. We study the problem of broadcasting confidential informa-
tion to a collection of devices while providing the ability to revoke
an arbitrary subset of those devices (and tolerating collusion among the
revoked devices). In this paper, we restrict our attention to low-memory
devices, that is, devices that can store at most keys. We consider
solutions for both zero-state and low-state cases, where such devices are
organized in a tree structure T. We allow the group controller to encrypt
broadcasts to any subtree of T, even if the tree is based on an multi-way
organizational chart or a severely unbalanced multicast tree.

1 Introduction

In the group broadcast problem, we have a group S of devices and a group
controller (GC) that periodically broadcasts messages to all the devices over
an insecure channel [8]. Such broadcast messages are encrypted so that only
valid devices can decrypt them. For example, the messages could be important
instructions from headquarters being sent to PDAs carried by employees in a
large corporation. We would like to provide for revocation, that is, for an arbi-
trary subset we would like to prevent any device in R from decrypting
the messages.
We are interested in schemes that work efficiently with low-memory devices,
that is, devices that can store at most secret keys. Such a scenario
models the likely situation where the devices are small and the secure firmware
dedicated to storing keys is smaller still. We refer to this as the log-key restriction.
We consider two variants of this model.
A static or zero-state version: the keys on each device cannot be
changed once the device is deployed. For example, memory for the devices
could be written into secure firmware at deployment.
This work was supported in part by NSF Grants CCR-0312760, CCR-0311720, CCR-

0225642, and CCR-0098068.
** This work was supported in part by NSF Grants CCR-0311510, CCR-0098068, and
IIS-0324846 and by a research gift from Sun Microsystems.

M. Franklin (Ed.): CRYPTO 2004, LNCS 3152, pp. 511“527, 2004.
© International Association for Cryptologic Research 2004
512 Michael T. Goodrich, Jonathan Z. Sun, and Roberto Tamassia

The dynamic or low-state version: any of the keys on each device
can be updated in response to broadcast messages. For example, such de-
vices might have a small tamper-resistant secure cache in which to store and
update secret keys.

Organizing Devices Using Trees. The schemes we consider organize the set
of devices in a tree structure, associating each device with a different leaf in
the tree. In fact, we consider three possible kinds of trees that the devices can
conceptually be organized into.
1. A balanced tree. In this case, the devices are associated with the leaves
of a balanced tree where each internal node has a constant number of
children; hence, each is at depth This tree is usually chosen purely
for the sake of efficiency, and, in fact, has been the only tree considered in
previous related work we are familiar with. For example, it forms the basis of
the Logical Key Hierarchy (LKH) scheme [26, 28], the One-way Function Tree
(OFT) scheme [21], the Subset-Difference Revocation (SDR) scheme [16],
and the Layered Subset Difference (LSD) scheme [10].
2. An organizational chart. In this case, the devices are associated with the
leaves of a tree that represents an organizational chart, such as that of a
corporation or university. For example, internal nodes could correspond to
campuses, colleges, and departments. The height of this tree is assumed
to be but the number of children of an internal is not assumed
to be bounded by a constant. Thus, the straightforward conversion of this
tree into an equivalent bounded-degree tree may cause the height to become

3. A multicast tree. In this case, the devices are associated with the nodes of
a multicast tree rooted at the group controller. The logical structure of this
tree could be determined in an ad hoc manner so that no bound is assumed
on either the tree height or the degree of internal nodes. Thus, this tree
may be quite imbalanced and could in fact have height that is exponentially
greater than the number of keys each device can hold.
In using trees, particularly in the latter two cases, we feel it is important to
provide the capability to the group controller of encrypting a message so that it
may be decrypted only by the devices associated with nodes in a certain subtree.
For instance, a sporting event might be broadcast to just a single region, or a
directive from headquarters might be intended just for a single division. We call
such a broadcast a subtree broadcast, which can also be modeled by multiple
GCs, each assigned to a different subtree. We continue in this case to assume


. 16
( 19)